needs to be run to make sure data is in sync.
Sent from my iPhone
On May 12, 2018, at 3:54 AM, onmstester onmstester onmstes...@zoho.com
wrote:
In an insert-only use case with TTL (6 months), should i run this command,
every 5-7 days on all the nodes of production cluster (according
In an insert-only use case with TTL (6 months), should i run this command,
every 5-7 days on all the nodes of production cluster (according to this:
http://cassandra.apache.org/doc/latest/operating/repair.html )?
nodetool repair -pr --full
When none of the nodes was down in 4 months (ever
Hi,
I'm getting "Pool is Busy (limit is 256)", while connecting to a single node
cassandra cluster. The whole client side application is a 3rd-party lib
which i can't change it's source and its session builder is not using any
PoolingOptions.
Is there any config on cassandra side that could
I recommend you to review newts data model, which is a time-series data model
upon cassandra:
https://github.com/OpenNMS/newts/wiki/DataModel
Sent using Zoho Mail
First the use-case: We have time-series of data from devices on several sites,
where each device (with a unique dev_id)
I've increased column_index_size_in_kb to 512 and then 4096 : no change in
response time, it even got worse.
Even increasing Key cache size and Row cache size did not help.
Sent using Zoho Mail
On Sun, 20 May 2018 08:52:03 +0430 Jeff Jirsa jji...@gmail.com
wrote
Column
The table is something like
Samples
...
partition key (partition,resource,(timestamp,metric_name)
creating prepared statement :
session.prepare("select * from samples where partition=:partition and
resource=:resource and timestamp=:start and timestamp=:end and
metric_name in
Should i run compaction after changing column_index_size_in_kb?
Sent using Zoho Mail
On Sun, 20 May 2018 15:06:57 +0430 onmstester onmstester
onmstes...@zoho.com wrote
I've increased column_index_size_in_kb to 512 and then 4096 : no change in
response time, it even got
Hi,
Due to some unpredictable behavior in input data i end up with some hundred
partitions having more than 300MB size. Reading any sequence of data
from these partitions took about 5 seconds while reading from other partitions
(with less than 50MB sizes) took less than 10ms.
Since i can't
Data spread between a SSD disk and a 15K disk.
the table has 26 tables totally.
I haven't try tracing, but i will and inform you!
Sent using Zoho Mail
On Sun, 20 May 2018 08:26:33 +0430 Jonathan Haddad
j...@jonhaddad.com wrote
What disks are you using? How many sstables
It seems that there is no way doing this using Cassandra and even something
like spark won't help because i'm going to read from a big Cassandra partition
(bottleneck is reading from Cassandra)
Sent using Zoho Mail
On Tue, 22 May 2018 09:08:55 +0430 onmstester onmstester
onmstes
practice, you shouldn’t do select * (as a production query) against
any database. You want to list the columns you actually want to select. That
way a later “alter table add column” (or similar) doesn’t cause unpredictable
results to the application.
Sean Durity
From: onmstester onmstester
By reading 90 partitions concurrently(each having size 200 MB), My single
node Apache Cassandra became unresponsive,
no read and write works for almost 10 minutes.
I'm using this configs:
memtable_allocation_type: offheap_buffers
gc: G1GC
heap: 128GB
concurrent_reads: 128 (having more
:28 PM, onmstester onmstester onmstes...@zoho.com
wrote:
I'm using RF=2 (i know it should be at least 3 but i'm short of resources) and
WCl=ONE and RCL=ONE in a cluster of 10 nodes in a insert-only scenario.
The problem: i dont want to use nodetool repair because it would put hige load
on my
I'm using RF=2 (i know it should be at least 3 but i'm short of resources) and
WCl=ONE and RCL=ONE in a cluster of 10 nodes in a insert-only scenario.
The problem: i dont want to use nodetool repair because it would put hige load
on my cluster for a long time, but also i need data
Hi,
I needed to save a distinct value for a key in each hour, the problem with
saving everything and computing distincts in memory is that there
are too many repeated data.
Table schema:
Table distinct(
hourNumber int,
key text,
distinctValue long
primary key (hourNumber)
)
I want
Hi,
I was doing 500K inserts + 100K counter update in seconds on my cluster of 12
nodes (20 core/128GB ram/4 * 600 HDD 10K) using batch statements
with no problem.
I saw a lot of warning show that most of batches not concerning a single node,
so they should not be in a batch, on the other
Hi
I want to load all rows from many partitions and change a column value in each
row, which of following ways is better concerning disk space and performance?
1. create a update statement for every row and batch update for each partitions
2. create an insert statement for every row and batch
Thanks for your replies
But my current situation is that i do not have enough free disk for my biggest
sstable, so i could not run major compaction or nodetool garbagecollect
Sent using Zoho Mail
On Thu, 31 May 2018 22:32:32 +0430 Alain RODRIGUEZ
arodr...@gmail.com wrote
Hi,
I've deleted 50% of my data row by row now disk usage of cassandra data is more
than 80%.
The gc_grace of table was default (10 days), now i set that to 0, although many
compactions finished but no space reclaimed so far.
How could i force deletion of tombstones in sstables and reclaim
ter to add /
update or insert data and do a soft delete on old data and apply a TTL to
remove it at a future time.
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On May 27, 2018, 5:36 AM -0400, onmstester onmstester
onmstes...@zoho.com, wrote:
Hi
I want to load all rows f
jia creative city,Luoyu Road,Wuhan,HuBei
Mob: +86 13797007811|Tel: + 86 27 5024 2516
发件人: onmstester onmstester onmstes...@zoho.com
发送时间: 2018年5月28日 14:33
收件人: user user@cassandra.apache.org
主题: Fwd: Re: cassandra update vs insert + delete
How update is working underneath?
Does it cr
Hi I'm using two directories on different disks as cassandra data storage, the
small disk is 90% full and the bigger diskis 30% full (the bigger one was added
later that we find out we need more storage!!), so i want to move all data to
the big disk, one way is to stop my application and copy
cycle.
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Feb 18, 2018, 9:23 AM -0500, onmstester onmstester
onmstes...@zoho.com, wrote:
But monitoring cassandra with jmx using jvisualVM shows no problem, less than
30% of heap size used
Sent using Zoho Mail
onmstester onmstester onmstes...@zoho.com:
I've configured a simple cluster using two PC with identical spec:
cpu core i5 RAM: 8GB ddr3 Disk: 1TB 5400rpm Network: 1 G (I've test it with
iperf, it really is!)
using the common configs described in many sites including datastax itself
Another Question on node density, in this scenario:
1. we should keep time series data of some years for a heavy write system in
Cassandra ( 10K Ops in seconds)
2. the system is insert only and inserted data would never be updated
3. in partition key, we used number of months since 1970, so
I've configured a simple cluster using two PC with identical spec:
cpu core i5 RAM: 8GB ddr3 Disk: 1TB 5400rpm Network: 1 G (I've test it with
iperf, it really is!)
using the common configs described in many sites including datastax itself:
cluster_name: 'MyCassandraCluster' num_tokens: 256
I have a single structured row as input with rate of 10K per seconds. Each row
has 20 columns. Some queries should be answered on these inputs. Because most
of queries needs different where, group by or orderby, The final data model
ended up like this:
primary key for table of query1 :
Singh
rahul.si...@anant.us
Anant Corporation
On Feb 18, 2018, 6:29 AM -0500, onmstester onmstester
onmstes...@zoho.com, wrote:
I've configured a simple cluster using two PC with identical spec:
cpu core i5 RAM: 8GB ddr3 Disk: 1TB 5400rpm Network: 1 G (I've test it with
iperf
Another Question on node density, in this scenario:
1. we should keep time series data of some years for a heavy write system in
Cassandra ( 10K Ops in seconds)
2. the system is insert only and inserted data would never be updated
3. in partition key, we used number of months since 1970, so
On Tue, 19 Jun 2018 08:16:28 +0430 onmstester onmstester
onmstes...@zoho.com wrote
Can i set gc_grace_seconds to 0 in this case? because reappearing deleted data
has no impact on my Business Logic, i'm just either creating a new row or
replacing the exactly same row.
Sent using
16:24:48 +0430 DuyHai Doan
doanduy...@gmail.com wrote
Maybe the disk I/O cannot keep up with the high mutation rate ?
Check the number of pending compactions
On Sun, Jun 17, 2018 at 9:24 AM, onmstester onmstester
onmstes...@zoho.com wrote:
Hi,
I was doing 500K inserts
data if the
table hasn't been repaired during the grace interval. You can also just
increase the tombstone thresholds, but the queries will be pretty
expensive/wasteful.
On Tue, Jun 12, 2018 at 2:02 AM, onmstester onmstester
onmstes...@zoho.com wrote:
Hi,
I needed to save
The current data model described as table name:
((partition_key),cluster_key),other_column1,other_column2,... user_by_name:
((time_bucket, username)),ts,request,email user_by_mail: ((time_bucket,
email)),ts,request,username The reason that all 2 keys (username, email)
repeated in all tables is
How many rows in average per partition? around 10K. Let me get this straight :
You are bifurcating your partitions on either email or username , essentially
potentially doubling the data because you don’t have a way to manage a central
system of record of users ? We are just analyzing output
I need to do a full text search (like) on one of my clustering keys and one of
partition keys (it use text as data type). The input rate is high so only
Cassandra could handle it, Is there any open source version project which help
using cassandra+ solr or cassandra + elastic? Any
Subject : Re: [EXTERNAL] full text search on
some text columns Forwarded message Maybe this plugin
could do the job: https://github.com/Stratio/cassandra-lucene-index On Tue, 31
Jul 2018 at 22:37, onmstester onmstester wrote:
Thanks Jordan, There would be millions of rows per day, is SASI capable of
standing such a rate? Sent using Zoho Mail On Tue, 31 Jul 2018 19:47:55
+0430 Jordan West wrote On Tue, Jul 31, 2018 at 7:45
AM, onmstester onmstester wrote: I need to do a full text
search (like) on one
From: onmstester onmstester Sent: Tuesday,
July 31, 2018 10:46 AM To: user Subject: [EXTERNAL]
full text search on some text columns I need to do a full text search (like)
on one of my clustering keys and one of partition keys (it use text as data
type). The input rate is high so only
I read in some best practising documents on datam model that: do not update old
partitions while using STCS. But i always use cluster keys in my queries and
cqlsh-tracing reports that it only accesses sstables with data having specified
cluster key (not all sstables containing part of
I am inserting to Cassandra by a simple insert query and an update counter
query for every input record. input rate is so high. I've configured the update
query with idempotent = true (no config for insert query, default is false
IMHO) I've seen multiple records having rows in counter table
I've noticed this new feature of 4.0: Streaming optimizations
(https://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html)
Is this mean that we could have much more data density with Cassandra 4.0
(less problems than 3.X)? I mean > 10 TB of data on each node without
Thanks Kurt, Actually my cluster has > 10 nodes, so there is a tiny chance to
stream a complete SSTable. While logically any Columnar noSql db like
Cassandra, needs always to re-sort grouped data for later-fast-reads and having
nodes with big amount of data (> 2 TB) would be annoying for this
August 2018 at 19:43, onmstester onmstester
wrote: Thanks Kurt, Actually my cluster has > 10 nodes,
so there is a tiny chance to stream a complete SSTable. While logically any
Columnar noSql db like Cassandra, needs always to re-sort grouped data for
later-fast-reads and having nodes with
I actually never set Xmx > 32 GB, for any java application, unless it
necessarily need more. Just because of the fact: "once you exceed this 32 GiB
border JVM will stop using compressed object pointers, effectively reducing the
available memory. That means increasing your JVM heap above 32 GiB
Hi , On a cluster with 10 nodes, Out of 20K/seconds Native-Transports,
200/seconds of them blocked. They are mostly small single writes. Also I'm
expriencing random read delays, which i suspect the filled native queue. On all
nodes, cpu usage is less than 20 percent, and there is no problem in
I tested the single node scenario on all nodes iteratively and it worked:
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsChangeIp.html
Sent using Zoho Mail Forwarded message From :
onmstester onmstester To :
"user" Date : S
I need to assign a new ip range to my cluster, What's the procedure? Thanks in
advance Sent using Zoho Mail
AM, onmstester onmstester
wrote: I'm using RF=2 and Write consistency = ONE, is
there a counter in cassandra jmx to report number of writes that only
acknowledged by one node (instead of both replica's)? Although i don't care
all replicas acknowledge the write, but i consider this as normal
I'm using RF=2 and Write consistency = ONE, is there a counter in cassandra jmx
to report number of writes that only acknowledged by one node (instead of both
replica's)? Although i don't care all replicas acknowledge the write, but i
consider this as normal status of cluster. Sent using Zoho
: Sun, 22 Jul 2018 10:43:38 +0430 Subject :
Re: Cassandra crashed with no log Forwarded message
Anything in non-Cassandra logs? Dmesg? -- Jeff Jirsa On Jul 21, 2018, at 11:07
PM, onmstester onmstester wrote: Cassandra in one of my
nodes, crashed without any error/warning
Cassandra in one of my nodes, crashed without any error/warning in
system/gc/debug log. All jmx metrics is being monitored, last fetched values
for heap usage is 50% and for cpu usage is 20%. How can i find the cause of
crash? Sent using Zoho Mail
Currently i have a cluster with 10 nodes dedicated to one keyspace (Hardware
sizing been done according to input rate and ttl just for current application
requirements). I need a launch a new application with new keyspace with another
set of servers (8 nodes), there is no relation between the
Hi, Cluster Spec: 30 nodes RF = 2 NetworkTopologyStrategy
GossipingPropertyFileSnitch + rack aware Suddenly i lost all disks of
cassandar-data on one of my racks, after replacing the disks, tried to replace
the nodes with same ip using this:
Thanks Jeff, You mean that with RF=2, num_tokens = 256 and having less than 256
nodes i should not worry about data distribution? Sent using Zoho Mail On
Sat, 08 Sep 2018 21:30:28 +0430 Jeff Jirsa wrote
Virtual nodes accomplish two primary goals 1) it makes it easier to gradually
IMHO, Cassandra write is more of a CPU bound task, so while determining cluster
write throughput, what CPU usage percent (avg among all cluster nodes) should
be determined as limit? Rephrase: what's the normal CPU usage in Cassandra
cluster (while no compaction, streaming or heavy-read
Any idea? Sent using Zoho Mail On Sun, 09 Sep 2018 11:23:17 +0430
onmstester onmstester wrote Hi, Cluster Spec: 30
nodes RF = 2 NetworkTopologyStrategy GossipingPropertyFileSnitch + rack aware
Suddenly i lost all disks of cassandar-data on one of my racks, after replacing
the disks
*heers, --- Alain Rodriguez -
@arodream - al...@thelastpickle.com France / Spain The Last Pickle - Apache
Cassandra Consulting http://www.thelastpickle.com Le lun. 10 sept. 2018 à
09:09, onmstester onmstester a écrit : Any idea? Sent
using Zoho Mail On Sun, 09 Sep 2018 11:23:17
Why not setting default vnodes count to that recommendation in Cassandra
installation files? Sent using Zoho Mail On Tue, 04 Sep 2018 17:35:54
+0430 Durity, Sean R wrote Longer term, I
agree with Oleksandr, the recommendation for number of vnodes is now much
smaller than 256. I
not enough), but every advice I've seen is for a lower write thread count being
optimal for most cases. On Thu, Sep 6, 2018 at 5:51 AM, onmstester onmstester
wrote: IMHO, Cassandra write is more of a CPU bound task,
so while determining cluster write throughput, what CPU usage percent (avg
among a
Hi, Because of "Cannot use selection function ttl on PRIMARY KEY part type",
i'm adding a boolean column to table with no non-primary key columns, i'm just
worried about someday i would need debugging on ttl! is this a right approach?
anyone else is doing this? Sent using Zoho Mail
Cassandra crashed in Two out of 10 nodes in my cluster within 1 day, the error
is: ERROR [CompactionExecutor:3389] 2018-07-10 11:27:58,857
CassandraDaemon.java:228 - Exception in thread
Thread[CompactionExecutor:3389,1,main] org.apache.cassandra.io.FSReadError:
java.io.IOException: Map failed
Would it be possible to copy/paste Cassandra data directory from one of nodes
(which Its OS partition corrupted) and use it in a fresh Cassandra node? I've
used rf=1 so that's my only chance!
Sent using Zoho Mail
#ops_backup_snapshot_restore_t
Cheers
Ben
On Thu, 8 Mar 2018 at 17:07 onmstester onmstester onmstes...@zoho.com
wrote:
--
Ben Slater
Chief Product Officer
Read our latest technical blog posts here.
This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and
Instaclustr Inc (USA
I'm going to benchmark Cassandra's write throughput on a node with following
spec:
CPU: 20 Cores
Memory: 128 GB (32 GB as Cassandra heap)
Disk: 3 seprate disk for OS, data and commitlog
Network: 10 Gb (test it with iperf)
Os: Ubuntu 16
Running Cassandra-stress:
cassandra-stress write
?
On Sun, Mar 11, 2018 at 10:44 PM, onmstester onmstester
onmstes...@zoho.com wrote:
I'm going to benchmark Cassandra's write throughput on a node with following
spec:
CPU: 20 Cores
Memory: 128 GB (32 GB as Cassandra heap)
Disk: 3 seprate disk for OS, data and commitlog
Network: 10 Gb
Mail
On Mon, 12 Mar 2018 09:34:26 +0330 onmstester onmstester
onmstes...@zoho.com wrote
Apache-cassandra-3.11.1
Yes, i'm dosing a single host test
Sent using Zoho Mail
On Mon, 12 Mar 2018 09:24:04 +0330 Jeff Jirsa jji...@gmail.com
wrote
Would help
writes: 32
concurrent_counter_writes: 32
Jumping directly to 160 would be a bit high with spinning disks, maybe start
with 64 just to see if it gets better.
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 12:08 PM
To:
?
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 12:50 PM
To: user user@cassandra.apache.org
Subject: RE: yet another benchmark bottleneck
no luck even with 320 threads for write
Sent using Zoho Mail
2018 14:25:12 +0330 Jacques-Henri Berthemet
jacques-henri.berthe...@genesys.com wrote
Any errors/warning in Cassandra logs? What’s your RF?
Using 300MB/s of network bandwidth for only 130 op/s looks very high.
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes
Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 10:48 AM
To: user user@cassandra.apache.org
Subject: Re: yet another benchmark bottleneck
Running two instance of Apache Cassandra on same server, each having their own
commit log disk
-path at least that prevent scaling with high CPU core count.
- Micke
On 03/12/2018 03:14 PM, onmstester onmstester wrote:
I mentioned that already tested increasing client threads + many
stress-client instances in one node + two stress-client in two
separate nodes, in all
could i calculate disk usage
approximately(without inserting actual data)?
Sent using Zoho Mail
On Sat, 10 Mar 2018 11:21:44 +0330 onmstester onmstester
onmstes...@zoho.com wrote
I've find out that blobs has no gain in storage saving!
I had some 16 digit number which been saved
...@smartthings.com wrote
If you're willing to do the data type conversion in insert and retrieval, the
you could use blobs as a sort of "adaptive length int" AFAIK
On Tue, Mar 6, 2018 at 6:02 AM, onmstester onmstester
onmstes...@zoho.com wrote:
I'm using int data type for
Running this command:
nodetools cfhistograms keyspace1 table1
throws this exception in production server:
javax.management.InstanceNotFoundException:
org.apache.cassandra.metrics:type=Table,keyspace=keyspace1,scope=table1,name=EstimatePartitionSizeHistogram
But i have no problem in a test
On Mar 6, 2018, at 3:29 AM, onmstester onmstester onmstes...@zoho.com
wrote:
Running this command:
nodetools cfhistograms keyspace1 table1
throws this exception in production server:
javax.management.InstanceNotFoundException:
org.apache.cassandra.metrics:type=Table,keyspace=keyspace1
, 9:45 AM onmstester onmstester onmstes...@zoho.com
wrote:
I've defained a table like this
create table test (
hours int,
key1 int,
value1 varchar,
primary key (hours,key1)
)
For one hour every input would be written in single partition, because i need
to group by some 500K
I've defained a table like this
create table test (
hours int,
key1 int,
value1 varchar,
primary key (hours,key1)
)
For one hour every input would be written in single partition, because i need
to group by some 500K records in the partition for a report with expected
response time in
I'm using apache spark on top of cassandra for such cases
Sent using Zoho Mail
On Mon, 09 Apr 2018 18:00:33 +0430 DuyHai Doan
doanduy...@gmail.com wrote
No, sorting by column other than clustering column is not possible
On Mon, Apr 9, 2018 at 11:42 AM, Eunsu Kim
Is there any way to copy some part of a table to another table in cassandra? A
large amount of data should be copied so i don't want to fetch data to client
and stream it back to cassandra using cql.
Sent using Zoho Mail
art target node or run nodetool refresh
Sent from my iPhone
On Apr 8, 2018, at 4:15 AM, onmstester onmstester onmstes...@zoho.com
wrote:
Is there any way to copy some part of a table to another table in cassandra? A
large amount of data should be copied so i don't want to fetch data
I was going to estimate Hardware requirements for a project which mainly uses
Apache Cassandra.
Because of rule "Cassandra nodes size better be 2 TB", the total disk
usage determines number of nodes,
and in most cases the result of this calculation would be so OK for satisfying
the required
:23 onmstester onmstester onmstes...@zoho.com
wrote:
--
Ben Slater
Chief Product Officer
Read our latest technical blog posts here.
This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and
Instaclustr Inc (USA).
This email and any attachments may contain
and your
Cassandra cluster doesn’t sound terribly stressed then there is room to
increase threads on the client to up throughput (unless your bottlenecked on IO
or something)?
On Sun, 18 Mar 2018 at 20:27 onmstester onmstester onmstes...@zoho.com
wrote:
--
Ben Slater
Chief Product
I need to insert some millions records in seconds in Cassandra. Using one
client with asyncExecute with folllowing configs:
maxConnectionsPerHost = 5
maxRequestsPerHost = 32K
maxAsyncQueue at client side = 100K
I could achieve 25% of throughtput i needed, client CPU is more than 80% and
I'm querying a single cassandra partition using sqlContext and Its temView
which creates more than 2000 tasks on spark and took about 360 seconds:
sqlContext.read().format("org.apache.spark.sql.cassandra).options(ops).load.createOrReplaceTempView("tableName")
But using
8144 9872
On Tue, Feb 27, 2018 at 8:26 PM, onmstester onmstester
onmstes...@zoho.com wrote:
What i've got to set up my Apache Cassandra cluster are some Servers with 20
Core cpu * 2 Threads and 128 GB ram and 8 * 2TB disk.
Just read all over the web: Do not use big nodes for yo
What i've got to set up my Apache Cassandra cluster are some Servers with 20
Core cpu * 2 Threads and 128 GB ram and 8 * 2TB disk.
Just read all over the web: Do not use big nodes for your cluster, i'm
convinced to run multiple nodes on a single physical server.
So the question is which
I'm using int data type for one of my columns but for 99.99...% its data never
would be 65K, Should i change it to smallint (It would save some Gigabytes
disks in a few months) or Cassandra Compression would take care of it in
storage?
What about blob data type ? Isn't better to use it in
Currently, before launching the production cluster, i run 'iperf -s' on half of
the cluster and then run 'iperf -c $nextIP' on the other half using parallel
ssh, So simultaneously all cluster's nodes are connecting together (paired) and
then examining the result of iperfs, doing the math that
What takes the most CPU? System or User? most of it is used by
org.apache.cassandra.util.coalesceInternal and SepWorker.run Did you try
removing a problematic node and installing a brand new one (instead of
re-adding)? I did not install a new node, but did remove the problematic node
and CPU
or if the
load your application is producing exceeds what your cluster can handle (needs
more nodes). Chris On Oct 20, 2018, at 5:18 AM, onmstester onmstester
wrote: 3 nodes in my cluster have 100% cpu usage
and most of it is used by org.apache.cassandra.util.coalesceInternal and
SepWorker.run
Any cron or other scheduler running on those nodes? no Lots of Java processes
running simultaneously? no, just Apache Cassandra Heavy repair continuously
running? none Lots of pending compactions? none, the cpu goes to 100% on first
seconds of insert (write load) so no memtable flushed yet, Is
Read this: https://docs.datastax.com/en/cql/3.3/cql/cql_reference/batch_r.html
Please use batch (any type of batch) for statements that only concerns a single
partition, otherwise it cause a lot of performance degradation on your cluster
and after a while throughput would be alot less than
Hi, One of my applications requires to create a cluster with more than 100
nodes, I've read documents recommended to use clusters with less than 50 or 100
nodes (Netflix got hundreds of clusters with less 100 nodes on each). Is it a
good idea to use multiple clusters for a single application,
IMHO, the best option with two datacenters is to config replication strategy to
stream data from dc with wrong num_token to correct one, and then a repair on
each node would move your data to the other dc Sent using Zoho Mail
Forwarded message From : Goutham reddy
To
I am facing. Any comments. Thanks and Regards, Goutham On
Fri, Nov 2, 2018 at 1:08 AM onmstester onmstester
wrote: -- Regards Goutham Reddy IMHO, the best option with two datacenters is
to config replication strategy to stream data from dc with wrong num_token to
correct one, and then a repair
unlogged batch meaningfully outperforms parallel execution of individual
statements, especially at scale, and creates lower memory pressure on both the
clients and cluster. They do outperform parallel individuals, but in cost of
higher pressure on coordinators which leads to more blocked
3 nodes in my cluster have 100% cpu usage and most of it is used by
org.apache.cassandra.util.coalesceInternal and SepWorker.run? The most active
threads are the messaging-service-incomming. Other nodes are normal, having 30
nodes, using Rack Aware strategy. with 10 rack each having 3 nodes.
Thank you all, Actually, "the documents" i mentioned in my question, was a talk
in youtube seen long time ago and could not find it. Also noticing that a lot
of companies like Netflix built hundreds of Clusters each having 10s of nodes
and saying that its much stable, i just concluded that big
Since i failed to find a document on how to configure and use the Token
Allocation Algorithm (to replace the random Algorithm), just wanted to be sure
about the procedure i've done: 1. Using Apache Cassandra 3.11.2 2. Configured
one of seed nodes with num_tokens=8 and started it. 3. Using Cqlsh
1 - 100 of 188 matches
Mail list logo