RE: [Cassandra] nodetool compactionstats not showing pending task.

2017-05-04 Thread Abhishek Kumar Maheshwari
I just restart the cluster but still facing same issue. Please let me know 
where I can search on JIRA or will raise new ticket for the same?

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: kurt greaves [mailto:k...@instaclustr.com]
Sent: Tuesday, May 2, 2017 11:30 AM
To: Abhishek Kumar Maheshwari 
Cc: Alain RODRIGUEZ ; user@cassandra.apache.org
Subject: Re: [Cassandra] nodetool compactionstats not showing pending task.

I believe this is a bug with the estimation of tasks, however not aware of any 
JIRA that covers the issue.

On 28 April 2017 at 06:19, Abhishek Kumar Maheshwari 
>
 wrote:
Hi ,

I will try with JMX but I try with tpstats. In tpstats its showing pending 
compaction as 0 but in nodetool compactionstats its showing 3. So, for me its 
seems strange.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Alain RODRIGUEZ [mailto:arodr...@gmail.com]
Sent: Thursday, April 27, 2017 4:45 PM
To: user@cassandra.apache.org
Subject: Re: [Cassandra] nodetool compactionstats not showing pending task.

Maybe try to monitor through JMX with 
'org.apache.cassandra.db:type=CompactionManager', attribute 'Compactions' or 
'CompactionsSummary'

C*heers
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-27 12:27 GMT+02:00 Alain RODRIGUEZ 
>:
Hi,

I am not sure about this one. It happened to me in the past as well. I never 
really wondered about it as it was gone after a while or a restart off the top 
of my head. To get rid of it, a restart might be enough.

But if you feel like troubleshooting this, I think the first thing is to try to 
see if compactions are really happening. Maybe using JMX, I believe 
`org.apache.cassandra.metrics:type=Compaction,name=PendingTasks` is what is 
used by 'nodetool compactionstats' but they might be more info there. Actually 
I don't really know what the 'system.compactions_in_progress' was replaced by, 
but any way to double check you could think of would probably help 
understanding better what's happening.

Does someone now the way to check pending compactions details in 3.0.9?

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-25 15:13 GMT+02:00 Abhishek Kumar Maheshwari 
>:
Hi All,

In Production, I am using Cassandra 3.0.9.

While I am running nodetool compactionstats command its just showing count not 
any other information like below:

[mohit.kundra@AdtechApp bin]$ ./nodetool -h XXX.XX.XX.XX compactionstats
pending tasks: 3
[mohit.kundra@AdtechAppX bin]$

So, this is some Cassandra bug or what? I am not able to understand.


Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

"Learn journalism at India's largest media house - The Times of India Group. 
Last Date 28 April, 2017. Visit www.tcms.in for details."





Re: DTCS to TWCS

2017-05-04 Thread Jon Haddad
We (The Last Pickle) wrote a blog post on using TWCS pre-3.0: 
http://thelastpickle.com/blog/2017/01/10/twcs-part2.html 


Alex Dejanovski wrote a very comprehensive guide to TWCS I recommend reading 
before putting it in prod: 
http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html 


Hope this helps,
Jon

> On May 4, 2017, at 2:05 PM, Cogumelos Maravilha  
> wrote:
> 
> Hi,
> 
> Take a look to https://issues.apache.org/jira/browse/CASSANDRA-13038 
> 
> Regards
> 
> On 04-05-2017 18:22, vasu gunja wrote:
>> Hi All,
>> 
>> We are currently on C* 2.1.13 version and we are using DTCS for our tables.
>> We planning to move to TWCS. 
>> 
>> My questions
>> From which versions TWCS is available ?
>> Do we need to perform upgrade of cassandra before changing DTCS to TWCS ?
>> 
>> 
>> Thanks
> 



Re: DTCS to TWCS

2017-05-04 Thread Cogumelos Maravilha
Hi,

Take a look to https://issues.apache.org/jira/browse/CASSANDRA-13038

Regards


On 04-05-2017 18:22, vasu gunja wrote:
> Hi All,
>
> We are currently on C* 2.1.13 version and we are using DTCS for our
> tables.
> We planning to move to TWCS. 
>
> My questions
> From which versions TWCS is available ?
> Do we need to perform upgrade of cassandra before changing DTCS to TWCS ?
>
>
> Thanks



DTCS to TWCS

2017-05-04 Thread vasu gunja
Hi All,

We are currently on C* 2.1.13 version and we are using DTCS for our tables.
We planning to move to TWCS.

My questions
>From which versions TWCS is available ?
Do we need to perform upgrade of cassandra before changing DTCS to TWCS ?


Thanks


Re: Totally unbalanced cluster

2017-05-04 Thread Jon Haddad
Adding nodes with NTS is easier, in my opinion.  You don’t need to worry about 
replica placement, if you do it right.

> On May 4, 2017, at 7:43 AM, Cogumelos Maravilha  
> wrote:
> 
> Hi Alain thanks for your kick reply.
> 
> 
> Regarding SimpleStrategy perhaps you are right but it's so easy to add nodes.
> 
> I'm not using vnodes and the default 256. The information that I've posted it 
> a regular nodetool status keyspace.
> 
> My partition key is a sequencial big int but nodetool cfstatus shows that the 
> number of keys are not balanced (data from 3 nodes):
> 
> Number of keys (estimate): 442779640
> 
> Number of keys (estimate): 736380940
> 
> Number of keys (estimate): 451097313
> 
> Should I use nodetool rebuild?
> 
> Running:
> 
> nodetool getendpoints mykeyspace data 9213395123941039285
> 
> 10.1.1.52
> 10.1.1.185
> 
> nodetool getendpoints mykeyspace data 9213395123941039286
> 
> 10.1.1.161
> 10.1.1.19
> All nodes are working hard because my TTL is for 18 days and daily data 
> ingestion is around 120,000,000 records:
> nodetool compactionstats -H
> pending tasks: 3
> - mykeyspace.data: 3
> 
> id   compaction type keyspace  table 
> completed  total  unit  progress
> c49599b1-308d-11e7-ba5b-67e232f1bee1 Remove deleted data mykeyspace data 
> 133.89 GiB 158.33 GiB bytes 84.56%  
> c49599b0-308d-11e7-ba5b-67e232f1bee1 Remove deleted data mykeyspace data 
> 136.2 GiB  278.96 GiB bytes 48.83%
> 
> Active compaction remaining time :   0h00m00s
> 
> 
> nodetool compactionstats -H
> pending tasks: 2
> - mykeyspace.data: 2
> 
> id   compaction type keyspace  table 
> completed total  unit  progress
> b6e8ce80-30d4-11e7-a2be-9b830f114108 Compaction  mykeyspace data 4.05 GiB 
>  133.02 GiB bytes 3.04%   
> Active compaction remaining time :   2h17m34s
> 
> The nodetool repair by default in this C* version is incremental and since 
> the repair is run in all nodes in different hours and I don't want snapshots 
> that's why I'm cleaning twice a day (not sure that with -pr a snapshot is 
> created).
> 
> The cleanup was already remove was there because last node was created a few 
> days ago.
> 
> I'm using garbagecollect to force the cleanup since I'm running out of space.
> 
> 
> Regards.
> 
> 
> 
> On 05/04/2017 12:50 PM, Alain RODRIGUEZ wrote:
>> Hi,
>> 
>> CREATE KEYSPACE mykeyspace WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = false;
>> 
>> The SimpleStrategy is never recommended for production clusters as it does 
>> not recognise racks or datacenter, inducing possible availability issues and 
>> unpredictable latency when using those. I would not even use it for testing 
>> purposes, I see no point in most cases.
>> 
>> Even if this should be changed, carefully but as soon as possible imho, it 
>> is probably not related to your main issue at hand.
>> 
>> If nodes are imbalanced, there are 3 mains questions that come to my mind:
>> 
>> Are the token well distributed among the available nodes?
>> Is the data correctly balanced on the token ring (i.e. are the 'id' values 
>> of 'mykeyspace.data' table well spread between the nodes?
>> Are the compaction processes running smoothly on every nodes
>> 
>> Point 1 depends on whether you are using vnodes or not and what number of 
>> vnodes ('num_token' in cassandra.yaml).
>> If not using vnodes, you have to manually set the positions of the nodes and 
>> move them around when adding more nodes so thing remain balanced
>> If using vnodes, make sure to use a high enough number of vnodes so 
>> distribution is 'good enough' (More than 32 in most cases, default is 256, 
>> which lead to quite balanced rings, but brings other issues)
>> 
>> UN  10.1.1.161  398.39 GiB  256  28.9%
>> UN  10.1.1.19   765.32 GiB  256  29.9%
>> UN  10.1.1.52   574.24 GiB  256  28.2%
>> UN  10.1.1.213  817.56 GiB  256  28.2%
>> UN  10.1.1.85   638.82 GiB  256  28.2%
>> UN  10.1.1.245  408.95 GiB  256  28.7%
>> UN  10.1.1.185  574.63 GiB  256  27.9%
>> 
>> You can have the token ownership information by running 'nodetool status 
>> '. Adding the keyspace name in the command give you the real 
>> ownership. Also, RF = 2 means the total of the ownership should be 200%, 
>> ideally evenly balanced. I am not sure about the command you ran here. Also 
>> as a global advice, let us the command you ran and what you expect us to see 
>> in the output.
>> 
>> Still the tokens seems to be well distributed, and I guess you are using the 
>> default 'num_token': 256. So I believe you are not having this issue. But 
>> the delta between the data hold on each node is up to x2 (400 GB on some 
>> nodes, 800 GB on some others).
>> 
>> Point 2 highly depends on the workload. Are your partitions evenly 
>> distributed among the nodes? It depends on your primary key. Using an UUID 

Re: Totally unbalanced cluster

2017-05-04 Thread Cogumelos Maravilha
Hi Alain thanks for your kick reply.


Regarding SimpleStrategy perhaps you are right but it's so easy to add
nodes.

I'm *not* using vnodes and the default 256. The information that I've
posted it a regular nodetool status keyspace.

My partition key is a sequencial big int but nodetool cfstatus shows
that the number of keys are not balanced (data from 3 nodes):

Number of keys (estimate): 442779640

Number of keys (estimate): 736380940

Number of keys (estimate): 451097313

*Should I use nodetool rebuild?*

Running:

nodetool getendpoints mykeyspace data 9213395123941039285

10.1.1.52
10.1.1.185

nodetool getendpoints mykeyspace data 9213395123941039286

10.1.1.161
10.1.1.19

All nodes are working hard because my TTL is for 18 days and daily data
ingestion is around 120,000,000 records:

nodetool compactionstats -H
pending tasks: 3
- mykeyspace.data: 3

id   compaction type keyspace 
table completed  total  unit  progress
c49599b1-308d-11e7-ba5b-67e232f1bee1 Remove deleted data mykeyspace data
133.89 GiB 158.33 GiB bytes 84.56% 
c49599b0-308d-11e7-ba5b-67e232f1bee1 Remove deleted data mykeyspace data
136.2 GiB  278.96 GiB bytes 48.83%

Active compaction remaining time :   0h00m00s


nodetool compactionstats -H
pending tasks: 2
- mykeyspace.data: 2

id   compaction type keyspace  table
completed total  unit  progress
b6e8ce80-30d4-11e7-a2be-9b830f114108 Compaction  mykeyspace data
4.05 GiB  133.02 GiB bytes 3.04%  
Active compaction remaining time :   2h17m34s


The nodetool repair by default in this C* version is incremental and
since the repair is run in all nodes in different hours and I don't want
snapshots that's why I'm cleaning twice a day (not sure that with -pr a
snapshot is created).

The cleanup was already remove was there because last node was created a
few days ago.

I'm using garbagecollect to force the cleanup since I'm running out of
space.


Regards.


On 05/04/2017 12:50 PM, Alain RODRIGUEZ wrote:
> Hi,
>
> CREATE KEYSPACE mykeyspace WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes =
> false;
>
>
> The SimpleStrategy is never recommended for production clusters as it
> does not recognise racks or datacenter, inducing possible availability
> issues and unpredictable latency when using those. I would not even
> use it for testing purposes, I see no point in most cases.
>
> Even if this should be changed, carefully but as soon as possible
> imho, it is probably not related to your main issue at hand.
>
> If nodes are imbalanced, there are 3 mains questions that come to my mind:
>
>  1. Are the token well distributed among the available nodes?
>  2. Is the data correctly balanced on the token ring (i.e. are the
> 'id' values of 'mykeyspace.data' table well spread between the nodes?
>  3. Are the compaction processes running smoothly on every nodes
>
>
> *Point 1* depends on whether you are using vnodes or not and what
> number of vnodes ('num_token' in cassandra.yaml).
>
>   * If not using vnodes, you have to manually set the positions of the
> nodes and move them around when adding more nodes so thing remain
> balanced
>   * If using vnodes, make sure to use a high enough number of vnodes
> so distribution is 'good enough' (More than 32 in most cases,
> default is 256, which lead to quite balanced rings, but brings
> other issues)
>
>
> UN  10.1.1.161  398.39 GiB  256  28.9%
> UN  10.1.1.19   765.32 GiB  256  29.9%
> UN  10.1.1.52   574.24 GiB  256  28.2%
> UN  10.1.1.213  817.56 GiB  256  28.2%
> UN  10.1.1.85   638.82 GiB  256  28.2%
> UN  10.1.1.245  408.95 GiB  256  28.7%
> UN  10.1.1.185  574.63 GiB  256  27.9%
>
>
> You can have the token ownership information by running 'nodetool
> status '. Adding the keyspace name in the command give you
> the real ownership. Also, RF = 2 means the total of the ownership
> should be 200%, ideally evenly balanced. I am not sure about the
> command you ran here. Also as a global advice, let us the command you
> ran and what you expect us to see in the output.
>
> Still the tokens seems to be well distributed, and I guess you are
> using the default 'num_token': 256. So I believe you are not having
> this issue. But the delta between the data hold on each node is up to
> x2 (400 GB on some nodes, 800 GB on some others).
>
> *Point 2* highly depends on the workload. Are your partitions evenly
> distributed among the nodes? It depends on your primary key. Using an
> UUID as the partition key is often a good idea, but it depends on your
> needs as well, of course. You could look at the distribution on the
> distinct nodes through: 'nodetool cfstats'.
> *
> *
> *Point 3* : even if the tokens are perfectly distributed and the
> primary key perfectly randomized, some node can have some disk issue
> or any 

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2017-05-04 Thread techpyaasa .
Hi guys,

Has anybody got fix for this issue?
Recently we upgraded our c* cluster from 2.0.17 to 2.1.17. And we saw
increase in read latency on few tables, read latency increased almost 2 to
3 times more than what it was in 2.0.17.

Kindly let me know the fix for it if anybody knows it.

Thanks
TechPyaasa

On Wed, Nov 9, 2016 at 1:28 AM, Dikang Gu  wrote:

> Michael, thanks for the info. It sounds to me a very serious performance
> regression. :(
>
> On Tue, Nov 8, 2016 at 11:39 AM, Michael Kjellman <
> mkjell...@internalcircle.com> wrote:
>
>> Yes, We hit this as well. We have a internal patch that I wrote to mostly
>> revert the behavior back to ByteBuffers with as small amount of code change
>> as possible. Performance of our build is now even with 2.0.x and we've also
>> forward ported it to 3.x (although the 3.x patch was even more complicated
>> due to Bounds, RangeTombstoneBound, ClusteringPrefix which actually
>> increases the number of allocations to somewhere between 11 and 13
>> depending on how I count it per indexed block -- making it even worse than
>> what you're observing in 2.1.
>>
>> We haven't upstreamed it as 2.1 is obviously not taking any changes at
>> this point and the longer term solution is https://issues.apache.org/jira
>> /browse/CASSANDRA-9754 (which also includes the changes to go back to
>> ByteBuffers and remove as much of the Composites from the storage engine as
>> possible.) Also, the solution is a bit of a hack -- although it was a
>> blocker from us deploying 2.1 -- so i'm not sure how "hacky" it is if it
>> works..
>>
>> best,
>> kjellman
>>
>>
>> On Nov 8, 2016, at 11:31 AM, Dikang Gu > an...@gmail.com>> wrote:
>>
>> This is very expensive:
>>
>> "MessagingService-Incoming-/2401:db00:21:1029:face:0:9:0" prio=10
>> tid=0x7f2fd57e1800 nid=0x1cc510 runnable [0x7f2b971b]
>>java.lang.Thread.State: RUNNABLE
>> at org.apache.cassandra.db.marshal.IntegerType.compare(IntegerT
>> ype.java:29)
>> at org.apache.cassandra.db.composites.AbstractSimpleCellNameTyp
>> e.compare(AbstractSimpleCellNameType.java:98)
>> at org.apache.cassandra.db.composites.AbstractSimpleCellNameTyp
>> e.compare(AbstractSimpleCellNameType.java:31)
>> at java.util.TreeMap.put(TreeMap.java:545)
>> at java.util.TreeSet.add(TreeSet.java:255)
>> at org.apache.cassandra.db.filter.NamesQueryFilter$Serializer.
>> deserialize(NamesQueryFilter.java:254)
>> at org.apache.cassandra.db.filter.NamesQueryFilter$Serializer.
>> deserialize(NamesQueryFilter.java:228)
>> at org.apache.cassandra.db.SliceByNamesReadCommandSerializer.
>> deserialize(SliceByNamesReadCommand.java:104)
>> at org.apache.cassandra.db.ReadCommandSerializer.deserialize(
>> ReadCommand.java:156)
>> at org.apache.cassandra.db.ReadCommandSerializer.deserialize(
>> ReadCommand.java:132)
>> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
>> at org.apache.cassandra.net.IncomingTcpConnection.receiveMessag
>> e(IncomingTcpConnection.java:195)
>> at org.apache.cassandra.net.IncomingTcpConnection.receiveMessag
>> es(IncomingTcpConnection.java:172)
>> at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingT
>> cpConnection.java:88)
>>
>>
>> Checked the git history, it comes from this jira:
>> https://issues.apache.org/jira/browse/CASSANDRA-5417
>>
>> Any thoughts?
>> ​
>>
>> On Fri, Oct 28, 2016 at 10:32 AM, Paulo Motta > > wrote:
>> Haven't seen this before, but perhaps it's related to CASSANDRA-10433?
>> This is just a wild guess as it's in a related codepath, but maybe worth
>> trying out the patch available to see if it helps anything...
>>
>> 2016-10-28 15:03 GMT-02:00 Dikang Gu > an...@gmail.com>>:
>> We are seeing huge cpu regression when upgrading one of our 2.0.16
>> cluster to 2.1.14 as well. The 2.1.14 node is not able to handle the same
>> amount of read traffic as the 2.0.16 node, actually, it's less than 50%.
>>
>> And in the perf results, the first line could go as high as 50%, as we
>> turn up the read traffic, which never appeared in 2.0.16.
>>
>> Any thoughts?
>> Thanks
>>
>>
>> Samples: 952K of event 'cycles', Event count (approx.): 229681774560
>> Overhead  Shared Object  Symbol
>>6.52%  perf-196410.map[.]
>> Lorg/apache/cassandra/db/marshal/IntegerType;.compare in
>> Lorg/apache/cassandra/db/composites/AbstractSimpleCellNameType;.compare
>>4.84%  libzip.so  [.] adler32
>>2.88%  perf-196410.map[.]
>> Ljava/nio/HeapByteBuffer;.get in Lorg/apache/cassandra/db/marsh
>> al/IntegerType;.compare
>>2.39%  perf-196410.map[.]
>> Ljava/nio/Buffer;.checkIndex in Lorg/apache/cassandra/db/marsh
>> 

Re: cql3 - adding two numbers in insert statement

2017-05-04 Thread Alain RODRIGUEZ
Thanks for the feedback Shreyas,

it's good to keep a trace of solutions and workarounds in the archives of
the mailing list :-).

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-28 13:47 GMT+01:00 Shreyas Chandra Sekhar :

> Hi Alain,
>
> Thanks for the reply.
>
> I figured out the issue. We are using Cassandra 2.x. Cassandra-stress tool
> of 2.x could not handle upper case in its query.
>
> However this issue has been fixed in Cassandra 3.x stress tool. So used
> that binary of 3.x and had to use command line options to convert from v4
> to v3 to target Cassandra 2.x.
>
>
>
> Shreyas
>
>
>
> *From:* Alain RODRIGUEZ [mailto:arodr...@gmail.com]
> *Sent:* Friday, April 28, 2017 3:12 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: cql3 - adding two numbers in insert statement
>
>
>
> Hi Shreyas,
>
>
>
> It's good to paste the error if you want us to give you a quick and good
> answer. In this case the table schema would have been helpful as well.
>
>
>
> That being said, you are probably missing some quotes around the values
> for 'key' and 'column1'.
>
>
>
> C*heers,
>
> ---
>
> Alain Rodriguez - @arodream - al...@thelastpickle.com
>
> France
>
>
>
> The Last Pickle - Apache Cassandra Consulting
>
> http://www.thelastpickle.com
>
>
>
> 2017-04-03 22:50 GMT+02:00 Shreyas Chandra Sekhar :
>
> Hi,
>
> I am trying to generate a random value of certain length and use that as
> one of the value in CQL3. Below is an example
>
>
>
> INSERT INTO "KS"."CF" (key, column1, value) VALUES (
> 613462303431313435313838306530667c6263317431756331,
> 2633174317563312f6f36, blobAsUuid(timeuuidAsBlob(now())) + 1000);
>
>
>
> This errors, can anyone help with right syntax?
>
>
>
> Thanks,
>
> Shreyas
>
>
>


Re: Totally unbalanced cluster

2017-05-04 Thread Alain RODRIGUEZ
Hi,

CREATE KEYSPACE mykeyspace WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = false;


The SimpleStrategy is never recommended for production clusters as it does
not recognise racks or datacenter, inducing possible availability issues
and unpredictable latency when using those. I would not even use it for
testing purposes, I see no point in most cases.

Even if this should be changed, carefully but as soon as possible imho, it
is probably not related to your main issue at hand.

If nodes are imbalanced, there are 3 mains questions that come to my mind:


   1. Are the token well distributed among the available nodes?
   2. Is the data correctly balanced on the token ring (i.e. are the 'id'
   values of 'mykeyspace.data' table well spread between the nodes?
   3. Are the compaction processes running smoothly on every nodes


*Point 1* depends on whether you are using vnodes or not and what number of
vnodes ('num_token' in cassandra.yaml).

   - If not using vnodes, you have to manually set the positions of the
   nodes and move them around when adding more nodes so thing remain balanced
   - If using vnodes, make sure to use a high enough number of vnodes so
   distribution is 'good enough' (More than 32 in most cases, default is 256,
   which lead to quite balanced rings, but brings other issues)


UN  10.1.1.161  398.39 GiB  256  28.9%
> UN  10.1.1.19   765.32 GiB  256  29.9%
> UN  10.1.1.52   574.24 GiB  256  28.2%
> UN  10.1.1.213  817.56 GiB  256  28.2%
> UN  10.1.1.85   638.82 GiB  256  28.2%
> UN  10.1.1.245  408.95 GiB  256  28.7%
> UN  10.1.1.185  574.63 GiB  256  27.9%


You can have the token ownership information by running 'nodetool status
'. Adding the keyspace name in the command give you the real
ownership. Also, RF = 2 means the total of the ownership should be 200%,
ideally evenly balanced. I am not sure about the command you ran here. Also
as a global advice, let us the command you ran and what you expect us to
see in the output.

Still the tokens seems to be well distributed, and I guess you are using
the default 'num_token': 256. So I believe you are not having this issue.
But the delta between the data hold on each node is up to x2 (400 GB on
some nodes, 800 GB on some others).

*Point 2* highly depends on the workload. Are your partitions evenly
distributed among the nodes? It depends on your primary key. Using an UUID
as the partition key is often a good idea, but it depends on your needs as
well, of course. You could look at the distribution on the distinct nodes
through: 'nodetool cfstats'.

*Point 3* : even if the tokens are perfectly distributed and the primary
key perfectly randomized, some node can have some disk issue or any other
reason having the compactions falling behind. This would lead to this node
to hold more data and note evicting tombstones properly in some cases,
increasing disk space used. Other than that, you can have a big SSTable
being compacted on a node, having the size of the node growing quite
suddenly (that's why 50 to 20% of the disk should always be free, depending
on the compaction strategy in use and the number of concurrent
compactions). Here, running 'nodetool compactionstats -H' on all the nodes
would probably help you to troubleshoot.

*About crontab*


> 08 05   * * *   rootnodetool repair -pr
> 11 11   * * *   rootfstrim -a
> 04 12   * * *   rootnodetool clearsnapshot
> 33 13   * * 2   rootnodetool cleanup
> 35 15   * * *   rootnodetool garbagecollect
> 46 19   * * *   rootnodetool clearsnapshot
> 50 23   * * *   rootnodetool flush
>

I don't understand what you try to achieve with some of the commands:

nodetool repair -pr


Repairing the cluster regularly is good in most cases, but as default
changes with version, I would specify if the repair is supposed to be
'incremental' or 'full', if it is supposed to be 'sequential' or 'parallel'
for example. Also, as the dataset growth, some issue will appear with
repairs.Just search for 'repairs cassandra' on google or any search engine
you are using and you will see that repair is a complex topic. Look for
videos and you will find a lot of informations about it from nice talks
like these 2 from the last summit:

https://www.youtube.com/watch?v=FrF8wQuXXks
https://www.youtube.com/watch?v=1Sz_K8UID6E

Also some nice tools exist to help with repairs:

The Reaper (originally made at Spotify now maintained by The Last Pickle):
https://github.com/thelastpickle/cassandra-reaper
'cassandra_range_repair.py':  https://github.com/
BrianGallew/cassandra_range_repair

11 11   * * *   rootfstrim -a


I am not really sure about this one but it looks good as long as the
'fstrim' do not create performance issue while it is running it seems fine.

04 12   * * *   rootnodetool clearsnapshot


This will automatically erase any snapshot you might want to keep. It might
be good to 

Totally unbalanced cluster

2017-05-04 Thread Cogumelos Maravilha
Hi all,

I'm using C* 3.10.

CREATE KEYSPACE mykeyspace WITH replication = {'class':
'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = false;

CREATE TABLE mykeyspace.data (
id bigint PRIMARY KEY,
kafka text
) WITH bloom_filter_fp_chance = 0.5
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
'compaction_window_size': '10', 'compaction_window_unit': 'HOURS',
'max_threshold': '32', 'min_threshold': '6'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 0.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 1555200
AND gc_grace_seconds = 10800
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

UN  10.1.1.161  398.39 GiB  256  28.9%
UN  10.1.1.19   765.32 GiB  256  29.9%
UN  10.1.1.52   574.24 GiB  256  28.2%
UN  10.1.1.213  817.56 GiB  256  28.2%
UN  10.1.1.85   638.82 GiB  256  28.2%
UN  10.1.1.245  408.95 GiB  256  28.7%
UN  10.1.1.185  574.63 GiB  256  27.9%

At crontab in all nodes (only changes the time):

08 05   * * *   rootnodetool repair -pr
11 11   * * *   rootfstrim -a
04 12   * * *   rootnodetool clearsnapshot
33 13   * * 2   rootnodetool cleanup
35 15   * * *   rootnodetool garbagecollect
46 19   * * *   rootnodetool clearsnapshot
50 23   * * *   rootnodetool flush

I can I fixed this?

Thanks in advance.



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Fail to add a new node to a exist cluster

2017-05-04 Thread Alain RODRIGUEZ
Hello Kevin, here are a few thoughts and things you could try:

each node has stored about 4TB

When I joined a new node, I found that the process has not been completed
> for more than a week


So first thought is that this is a lot of data per node. I think the sweet
spot is around 1-2 TB / node. That being said, since 8 TB are available,
this should not be an issue as long as you are ready to wait days to have
data moving around (repairs / bootstrap / etc). Also, modern versions of
Apache Cassandra aim at increasing the amount of data per node allowed by
reducing operation time and managing memory (among other things)  better
and better. Also, I have worked with perfectly healthy nodes with 4 TB of
data. So it's nothing completely undoable.

By default in your version you should have 'streaming_socket_timeout_in_ms:
8640', if it is set to zero for some reason, set it back to 1 day
(*8640
ms*). If it is already set to one day and you have some 'VERY LARGE'™ (like
1+ TB) SSTables you might want to try setting this value to 2 days or so.
You should be able to find trace of streaming failure and retry in the logs
if that's the issue you are facing.

CPU load of new node and some other nodes continued to be high


That's normal for new nodes, they are receiving tons of data that needs to
be compacted. Yet as long as they are not showing 'UN' in 'nodetool status'
they are not serving read and so not creating any latency. No worries about
that. On old nodes, it's interesting to dig what CPUs are actually doing.

finally had to give up join


So you stopped the joining node, right? Next time 'nodetool netstats -H'
could give you interesting progress information on stream. You could also
see if a stream get blocked using ‘watch -d nodetool netstats -H'.

Is there any good way to speed up the process of adding new nodes?


It mostly depends on the load you that the existing nodes (CPU / disk) and
network can handle. But you can tune the streaming speed with
'stream_throughput_outbound_megabits_per_sec: 200' (200 default value) and
the 'inter_dc_stream_throughput_outbound_megabits_per_sec' if streaming
cross datacenter in the cassandra.yaml or even dynamically with 'nodetool
setstreamthroughput X'

Also, new nodes can probably compact faster as they are not used for
serving reads: 'compaction_throughput_mb_per_sec'. Again, with nodetool
(and so without restarting the node): 'nodetool setcompactionthroughput X'

Next time you add a node, give us some outputs with commands above and try
some distinct tuning. Let us know how it goes before it's too late :-).

Also in recent versions you can 'resume' a bootstrap I believe. That my be
handy when using 4 TB big Apache Cassandra nodes.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-05-04 6:56 GMT+01:00 kevin :

> I have a Cassandra(v3.7) cluster with 31 nodes, each node’s hard
> configuration is 4cpu, 64GB memory, 8TB hard disk, and each node has stored
> about 4TB of data now. When I joined a new node, I found that the process
> has not been completed for more than a week, while the CPU load of new
> node and some other nodes continued to be high, and finally had to give up
> join. Is it a new node to join the process itself is very slow, or our way
> of use (too much data per node) improper and cause this problem? Is there
> any good way to speed up the process of adding new nodes?
>
> Thanks,
> Kevin
>
>
>
>
>
>
>
>