Re: Related to https://issues.apache.org/jira/browse/CASSANDRA-11143

2020-06-02 Thread Deepak Sharma
Thanks! That's what I realized later too.

On Tue, Jun 2, 2020 at 9:57 AM Jeff Jirsa  wrote:

> It's marked as a duplicate of
> https://issues.apache.org/jira/browse/CASSANDRA-10699 which is not yet
> fixed
>
>
> On Tue, Jun 2, 2020 at 9:39 AM Deepak Sharma
>  wrote:
>
>> Hi There,
>>
>> I see this (https://issues.apache.org/jira/browse/CASSANDRA-11143) issue
>> in the resolved state. Does it mean it has been fixed? This question is
>> specific in the context of 3.0.13 and 3.11.4 versions of Cassandra.
>>
>> Thanks,
>> Deepak
>>
>>


Re: Cassandra Bootstrap Sequence

2020-06-02 Thread Jai Bheemsen Rao Dhanwada
Just did some more debugging it looks like the "nodetool compactionstats"
which is hung/taking time during this period causing the delay in metrics.
I still puzzled why the nodetool compactionstats commands takes longer on
all the nodes at the same time, when one node is being restarted

$ time nodetool compactionstats
> pending tasks: 0
>
> real 1m17.559s
> user 0m2.340s
> sys 0m0.248s


On Tue, Jun 2, 2020 at 10:25 AM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Also during this time, I am losing metrics for all the nodes in the
> cluster (metrics agent is timing out collecting within 10s) and recovers
> once the node starts the CQL port. Is there any known issue which could
> cause this? In my case the delay between Gossip settle and CQL port open is
> 3 minutes, metrics were lost for all the nodes during the 3 minute period.
>
> On Tue, Jun 2, 2020 at 7:55 AM Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> Thank you,
>>
>> Does that mean there is no way to improve this delay? And i have to live
>> with it since i have more tables?
>>
>> On Tuesday, June 2, 2020, Durity, Sean R 
>> wrote:
>>
>>> As I understand it, Cassandra clusters should be limited to a number of
>>> tables in the low hundreds (under 200), at most. What you are seeing is the
>>> carving up of memtables for each of those 3,000. I try to limit my clusters
>>> to roughly 100 tables.
>>>
>>>
>>>
>>>
>>>
>>> Sean Durity
>>>
>>>
>>>
>>> *From:* Jai Bheemsen Rao Dhanwada 
>>> *Sent:* Tuesday, June 2, 2020 10:48 AM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>>
>>>
>>>
>>> 3000 tables
>>>
>>> On Tuesday, June 2, 2020, Durity, Sean R 
>>> wrote:
>>>
>>> How many total tables in the cluster?
>>>
>>>
>>>
>>>
>>>
>>> Sean Durity
>>>
>>>
>>>
>>> *From:* Jai Bheemsen Rao Dhanwada 
>>> *Sent:* Monday, June 1, 2020 8:36 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>>
>>>
>>>
>>> Thanks Erick,
>>>
>>>
>>>
>>> I see below tasks are being run mostly. I didn't quite understand what
>>> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
>>> time or do I have to live with this delay?
>>>
>>>
>>>
>>> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
>>> strategy" debug.log*  | wc -l
>>> 3249
>>> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache
>>> for" debug.log*  | wc -l
>>> 6293
>>> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  |
>>> wc -l
>>> 6308
>>> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
>>> DiskBoundaries" debug.log*  | wc -l
>>> 3249
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
>>> wrote:
>>>
>>> There's quite a lot of steps that takes place during the startup
>>> sequence between these 2 lines:
>>>
>>>
>>>
>>>
>>> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
>>> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
>>> NativeTransportService.java:70 - Netty using native Epoll event loop
>>>
>>>
>>>
>>> For the most part, it's taken up by CompactionStrategyManager and
>>> DiskBoundaryManager. If you check debug.log, you'll see that it's
>>> mostly updating disk boundaries. The length of time it takes is
>>> proportional to the number of tables in the cluster.
>>>
>>>
>>>
>>> Have a look at this section [1] of CassandraDaemon if you're interested
>>> in the details of the startup sequence. Cheers!
>>>
>>>
>>>
>>> [1] 
>>> https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
>>> [github.com]
>>> 
>>>
>>>
>>> --
>>>
>>>
>>> The information in this Internet Email is confidential and may be
>>> legally privileged. It is intended solely for the addressee. Access to this
>>> Email by anyone else is unauthorized. If you are not the intended
>>> recipient, any disclosure, copying, distribution or any action taken or
>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>> When addressed to our clients any opinions or advice contained in this
>>> Email are subject to the terms and conditions expressed in any applicable
>>> governing The Home Depot terms of business or client engagement letter. The
>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>> content of this attachment and for any damages or losses arising from any
>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>> items of a destructive nature, which may be contained in this attachment
>>> and shall not be liable for direct, indirect, consequential or 

Re: Cassandra Bootstrap Sequence

2020-06-02 Thread Jai Bheemsen Rao Dhanwada
Also during this time, I am losing metrics for all the nodes in the cluster
(metrics agent is timing out collecting within 10s) and recovers once the
node starts the CQL port. Is there any known issue which could cause this?
In my case the delay between Gossip settle and CQL port open is 3 minutes,
metrics were lost for all the nodes during the 3 minute period.

On Tue, Jun 2, 2020 at 7:55 AM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Thank you,
>
> Does that mean there is no way to improve this delay? And i have to live
> with it since i have more tables?
>
> On Tuesday, June 2, 2020, Durity, Sean R 
> wrote:
>
>> As I understand it, Cassandra clusters should be limited to a number of
>> tables in the low hundreds (under 200), at most. What you are seeing is the
>> carving up of memtables for each of those 3,000. I try to limit my clusters
>> to roughly 100 tables.
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Jai Bheemsen Rao Dhanwada 
>> *Sent:* Tuesday, June 2, 2020 10:48 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>
>>
>>
>> 3000 tables
>>
>> On Tuesday, June 2, 2020, Durity, Sean R 
>> wrote:
>>
>> How many total tables in the cluster?
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Jai Bheemsen Rao Dhanwada 
>> *Sent:* Monday, June 1, 2020 8:36 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>
>>
>>
>> Thanks Erick,
>>
>>
>>
>> I see below tasks are being run mostly. I didn't quite understand what
>> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
>> time or do I have to live with this delay?
>>
>>
>>
>> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
>> strategy" debug.log*  | wc -l
>> 3249
>> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache
>> for" debug.log*  | wc -l
>> 6293
>> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
>> -l
>> 6308
>> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
>> DiskBoundaries" debug.log*  | wc -l
>> 3249
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
>> wrote:
>>
>> There's quite a lot of steps that takes place during the startup sequence
>> between these 2 lines:
>>
>>
>>
>>
>> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
>> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
>> NativeTransportService.java:70 - Netty using native Epoll event loop
>>
>>
>>
>> For the most part, it's taken up by CompactionStrategyManager and
>> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
>> updating disk boundaries. The length of time it takes is proportional to
>> the number of tables in the cluster.
>>
>>
>>
>> Have a look at this section [1] of CassandraDaemon if you're interested
>> in the details of the startup sequence. Cheers!
>>
>>
>>
>> [1] 
>> https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
>> [github.com]
>> 
>>
>>
>> --
>>
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>>
>> --
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any 

Re: Related to https://issues.apache.org/jira/browse/CASSANDRA-11143

2020-06-02 Thread Jeff Jirsa
It's marked as a duplicate of
https://issues.apache.org/jira/browse/CASSANDRA-10699 which is not yet fixed


On Tue, Jun 2, 2020 at 9:39 AM Deepak Sharma
 wrote:

> Hi There,
>
> I see this (https://issues.apache.org/jira/browse/CASSANDRA-11143) issue
> in the resolved state. Does it mean it has been fixed? This question is
> specific in the context of 3.0.13 and 3.11.4 versions of Cassandra.
>
> Thanks,
> Deepak
>
>


Related to https://issues.apache.org/jira/browse/CASSANDRA-11143

2020-06-02 Thread Deepak Sharma
Hi There,

I see this (https://issues.apache.org/jira/browse/CASSANDRA-11143) issue in
the resolved state. Does it mean it has been fixed? This question is
specific in the context of 3.0.13 and 3.11.4 versions of Cassandra.

Thanks,
Deepak


Re: CDC Tools

2020-06-02 Thread Johnny Miller
Dor - that looks very useful. Looking forward to trying the CDC Kafka
connector!

On Thu, 28 May 2020 at 02:53, Dor Laor  wrote:

> If it's helpful, IMO, the approach Cassandra needs to take isn't
> by tracking the individual node commit log and putting the burden
> on the client. At Scylla, we had the 'opportunity' to be a late comer
> and see what approach Cassadnra took and what DynamoDB streams
> took.
>
> We've implemented CDC as a regular CQL table [1].
> Not only it's super easy to consume, it's also consistent and
> you can choose to read the older values.
>
> I recommend Cassandra should pick up our design, a small
> contribution back. We're implementing an OSS kafka CDCk connector
> too.
>
> Dor
>
> [1]
> https://www.scylladb.com/2020/03/31/observing-data-changes-with-change-data-capture-cdc/
>
> On Wed, May 27, 2020 at 5:41 PM Erick Ramirez 
> wrote:
>
>> I have looked at DataStax CDC but I think it works only for DSE !
>>>
>>
>> Yes, thanks for the correction.  I just got confirmation myself -- the
>> Kafka-Cassandra connector works with OSS C* but the CDC connector relies on
>> a DSE feature that's not yet available in OSS C*. Cheers!
>>
>

-- 



--

The information contained in this electronic message and any 
attachments to this message are intended for the exclusive use of the 
addressee(s) and may contain proprietary, confidential or privileged 
information. If you are not the intended recipient, you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately and destroy all copies of this message and any attachments. 
WARNING: Computer viruses can be transmitted via email. The recipient 
should check this email and any attachments for the presence of viruses. 
The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.digitalis.io 


Re: Cassandra crashes when using offheap_objects for memtable_allocation_type

2020-06-02 Thread Reid Pinchback
I’d also take a look at the O/S level.  You might be queued up on flushing of 
dirty pages, which would also throttle your ability to write mempages.  Once 
the I/O gets throttled badly, I’ve seen it push back into what you see in C*. 
To Aaron’s point, you want a balance in memory between C* and O/S buffer cache, 
because to write to disk you pass through buffer cache first.

From: Aaron Ploetz 
Reply-To: "user@cassandra.apache.org" 
Date: Tuesday, June 2, 2020 at 9:38 AM
To: "user@cassandra.apache.org" 
Subject: Re: Cassandra crashes when using offheap_objects for 
memtable_allocation_type

Message from External Sender
I would try running it with memtable_offheap_space_in_mb at the default for 
sure, but definitely lower than 8GB.  With 32GB of RAM, you're already 
allocating half of that for your heap, and then halving the remainder for off 
heap memtables.  What's left may not be enough for the OS, etc.  Giving some of 
that back, will allow more to be used for page cache, which always helps.

"JVM heap size: 16GB, CMS, 1GB newgen"

For CMS GC with a 16GB heap, 1GB is way too small for new gen.  You're going to 
want that to be at least 40% of the max heap size.  Some folks here even 
advocate for setting Xmn as high as 50% of Xmx/s.

If you want to stick with CMS GC, take a look at 
https://issues.apache.org/jira/browse/CASSANDRA-8150.
  There's plenty of good info in there on CMS GC tuning.  Make sure to read 
through the whole ticket, so that you understand what each setting does.  You 
can't just pick-and-choose.

Regards,

Aaron


On Tue, Jun 2, 2020 at 1:31 AM onmstester onmstester 
 wrote:
I just changed these properties to increase flushed file size (decrease number 
of compactions):

  *   memtable_allocation_type from heap_buffers to offheap_objects
  *   memtable_offheap_space_in_mb: from default (2048) to 8192
Using default value for other memtable/compaction/commitlog configurations .

After a few hours some of nodes stopped to do any mutations (dropped mutaion 
increased) and also pending flushes increased, they were just up and running 
and there was only a single CPU core with 100% usage(other cores was 0%). other 
nodes on the cluster determines the node as DN. Could not access 7199 and also 
could not create thread dump even with jstack -F.

Restarting Cassandra service fixes the problem but after a while some other 
node would be DN.

Am i missing some configurations?  What should i change in cassandra default 
configuration to maximize write throughput in single node/cluster in 
write-heavy scenario for the data model:
Data mode is a single table:
  create table test(
  text partition_key,
  text clustering_key,
  set rows,
  primary key ((partition_key, clustering_key))


vCPU: 12
Memory: 32GB
Node data size: 2TB
Apache cassandra 3.11.2
JVM heap size: 16GB, CMS, 1GB newgen


Sent using Zoho 
Mail





Re: Cassandra Bootstrap Sequence

2020-06-02 Thread Jai Bheemsen Rao Dhanwada
Thank you,

Does that mean there is no way to improve this delay? And i have to live
with it since i have more tables?

On Tuesday, June 2, 2020, Durity, Sean R 
wrote:

> As I understand it, Cassandra clusters should be limited to a number of
> tables in the low hundreds (under 200), at most. What you are seeing is the
> carving up of memtables for each of those 3,000. I try to limit my clusters
> to roughly 100 tables.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada 
> *Sent:* Tuesday, June 2, 2020 10:48 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>
>
>
> 3000 tables
>
> On Tuesday, June 2, 2020, Durity, Sean R 
> wrote:
>
> How many total tables in the cluster?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada 
> *Sent:* Monday, June 1, 2020 8:36 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>
>
>
> Thanks Erick,
>
>
>
> I see below tasks are being run mostly. I didn't quite understand what
> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
> time or do I have to live with this delay?
>
>
>
> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
> strategy" debug.log*  | wc -l
> 3249
> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for"
> debug.log*  | wc -l
> 6293
> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
> -l
> 6308
> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
> DiskBoundaries" debug.log*  | wc -l
> 3249
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
> wrote:
>
> There's quite a lot of steps that takes place during the startup sequence
> between these 2 lines:
>
>
>
>
> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
> NativeTransportService.java:70 - Netty using native Epoll event loop
>
>
>
> For the most part, it's taken up by CompactionStrategyManager and
> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
> updating disk boundaries. The length of time it takes is proportional to
> the number of tables in the cluster.
>
>
>
> Have a look at this section [1] of CassandraDaemon if you're interested
> in the details of the startup sequence. Cheers!
>
>
>
> [1] https://github.com/apache/cassandra/blob/cassandra-3.11.
> 3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
> [github.com]
> 
>
>
> --
>
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


RE: Cassandra Bootstrap Sequence

2020-06-02 Thread Durity, Sean R
As I understand it, Cassandra clusters should be limited to a number of tables 
in the low hundreds (under 200), at most. What you are seeing is the carving up 
of memtables for each of those 3,000. I try to limit my clusters to roughly 100 
tables.


Sean Durity

From: Jai Bheemsen Rao Dhanwada 
Sent: Tuesday, June 2, 2020 10:48 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

3000 tables

On Tuesday, June 2, 2020, Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
How many total tables in the cluster?


Sean Durity

From: Jai Bheemsen Rao Dhanwada 
mailto:jaibheem...@gmail.com>>
Sent: Monday, June 1, 2020 8:36 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what exactly 
these scheduled tasks are for? Is there a way to reduce the boot-up time or do 
I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction strategy" 
debug.log*  | wc -l
3249
$ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for" 
debug.log*  | wc -l
6293
$ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc -l
6308
$ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from DiskBoundaries" 
debug.log*  | wc -l
3249





On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
mailto:erick.rami...@datastax.com>> wrote:
There's quite a lot of steps that takes place during the startup sequence 
between these 2 lines:

INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; 
proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty 
using native Epoll event loop

For the most part, it's taken up by CompactionStrategyManager and 
DiskBoundaryManager. If you check debug.log, you'll see that it's mostly 
updating disk boundaries. The length of time it takes is proportional to the 
number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in the 
details of the startup sequence. Cheers!

[1] 
https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
 
[github.com]



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Cassandra Bootstrap Sequence

2020-06-02 Thread Reid Pinchback
Would updating disk boundaries be sensitive to disk I/O tuning?  I’m 
remembering Jon Haddad’s talk about typical throughput problems in disk page 
sizing.

From: Jai Bheemsen Rao Dhanwada 
Reply-To: "user@cassandra.apache.org" 
Date: Tuesday, June 2, 2020 at 10:48 AM
To: "user@cassandra.apache.org" 
Subject: Re: Cassandra Bootstrap Sequence

Message from External Sender
3000 tables

On Tuesday, June 2, 2020, Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
How many total tables in the cluster?


Sean Durity

From: Jai Bheemsen Rao Dhanwada 
mailto:jaibheem...@gmail.com>>
Sent: Monday, June 1, 2020 8:36 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what exactly 
these scheduled tasks are for? Is there a way to reduce the boot-up time or do 
I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction strategy" 
debug.log*  | wc -l
3249
$ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for" 
debug.log*  | wc -l
6293
$ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc -l
6308
$ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from DiskBoundaries" 
debug.log*  | wc -l
3249





On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
mailto:erick.rami...@datastax.com>> wrote:
There's quite a lot of steps that takes place during the startup sequence 
between these 2 lines:

INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; 
proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty 
using native Epoll event loop

For the most part, it's taken up by CompactionStrategyManager and 
DiskBoundaryManager. If you check debug.log, you'll see that it's mostly 
updating disk boundaries. The length of time it takes is proportional to the 
number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in the 
details of the startup sequence. Cheers!

[1] 
https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
 
[github.com]



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Impact of enabling authentication on performance

2020-06-02 Thread Durity, Sean R
To flesh this out a bit, I set roles_validity_in_ms and 
permissions_validity_in_ms to 360 (10 minutes). The default of 2000 is far 
too often for my use cases. Usually I set the RF for system_auth to 3 per DC. 
On a larger, busier cluster I have set it to 6 per DC. NOTE: if you set the 
validity higher, it may take that amount of time before a change in password or 
table permissions is picked up (usually less).


Sean Durity

-Original Message-
From: Jeff Jirsa 
Sent: Tuesday, June 2, 2020 2:39 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Impact of enabling authentication on performance

Set the Auth cache to a long validity

Don’t go crazy with RF of system auth

Drop bcrypt rounds if you see massive cpu spikes on reconnect storms


> On Jun 1, 2020, at 11:26 PM, Gil Ganz  wrote:
>
> 
> Hi
> I have a production 3.11.6 cluster which I'm might want to enable 
> authentication in, I'm trying to understand what will be the performance 
> impact, if any.
> I understand each use case might be different, trying to understand if there 
> is a common % people usually see their performance hit, or if someone has 
> looked into this.
> Gil

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Cassandra Bootstrap Sequence

2020-06-02 Thread Jai Bheemsen Rao Dhanwada
3000 tables

On Tuesday, June 2, 2020, Durity, Sean R 
wrote:

> How many total tables in the cluster?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada 
> *Sent:* Monday, June 1, 2020 8:36 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>
>
>
> Thanks Erick,
>
>
>
> I see below tasks are being run mostly. I didn't quite understand what
> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
> time or do I have to live with this delay?
>
>
>
> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
> strategy" debug.log*  | wc -l
> 3249
> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for"
> debug.log*  | wc -l
> 6293
> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
> -l
> 6308
> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
> DiskBoundaries" debug.log*  | wc -l
> 3249
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
> wrote:
>
> There's quite a lot of steps that takes place during the startup sequence
> between these 2 lines:
>
>
>
>
> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
> NativeTransportService.java:70 - Netty using native Epoll event loop
>
>
>
> For the most part, it's taken up by CompactionStrategyManager and
> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
> updating disk boundaries. The length of time it takes is proportional to
> the number of tables in the cluster.
>
>
>
> Have a look at this section [1] of CassandraDaemon if you're interested
> in the details of the startup sequence. Cheers!
>
>
>
> [1] https://github.com/apache/cassandra/blob/cassandra-3.11.
> 3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
> [github.com]
> 
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


RE: Cassandra Bootstrap Sequence

2020-06-02 Thread Durity, Sean R
How many total tables in the cluster?


Sean Durity

From: Jai Bheemsen Rao Dhanwada 
Sent: Monday, June 1, 2020 8:36 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what exactly 
these scheduled tasks are for? Is there a way to reduce the boot-up time or do 
I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction strategy" 
debug.log*  | wc -l
3249
$ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for" 
debug.log*  | wc -l
6293
$ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc -l
6308
$ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from DiskBoundaries" 
debug.log*  | wc -l
3249





On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez 
mailto:erick.rami...@datastax.com>> wrote:
There's quite a lot of steps that takes place during the startup sequence 
between these 2 lines:

INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; 
proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty 
using native Epoll event loop

For the most part, it's taken up by CompactionStrategyManager and 
DiskBoundaryManager. If you check debug.log, you'll see that it's mostly 
updating disk boundaries. The length of time it takes is proportional to the 
number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in the 
details of the startup sequence. Cheers!

[1] 
https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
 
[github.com]



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Cassandra crashes when using offheap_objects for memtable_allocation_type

2020-06-02 Thread Aaron Ploetz
primary key ((partition_key, clustering_key))

Also, this primary key definition does not define a partitioning key and a
clustering key.  It defines a *composite* partition key.

If you want it to instantiate both a partition and clustering key, get rid
of one set of parens.

primary key (partition_key, clustering_key)


On Tue, Jun 2, 2020 at 1:31 AM onmstester onmstester
 wrote:

> I just changed these properties to increase flushed file size (decrease
> number of compactions):
>
>- memtable_allocation_type from heap_buffers to offheap_objects
>- memtable_offheap_space_in_mb: from default (2048) to 8192
>
> Using default value for other memtable/compaction/commitlog configurations
> .
>
> After a few hours some of nodes stopped to do any mutations (dropped
> mutaion increased) and also pending flushes increased, they were just up
> and running and there was only a single CPU core with 100% usage(other
> cores was 0%). other nodes on the cluster determines the node as DN. Could
> not access 7199 and also could not create thread dump even with jstack -F.
>
> Restarting Cassandra service fixes the problem but after a while some
> other node would be DN.
>
> Am i missing some configurations?  What should i change in cassandra
> default configuration to maximize write throughput in single node/cluster
> in write-heavy scenario for the data model:
> Data mode is a single table:
>   create table test(
>   text partition_key,
>   text clustering_key,
>   set rows,
>   primary key ((partition_key, clustering_key))
>
>
> vCPU: 12
> Memory: 32GB
> Node data size: 2TB
> Apache cassandra 3.11.2
> JVM heap size: 16GB, CMS, 1GB newgen
>
> Sent using Zoho Mail 
>
>
>
>


Re: Cassandra crashes when using offheap_objects for memtable_allocation_type

2020-06-02 Thread Aaron Ploetz
I would try running it with memtable_offheap_space_in_mb at the default for
sure, but definitely lower than 8GB.  With 32GB of RAM, you're already
allocating half of that for your heap, and then halving the remainder for
off heap memtables.  What's left may not be enough for the OS, etc.  Giving
some of that back, will allow more to be used for page cache, which always
helps.

"JVM heap size: 16GB, CMS, 1GB newgen"

For CMS GC with a 16GB heap, 1GB is way too small for new gen.  You're
going to want that to be at least 40% of the max heap size.  Some folks
here even advocate for setting Xmn as high as 50% of Xmx/s.

If you want to stick with CMS GC, take a look at
https://issues.apache.org/jira/browse/CASSANDRA-8150.  There's plenty of
good info in there on CMS GC tuning.  Make sure to read through the whole
ticket, so that you understand what each setting does.  You can't just
pick-and-choose.

Regards,

Aaron


On Tue, Jun 2, 2020 at 1:31 AM onmstester onmstester
 wrote:

> I just changed these properties to increase flushed file size (decrease
> number of compactions):
>
>- memtable_allocation_type from heap_buffers to offheap_objects
>- memtable_offheap_space_in_mb: from default (2048) to 8192
>
> Using default value for other memtable/compaction/commitlog configurations
> .
>
> After a few hours some of nodes stopped to do any mutations (dropped
> mutaion increased) and also pending flushes increased, they were just up
> and running and there was only a single CPU core with 100% usage(other
> cores was 0%). other nodes on the cluster determines the node as DN. Could
> not access 7199 and also could not create thread dump even with jstack -F.
>
> Restarting Cassandra service fixes the problem but after a while some
> other node would be DN.
>
> Am i missing some configurations?  What should i change in cassandra
> default configuration to maximize write throughput in single node/cluster
> in write-heavy scenario for the data model:
> Data mode is a single table:
>   create table test(
>   text partition_key,
>   text clustering_key,
>   set rows,
>   primary key ((partition_key, clustering_key))
>
>
> vCPU: 12
> Memory: 32GB
> Node data size: 2TB
> Apache cassandra 3.11.2
> JVM heap size: 16GB, CMS, 1GB newgen
>
> Sent using Zoho Mail 
>
>
>
>


Re: Impact of enabling authentication on performance

2020-06-02 Thread Jeff Jirsa
Set the Auth cache to a long validity

Don’t go crazy with RF of system auth

Drop bcrypt rounds if you see massive cpu spikes on reconnect storms


> On Jun 1, 2020, at 11:26 PM, Gil Ganz  wrote:
> 
> 
> Hi
> I have a production 3.11.6 cluster which I'm might want to enable 
> authentication in, I'm trying to understand what will be the performance 
> impact, if any.
> I understand each use case might be different, trying to understand if there 
> is a common % people usually see their performance hit, or if someone has 
> looked into this.
> Gil

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Cassandra crashes when using offheap_objects for memtable_allocation_type

2020-06-02 Thread onmstester onmstester
I just changed these properties to increase flushed file size (decrease number 
of compactions):

memtable_allocation_type from heap_buffers to offheap_objects

memtable_offheap_space_in_mb: from default (2048) to 8192


Using default value for other memtable/compaction/commitlog configurations .


After a few hours some of nodes stopped to do any mutations (dropped mutaion 
increased) and also pending flushes increased, they were just up and running 
and there was only a single CPU core with 100% usage(other cores was 0%). other 
nodes on the cluster determines the node as DN. Could not access 7199 and also 
could not create thread dump even with jstack -F. 



Restarting Cassandra service fixes the problem but after a while some other 
node would be DN.



Am i missing some configurations?  What should i change in cassandra default 
configuration to maximize write throughput in single node/cluster in 
write-heavy scenario for the data model:

Data mode is a single table:

  create table test(

  text partition_key,

  text clustering_key,

  set rows,

  primary key ((partition_key, clustering_key))






vCPU: 12

Memory: 32GB

Node data size: 2TB
Apache cassandra 3.11.2

JVM heap size: 16GB, CMS, 1GB newgen



Sent using https://www.zoho.com/mail/

Impact of enabling authentication on performance

2020-06-02 Thread Gil Ganz
Hi
I have a production 3.11.6 cluster which I'm might want to enable
authentication in, I'm trying to understand what will be the performance
impact, if any.
I understand each use case might be different, trying to understand if
there is a common % people usually see their performance hit, or if someone
has looked into this.
Gil