Re: Disabling Swap for Cassandra

2020-04-16 Thread Jeff Jirsa


> On Apr 16, 2020, at 5:50 PM, Dor Laor  wrote:
> 
> You should configure swap for safety, better be slow than crash,
> 

For most production use cases, it’s almost always better to crash than be slow.



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Disabling Swap for Cassandra

2020-04-16 Thread Dor Laor
On Thu, Apr 16, 2020 at 5:09 PM Kunal  wrote:
>
> Thanks for the responses. Appreciae it.
>
> @Dor, so you are saying if we add "memlock unlimited" in limits.conf, the 
> entire heap (Xms=Xmx) can be locked at startup ? Will this be applied to all 
> Java processes ?  We have couple of Java programs running with the same owner.

Each process is responsible for calling mlock on its own (in the code itself).
I only see mlock in C* under JNA, my knowledge is mostly in scylla, so
not sure about this.
The limits.conf just makes sure the limits are high enough

You should configure swap for safety, better be slow than crash, the
memory locking
is another safety measure and isn't a must. You can also run your daemons
in separate cgroup and cap their memory usage as explained in one of the answers
here:
https://stackoverflow.com/questions/12520499/linux-how-to-lock-the-pages-of-a-process-in-memory

>
>
> Thanks
> Kunal
>
> On Thu, Apr 16, 2020 at 4:31 PM Dor Laor  wrote:
>>
>> It is good to configure swap for the OS but exempt Cassandra
>> from swapping. Why is it good? Since you never know the
>> memory utilization of additional agents and processes you or
>> other admins will run on your server.
>>
>> So do configure a swap partition.
>> You can control the eagerness of the kernel by the swappiness
>> sysctl parameter. You can even control it per cgroup:
>> https://askubuntu.com/questions/967588/how-can-i-prevent-certain-process-from-being-swapped
>>
>> You should make sure Cassandra locks its memory so the kernel
>> won't choose its memory to be swapped out (since it will kill
>> your latency). You do it by mlock. Read more on:
>> https://stackoverflow.com/questions/578137/can-i-tell-linux-not-to-swap-out-a-particular-processes-memory
>>
>> The scylla /dist/common/limits.d/scylladb.com looks like this:
>> scylla  -  core unlimited
>> scylla  -  memlock  unlimited
>> scylla  -  nofile   20
>> scylla  -  as   unlimited
>> scylla  -  nproc8096
>>
>> On Thu, Apr 16, 2020 at 3:57 PM Nitan Kainth  wrote:
>> >
>> > Swap is controlled by OS and will use it when running short of memory. I 
>> > don’t think you can disable at Cassandra level
>> >
>> >
>> > Regards,
>> >
>> > Nitan
>> >
>> > Cell: 510 449 9629
>> >
>> >
>> > On Apr 16, 2020, at 5:50 PM, Kunal  wrote:
>> >
>> > 
>> >
>> > Hello,
>> >
>> >
>> >
>> > I need some suggestion from you all. I am new to Cassandra and was reading 
>> > Cassandra best practices. On one document, it was mentioned that Cassandra 
>> > should not be using swap, it degrades the performance.
>> >
>> > My question is instead of disabling swap system wide, can we force 
>> > Cassandra not to use swap? Some documentation suggests to use 
>> > memory_locking_policy in cassandra.yaml.
>> >
>> >
>> > How do I check if our Cassandra already has this parameter and still uses 
>> > swap ? Is there any way i can check this. I already checked cassandra.yaml 
>> > and dont see this parameter. Is there any other place i can check and 
>> > confirm?
>> >
>> >
>> > Also, Can I set memlock parameter to unlimited (64kB default), so entire 
>> > Heap (Xms = Xmx) can be locked at node startup ? Will that help?
>> >
>> >
>> > Or if you have any other suggestions, please let me know.
>> >
>> >
>> >
>> >
>> >
>> > Regards,
>> >
>> > Kunal
>> >
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>
>
> --
>
>
>
> Regards,
> Kunal Vaid

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Multi DC replication between different Cassandra versions

2020-04-16 Thread Ashika Umagiliya
Thank you for the clarifications,

If this is not recommended, our last resort is to upgrade the entire
cluster.

About Kafka Connect, we sound following Source Connectors which can be used
to Ingest data from C* to Kafka .

https://debezium.io/documentation/reference/connectors/cassandra.html
https://docs.lenses.io/2.0/connectors/source/cassandra-cdc.html
https://docs.lenses.io/2.0/connectors/source/cassandra.html
https://www.datastax.com/press-release/datastax-announces-change-data-capture-cdc-connector-apache-kafka




On Thu, Apr 16, 2020 at 9:42 PM Durity, Sean R 
wrote:

> I agree – do not aim for a mixed version as normal. Mixed versions are
> fine during an upgrade process, but the goal is to complete the upgrade as
> soon as possible.
>
>
>
> As for other parts of your plan, the Kafka Connector is a “sink-only,”
> which means that it can only insert into Cassandra. It doesn’t go the other
> way.
>
>
>
> I usually suggest that if the data is needed in two (or more) places, that
> the application write to a queue. Then, let the queue feed all the
> downstream destinations.
>
>
>
>
>
> Sean Durity – Staff Systems Engineer, Cassandra
>
>
>
> *From:* Christopher Bradford 
> *Sent:* Thursday, April 16, 2020 1:13 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Multi DC replication between different
> Cassandra versions
>
>
>
> It’s worth noting there can be issues with streaming between different
> versions of C*. Note this excerpt from
>
> https://thelastpickle.com/blog/2019/02/26/data-center-switch.html
> [thelastpickle.com]
> 
>
>
>
>
> Note that with an upgrade it’s important to keep in mind that *streaming
> in a cluster running mixed versions of Casandra is not recommended*
>
>
>
> Emphasis mine. With the approach you’re suggesting streaming would be
> involved both during bootstrap and repair. Would it be possible to upgrade
> to a more recent release prior to pursuing this course of action?
>
>
>
> On Thu, Apr 16, 2020 at 1:02 AM Erick Ramirez 
> wrote:
>
> I don't mean any disrespect but let me offer you a friendly advice --
> don't do it to yourself. I think you would have a very hard time finding
> someone who would recommend implementing a solution that involves mixed
> versions. If you run into issues, it would be hell trying to unscramble
> that egg.
>
>
>
> On top of that, Cassandra 3.0.9 is an ancient version released 4 years ago
> (September 2016). There are several pages of fixes deployed since then. So
> in the nicest possible way, what you're planning to do is not a good idea.
> I personally wouldn't do it. Cheers!
>
> --
>
>
> Christopher Bradford
>
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


Re: Disabling Swap for Cassandra

2020-04-16 Thread Kunal
Thanks for the responses. Appreciae it.

@Dor, so you are saying if we add "memlock unlimited" in limits.conf, the
entire heap (Xms=Xmx) can be locked at startup ? Will this be applied to
all Java processes ?  We have couple of Java programs running with the same
owner.


Thanks
Kunal

On Thu, Apr 16, 2020 at 4:31 PM Dor Laor  wrote:

> It is good to configure swap for the OS but exempt Cassandra
> from swapping. Why is it good? Since you never know the
> memory utilization of additional agents and processes you or
> other admins will run on your server.
>
> So do configure a swap partition.
> You can control the eagerness of the kernel by the swappiness
> sysctl parameter. You can even control it per cgroup:
>
> https://askubuntu.com/questions/967588/how-can-i-prevent-certain-process-from-being-swapped
>
> You should make sure Cassandra locks its memory so the kernel
> won't choose its memory to be swapped out (since it will kill
> your latency). You do it by mlock. Read more on:
>
> https://stackoverflow.com/questions/578137/can-i-tell-linux-not-to-swap-out-a-particular-processes-memory
>
> The scylla /dist/common/limits.d/scylladb.com looks like this:
> scylla  -  core unlimited
> scylla  -  memlock  unlimited
> scylla  -  nofile   20
> scylla  -  as   unlimited
> scylla  -  nproc8096
>
> On Thu, Apr 16, 2020 at 3:57 PM Nitan Kainth 
> wrote:
> >
> > Swap is controlled by OS and will use it when running short of memory. I
> don’t think you can disable at Cassandra level
> >
> >
> > Regards,
> >
> > Nitan
> >
> > Cell: 510 449 9629
> >
> >
> > On Apr 16, 2020, at 5:50 PM, Kunal  wrote:
> >
> > 
> >
> > Hello,
> >
> >
> >
> > I need some suggestion from you all. I am new to Cassandra and was
> reading Cassandra best practices. On one document, it was mentioned that
> Cassandra should not be using swap, it degrades the performance.
> >
> > My question is instead of disabling swap system wide, can we force
> Cassandra not to use swap? Some documentation suggests to use
> memory_locking_policy in cassandra.yaml.
> >
> >
> > How do I check if our Cassandra already has this parameter and still
> uses swap ? Is there any way i can check this. I already checked
> cassandra.yaml and dont see this parameter. Is there any other place i can
> check and confirm?
> >
> >
> > Also, Can I set memlock parameter to unlimited (64kB default), so entire
> Heap (Xms = Xmx) can be locked at node startup ? Will that help?
> >
> >
> > Or if you have any other suggestions, please let me know.
> >
> >
> >
> >
> >
> > Regards,
> >
> > Kunal
> >
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

-- 



Regards,
Kunal Vaid


Re: Disabling Swap for Cassandra

2020-04-16 Thread Dor Laor
It is good to configure swap for the OS but exempt Cassandra
from swapping. Why is it good? Since you never know the
memory utilization of additional agents and processes you or
other admins will run on your server.

So do configure a swap partition.
You can control the eagerness of the kernel by the swappiness
sysctl parameter. You can even control it per cgroup:
https://askubuntu.com/questions/967588/how-can-i-prevent-certain-process-from-being-swapped

You should make sure Cassandra locks its memory so the kernel
won't choose its memory to be swapped out (since it will kill
your latency). You do it by mlock. Read more on:
https://stackoverflow.com/questions/578137/can-i-tell-linux-not-to-swap-out-a-particular-processes-memory

The scylla /dist/common/limits.d/scylladb.com looks like this:
scylla  -  core unlimited
scylla  -  memlock  unlimited
scylla  -  nofile   20
scylla  -  as   unlimited
scylla  -  nproc8096

On Thu, Apr 16, 2020 at 3:57 PM Nitan Kainth  wrote:
>
> Swap is controlled by OS and will use it when running short of memory. I 
> don’t think you can disable at Cassandra level
>
>
> Regards,
>
> Nitan
>
> Cell: 510 449 9629
>
>
> On Apr 16, 2020, at 5:50 PM, Kunal  wrote:
>
> 
>
> Hello,
>
>
>
> I need some suggestion from you all. I am new to Cassandra and was reading 
> Cassandra best practices. On one document, it was mentioned that Cassandra 
> should not be using swap, it degrades the performance.
>
> My question is instead of disabling swap system wide, can we force Cassandra 
> not to use swap? Some documentation suggests to use memory_locking_policy in 
> cassandra.yaml.
>
>
> How do I check if our Cassandra already has this parameter and still uses 
> swap ? Is there any way i can check this. I already checked cassandra.yaml 
> and dont see this parameter. Is there any other place i can check and confirm?
>
>
> Also, Can I set memlock parameter to unlimited (64kB default), so entire Heap 
> (Xms = Xmx) can be locked at node startup ? Will that help?
>
>
> Or if you have any other suggestions, please let me know.
>
>
>
>
>
> Regards,
>
> Kunal
>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Disabling Swap for Cassandra

2020-04-16 Thread J. D. Jordan

Cassandra attempts to lock the heap at startup, but all the memory allocated 
after startup is not locked.  So you do want to make sure the allowed locked 
memory is large.

Disabling or vastly dialing down swappiness is a best practice for all server 
software, not just Cassandra, so you should still at the very least set the 
swappiness to some small number of you don’t want to completely disable it.

-Jeremiah

> On Apr 16, 2020, at 5:57 PM, Nitan Kainth  wrote:
> 
> Swap is controlled by OS and will use it when running short of memory. I 
> don’t think you can disable at Cassandra level
> 
> 
> Regards,
> Nitan
> Cell: 510 449 9629
> 
>>> On Apr 16, 2020, at 5:50 PM, Kunal  wrote:
>>> 
>> 
>> Hello,
>>  
>> I need some suggestion from you all. I am new to Cassandra and was reading 
>> Cassandra best practices. On one document, it was mentioned that Cassandra 
>> should not be using swap, it degrades the performance. 
>> My question is instead of disabling swap system wide, can we force Cassandra 
>> not to use swap? Some documentation suggests to use memory_locking_policy in 
>> cassandra.yaml. 
>> 
>> How do I check if our Cassandra already has this parameter and still uses 
>> swap ? Is there any way i can check this. I already checked cassandra.yaml 
>> and dont see this parameter. Is there any other place i can check and 
>> confirm?
>> 
>> Also, Can I set memlock parameter to unlimited (64kB default), so entire 
>> Heap (Xms = Xmx) can be locked at node startup ? Will that help?
>> 
>> Or if you have any other suggestions, please let me know. 
>>  
>>  
>> Regards,
>> Kunal
>>  


Re: Disabling Swap for Cassandra

2020-04-16 Thread Nitan Kainth
Swap is controlled by OS and will use it when running short of memory. I don’t 
think you can disable at Cassandra level


Regards,
Nitan
Cell: 510 449 9629

> On Apr 16, 2020, at 5:50 PM, Kunal  wrote:
> 
> 
> Hello,
>  
> I need some suggestion from you all. I am new to Cassandra and was reading 
> Cassandra best practices. On one document, it was mentioned that Cassandra 
> should not be using swap, it degrades the performance. 
> My question is instead of disabling swap system wide, can we force Cassandra 
> not to use swap? Some documentation suggests to use memory_locking_policy in 
> cassandra.yaml. 
> 
> How do I check if our Cassandra already has this parameter and still uses 
> swap ? Is there any way i can check this. I already checked cassandra.yaml 
> and dont see this parameter. Is there any other place i can check and confirm?
> 
> Also, Can I set memlock parameter to unlimited (64kB default), so entire Heap 
> (Xms = Xmx) can be locked at node startup ? Will that help?
> 
> Or if you have any other suggestions, please let me know. 
>  
>  
> Regards,
> Kunal
>  


Disabling Swap for Cassandra

2020-04-16 Thread Kunal
Hello,



I need some suggestion from you all. I am new to Cassandra and was reading
Cassandra best practices. On one document, it was mentioned that Cassandra
should not be using swap, it degrades the performance.

My question is instead of disabling swap system wide, can we force
Cassandra not to use swap? Some documentation suggests to use
memory_locking_policy in cassandra.yaml.


How do I check if our Cassandra already has this parameter and still uses
swap ? Is there any way i can check this. I already checked cassandra.yaml
and dont see this parameter. Is there any other place i can check and
confirm?


Also, Can I set memlock parameter to unlimited (64kB default), so entire
Heap (Xms = Xmx) can be locked at node startup ? Will that help?


Or if you have any other suggestions, please let me know.





Regards,

Kunal


Re: How quickly off heap memory freed by compacted tables is reclaimed

2020-04-16 Thread Reid Pinchback
If I understand the logic of things like SlabAllocator properly, this is 
essentially buffer space that has been allocated for the purpose and C* pulls 
off ByteBuffer hunks of it as needed.  The notion of reclaiming by the kernel 
wouldn’t apply, C* would be managing the use of the space itself.

Whether GC cycles matter at all isn’t obvious at a quick glance.  C* makes use 
of weak and phantom references so it’s possible that there is a code path where 
release of a ByteBuffer would wait upon a GC, but I can’t say for sure.

From: HImanshu Sharma 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, April 15, 2020 at 10:34 PM
To: "user@cassandra.apache.org" 
Subject: How quickly off heap memory freed by compacted tables is reclaimed

Message from External Sender
Hi

As we know data structures like bloom filters, compression metadata, index 
summary are kept off heap. But once a table gets compacted, how quickly that 
memory is reclaimed by kernel.
Is it instant or it depends when reference if GCed?

Regards
Himanshu


Re: Understanding "nodetool netstats" on a multi region cluster

2020-04-16 Thread Jai Bheemsen Rao Dhanwada
Hello Erick,

I can see the bootstrap message "Bootstrap
24359390-4443-11ea-af19-1fbf341b76a0" every time, I spin up a 3rd
datacenter. I can confirm that there are no network issues and cluster is
not overloaded, as this is the new cluster without any data. 1 and 2
regions don't have any Bootstrap message in "nodetool netstats", only 3rd
region have it and no they bootstrapped successfully and there were no
issues.

On Wed, Apr 15, 2020 at 9:22 PM Erick Ramirez 
wrote:

> I could re-produce this behavior all the times in 3rd datacenter and there
>> is network connectivity issues. Also cluster is not overloaded as this is
>> brand new cluster.
>
>
> I don't quite understand what you mean. What can you re-produce? It would
> be good if you could elaborate. Cheers!
>


RE: Multi DC replication between different Cassandra versions

2020-04-16 Thread Durity, Sean R
I agree – do not aim for a mixed version as normal. Mixed versions are fine 
during an upgrade process, but the goal is to complete the upgrade as soon as 
possible.

As for other parts of your plan, the Kafka Connector is a “sink-only,” which 
means that it can only insert into Cassandra. It doesn’t go the other way.

I usually suggest that if the data is needed in two (or more) places, that the 
application write to a queue. Then, let the queue feed all the downstream 
destinations.


Sean Durity – Staff Systems Engineer, Cassandra

From: Christopher Bradford 
Sent: Thursday, April 16, 2020 1:13 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Multi DC replication between different Cassandra 
versions

It’s worth noting there can be issues with streaming between different versions 
of C*. Note this excerpt from
https://thelastpickle.com/blog/2019/02/26/data-center-switch.html 
[thelastpickle.com]

Note that with an upgrade it’s important to keep in mind that streaming in a 
cluster running mixed versions of Casandra is not recommended

Emphasis mine. With the approach you’re suggesting streaming would be involved 
both during bootstrap and repair. Would it be possible to upgrade to a more 
recent release prior to pursuing this course of action?

On Thu, Apr 16, 2020 at 1:02 AM Erick Ramirez 
mailto:erick.rami...@datastax.com>> wrote:
I don't mean any disrespect but let me offer you a friendly advice -- don't do 
it to yourself. I think you would have a very hard time finding someone who 
would recommend implementing a solution that involves mixed versions. If you 
run into issues, it would be hell trying to unscramble that egg.

On top of that, Cassandra 3.0.9 is an ancient version released 4 years ago 
(September 2016). There are several pages of fixes deployed since then. So in 
the nicest possible way, what you're planning to do is not a good idea. I 
personally wouldn't do it. Cheers!
--

Christopher Bradford




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Cassandra node JVM hang during node repair a table with materialized view

2020-04-16 Thread Ben G
Thanks a lot. We are working on removing views and control the partition
size.  I hope the improvements help us

Best regards

Gb

Erick Ramirez  于2020年4月16日周四 下午2:08写道:

> GC collector is G1.  I ever repair the node after scale up. The JVM issue
>> reproduced.  Can I increase the heap to 40 GB on a 64GB VM?
>>
>
> I wouldn't recommend going beyond 31GB on G1. It will be diminishing
> returns as I mentioned before.
>
> Do you think the issue is related to materialized view or big partition?
>>
>
> Yes, materialised views are problematic and I don't recommend them for
> production since they're still experimental. But if I were to guess, I'd
> say your problem is more an issue with large partitions and too many
> tombstones both putting pressure on the heap.
>
> The thing is if you can't bootstrap because you're running into the
> TombstoneOverwhelmException (I'm guessing), I can't see how you wouldn't
> run into it with repairs. In any case, try running repairs on the smaller
> tables first and work on the remaining tables one-by-one. But bootstrapping
> a node with repairs is a very expensive exercise than just plain old
> bootstrap. I get that you're in a tough spot right now so good luck!
>


-- 

Thanks
Guo Bin


Re: Cassandra node JVM hang during node repair a table with materialized view

2020-04-16 Thread Erick Ramirez
>
> GC collector is G1.  I ever repair the node after scale up. The JVM issue
> reproduced.  Can I increase the heap to 40 GB on a 64GB VM?
>

I wouldn't recommend going beyond 31GB on G1. It will be diminishing
returns as I mentioned before.

Do you think the issue is related to materialized view or big partition?
>

Yes, materialised views are problematic and I don't recommend them for
production since they're still experimental. But if I were to guess, I'd
say your problem is more an issue with large partitions and too many
tombstones both putting pressure on the heap.

The thing is if you can't bootstrap because you're running into the
TombstoneOverwhelmException (I'm guessing), I can't see how you wouldn't
run into it with repairs. In any case, try running repairs on the smaller
tables first and work on the remaining tables one-by-one. But bootstrapping
a node with repairs is a very expensive exercise than just plain old
bootstrap. I get that you're in a tough spot right now so good luck!