Re: sstableloader & num_tokens change

2020-01-24 Thread Erick Ramirez
On the subject of DSBulk, sstableloader is the tool of choice for this
scenario.

+1 to Sergio and I'm confirming that DSBulk is designed as a bulk loader
for CSV/JSON formats. Cheers!


Re: sstableloader & num_tokens change

2020-01-24 Thread Erick Ramirez
> If I may just loop this back to the question at hand:
>
> I'm curious if there are any gotchas with using sstableloader to restore
> snapshots taken from 256-token nodes into a cluster with 32-token (or your
> preferred number of tokens) nodes (otherwise same # of nodes and same RF).
>

No, there isn't. It will work as designed so you're good to go. Cheers!


>


Re: sstableloader & num_tokens change

2020-01-24 Thread Voytek Jarnot
If I may just loop this back to the question at hand:

I'm curious if there are any gotchas with using sstableloader to restore
snapshots taken from 256-token nodes into a cluster with 32-token (or your
preferred number of tokens) nodes (otherwise same # of nodes and same RF).

On Fri, Jan 24, 2020 at 11:15 AM Sergio  wrote:

> https://docs.datastax.com/en/dsbulk/doc/dsbulk/reference/dsbulkLoad.html
>
> Just skimming through the docs
>
> I see examples by loading from CSV / JSON
>
> Maybe there is some other command or doc page that I am missing
>
>
>
>
> On Fri, Jan 24, 2020, 9:10 AM Nitan Kainth  wrote:
>
>> Dsbulk works same as sstableloder.
>>
>>
>> Regards,
>>
>> Nitan
>>
>> Cell: 510 449 9629
>>
>> On Jan 24, 2020, at 10:40 AM, Sergio  wrote:
>>
>> 
>> I was wondering if that improvement for token allocation would work even
>> with just one rack. It should but I am not sure.
>>
>> Does Dsbulk support migration cluster to cluster without CSV or JSON
>> export?
>>
>> Thanks and Regards
>>
>> On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth  wrote:
>>
>>> Instead of sstableloader consider dsbulk by datastax.
>>>
>>> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback <
>>> rpinchb...@tripadvisor.com> wrote:
>>>
 Jon Haddad has previously made the case for num_tokens=4.  His
 Accelerate 2019 talk is available at:



 https://www.youtube.com/watch?v=swL7bCnolkU



 You might want to check that out.  Also I think the amount of effort
 you put into evening out the token distribution increases as vnode count
 shrinks.  The caveats are explored at:




 https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html





 *From: *Voytek Jarnot 
 *Reply-To: *"user@cassandra.apache.org" 
 *Date: *Friday, January 24, 2020 at 10:39 AM
 *To: *"user@cassandra.apache.org" 
 *Subject: *sstableloader & num_tokens change



 *Message from External Sender*

 Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different
 4 node RF=3 cluster.



 I've read that 256 is not an optimal default num_tokens value, and that
 32 is likely a better option.



 We have the "opportunity" to switch, as we're migrating environments
 and will likely be using sstableloader to do so. I'm curious if there are
 any gotchas with using sstableloader to restore snapshots taken from
 256-token nodes into a cluster with 32-token nodes (otherwise same # of
 nodes and same RF).



 Thanks in advance.

>>>


Re: Cassandra going OOM due to tombstones (heapdump screenshots provided)

2020-01-24 Thread Reid Pinchback
Just a thought along those lines.  If the memtable flush isn’t keeping up, you 
might find that manifested in the I/O queue length and dirty page stats leading 
into the time the OOM event took place.  If you do see that, then you might 
need to do some I/O tuning as well.

From: Jeff Jirsa 
Reply-To: "user@cassandra.apache.org" 
Date: Friday, January 24, 2020 at 12:09 PM
To: cassandra 
Subject: Re: Cassandra going OOM due to tombstones (heapdump screenshots 
provided)

Message from External Sender
Ah, I focused too much on the literal meaning of startup. If it's happening 
JUST AFTER startup, it's probably getting flooded with hints from the other 
hosts when it comes online.

If that's the case, it may be just simply overrunning the memtable, or it may 
be a deadlock like 
https://issues.apache.org/jira/browse/CASSANDRA-15367
 (which benedict just updated this morning, good timing)


If it's after the host comes online and it's hint replay from the other hosts, 
you probably want to throttle hint replay significantly on the rest of the 
cluster. Whatever your hinted handoff throttle is, consider dropping it by 
50-90% to work around whichever of those two problems it is.


On Fri, Jan 24, 2020 at 9:06 AM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
6 GB of mutations on heap
Startup would replay commitlog, which would re-materialize all of those 
mutations and put them into the memtable. The memtable would flush over time to 
disk, and clear the commitlog.

It looks like PERHAPS the commitlog replay is faster than the memtable flush, 
so you're blowing out the memtable while you're replaying the commitlog.

How much memory does the machine have? How much of that is allocated to the 
heap? What are your memtable settings? Do you see log lines about flushing 
memtables to free room (probably something like the slab pool cleaner)?



On Fri, Jan 24, 2020 at 3:16 AM Behroz Sikander 
mailto:bsikan...@apache.org>> wrote:
We recently had a lot of OOM in C* and it was generally happening during 
startup.
We took some heap dumps but still cannot pin point the exact reason. So, we 
need some help from experts.

Our clients are not explicitly deleting data but they have TTL enabled.

C* details:
> show version
[cqlsh 5.0.1 | Cassandra 2.2.9 | CQL spec 3.3.1 | Native protocol v4]

Most of the heap was allocated was the object[]
- org.apache.cassandra.db.Cell

Heap dump images:
Heap usage by class: 
https://pasteboard.co/IRrfu70.png
Classes using most heap: 
https://pasteboard.co/IRrgszZ.png
Overall heap usage: 
https://pasteboard.co/IRrg7t1.png

What could be the reason for such OOM? Something that we can tune to improve 
this?
Any help would be much appreciated.


-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org


Re: sstableloader & num_tokens change

2020-01-24 Thread Sergio
https://docs.datastax.com/en/dsbulk/doc/dsbulk/reference/dsbulkLoad.html

Just skimming through the docs

I see examples by loading from CSV / JSON

Maybe there is some other command or doc page that I am missing




On Fri, Jan 24, 2020, 9:10 AM Nitan Kainth  wrote:

> Dsbulk works same as sstableloder.
>
>
> Regards,
>
> Nitan
>
> Cell: 510 449 9629
>
> On Jan 24, 2020, at 10:40 AM, Sergio  wrote:
>
> 
> I was wondering if that improvement for token allocation would work even
> with just one rack. It should but I am not sure.
>
> Does Dsbulk support migration cluster to cluster without CSV or JSON
> export?
>
> Thanks and Regards
>
> On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth  wrote:
>
>> Instead of sstableloader consider dsbulk by datastax.
>>
>> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback <
>> rpinchb...@tripadvisor.com> wrote:
>>
>>> Jon Haddad has previously made the case for num_tokens=4.  His
>>> Accelerate 2019 talk is available at:
>>>
>>>
>>>
>>> https://www.youtube.com/watch?v=swL7bCnolkU
>>>
>>>
>>>
>>> You might want to check that out.  Also I think the amount of effort you
>>> put into evening out the token distribution increases as vnode count
>>> shrinks.  The caveats are explored at:
>>>
>>>
>>>
>>>
>>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>>>
>>>
>>>
>>>
>>>
>>> *From: *Voytek Jarnot 
>>> *Reply-To: *"user@cassandra.apache.org" 
>>> *Date: *Friday, January 24, 2020 at 10:39 AM
>>> *To: *"user@cassandra.apache.org" 
>>> *Subject: *sstableloader & num_tokens change
>>>
>>>
>>>
>>> *Message from External Sender*
>>>
>>> Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different
>>> 4 node RF=3 cluster.
>>>
>>>
>>>
>>> I've read that 256 is not an optimal default num_tokens value, and that
>>> 32 is likely a better option.
>>>
>>>
>>>
>>> We have the "opportunity" to switch, as we're migrating environments and
>>> will likely be using sstableloader to do so. I'm curious if there are any
>>> gotchas with using sstableloader to restore snapshots taken from 256-token
>>> nodes into a cluster with 32-token nodes (otherwise same # of nodes and
>>> same RF).
>>>
>>>
>>>
>>> Thanks in advance.
>>>
>>


Re: sstableloader & num_tokens change

2020-01-24 Thread Nitan Kainth
Dsbulk works same as sstableloder.


Regards,
Nitan
Cell: 510 449 9629

> On Jan 24, 2020, at 10:40 AM, Sergio  wrote:
> 
> 
> I was wondering if that improvement for token allocation would work even with 
> just one rack. It should but I am not sure.
> 
> Does Dsbulk support migration cluster to cluster without CSV or JSON export?
> 
> Thanks and Regards
> 
>> On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth  wrote:
>> Instead of sstableloader consider dsbulk by datastax. 
>> 
>>> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback 
>>>  wrote:
>>> Jon Haddad has previously made the case for num_tokens=4.  His Accelerate 
>>> 2019 talk is available at:
>>> 
>>>  
>>> 
>>> https://www.youtube.com/watch?v=swL7bCnolkU
>>> 
>>>  
>>> 
>>> You might want to check that out.  Also I think the amount of effort you 
>>> put into evening out the token distribution increases as vnode count 
>>> shrinks.  The caveats are explored at:
>>> 
>>>  
>>> 
>>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>>> 
>>>  
>>> 
>>>  
>>> 
>>> From: Voytek Jarnot 
>>> Reply-To: "user@cassandra.apache.org" 
>>> Date: Friday, January 24, 2020 at 10:39 AM
>>> To: "user@cassandra.apache.org" 
>>> Subject: sstableloader & num_tokens change
>>> 
>>>  
>>> 
>>> Message from External Sender
>>> 
>>> Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4 
>>> node RF=3 cluster.
>>> 
>>>  
>>> 
>>> I've read that 256 is not an optimal default num_tokens value, and that 32 
>>> is likely a better option.
>>> 
>>>  
>>> 
>>> We have the "opportunity" to switch, as we're migrating environments and 
>>> will likely be using sstableloader to do so. I'm curious if there are any 
>>> gotchas with using sstableloader to restore snapshots taken from 256-token 
>>> nodes into a cluster with 32-token nodes (otherwise same # of nodes and 
>>> same RF).
>>> 
>>>  
>>> 
>>> Thanks in advance.


Re: Cassandra going OOM due to tombstones (heapdump screenshots provided)

2020-01-24 Thread Jeff Jirsa
Ah, I focused too much on the literal meaning of startup. If it's happening
JUST AFTER startup, it's probably getting flooded with hints from the other
hosts when it comes online.

If that's the case, it may be just simply overrunning the memtable, or it
may be a deadlock like https://issues.apache.org/jira/browse/CASSANDRA-15367
(which benedict just updated this morning, good timing)

If it's after the host comes online and it's hint replay from the other
hosts, you probably want to throttle hint replay significantly on the rest
of the cluster. Whatever your hinted handoff throttle is, consider dropping
it by 50-90% to work around whichever of those two problems it is.


On Fri, Jan 24, 2020 at 9:06 AM Jeff Jirsa  wrote:

> 6 GB of mutations on heap
> Startup would replay commitlog, which would re-materialize all of those
> mutations and put them into the memtable. The memtable would flush over
> time to disk, and clear the commitlog.
>
> It looks like PERHAPS the commitlog replay is faster than the memtable
> flush, so you're blowing out the memtable while you're replaying the
> commitlog.
>
> How much memory does the machine have? How much of that is allocated to
> the heap? What are your memtable settings? Do you see log lines about
> flushing memtables to free room (probably something like the slab pool
> cleaner)?
>
>
>
> On Fri, Jan 24, 2020 at 3:16 AM Behroz Sikander 
> wrote:
>
>> We recently had a lot of OOM in C* and it was generally happening during
>> startup.
>> We took some heap dumps but still cannot pin point the exact reason. So,
>> we need some help from experts.
>>
>> Our clients are not explicitly deleting data but they have TTL enabled.
>>
>> C* details:
>> > show version
>> [cqlsh 5.0.1 | Cassandra 2.2.9 | CQL spec 3.3.1 | Native protocol v4]
>>
>> Most of the heap was allocated was the object[]
>> - org.apache.cassandra.db.Cell
>>
>> Heap dump images:
>> Heap usage by class: https://pasteboard.co/IRrfu70.png
>> Classes using most heap: https://pasteboard.co/IRrgszZ.png
>> Overall heap usage: https://pasteboard.co/IRrg7t1.png
>>
>> What could be the reason for such OOM? Something that we can tune to
>> improve this?
>> Any help would be much appreciated.
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>


Re: Cassandra going OOM due to tombstones (heapdump screenshots provided)

2020-01-24 Thread Jeff Jirsa
6 GB of mutations on heap
Startup would replay commitlog, which would re-materialize all of those
mutations and put them into the memtable. The memtable would flush over
time to disk, and clear the commitlog.

It looks like PERHAPS the commitlog replay is faster than the memtable
flush, so you're blowing out the memtable while you're replaying the
commitlog.

How much memory does the machine have? How much of that is allocated to the
heap? What are your memtable settings? Do you see log lines about flushing
memtables to free room (probably something like the slab pool cleaner)?



On Fri, Jan 24, 2020 at 3:16 AM Behroz Sikander 
wrote:

> We recently had a lot of OOM in C* and it was generally happening during
> startup.
> We took some heap dumps but still cannot pin point the exact reason. So,
> we need some help from experts.
>
> Our clients are not explicitly deleting data but they have TTL enabled.
>
> C* details:
> > show version
> [cqlsh 5.0.1 | Cassandra 2.2.9 | CQL spec 3.3.1 | Native protocol v4]
>
> Most of the heap was allocated was the object[]
> - org.apache.cassandra.db.Cell
>
> Heap dump images:
> Heap usage by class: https://pasteboard.co/IRrfu70.png
> Classes using most heap: https://pasteboard.co/IRrgszZ.png
> Overall heap usage: https://pasteboard.co/IRrg7t1.png
>
> What could be the reason for such OOM? Something that we can tune to
> improve this?
> Any help would be much appreciated.
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: sstableloader & num_tokens change

2020-01-24 Thread Voytek Jarnot
Why? Seems to me that the old Cassandra -> CSV/JSON and CSV/JSON -> new
Cassandra are unnecessary steps in my case.

On Fri, Jan 24, 2020 at 10:34 AM Nitan Kainth  wrote:

> Instead of sstableloader consider dsbulk by datastax.
>
> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback <
> rpinchb...@tripadvisor.com> wrote:
>
>> Jon Haddad has previously made the case for num_tokens=4.  His Accelerate
>> 2019 talk is available at:
>>
>>
>>
>> https://www.youtube.com/watch?v=swL7bCnolkU
>>
>>
>>
>> You might want to check that out.  Also I think the amount of effort you
>> put into evening out the token distribution increases as vnode count
>> shrinks.  The caveats are explored at:
>>
>>
>>
>>
>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>>
>>
>>
>>
>>
>> *From: *Voytek Jarnot 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Friday, January 24, 2020 at 10:39 AM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *sstableloader & num_tokens change
>>
>>
>>
>> *Message from External Sender*
>>
>> Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4
>> node RF=3 cluster.
>>
>>
>>
>> I've read that 256 is not an optimal default num_tokens value, and that
>> 32 is likely a better option.
>>
>>
>>
>> We have the "opportunity" to switch, as we're migrating environments and
>> will likely be using sstableloader to do so. I'm curious if there are any
>> gotchas with using sstableloader to restore snapshots taken from 256-token
>> nodes into a cluster with 32-token nodes (otherwise same # of nodes and
>> same RF).
>>
>>
>>
>> Thanks in advance.
>>
>


Re: sstableloader & num_tokens change

2020-01-24 Thread Sergio
I was wondering if that improvement for token allocation would work even
with just one rack. It should but I am not sure.

Does Dsbulk support migration cluster to cluster without CSV or JSON export?

Thanks and Regards

On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth  wrote:

> Instead of sstableloader consider dsbulk by datastax.
>
> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback <
> rpinchb...@tripadvisor.com> wrote:
>
>> Jon Haddad has previously made the case for num_tokens=4.  His Accelerate
>> 2019 talk is available at:
>>
>>
>>
>> https://www.youtube.com/watch?v=swL7bCnolkU
>>
>>
>>
>> You might want to check that out.  Also I think the amount of effort you
>> put into evening out the token distribution increases as vnode count
>> shrinks.  The caveats are explored at:
>>
>>
>>
>>
>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>>
>>
>>
>>
>>
>> *From: *Voytek Jarnot 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Friday, January 24, 2020 at 10:39 AM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *sstableloader & num_tokens change
>>
>>
>>
>> *Message from External Sender*
>>
>> Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4
>> node RF=3 cluster.
>>
>>
>>
>> I've read that 256 is not an optimal default num_tokens value, and that
>> 32 is likely a better option.
>>
>>
>>
>> We have the "opportunity" to switch, as we're migrating environments and
>> will likely be using sstableloader to do so. I'm curious if there are any
>> gotchas with using sstableloader to restore snapshots taken from 256-token
>> nodes into a cluster with 32-token nodes (otherwise same # of nodes and
>> same RF).
>>
>>
>>
>> Thanks in advance.
>>
>


Re: Cassandra going OOM due to tombstones (heapdump screenshots provided)

2020-01-24 Thread Michael Shuler
Some environment details like Cassandra version, amount of physical RAM, 
JVM configs (heap and others), and any other non-default cassandra.yaaml 
configs would help. The amount of data, number of keyspaces & tables, 
since you mention "clients", would also be helpful for people to suggest 
tuning improvements.


Michael

On 1/24/20 5:16 AM, Behroz Sikander wrote:

We recently had a lot of OOM in C* and it was generally happening during 
startup.
We took some heap dumps but still cannot pin point the exact reason. So, we 
need some help from experts.

Our clients are not explicitly deleting data but they have TTL enabled.

C* details:

show version

[cqlsh 5.0.1 | Cassandra 2.2.9 | CQL spec 3.3.1 | Native protocol v4]

Most of the heap was allocated was the object[]
- org.apache.cassandra.db.Cell
  
Heap dump images:

Heap usage by class: https://pasteboard.co/IRrfu70.png
Classes using most heap: https://pasteboard.co/IRrgszZ.png
Overall heap usage: https://pasteboard.co/IRrg7t1.png

What could be the reason for such OOM? Something that we can tune to improve 
this?
Any help would be much appreciated.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: sstableloader & num_tokens change

2020-01-24 Thread Nitan Kainth
Instead of sstableloader consider dsbulk by datastax.

On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback 
wrote:

> Jon Haddad has previously made the case for num_tokens=4.  His Accelerate
> 2019 talk is available at:
>
>
>
> https://www.youtube.com/watch?v=swL7bCnolkU
>
>
>
> You might want to check that out.  Also I think the amount of effort you
> put into evening out the token distribution increases as vnode count
> shrinks.  The caveats are explored at:
>
>
>
>
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>
>
>
>
>
> *From: *Voytek Jarnot 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Friday, January 24, 2020 at 10:39 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *sstableloader & num_tokens change
>
>
>
> *Message from External Sender*
>
> Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4
> node RF=3 cluster.
>
>
>
> I've read that 256 is not an optimal default num_tokens value, and that 32
> is likely a better option.
>
>
>
> We have the "opportunity" to switch, as we're migrating environments and
> will likely be using sstableloader to do so. I'm curious if there are any
> gotchas with using sstableloader to restore snapshots taken from 256-token
> nodes into a cluster with 32-token nodes (otherwise same # of nodes and
> same RF).
>
>
>
> Thanks in advance.
>


Re: sstableloader & num_tokens change

2020-01-24 Thread Reid Pinchback
Jon Haddad has previously made the case for num_tokens=4.  His Accelerate 2019 
talk is available at:

https://www.youtube.com/watch?v=swL7bCnolkU

You might want to check that out.  Also I think the amount of effort you put 
into evening out the token distribution increases as vnode count shrinks.  The 
caveats are explored at:

https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html


From: Voytek Jarnot 
Reply-To: "user@cassandra.apache.org" 
Date: Friday, January 24, 2020 at 10:39 AM
To: "user@cassandra.apache.org" 
Subject: sstableloader & num_tokens change

Message from External Sender
Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4 node 
RF=3 cluster.

I've read that 256 is not an optimal default num_tokens value, and that 32 is 
likely a better option.

We have the "opportunity" to switch, as we're migrating environments and will 
likely be using sstableloader to do so. I'm curious if there are any gotchas 
with using sstableloader to restore snapshots taken from 256-token nodes into a 
cluster with 32-token nodes (otherwise same # of nodes and same RF).

Thanks in advance.


sstableloader & num_tokens change

2020-01-24 Thread Voytek Jarnot
Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4
node RF=3 cluster.

I've read that 256 is not an optimal default num_tokens value, and that 32
is likely a better option.

We have the "opportunity" to switch, as we're migrating environments and
will likely be using sstableloader to do so. I'm curious if there are any
gotchas with using sstableloader to restore snapshots taken from 256-token
nodes into a cluster with 32-token nodes (otherwise same # of nodes and
same RF).

Thanks in advance.


Re: Uneven token distribution with allocate_tokens_for_keyspace

2020-01-24 Thread Léo FERLIN SUTTON
Hi Anthony !

I have a follow-up question :

Check to make sure that no other node in the cluster is assigned any of the
> four tokens specified above. If there is another node in the cluster that
> is assigned one of the above tokens, increment the conflicting token by
> values of one until no other node in the cluster is assigned that token
> value. The idea is to make sure that these four tokens are unique to the
> node.


I don't understand this part of the process. Why do tokens conflict if the
nodes owning them are in a different datacenter ?

Regards,

Leo

On Thu, Dec 5, 2019 at 1:00 AM Anthony Grasso 
wrote:

> Hi Enrico,
>
> Glad to hear the problem has been resolved and thank you for the feedback!
>
> Kind regards,
> Anthony
>
> On Mon, 2 Dec 2019 at 22:03, Enrico Cavallin 
> wrote:
>
>> Hi Anthony,
>> thank you for your hints, now the new DC is well balanced within 2%.
>> I did read your article, but I thought it was needed only for new
>> "clusters", not also for new "DCs"; but RF is per DC so it makes sense.
>>
>> You TLP guys are doing a great job for Cassandra community.
>>
>> Thank you,
>> Enrico
>>
>>
>> On Fri, 29 Nov 2019 at 05:09, Anthony Grasso 
>> wrote:
>>
>>> Hi Enrico,
>>>
>>> This is a classic chicken and egg problem with the
>>> allocate_tokens_for_keyspace setting.
>>>
>>> The allocate_tokens_for_keyspace setting uses the replication factor of
>>> a DC keyspace to calculate the token allocation when a node is added to the
>>> cluster for the first time.
>>>
>>> Nodes need to be added to the new DC before we can replicate the
>>> keyspace over to it. Herein lies the problem. We are unable to use
>>> allocate_tokens_for_keyspace unless the keyspace is replicated to the
>>> new DC. In addition, as soon as you change the keyspace replication to the
>>> new DC, new data will start to be written to it. To work around this issue
>>> you will need to do the following.
>>>
>>>1. Decommission all the nodes in the *dcNew*, one at a time.
>>>2. Once all the *dcNew* nodes are decommissioned, wipe the contents
>>>in the *commitlog*, *data*, *saved_caches*, and *hints* directories
>>>of these nodes.
>>>3. Make the first node to add into the *dcNew* a seed node. Set the
>>>seed list of the first node with its IP address and the IP addresses of 
>>> the
>>>other seed nodes in the cluster.
>>>4. Set the *initial_token* setting for the first node. You can
>>>calculate the values using the algorithm in my blog post:
>>>
>>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html.
>>>For convenience I have calculated them:
>>>*-9223372036854775808,-4611686018427387904,0,4611686018427387904*.
>>>Note, remove the *allocate_tokens_for_keyspace* setting from the
>>>*cassandra.yaml* file for this (seed) node.
>>>5. Check to make sure that no other node in the cluster is assigned
>>>any of the four tokens specified above. If there is another node in the
>>>cluster that is assigned one of the above tokens, increment the 
>>> conflicting
>>>token by values of one until no other node in the cluster is assigned 
>>> that
>>>token value. The idea is to make sure that these four tokens are unique 
>>> to
>>>the node.
>>>6. Add the seed node to cluster. Make sure it is listed in *dcNew *by
>>>checking nodetool status.
>>>7. Create a dummy keyspace in *dcNew* that has a replication factor
>>>of 2.
>>>8. Set the *allocate_tokens_for_keyspace* value to be the name of
>>>the dummy keyspace for the other two nodes you want to add to *dcNew*.
>>>Note remove the *initial_token* setting for these other nodes.
>>>9. Set *auto_bootstrap* to *false* for the other two nodes you want
>>>to add to *dcNew*.
>>>10. Add the other two nodes to the cluster, one at a time.
>>>11. If you are happy with the distribution, copy the data to *dcNew*
>>>by running a rebuild.
>>>
>>>
>>> Hope this helps.
>>>
>>> Regards,
>>> Anthony
>>>
>>> On Fri, 29 Nov 2019 at 02:08, Enrico Cavallin 
>>> wrote:
>>>
 Hi all,
 I have an old datacenter with 4 nodes and 256 tokens each.
 I am now starting a new datacenter with 3 nodes and num_token=4
 and allocate_tokens_for_keyspace=myBiggestKeyspace in each node.
 Both DCs run Cassandra 3.11.x.

 myBiggestKeyspace has RF=3 in dcOld and RF=2 in dcNew. Now dcNew is
 very unbalanced.
 Also keyspaces with RF=2 in both DCs have the same problem.
 Did I miss something or even with  allocate_tokens_for_keyspace I have
 strong limitations with low num_token?
 Any suggestions on how to mitigate it?

 # nodetool status myBiggestKeyspace
 Datacenter: dcOld
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address   Load   Tokens   Owns (effective)  Host ID
   Rack
 UN  x.x.x.x  515.83 GiB  256  76.2%

Cassandra going OOM due to tombstones (heapdump screenshots provided)

2020-01-24 Thread Behroz Sikander
We recently had a lot of OOM in C* and it was generally happening during 
startup.
We took some heap dumps but still cannot pin point the exact reason. So, we 
need some help from experts.

Our clients are not explicitly deleting data but they have TTL enabled.

C* details:
> show version
[cqlsh 5.0.1 | Cassandra 2.2.9 | CQL spec 3.3.1 | Native protocol v4]

Most of the heap was allocated was the object[]
- org.apache.cassandra.db.Cell
 
Heap dump images:
Heap usage by class: https://pasteboard.co/IRrfu70.png
Classes using most heap: https://pasteboard.co/IRrgszZ.png
Overall heap usage: https://pasteboard.co/IRrg7t1.png

What could be the reason for such OOM? Something that we can tune to improve 
this?
Any help would be much appreciated.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org