1.2.19: AssertionError when running compactions on a CF with TTLed columns

2018-08-30 Thread Reynald Borer
Hi everyone, I'm running a Cassandra 1.2.19 cluster of 40 nodes and compactions of a specific column family are sporadically raising an AssertionError like this (full stack trace visible under https://gist.github.com/rborer/46862d6d693c0163aa8fe0e74caa2d9a): ERROR [CompactionExecutor:9137]

unsubscribe

2018-08-30 Thread Bharatha, Naveen

Re: Large sstables

2018-08-30 Thread Mohamadreza Rostami
Hi,Dear Vitali The best option for you is migrating data to the new table and change portion key patterns to a better distribution of data and you sstables become smaller but if your data already have good distribution and your data is really big you must add new server to your datacenter, if

Re: Bootstrap streaming issues

2018-08-30 Thread Jai Bheemsen Rao Dhanwada
thank you On Thu, Aug 30, 2018 at 11:58 AM Jeff Jirsa wrote: > This is the closest JIRA that comes to mind (from memory, I didn't search, > there may be others): https://issues.apache.org/jira/browse/CASSANDRA-8150 > > The best blog that's all in one place on tuning GC in cassandra is >

Re: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

2018-08-30 Thread Rajath Subramanyam
Thank you, everyone, for responding. Rajath Subramanyam On Thu, Aug 30, 2018 at 8:38 AM Carl Mueller wrote: > - Range aware compaction strategy that subdivides data by the token range > could help for this: you only bakcup data for the primary node and not the >

Re: Bootstrap streaming issues

2018-08-30 Thread Jeff Jirsa
This is the closest JIRA that comes to mind (from memory, I didn't search, there may be others): https://issues.apache.org/jira/browse/CASSANDRA-8150 The best blog that's all in one place on tuning GC in cassandra is actually Amy's 2.1 tuning guide:

Re: Bootstrap streaming issues

2018-08-30 Thread Jai Bheemsen Rao Dhanwada
Hi Jeff, Is there any JIRA that talks about increasing the HEAP will help? Also, any other alternatives than increasing the HEAP Size? last time when I tried increasing the heap, longer GC Pauses caused more damage in terms of latencies while gc pause. On Wed, Aug 29, 2018 at 11:07 PM Jai

Re: commitlog content

2018-08-30 Thread Vitaliy Semochkin
Thank you for the excellent response Alain! On Thu, Aug 30, 2018 at 5:25 PM Alain RODRIGUEZ wrote: > > Hello Vitaly. > > This sounds weird to me (unless we are speaking about a small size MB, a few > GB maybe). Then the commit log size is limited, by default (see below) and > the data should

Re: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

2018-08-30 Thread Carl Mueller
- Range aware compaction strategy that subdivides data by the token range could help for this: you only bakcup data for the primary node and not the replica data - yes, if you want to use nodetool refresh as some sort of recovery solution, MAKE SURE YOU STORE THE TOKEN LIST with the

Re: Large sstables

2018-08-30 Thread Jeff Jirsa
Either of those are options, but there’s also sstablesplit to break it up a bit Switching to LCS can be a problem depending on how many sstables /overlaps you have -- Jeff Jirsa > On Aug 30, 2018, at 8:05 AM, Vitali Dyachuk wrote: > > Hi, > Some of the sstables got too big 100gb and more

Large sstables

2018-08-30 Thread Vitali Dyachuk
Hi, Some of the sstables got too big 100gb and more so they are not compactiong any more so some of the disks are running out of space. I'm running C* 3.0.17, RF3 with 10 disks/jbod with STCS. What are my options? Completely delete all data on this node and rejoin it to the cluster, change CS to

Re: commitlog content

2018-08-30 Thread Alain RODRIGUEZ
Hello Vitaly. This sounds weird to me (unless we are speaking about a small size MB, a few GB maybe). Then the commit log size is limited, by default (see below) and the data should grow bigger in most cases. According to the documentation (

Re: Recommended num_tokens setting for small cluster

2018-08-30 Thread Oleksandr Shulgin
On Thu, Aug 30, 2018 at 12:05 AM kurt greaves wrote: > For 10 nodes you probably want to use between 32 and 64. Make sure you use > the token allocation algorithm by specifying allocate_tokens_for_keyspace > We are using 16 tokens with 30 nodes on Cassandra 3.0. And yes, we have used

Re: bigger data density with Cassandra 4.0?

2018-08-30 Thread dinesh.jo...@yahoo.com.INVALID
With LCS, 6696 you can maximize the percentage of SSTables that use the new streaming path. With LCS and relatively small SSTables you should see good gains. Bootstrap is a use-case that should see the maximum benefits. This feature will get better with time. Dinesh On Wednesday, August

Re: Bootstrap streaming issues

2018-08-30 Thread Jai Bheemsen Rao Dhanwada
okay, thank you On Wed, Aug 29, 2018 at 11:04 PM Jeff Jirsa wrote: > You’re seeing an OOM, not a socket error / timeout. > > -- > Jeff Jirsa > > > On Aug 29, 2018, at 10:56 PM, Jai Bheemsen Rao Dhanwada < > jaibheem...@gmail.com> wrote: > > Jeff, > > any idea if this is somehow related to : >

Re: Bootstrap streaming issues

2018-08-30 Thread Jeff Jirsa
You’re seeing an OOM, not a socket error / timeout. -- Jeff Jirsa > On Aug 29, 2018, at 10:56 PM, Jai Bheemsen Rao Dhanwada > wrote: > > Jeff, > > any idea if this is somehow related to : > https://issues.apache.org/jira/browse/CASSANDRA-11840? > does increasing the value of

Re: Bootstrap streaming issues

2018-08-30 Thread Jeff Jirsa
CMS is fine at 12G for sure, likely up to 16G You’ll want to initiate CMS a bit earlier (55-69%), and you likely want new gen to be larger - perhaps 3-6G You’ll want to manually set the memtable size - it scales with heap by default After bootstrap you can lower it again -- Jeff Jirsa >