Re: Time Series schema performance

2018-05-29 Thread Affan Syed
Haris, Like all things in Cassandra, you will need to create a down-sample normalized table. ie either run a cron over the raw table, or if using some streaming solution like Flink/Storm/Spark, to extract aggregate values and put them into your downsample data. HTH - Affan On Tue, May 29, 2018

Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread Chris Lohfink
Might be better to disable explicit gcs so the full gcs don’t even occur. It’s likely from the rmi dgc or directbytebuffers not any actual need to do gcs or the concurrent gc threads would be an issue as well. Nodetool also has no excuse to use that big of a heap so it should have max size

Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread kurt greaves
Good to know. So that confirms it's just the GC threads causing problems. On Tue., 29 May 2018, 22:02 Steinmaurer, Thomas, < thomas.steinmau...@dynatrace.com> wrote: > Kurt, > > > > in our test it also didn’t made a difference with the default number of GC > Threads (43 on our large machine) and

Re: Time Series schema performance

2018-05-29 Thread Haris Altaf
Hi All, I have a related question. How do you down-sample your timeseries data? regards, Haris On Tue, 29 May 2018 at 22:11 Jonathan Haddad wrote: > I wrote a post on this topic a while ago, might be worth reading over: > >

Re: Time Series schema performance

2018-05-29 Thread Jonathan Haddad
I wrote a post on this topic a while ago, might be worth reading over: http://thelastpickle.com/blog/2017/08/02/time-series-data-modeling-massive-scale.html On Tue, May 29, 2018 at 8:02 AM Jeff Jirsa wrote: > There’s a third option which is doing bucketing by time instead of by hash, which tends

Re: Time Series schema performance

2018-05-29 Thread Jeff Jirsa
There’s a third option which is doing bucketing by time instead of by hash, which tends to perform quite well if you’re using TWCS as it makes it quite likely that a read can be served by a single sstable -- Jeff Jirsa > On May 29, 2018, at 6:49 AM, sujeet jog wrote: > > Folks, > I have

Time Series schema performance

2018-05-29 Thread sujeet jog
Folks, I have two alternatives for the time series schema i have, and wanted to weigh of on one of the schema . The query is given id, & timestamp, read the metrics associated with the id The records are inserted every 5 mins, and the number of id's = 2 million, so at every 5mins it will be 2

RE: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread Steinmaurer, Thomas
Kurt, in our test it also didn’t made a difference with the default number of GC Threads (43 on our large machine) and running with Xmx128M or XmX31G (derived from $MAX_HEAP_SIZE). For both Xmx, we saw the high CPU caused by nodetool. Regards, Thomas From: kurt greaves

Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread kurt greaves
Thanks Thomas. After a bit more research today I found that the whole $MAX_HEAP_SIZE issue isn't really a problem because we don't explicitly set -Xms so the minimum heapsize by default will be 256mb, which isn't hugely problematic, and it's unlikely more than that would get allocated. On 29 May

Scaling in Cassandra

2018-05-29 Thread Vishal1.Sharma
Dear Community, I’ll appreciate if I can get some help below: https://stackoverflow.com/q/50581473/5701173 Regards, Vishal Sharma "Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s). are confidential and may be privileged. If

RE: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread Steinmaurer, Thomas
Hi Kurt, thanks for pointing me to the Xmx issue. JIRA + patch (for Linux only based on C* 3.11) for the parallel GC thread issue is available here: https://issues.apache.org/jira/browse/CASSANDRA-14475 Thanks, Thomas From: kurt greaves [mailto:k...@instaclustr.com] Sent: Dienstag, 29. Mai

Re: Certified Cassandra for Enterprise use

2018-05-29 Thread Ben Slater
Hi Pranay We (Instaclustr) provide enterprise support for Cassandra ( https://www.instaclustr.com/services/cassandra-support/) which may cover what you are looking for. Please get in touch direct if you would like to discuss. Cheers Ben On Tue, 29 May 2018 at 10:11 Pranay akula wrote: > Is

Re: Using K8s to Manage Cassandra in Production

2018-05-29 Thread Hassaan Pasha
Thank you everyone. This thread has been really useful! On Wed, May 23, 2018 at 8:59 PM, Ben Bromhead wrote: > Here is the expectations around compatibility levels https://github.com/ > kubernetes/community/blob/master/contributors/design- >