Re: Odd CPU utilization spikes on 1 node out of 30 during repair

2018-09-26 Thread Oleksandr Shulgin
On Wed, Sep 26, 2018 at 1:07 PM Anup Shirolkar < anup.shirol...@instaclustr.com> wrote: > > Looking at information you have provided, the increased CPU utilisation > could be because of repair running on the node. > Repairs are resource intensive operations. > > Restarting the node should have

Re: Odd CPU utilization spikes on 1 node out of 30 during repair

2018-09-26 Thread Anup Shirolkar
Hi, Looking at information you have provided, the increased CPU utilisation could be because of repair running on the node. Repairs are resource intensive operations. Restarting the node should have halted repair operation getting the CPU back to normal. In case you regularly run repairs but

About UDF/UDA

2018-09-26 Thread Riccardo Ferrari
Hi users! Given my Cassandra version 3.0.x I don't have the famous GROUP BY operator available. So looking around I turned to UDAs. I'm aware all/most of the magic happens on the coordinator and the plan is to keep the data volume low to avoid too much pressure. Q1: How much is low volume. It's

Re: monitor and alert tool for open source cassandra

2018-09-26 Thread Joaquin Casares
Hello James, This project has provides a Docker Compose environment for using Prometheus and Grafana along with preconfigured files for Prometheus and Graphite consumption. I would, however, prefer Prometheus over Graphite as Grafana's backend.

Re: About UDF/UDA

2018-09-26 Thread DuyHai Doan
A hint to answer your Q3 is to use a final function to perform the flattening or transformation on the result of the aggregation The syntax of an UDA is: CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC

Re: Data copy problem

2018-09-26 Thread Kiran mk
Please do try COPY TO commas to dump the data with csv or any other delimited format to dump. Then try Run COPY FROM on target cluster after copying the exported file. Best Regards, Kiran.M.K On Thu, 27 Sep 2018 at 4:05 AM, rajasekhar kommineni wrote: > Hi All, > > I have a requirement to

Re: Data copy problem

2018-09-26 Thread rajasekhar kommineni
I have to copy whole key space which has many tables, some tables are 30 GB. Copy is going to take more and also load is shooting up. I have a 4 node source cluster and 4 node cluster target cluster. My question is do need to execute below command from all the nodes of source cluster ?

Data copy problem

2018-09-26 Thread rajasekhar kommineni
Hi All, I have a requirement to copy table data from cluster to another cluster, I used both sstableloader and copy snapshot, run nodetool refresh command. In both cases the count of records in new cluster is less than original cluster. I used select count(*) table name to count the records.

Re: Odd CPU utilization spikes on 1 node out of 30 during repair

2018-09-26 Thread Anup Shirolkar
Hi, Most of the things look ok from your setup. You can enable Debug logs for repair duration. This will help identify if you are hitting a bug or other cause of unusual behaviour. Just a remote possibility, do you have other things running on nodes besides Cassandra. Do they consume additional

Re: TWCS + subrange repair = excessive re-compaction?

2018-09-26 Thread kurt greaves
Not any faster, as you'll still have to wait for all the SSTables to age off, as a partition level tombstone will simply go to a new SSTable and likely will not be compacted with the old SSTables. On Tue, 25 Sep 2018 at 17:03, Martin Mačura wrote: > Most partitions in our dataset span one or

Re: About UDF/UDA

2018-09-26 Thread Riccardo Ferrari
Thank you Doan, Indeed I'm using a FINALFUNC to compute the average already. Bit more context, I'm working on bucketized data, each bucket has already an 'event count' and an 'average' in it. My functions look as follow: //SFUNC CREATE OR REPLACE FUNCTION summaryState(state map >>, name text,

Odd CPU utilization spikes on 1 node out of 30 during repair

2018-09-26 Thread Oleksandr Shulgin
Hello, On our production cluster of 30 Apache Cassandra 3.0.17 nodes we have observed that only one node started to show about 2 times the CPU utilization as compared to the rest (see screenshot): up to 30% vs. ~15% on average for the other nodes. This started more or less immediately after