Re: Question About Reaper

2018-05-21 Thread Alexander Dejanovski
You won't be able to have less segments than vnodes, so just use 256 segments per node, use parallel as repair parallelism, and set intensity to 1. You apparently have more than 3TB per node, and that kind of density is always challenging when it comes to run "fast" repairs. Cheers, Le mar. 22

Re: Question About Reaper

2018-05-21 Thread Surbhi Gupta
We are on Dse 4.8.15 and it is cassandra 2.1. What are the best configuration to use for reaper for 144 nodes with 256 vnodes and it shows around 532TB data when we start opscenter repairs. We need to finish repair soon. On Mon, May 21, 2018 at 10:53 AM Alexander Dejanovski <

RE: [EXTERNAL] IN clause of prepared statement

2018-05-21 Thread onmstester onmstester
It seems that there is no way doing this using Cassandra and even something like spark won't help because i'm going to read from a big Cassandra partition (bottleneck is reading from Cassandra) Sent using Zoho Mail On Tue, 22 May 2018 09:08:55 +0430 onmstester onmstester

RE: [EXTERNAL] IN clause of prepared statement

2018-05-21 Thread onmstester onmstester
I try that too, using select ALL_NON_Collection_Columns ..., encoutered error: IN restrictions are not supported on indexed columns Sent using Zoho Mail On Mon, 21 May 2018 20:10:29 +0430 Durity, Sean R sean_r_dur...@homedepot.com wrote One of the columns you are selecting

How is Token function work in Cassandra

2018-05-21 Thread Goutham reddy
I would like to know how the Token Function works in Cassandra. In what scenario it is best used. Secondly can a range query performed on token function on a composite primary key. Any help is highly appreciated. Thanks and Regards, Goutham Reddy Aenugu. -- Regards Goutham Reddy

Re: Client ID logging

2018-05-21 Thread Andy Tolbert
CASSANDRA-13665 adds a 'nodetool clientlist' command which I think would be helpful in this circumstance. That feature is targeted for C* 4.0 however. You could use something like lsof to see what active

Re: Client ID logging

2018-05-21 Thread Hannu Kröger
Hmm, I think that by default not but you can create a hook to log that. Create a wrapper for PasswordAuthenticator class for example and use that. Or if you don’t use authentication you can create your own query handler. Hannu > James Lovato kirjoitti 21.5.2018 kello

Client ID logging

2018-05-21 Thread James Lovato
Hi guys, Can standard OSS Cassandra 3 do logging of who connects to it? We have a cluster in 3 DCs and our devs want to see if the client is crossing across DC (even though they have DCLOCAL set from their DS driver). Thanks, James

Re: Cassandra few nodes having high mem consumption

2018-05-21 Thread Abdul Patel
Additonally the cqlsh was taking lil time to login ans immediatly the message popped up in log lik PERIODIC-COMMIT-LOG-SYNCER . Seems commutlog isnt able to vommit to disk .any ideas? I have ran nodetool flush and restarted nodes .. But wanted to kmow the root cause. On Monday, May 21, 2018,

Re: Question About Reaper

2018-05-21 Thread Alexander Dejanovski
Hi Subri, Reaper might indeed be your best chance to reduce the overhead of vnodes there. The latest betas include a new feature that will group vnodes sharing the same replicas in the same segment. This will allow to have less segments than vnodes, and is available with Cassandra 2.2 and onwards

RE: [EXTERNAL] IN clause of prepared statement

2018-05-21 Thread Durity, Sean R
One of the columns you are selecting is a list or map or other kind of collection. You can’t do that with an IN clause against a clustering column. Either don’t select the collection column OR don’t use the IN clause. Cassandra is trying to protect itself (and you) from a query that won’t scale

Cassandra few nodes having high mem consumption

2018-05-21 Thread Abdul Patel
Hi I have few cassandra nodes throwing suddwnly 80% memory usage , this happened 1 week after upgrading from 3.1.0 to 3.11.2 , no errors in log . Is their a way i can find high cpu or memory consuming process in cassnadra?

Re: Question About Reaper

2018-05-21 Thread Surbhi Gupta
Thanks Abdul On Mon, May 21, 2018 at 6:28 AM Abdul Patel wrote: > We have a paramater in reaper yaml file called > repairManagerSchrdulingIntervalSeconds default is 10 seconds , i tested > with 8,6,5 seconds and found 5 seconds optimal for my environment ..you go > down

Re: Question About Reaper

2018-05-21 Thread Abdul Patel
We have a paramater in reaper yaml file called repairManagerSchrdulingIntervalSeconds default is 10 seconds , i tested with 8,6,5 seconds and found 5 seconds optimal for my environment ..you go down further but it will have cascading effects in cpu and memory consumption. So test well. On

Cassandra insert from Spark slows down when running executors on the same node

2018-05-21 Thread Javier Pareja
Hello, I have a Spark Streaming job reading data from kafka, processing it and inserting it into Cassandra. The job is running on a cluster with 3 machines. I use Mesos to submit the job with 3 executors using 1 core each. The problem is that when all executors are running on the same node, the

Re: performance on reading only the specific nonPk column

2018-05-21 Thread sujeet jog
Thanks Kurt, that answers my question. @nandan, id, timestamp ensures unique primary-key. On Mon, May 21, 2018 at 2:23 PM, kurt greaves wrote: > Every column will be retrieved (that's populated) from disk and the > requested column will then be sliced out in memory and

Re: performance on reading only the specific nonPk column

2018-05-21 Thread kurt greaves
Every column will be retrieved (that's populated) from disk and the requested column will then be sliced out in memory and sent back. On 21 May 2018 at 08:34, sujeet jog wrote: > Folks, > > consider a table with 100 metrics with (id , timestamp ) as key, > if one wants to

Re: performance on reading only the specific nonPk column

2018-05-21 Thread @Nandan@
First question:- [Just as Concern] How are you making sure that your PK is giving Uniqueness? For Example:- At the same time, 10 users will write data then how's your schema going to tackle that. Now on your question:- does the read on the specific node happen first bringing all the

performance on reading only the specific nonPk column

2018-05-21 Thread sujeet jog
Folks, consider a table with 100 metrics with (id , timestamp ) as key, if one wants to do a selective metric read select m1 from table where id = 10 and timestamp >= '2017-01-02 :00:00:00' and timestamp <= '2017-01-02 04:00:00' does the read on the specific node happen first bringing all the