Re: High Bloom filter false ratio

2016-02-23 Thread Jeff Jirsa
;user@cassandra.apache.org" Date: Tuesday, February 23, 2016 at 12:37 AM To: "user@cassandra.apache.org" Subject: Re: High Bloom filter false ratio Looks like that sstablemetadata is available in 2.2 , we are on 2.0.x do you know anything that will work on 2.0.x On Tue, Feb 23, 2016

RE: High Bloom filter false ratio

2016-02-23 Thread SEAN_R_DURITY
I see the sstablemetadata tool as far back as 1.2.19 (in tools/bin). Sean Durity From: Anishek Agarwal [mailto:anis...@gmail.com] Sent: Tuesday, February 23, 2016 3:37 AM To: user@cassandra.apache.org Subject: Re: High Bloom filter false ratio Looks like that sstablemetadata is available in 2.2

Re: High Bloom filter false ratio

2016-02-23 Thread Anishek Agarwal
ould, >> very easily, write a script that gives you a list of sstables that you >> could feed to forceUserDefinedCompaction to join together to eliminate >> leftover waste. >> >> Your long ParNew times may be fixable by increasing the new gen size of >> your

Re: High Bloom filter false ratio

2016-02-23 Thread Anishek Agarwal
ong ParNew times may be fixable by increasing the new gen size of > your heap – the general guidance in cassandra-env.sh is out of date, you > may want to reference CASSANDRA-8150 for “newer” advice ( > http://issues.apache.org/jira/browse/CASSANDRA-8150 ) > > - Jeff > > From:

Re: High Bloom filter false ratio

2016-02-22 Thread Jeff Jirsa
-8150 ) - Jeff From: Anishek Agarwal Reply-To: "user@cassandra.apache.org" Date: Monday, February 22, 2016 at 8:33 PM To: "user@cassandra.apache.org" Subject: Re: High Bloom filter false ratio Hey Jeff, Thanks for the clarification, I did not exp

Re: High Bloom filter false ratio

2016-02-22 Thread Anishek Agarwal
ishek Agarwal > Reply-To: "user@cassandra.apache.org" > Date: Sunday, February 21, 2016 at 11:13 PM > To: "user@cassandra.apache.org" > Subject: Re: High Bloom filter false ratio > > Hey guys, > > Just did some more digging ... looks like DTCS is not rem

Re: High Bloom filter false ratio

2016-02-22 Thread Jeff Jirsa
uot;user@cassandra.apache.org" Date: Sunday, February 21, 2016 at 11:13 PM To: "user@cassandra.apache.org" Subject: Re: High Bloom filter false ratio Hey guys, Just did some more digging ... looks like DTCS is not removing old data completely, I used sstable2json for one such

Re: High Bloom filter false ratio

2016-02-22 Thread Christopher Bradford
Does every record in the SSTable have a "d" column? On Mon, Feb 22, 2016 at 2:14 AM Anishek Agarwal wrote: > Hey guys, > > Just did some more digging ... looks like DTCS is not removing old data > completely, I used sstable2json for one such table and saw old data there. > we

Re: High Bloom filter false ratio

2016-02-21 Thread Anishek Agarwal
Hey guys, Just did some more digging ... looks like DTCS is not removing old data completely, I used sstable2json for one such table and saw old data there. we have a value of 30 for max_stable_age_days for the table. One of the columns showed data as :["2015-12-10 11\\:03+0530:", "56690ea2",

Re: High Bloom filter false ratio

2016-02-21 Thread Anishek Agarwal
We are using DTCS have a 30 day window for them before they are cleaned up. I don't think with DTCS we can do anything about table sizing. Please do let me know if there are other ideas. On Sat, Feb 20, 2016 at 12:51 AM, Jaydeep Chovatia < chovatia.jayd...@gmail.com> wrote: > To me following

Re: High Bloom filter false ratio

2016-02-19 Thread Jaydeep Chovatia
To me following three looks on higher side: SSTable count: 1289 In order to reduce SSTable count see if you are compacting of not (If using STCS). Is it possible to change this to LCS? Number of keys (estimate): 345137664 (345M partition keys) I don't have any suggestion about reducing this

Re: High Bloom filter false ratio

2016-02-18 Thread Anishek Agarwal
Hey all, @Jaydeep here is the cfstats output from one node. Read Count: 1721134722 Read Latency: 0.04268825050756254 ms. Write Count: 56743880 Write Latency: 0.014650376727851532 ms. Pending Tasks: 0 Table: user_stay_points SSTable count: 1289 Space used (live), bytes: 122141272262 Space

Re: High Bloom filter false ratio

2016-02-18 Thread daemeon reiydelle
The bloom filter buckets the values in a small number of buckets. I have been surprised by how many cases I see with large cardinality where a few values populate a given bloom leaf, resulting in high false positives, and a surprising impact on latencies! Are you seeing 2:1 ranges between mean

Re: High Bloom filter false ratio

2016-02-18 Thread Tyler Hobbs
You can try slightly lowering the bloom_filter_fp_chance on your table. Otherwise, it's possible that you're repeatedly querying one or two partitions that always trigger a bloom filter false positive. You could try manually tracing a few queries on this table (for non-existent partitions) to

High Bloom filter false ratio

2016-02-17 Thread Anishek Agarwal
Hello, We have a table with composite partition key with humungous cardinality, its a combination of (long,long). On the table we have bloom_filter_fp_chance=0.01. On doing "nodetool cfstats" on the 5 nodes we have in the cluster we are seeing "Bloom filter false ratio:" in the range of 0.7