Re: clarification on 100k tombstone limit in indexes

2014-08-13 Thread DuyHai Doan
add an additional integer column to the partition key (making it a composite partition key if it isn't already). When inserting, randomly pick a value between, say, 0 and 10 to use for this column -- Due to the low cardinality of bucket (only 10), there is no guarantee that the partitions would

Re: clarification on 100k tombstone limit in indexes

2014-08-13 Thread Tyler Hobbs
On Wed, Aug 13, 2014 at 4:35 AM, DuyHai Doan doanduy...@gmail.com wrote: add an additional integer column to the partition key (making it a composite partition key if it isn't already). When inserting, randomly pick a value between, say, 0 and 10 to use for this column -- Due to the low

Re: clarification on 100k tombstone limit in indexes

2014-08-12 Thread DuyHai Doan
Hello Ian So that way each index entry *will* have quite a few entries and the index as a whole won't grow too big. Is my thinking correct here? -- In this case yes. Do not forget that for each date value, there will be 1 corresponding index value + 10 updates. If you have an approximate count

Re: clarification on 100k tombstone limit in indexes

2014-08-12 Thread Ian Rose
Makes sense - thanks again! On Tue, Aug 12, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Ian So that way each index entry *will* have quite a few entries and the index as a whole won't grow too big. Is my thinking correct here? -- In this case yes. Do not forget that for

Re: clarification on 100k tombstone limit in indexes

2014-08-12 Thread Tyler Hobbs
On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose ianr...@fullstory.com wrote: You better off create a manuel reverse-index to track modification date, something like this -- I had considered an approach like this but my concern is that for any given minute *all* of the updates will be handled by a

Re: clarification on 100k tombstone limit in indexes

2014-08-11 Thread Ian Rose
Hi DuyHai, Thanks for the detailed response! A few responses below: On a side node, your usage of secondary index is not the best one. Indeed, indexing the update date will lead to a situation where for one date, you'll mostly have one or a few matching items (assuming that the update date

clarification on 100k tombstone limit in indexes

2014-08-10 Thread Ian Rose
Hi - On this page ( http://www.datastax.com/documentation/cql/3.0/cql/ddl/ddl_when_use_index_c.html), the docs state: Do not use an index [...] On a frequently updated or deleted column and *Problems using an index on a frequently updated or deleted column*ΒΆ

Re: clarification on 100k tombstone limit in indexes

2014-08-10 Thread Mark Reddy
Hi Ian, The issues here, which relates to normal and index column families, is scanning over a large number of tombstones can cause Cassandra to fall over due to increased GC pressure. This pressure is caused because tombstones will create DeletedColumn objects which consume heap. Also these

Re: clarification on 100k tombstone limit in indexes

2014-08-10 Thread Ian Rose
Hi Mark - Thanks for the clarification but as I'm not too familiar with the nuts bolts of Cassandra I'm not sure how to apply that info to my current situation. It sounds like this 100k limit is, indeed, a global limit as opposed to a per-row limit. Are these tombstones ever GCed out of the

Re: clarification on 100k tombstone limit in indexes

2014-08-10 Thread Mark Reddy
Hi Ian Are these tombstones ever GCed out of the index? How frequently? Yes, tombstones are removed after the time specified by gc_grace_seconds has elapsed, which by default is 10 days and is configurable. Knowing and understanding how Cassandra handles distributed deletes is key to designing

Re: clarification on 100k tombstone limit in indexes

2014-08-10 Thread DuyHai Doan
Hello Ian It sounds like this 100k limit is, indeed, a global limit as opposed to a per-row limit --The threshold applies to each REQUEST, not partition or globally. The threshold does not apply to a partition (physical row) simply because in one request you can fetch data from many partitions