On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose <ianr...@fullstory.com> wrote:

>
> "You better off create a manuel reverse-index to track modification date,
> something like this"  --> I had considered an approach like this but my
> concern is that for any given minute *all* of the updates will be handled
> by a single node, right?  For example, if the minute_bucket is 2739 then
> for that one minute, every single item update will flow to the node at
> HASH(2739).  Assuming I am thinking about that right, that seemed like a
> potential scaling bottleneck, which scared me off that approach.
>

If you're concerned about bottlenecking on one node (or set of replicas)
during the minute, add an additional integer column to the partition key
(making it a composite partition key if it isn't already).  When inserting,
randomly pick a value between, say, 0 and 10 to use for this column.  When
reading, read all 10 partitions and merge them.  (Alternatively, instead of
using a random number, you could hash the other key components and use the
lowest bits for the value.  This has the advantage of being deterministic.)


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Reply via email to