On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose <ianr...@fullstory.com> wrote:
> > "You better off create a manuel reverse-index to track modification date, > something like this" --> I had considered an approach like this but my > concern is that for any given minute *all* of the updates will be handled > by a single node, right? For example, if the minute_bucket is 2739 then > for that one minute, every single item update will flow to the node at > HASH(2739). Assuming I am thinking about that right, that seemed like a > potential scaling bottleneck, which scared me off that approach. > If you're concerned about bottlenecking on one node (or set of replicas) during the minute, add an additional integer column to the partition key (making it a composite partition key if it isn't already). When inserting, randomly pick a value between, say, 0 and 10 to use for this column. When reading, read all 10 partitions and merge them. (Alternatively, instead of using a random number, you could hash the other key components and use the lowest bits for the value. This has the advantage of being deterministic.) -- Tyler Hobbs DataStax <http://datastax.com/>