Thanks. Basically there are two access patterns: 1) For last 1 hour (or more if last batch failed for some reason), get the clicks data for all Ads. But it seems not possible as Ad Id is part of Partition key. 2) For last 1 hour (or more if last batch failed for some reason), get the clicks data for a specific Ad Id(one or more may be).
How do we support 1 and 2 with a same data model? (I thought to use Ad ID + Hour data as Partition key to avoid hotspots) Thanks Ajay On Wed, Jan 7, 2015 at 6:34 PM, Sylvain Lebresne <sylv...@datastax.com> wrote: > On Wed, Jan 7, 2015 at 10:18 AM, Ajay <ajay.ga...@gmail.com> wrote: > >> Hi, >> >> I have a column family as below: >> >> (Wide row design) >> CREATE TABLE clicks (hour text,adId int,itemId int,time timeuuid,PRIMARY >> KEY((adId, hour), time, itemId)) WITH CLUSTERING ORDER BY (time DESC); >> >> Now to query for a given Ad Id and specific 3 hours say 2015-01-07 11 to >> 2015-01-07 14, how do I use the token function in the CQL. >> > > From that description, it doesn't appear to me that you need the token > function. Just do 3 queries for each hour, each queries being something > along the lines of > SELECT * FROM clicks WHERE adId=... AND hour='2015-01-07 11' AND ... > > For completness sake, I should note that you could do that with a single > query by using an IN on the hour column, but it's actually not a better > solution (provided you submit the 3 queries in an asynchronous fashion at > least) in that case because of reason explained here: > https://medium.com/@foundev/cassandra-query-patterns-not-using-the-in-query-e8d23f9b17c7 > . > > -- > Sylvain > >> >