On Jun 13, 2011, at 5:10 AM, aaron morton wrote: >> I am wondering how to index on the most recent hour as well. (ie show me top >> 5 URLs type query).. > > AFAIK thats not a great application for counters. You would need range > support in the secondary indexes so you could get the first X rows ordered by > a column value. > > To be honest, depending on scale, I'd consider a sorted set in redis for > that.
It does. Thanks Aaron. > > Hope that helps. > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 11 Jun 2011, at 00:36, Ian Holsman wrote: > >> >> On Jun 9, 2011, at 10:04 PM, aaron morton wrote: >> >>> I may be missing something but could you use a column for each of the last >>> 48 hours all in the same row for a url ? >>> >>> e.g. >>> { >>> "/url.com/hourly" : { >>> "20110609T01:00:00" : 456, >>> "20110609T02:00:00" : 4567, >>> } >>> } >> >> yes.. that would work better... I was storing all the different times in the >> same row. >> { >> "/url.com" : { >> "H-20110609T01:00:00" : 456, >> "H-0110609T02:00:00" : 4567, >> "D-0110609" : 5678, >> } >> } >> >> I am wondering how to index on the most recent hour as well. (ie show me top >> 5 URLs type query).. >> >>> >>> Increment the current hour only. Delete the older columns either when a >>> read detects there are old values or as a maintenance job. Or as part of >>> writing values for the first 5 minutes of any hour. >> >> yes.. I thought of that. The problem with doing it on read is there may be a >> case where a old URL never gets read.. so it will just sit there taking up >> space.. the maintenance job is the route I went down. >> >>> >>> The row will get spread out over a lot of sstables which may reduce read >>> speed. If this is a problem consider a separate CF with more aggressive GC >>> and compaction settings. >> >> Thanks! >>> >>> Cheers >>> >>> >>> ----------------- >>> Aaron Morton >>> Freelance Cassandra Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 10 Jun 2011, at 09:28, Ian Holsman wrote: >>> >>>> So would doing something like storing it in reverse (so I know what to >>>> delete) work? Or is storing a million columns in a supercolumn impossible. >>>> >>>> I could always use a logfile and run the archiver off that as a worst case >>>> I guess. >>>> Would doing so many deletes screw up the db/cause other problems? >>>> >>>> --- >>>> Ian Holsman - 703 879-3128 >>>> >>>> I saw the angel in the marble and carved until I set him free -- >>>> Michelangelo >>>> >>>> On 09/06/2011, at 4:22 PM, Ryan King <r...@twitter.com> wrote: >>>> >>>>> On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman <had...@holsman.net> wrote: >>>>>> Hi Ryan. >>>>>> you wouldn't have your version of cassandra up on github would you?? >>>>> >>>>> No, and the patch isn't in our version yet either. We're still working on >>>>> it. >>>>> >>>>> -ryan >>> >> >