Someone mentioned in another post about hotspotting. I guess I could reverse the row keys to prevent this?
On Tue, Apr 29, 2014 at 3:34 PM, Software Dev <[email protected]> wrote: > Hey all. I have some questions regarding row key and column design. > > We want to calculate some metrics based on our page views broken down > by hour, day, month and year. We also want this broken down country > and have the ability to filter by some other attributes such as the > sex of the user or whether or not the user is logged in..... Note > these will all be increments. > > So we have the initial row key design as > > YYYY - Row key for yearly totals > YYYYMM - Row key for monthly totals > YYYYMMDD - Row key for daily totals > YYYYMMDDHH - Row key for hourly totals > > I think this may make sense as it will be easy to do a range scan over > a time period. > > Now for my column design. We were thinking along these lines. > > daily:US - Daily counts for the US > hourly:CA - Hourly counts for Canada > ... and so on > > Now this seems like it would work but fails when we add in the > requirement of filtering results base on some other attributes. Say we > wanted to be able to filter based on sex (M or F) and/or filter based > on logged in status (Online or Offline) OR and/or filter based on some > other attribute OR perform no filtering at all. How would I go about > accomplishing this? > > Thanks for any input/pointers.
