Mohit, What would be your read patterns later on? Are you going to read per session, or for a time period, or for a set of users, or process through the entire dataset every time? That would play an important role in defining your keys and columns.
-Amandeep On Tue, Jun 26, 2012 at 1:34 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > I am starting out with a new application where I need to store users > clickstream data. I'll have Visitor Id, session id along with other page > related data. I am wondering if I should just key off randomly generated > session id and store all the page related data as columns inside that row > assuming that this would also give good distribution accross region > servers. In a session user could send 100s of HTML requests and get > responses. If someone is already doing this in HBase I would like to learn > more about it as to how they have designed the schema. >