Re: " keyed by something like [timestamp, action details, session ID]"
Read the part about monotonically increasing keys in the HBase book. There have been lots of other threads in the dist-list about this topic too. -----Original Message----- From: Leif Wickland [mailto:leifwickl...@gmail.com] Sent: Monday, June 13, 2011 1:29 PM To: user@hbase.apache.org Subject: Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families" > > If they have divergent read and write patterns why not put them in > separate tables? > That's an entirely fair question. I'm new to this. I figured if the data was related to the same thing and could have the same key, then it ought to go into various CFs on that key in a single table. I got the feeling from reading the BigTable paper that the typical design approach was to dump lots of CFs into a table. It seems like that's not the HBase-way, though. For the most part it's not a big deal to store the data in separate tables. However, I'm curious what you'd recommend for one particular part of it. Specifically I'd like to store actions within a web visit. I've been planning to store individual actions as columns in their own column family, keyed by something like [timestamp, action details, session ID]. In another column family I'd been planning on storing statistics about the actions, such as first time, end time, count, etc. When writing to the actions CF, I'd need to read from and possibly update the stats CF. Would your recommendation be to store that kind of data in the same CF, two CFs in the same table, or in two separate tables? My thought was that I could use row locking to avoid races to update the stats after inserting into actions if I took the two CF approach. Thanks for your feedback, Leif Wickland