I guess you were referring to section 6.3.2 bq. rowkey is stored and/ or read for every cell value
The above is true. bq. the event description is a string of 0.1 to 2Kb You can enable Data Block encoding to reduce storage. Cheers On Tue, Sep 17, 2013 at 9:44 AM, Adrian CAPDEFIER <[email protected]>wrote: > Howdy all, > > I'm trying to use hbase for the first time (plenty of other experience with > RDBMS database though), and I have a couple of questions after reading The > Book. > > I am a bit confused by the advice to reduce "the row size" in the hbase > book. It states that every cell value is accomplished by the coordinates > (row, column and timestamp). I'm just trying to be thorough, so am I to > understand that the rowkey is stored and/ or read for every cell value in a > record or just once per column family in a record? > > I am intrigued by the rows as columns design as described in the book at > http://hbase.apache.org/book.html#rowkey.design. To make a long story > short, I will end up with a table to store event types and number of > occurrences in each day. I would prefer to have the event description as > the row key and the dates when it happened as columns - up to 7300 for > roughly 20 years. > However, the event description is a string of 0.1 to 2Kb and if it is > stored for each cell value, I will need to use a surrogate (shorter) value. > > Is there a built-in functionality to generate (integer) surrogate values in > hbase that can be used on the rowkey or does it need to be hand code it > from scratch? >
