Thanks Ted. So far I don't see direct answer yet in any hbase books or articles. all resources say that values are ordered by rowkey:cf:column, but no one explains how new columns are stored after compaction. I think after compaction the store files should still follow the same way to organize data. So if a new column need to be added in all rows regularly, the compaction might have to extra works I/O operations accordingly. Maybe the schema design better to keep old data intact instead of keep adding new columns into it.
On Sat, Oct 10, 2015 at 7:55 PM, Ted Yu <[email protected]> wrote: > Please take a look at: > > http://hbase.apache.org/book.html#_compaction > http://hbase.apache.org/book.html#exploringcompaction.policy > > http://hbase.apache.org/book.html#compaction.ratiobasedcompactionpolicy.algorithm > > FYI > > On Sat, Oct 10, 2015 at 6:53 PM, Liren Ding <[email protected]> > wrote: > > > Hi, > > > > I am trying to design a schema for time series events data. The row key > is > > eventId, and event data is added into new "date" columns daily. So in a > > query I only need to set filter on columns to find all data for specified > > events. The table should look like following: > > > > rowkey | 09-01-2015 | 09-02-2015 | ...... > > > > eventid1 data11 data12 > > eventid2 data21 data22 > > eventid3 ...... ,...... > > ....... > > > > I know during compaction the data with same row key will be stored > > together. So with this design, will new columns cause compaction storm? > Or > > any other issues? > > Appreciate! > > >
