Hi Ted, Thanks a lot for the reply. For #1, the size for the value only will be around 20 bytes for each cell. And there will be hundreds of thousands of time stamp per cell. But not millions. Any suggestion?
Many thanks. Cao On Monday, November 10, 2014, Ted Yu <[email protected]> wrote: > For #1, what's the expected size of data you want to store ? > > For #2, the new data inserted under column:value with a newer timestamp > would be stored in a different HFile. Old and new data would be > consolidated after major compaction. > > Cheers > > On Mon, Nov 10, 2014 at 6:21 AM, Bill Q <[email protected] > <javascript:;>> wrote: > > > Hi, > > I am designing a schema to store time series data for each device. And I > > have a couple of questions that I am not quit sure. > > > > 1. *Is there any down side for storing the data in the same > > columnfamily:column with a long history of customized timestamp? * > > > > For example, I have historical daily data for a device. I would like to > use > > only one column qualifier to store them with custom timestamp, which is > the > > date of the data was collected. So, when I query the data I can easily > pull > > all the timeseries data against this particular device in one scan. > > > > 2. *After a storefile is finalized and become immutable, what would > happen > > when someone updates the row? * > > > > For example, if I insert a new column:value with a newer timestamp into > the > > same row:columnfamily. Where is this new key/value part going to sit in > the > > HDFS? Is it close to the previous K/V pairs in the storefile? > > > > > > Many thanks. > > > > > > Bill > > > -- Many thanks. Bill
