Hi Jonathan, I am trying to store large time series data. I am using a row as a group for one hour's data. My row contains 60 timestamps, and each timestamp has various cell values. I am hoping this will produce row that is not too thick and table that is slightly shorter. I am fine with none ordered versioning, as long as I get timestamp when data is retrieved for the timestamp range. When I scan for the cell, I only get the most recent three versions of the cell.
This was tested on hbase 0.20.5, and hadoop 0.20.2. regards, Eric On Sat, Jul 3, 2010 at 2:34 PM, Jonathan Gray <[email protected]> wrote: > What exactly are you trying to do with the timestamp? Currently even > duplicates are retained and returned, but the order is not guaranteed (though > we are working on this). > > The behavior is related only to time/order of operations, no difference if > using different clients (not including behavior from write buffering). > > JG > >> -----Original Message----- >> From: Eric Yang [mailto:[email protected]] >> Sent: Saturday, July 03, 2010 2:32 PM >> To: [email protected] >> Subject: Re: stargate retrieve multiple version of a cell >> >> I think I just found the answer of my own question. It was not >> stargte's problem. The data was not stored in hbase as I expected it >> to be. This raised a more basic question: >> >> I am storing data like this: >> >> Put row1, cf1:c1: 0, timestamp: 10 >> Put row1, cf1:c2: 10, timestamp: 10 >> Put row1, cf1:c2: 15, timestamp: 20 >> Put row1, cf1:c1: 1, timestamp: 20 >> >> I am updating individual column by timestamp, and repeat repeat this >> 60 times for each of the columns. This is all executed by the same >> client. When I scan for "row1, c2", would I get 60 different values >> for each of the timestamp? >> >> What would happen if this kind of updates are applied by different >> hbase client? >> >> regards, >> Eric >> >> On Sat, Jul 3, 2010 at 1:56 PM, Eric Yang <[email protected]> wrote: >> > Hi all, >> > >> > I am trying to use stargate to get multiple versions of the cell, and >> > my query looks like this: >> > >> > http://localhost:9090/chukwa/1278180000000-Eric-Yangs- >> iMac.local/Hadoop_dfs_namenode:CreateFileOps/1278183540000/127818990000 >> 0 >> > >> > table name: chukwa >> > row: 1278187200000-Eric-Yangs-iMac.local >> > column: Hadoop_dfs_namenode:CreateFileOps >> > start-timestamp: 1278183540000 >> > end-timestamp: 1278189900000 >> > >> > It only shows me the most recent 3 versions, but not all the versions >> > in this time range. Is this the right syntax? What am I doing >> wrong? >> > Thanks >> > >> > regards, >> > Eric >> > >
