This is exactly what I need. Thanks, owe you a beer. :) regards, Eric
On Sat, Jul 3, 2010 at 9:34 PM, Jonathan Gray <[email protected]> wrote: > You should reuse HTable instances but they are not thread-safe so use one per > thread. Check out the HTablePool class. > >> -----Original Message----- >> From: Eric Yang [mailto:[email protected]] >> Sent: Saturday, July 03, 2010 9:30 PM >> To: [email protected] >> Subject: Re: stargate retrieve multiple version of a cell >> >> I used the shell to create the table. This explained why it only >> stored 3 versions. I will switch to use java API to create the >> tables. Another question, I am currently sinking all data into the >> same table for my prototype. Is there any heavy cost for creating new >> instance of HTable? >> >> My code may looks like this: >> >> for(String tableName : tableList) { >> List<PUT> list = ...; >> hbase = new HTable(new HBaseConfiguration(), tableName); >> hbase.put(list); >> } >> >> Or should I keep HTable instances in hash and reuse them later? >> >> regards, >> Eric >> >> On Sat, Jul 3, 2010 at 5:43 PM, Jonathan Gray <[email protected]> >> wrote: >> > Have you looked at Scan.setMaxVersions(int)? Is that what you're >> looking for? >> > >> > Also, when you created the table, it has a default max of three >> versions. Did you use the java API or the shell to create your table? >> > >> > HColumnDescriptor.setMaxVersions(int) is what you want to set when >> you create the table initially. To keep all versions, use >> setMaxVersions(Integer.MAX_VALUE). >> > >> > JG >> > >> >> -----Original Message----- >> >> From: Eric Yang [mailto:[email protected]] >> >> Sent: Saturday, July 03, 2010 4:19 PM >> >> To: [email protected] >> >> Subject: Re: stargate retrieve multiple version of a cell >> >> >> >> Hi Jonathan, >> >> >> >> I am trying to store large time series data. I am using a row as a >> >> group for one hour's data. My row contains 60 timestamps, and each >> >> timestamp has various cell values. I am hoping this will produce >> row >> >> that is not too thick and table that is slightly shorter. I am >> fine >> >> with none ordered versioning, as long as I get timestamp when data >> is >> >> retrieved for the timestamp range. When I scan for the cell, I only >> >> get the most recent three versions of the cell. >> >> >> >> This was tested on hbase 0.20.5, and hadoop 0.20.2. >> >> >> >> regards, >> >> Eric >> >> >> >> >> >> >> >> On Sat, Jul 3, 2010 at 2:34 PM, Jonathan Gray <[email protected]> >> >> wrote: >> >> > What exactly are you trying to do with the timestamp? Currently >> even >> >> duplicates are retained and returned, but the order is not >> guaranteed >> >> (though we are working on this). >> >> > >> >> > The behavior is related only to time/order of operations, no >> >> difference if using different clients (not including behavior from >> >> write buffering). >> >> > >> >> > JG >> >> > >> >> >> -----Original Message----- >> >> >> From: Eric Yang [mailto:[email protected]] >> >> >> Sent: Saturday, July 03, 2010 2:32 PM >> >> >> To: [email protected] >> >> >> Subject: Re: stargate retrieve multiple version of a cell >> >> >> >> >> >> I think I just found the answer of my own question. It was not >> >> >> stargte's problem. The data was not stored in hbase as I >> expected >> >> it >> >> >> to be. This raised a more basic question: >> >> >> >> >> >> I am storing data like this: >> >> >> >> >> >> Put row1, cf1:c1: 0, timestamp: 10 >> >> >> Put row1, cf1:c2: 10, timestamp: 10 >> >> >> Put row1, cf1:c2: 15, timestamp: 20 >> >> >> Put row1, cf1:c1: 1, timestamp: 20 >> >> >> >> >> >> I am updating individual column by timestamp, and repeat repeat >> this >> >> >> 60 times for each of the columns. This is all executed by the >> same >> >> >> client. When I scan for "row1, c2", would I get 60 different >> values >> >> >> for each of the timestamp? >> >> >> >> >> >> What would happen if this kind of updates are applied by >> different >> >> >> hbase client? >> >> >> >> >> >> regards, >> >> >> Eric >> >> >> >> >> >> On Sat, Jul 3, 2010 at 1:56 PM, Eric Yang <[email protected]> >> wrote: >> >> >> > Hi all, >> >> >> > >> >> >> > I am trying to use stargate to get multiple versions of the >> cell, >> >> and >> >> >> > my query looks like this: >> >> >> > >> >> >> > http://localhost:9090/chukwa/1278180000000-Eric-Yangs- >> >> >> >> >> >> iMac.local/Hadoop_dfs_namenode:CreateFileOps/1278183540000/127818990000 >> >> >> 0 >> >> >> > >> >> >> > table name: chukwa >> >> >> > row: 1278187200000-Eric-Yangs-iMac.local >> >> >> > column: Hadoop_dfs_namenode:CreateFileOps >> >> >> > start-timestamp: 1278183540000 >> >> >> > end-timestamp: 1278189900000 >> >> >> > >> >> >> > It only shows me the most recent 3 versions, but not all the >> >> versions >> >> >> > in this time range. Is this the right syntax? What am I doing >> >> >> wrong? >> >> >> > Thanks >> >> >> > >> >> >> > regards, >> >> >> > Eric >> >> >> > >> >> > >> > >
