You should reuse HTable instances but they are not thread-safe so use one per thread. Check out the HTablePool class.
> -----Original Message----- > From: Eric Yang [mailto:[email protected]] > Sent: Saturday, July 03, 2010 9:30 PM > To: [email protected] > Subject: Re: stargate retrieve multiple version of a cell > > I used the shell to create the table. This explained why it only > stored 3 versions. I will switch to use java API to create the > tables. Another question, I am currently sinking all data into the > same table for my prototype. Is there any heavy cost for creating new > instance of HTable? > > My code may looks like this: > > for(String tableName : tableList) { > List<PUT> list = ...; > hbase = new HTable(new HBaseConfiguration(), tableName); > hbase.put(list); > } > > Or should I keep HTable instances in hash and reuse them later? > > regards, > Eric > > On Sat, Jul 3, 2010 at 5:43 PM, Jonathan Gray <[email protected]> > wrote: > > Have you looked at Scan.setMaxVersions(int)? Is that what you're > looking for? > > > > Also, when you created the table, it has a default max of three > versions. Did you use the java API or the shell to create your table? > > > > HColumnDescriptor.setMaxVersions(int) is what you want to set when > you create the table initially. To keep all versions, use > setMaxVersions(Integer.MAX_VALUE). > > > > JG > > > >> -----Original Message----- > >> From: Eric Yang [mailto:[email protected]] > >> Sent: Saturday, July 03, 2010 4:19 PM > >> To: [email protected] > >> Subject: Re: stargate retrieve multiple version of a cell > >> > >> Hi Jonathan, > >> > >> I am trying to store large time series data. I am using a row as a > >> group for one hour's data. My row contains 60 timestamps, and each > >> timestamp has various cell values. I am hoping this will produce > row > >> that is not too thick and table that is slightly shorter. I am > fine > >> with none ordered versioning, as long as I get timestamp when data > is > >> retrieved for the timestamp range. When I scan for the cell, I only > >> get the most recent three versions of the cell. > >> > >> This was tested on hbase 0.20.5, and hadoop 0.20.2. > >> > >> regards, > >> Eric > >> > >> > >> > >> On Sat, Jul 3, 2010 at 2:34 PM, Jonathan Gray <[email protected]> > >> wrote: > >> > What exactly are you trying to do with the timestamp? Currently > even > >> duplicates are retained and returned, but the order is not > guaranteed > >> (though we are working on this). > >> > > >> > The behavior is related only to time/order of operations, no > >> difference if using different clients (not including behavior from > >> write buffering). > >> > > >> > JG > >> > > >> >> -----Original Message----- > >> >> From: Eric Yang [mailto:[email protected]] > >> >> Sent: Saturday, July 03, 2010 2:32 PM > >> >> To: [email protected] > >> >> Subject: Re: stargate retrieve multiple version of a cell > >> >> > >> >> I think I just found the answer of my own question. It was not > >> >> stargte's problem. The data was not stored in hbase as I > expected > >> it > >> >> to be. This raised a more basic question: > >> >> > >> >> I am storing data like this: > >> >> > >> >> Put row1, cf1:c1: 0, timestamp: 10 > >> >> Put row1, cf1:c2: 10, timestamp: 10 > >> >> Put row1, cf1:c2: 15, timestamp: 20 > >> >> Put row1, cf1:c1: 1, timestamp: 20 > >> >> > >> >> I am updating individual column by timestamp, and repeat repeat > this > >> >> 60 times for each of the columns. This is all executed by the > same > >> >> client. When I scan for "row1, c2", would I get 60 different > values > >> >> for each of the timestamp? > >> >> > >> >> What would happen if this kind of updates are applied by > different > >> >> hbase client? > >> >> > >> >> regards, > >> >> Eric > >> >> > >> >> On Sat, Jul 3, 2010 at 1:56 PM, Eric Yang <[email protected]> > wrote: > >> >> > Hi all, > >> >> > > >> >> > I am trying to use stargate to get multiple versions of the > cell, > >> and > >> >> > my query looks like this: > >> >> > > >> >> > http://localhost:9090/chukwa/1278180000000-Eric-Yangs- > >> >> > >> > iMac.local/Hadoop_dfs_namenode:CreateFileOps/1278183540000/127818990000 > >> >> 0 > >> >> > > >> >> > table name: chukwa > >> >> > row: 1278187200000-Eric-Yangs-iMac.local > >> >> > column: Hadoop_dfs_namenode:CreateFileOps > >> >> > start-timestamp: 1278183540000 > >> >> > end-timestamp: 1278189900000 > >> >> > > >> >> > It only shows me the most recent 3 versions, but not all the > >> versions > >> >> > in this time range. Is this the right syntax? What am I doing > >> >> wrong? > >> >> > Thanks > >> >> > > >> >> > regards, > >> >> > Eric > >> >> > > >> > > >
