Hi Srikanth, Yes. Versions will help me if I have fixed number of Versions.
But in my case, I will not know the number of versions beforehand. The table will be populated from feedfiles using a mapreduce program. Once loaded, all these will go into the same column family:column. Then I would want to count the number of times a particular URI was accessed by userid1. For this, I need to be able to scan through all the versions loaded in that rowkey and do a counter increment. How is this possible,if I donot know the number of versions that is getting loaded into a table rowkey as it is a dynamic property (each feedfile may have different number of records) ? Is the setTimeRange method of GET and SCAN meant to do this? If so, why am I not getting all the column values for a particular rowkey? Regards, Narayanan On Mon, Jul 11, 2011 at 12:28 PM, Srikanth P. Shreenivas < [email protected]> wrote: > Hi Narayanan, > > I think you need to create the table with versions enabled. > > For example, if you need to store 5 versions, you can use create like this: > > Hbase> create 'useractivity', {NAME => 'pageviews', VERSIONS => 5} > > HBase> put 'useractivity', 'userid1', 'pageviews:uri', ' > http://www.allaboutdata.net' > HBase> put 'useractivity', 'userid1', 'pageviews:uri', ' > http://www.yahoo.co.in' > > HBase> get "useractivity", "userid1", {COLUMN=>'pageviews',VERSIONS=>2} > COLUMN CELL > pageviews:uri timestamp=1310367267049, > value=http://www.yahoo.co.in > pageviews:uri timestamp=1310367221129, > value=http://www.allaboutdata.net > 2 row(s) in 0.0440 seconds > > > One thing you need to watch out for is the VERSIONS is defined on column > family, and hence, you cannot change it once you have defined your column > family. This will work if your applications wishes to store only fixed > number of versions you want to store. If that is not the case, you need to > relook at your table design and realize that using some other way. > > Regards, > Srikanth > > -----Original Message----- > From: Narayanan K [mailto:[email protected]] > Sent: Monday, July 11, 2011 11:07 AM > To: [email protected] > Subject: Fetching and iterating through all column values belonging to all > Timestamps of a Row > > Hi all, > > I am using Hadoop - 0.20.1 and HBASE - 0.20. > > Currently, I am trying to retrieve and iterate through all the column > values > of a particular rowkey in an Hbase Table. > But I am able to retrieve *only* the cell+value having the *latest > Timestamp > *. > > Eg: > > *hbase>create 'useractivity', 'pageviews' > hbase>put 'useractivity', 'userid1', 'pageviews:uri', > 'http://www.allaboutdata.net' > hbase>put 'useractivity', 'userid1', 'pageviews:uri', ' > http://www.yahoo.co.in'* > > *hbase>get 'useractivity', 'userid1' * > is fetching only the "http://www.yahoo.co.in" column value as it has the > latest timestamp. > > I wanted to view both the values in the column *uri*. > > I tried the same with the java API - Get as well as Scan. But still both of > them gave me the same result with the column having value that was > inserted the latest. > I also read through some old archives and found I could setTimeRange on > Get/Scan which is also not solving my problem. > > *get.setTimeRange(0,Long.MAXVALUE);* as in : > > *HTable table = new HTable(new HBaseConfiguration(), "useractivity"); > Get get = new Get(Bytes.toBytes("userid1")); > get.addFamily(Bytes.toBytes("pageviews")); > get.setTimeRange(0,Long.MAXVALUE); > Result result = table.get(get); > byte[] value = result.getValue(Bytes.toBytes("pageviews"), > Bytes.toBytes("uri")); > > System.out.println(Bytes.toString(value));* > > This is fetching me only the column value with the latest timestamp. > > I tried the same with Scan API. But I get the same result. > > *Could you please let me know how I can retrieve all column values of all > timestamps of a particular rowkey??* > > Many Thanks, > Narayanan > > ________________________________ > > http://www.mindtree.com/email/disclaimer.html >
