[3] is titled with respect to storage, but if you read through the comments of [2], Dmitriy mentions that it'll also include querying.
Norbert On Thu, Jul 28, 2011 at 8:53 AM, Vincent Barat <[email protected]>wrote: > Thanks for the input, [3] is more related to timestamp storage, anyway I > added my 2 cents to the issue concerning loading by timestamp. > > Le 28/07/11 13:19, Norbert Burger a écrit : > > You can instruct HBaseStorage to load a subset of the rows using the "-gt" >> and "-lt" options to HBaseStorage, documented here [1]. >> >> I don't believe querying by timestamp is currently supported in Pig, based >> on the comments to [2]. There is a standalone JIRA that's been created >> [3]. >> >> Norbert >> >> [1] >> http://ofps.oreilly.com/**titles/9781449302641/** >> community.html#hbase_options_**table<http://ofps.oreilly.com/titles/9781449302641/community.html#hbase_options_table> >> [2] >> https://issues.apache.org/**jira/browse/PIG-1782<https://issues.apache.org/jira/browse/PIG-1782> >> [3] >> https://issues.apache.org/**jira/browse/PIG-1832<https://issues.apache.org/jira/browse/PIG-1832> >> >> On Thu, Jul 28, 2011 at 6:18 AM, Vincent Barat<[email protected]>** >> wrote: >> >> Hi, >>> >>> I'd like to make PIG load only a subset of an HBase table, based on the >>> timestamp of the records, or on the key of the rows. >>> >>> As an example, I'd like to load only records that have a timestamp> N, >>> or >>> a key> "something". >>> >>> I know that HBase can handle scanners that are highly optimized to >>> perform >>> this kind of things, and it would greatly improve the time needed to load >>> my >>> data. >>> >>> Is there any way to do this ? >>> If not, it is planned to be added in the HBase loader ? >>> If not, is it technically possible to do it ? >>> If yes, can I contribute and propose a patch on that ? >>> >>> Thank a lot ! >>> >>>
