An example of that is how in Trafodion one can generate a Divisioning column, such a week number, derived from a date column, that becomes the leading part of a multi-column HBase key. Of course, Trafodion has a salt key as a prefix to spread the data across the regions in a balanced way, but you may not need that in your scenario. Then you can use that to just access the data for the last week.
Rohit > On Jan 28, 2017, at 1:34 PM, Josh Elser <[email protected]> wrote: > > (Please stop adding the dev@hbase mailing list. This is a question for the > user@ list only.) > > Unless you have a time component included in your HBase data, there is no way > to find all "new" data in HBase with the timestamp component aside from > scanning the entire HBase table. Performing a full table scan is not an ideal > scenario, as it is not a situation which HBase is optimized for. > > You can consider including a leading component of time in your rowKey or > creating an index table of time loaded to rowKey to efficiently perform these > lookups. > > Chetan Khatri wrote: >> Sure, There are several applications talks to HBase and populate data, Now >> I want to load Incrementally data from HBase and do transformations like >> Data Quality (filters) and save at Hive. >> >> Incremental load means - I want to run this job weekly, and making sure >> should not get duplication at Hive level. >> >> Thanks. >> >>> On Sat, Jan 28, 2017 at 1:00 AM, Josh Elser<[email protected]> wrote: >>> >>> (-cc dev) >>> >>> Might you be able to be more specific in the context of your question? >>> >>> What kind of requirements do you have? >>> >>> >>> Chetan Khatri wrote: >>> >>>> Hello Community, >>>> >>>> I am working with HBase 1.2.4 , what would be the best approach to do >>>> Incremental load from HBase to Hive ? >>>> >>>> Thanks. >>>> >>>> >>
