(Please stop adding the dev@hbase mailing list. This is a question for
the user@ list only.)
Unless you have a time component included in your HBase data, there is
no way to find all "new" data in HBase with the timestamp component
aside from scanning the entire HBase table. Performing a full table scan
is not an ideal scenario, as it is not a situation which HBase is
optimized for.
You can consider including a leading component of time in your rowKey or
creating an index table of time loaded to rowKey to efficiently perform
these lookups.
Chetan Khatri wrote:
Sure, There are several applications talks to HBase and populate data, Now
I want to load Incrementally data from HBase and do transformations like
Data Quality (filters) and save at Hive.
Incremental load means - I want to run this job weekly, and making sure
should not get duplication at Hive level.
Thanks.
On Sat, Jan 28, 2017 at 1:00 AM, Josh Elser<[email protected]> wrote:
(-cc dev)
Might you be able to be more specific in the context of your question?
What kind of requirements do you have?
Chetan Khatri wrote:
Hello Community,
I am working with HBase 1.2.4 , what would be the best approach to do
Incremental load from HBase to Hive ?
Thanks.