Hi Shaofeng,

Sorry about the delayed response; I was on vacation last week.

We (Upsight) are actually also on 0.98, and I have a version of the patch rebased against 0.98.12, which I'll upload to the JIRA ticket. We've had success with running just the patched hbase-server jar with our mapreduce jobs (deploying it without touching our server installations), so I imagine it should work for you as well (in particular if you happen to be already building/maintaining an HBase fork).

Let me know if you run into any issues!

Andrew

On 5/22/15 8:11 PM, Shi, Shaofeng wrote:
Hi Andrew, this is what we need, thank you! In which version will this
feature be released? Our hbase is v0.98, is it possible that just patch
this to get the feature?

On 5/22/15, 6:06 PM, "Andrew Mains"<[email protected]>  wrote:

In the latest release, no; however I've filed a ticket here
https://issues.apache.org/jira/browse/HBASE-13356  for this feature, and
uploaded a patch for review.

The patch provides a MultiTableSnapshotInputFormat which can run a list
of scans over multiple snapshots. Jobs can be initialized using:

  public static void initMultiTableSnapshotMapperJob(Map<String,
Collection<Scan>> snapshotScans,
      Class<? extends TableMapper> mapper, Class<?> outputKeyClass,
Class<?> outputValueClass,
       Job job, boolean addDependencyJars, Path tmpRestoreDir) throws
IOException {


Hope this helps!

Andrew

On 5/22/15 2:35 AM, Shi, Shaofeng wrote:
Hello,

We have a scenario which need merge multiple Hbase tables into one
table periodically; To gain better performance and minimal the impact to
HBase server, we are evaluating the method of using
TableSnapshotInputFormat
(http://www.slideshare.net/enissoz/mapreduce-over-snapshots); But from
the API we see it only allows one snapshot as input; Is it possible to
change it to allow multiple snapshots?

Thanks in advance for any advise;

Shaofeng Shi
Apache Kylin


Reply via email to