[
https://issues.apache.org/jira/browse/HBASE-21642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730327#comment-16730327
]
Hudson commented on HBASE-21642:
--------------------------------
Results for branch master
[build #685 on
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/685/]: (x)
*{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://builds.apache.org/job/HBase%20Nightly/job/master/685//General_Nightly_Build_Report/]
(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://builds.apache.org/job/HBase%20Nightly/job/master/685//JDK8_Nightly_Build_Report_(Hadoop2)/]
(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://builds.apache.org/job/HBase%20Nightly/job/master/685//JDK8_Nightly_Build_Report_(Hadoop3)/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> CopyTable by reading snapshot and bulkloading will save a lot of time.
> ----------------------------------------------------------------------
>
> Key: HBASE-21642
> URL: https://issues.apache.org/jira/browse/HBASE-21642
> Project: HBase
> Issue Type: Improvement
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21642.v1.patch
>
>
> In our HBase clusters, some users has the need to merge two diff table's
> data into one. Currently , the CopyTable will scan the source table , and
> put mutations into destination table.
> Although CopyTable with bulkload can speed a lot (compared to CopyTable with
> scan and put), it still take lots of time to scan the source table. and the
> worst thing is: CopyTable with scan table will impact the cluster's
> availablity, it cost lots of resource in RS to scanning, the cpu, memory,
> gc stw, rs handlers, disk io, network io ... etc. All those things will
> affect the availablity.
> So in our clusters, we tried to do all scanning job by using scan snapshot
> instead of scan table. it at least isolate the cpu & memory & gc resource
> between the online RS and scanning job. What's more, the snapshot scanning
> is much faster than scaning RS, and it's more stable.
> So, here, I'll make the copy table tool support snapshot scanning.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)