[jira] [Commented] (HBASE-21642) CopyTable by reading snapshot and bulkloading will save a lot of time.

Hudson (JIRA) Fri, 28 Dec 2018 07:43:21 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-21642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730327#comment-16730327
 ]


Hudson commented on HBASE-21642:
--------------------------------

Results for branch master
        [build #685 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/685/]: (x) 
*{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/685//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/685//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/685//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> CopyTable by reading snapshot and bulkloading will save a lot of time.
> ----------------------------------------------------------------------
>
>                 Key: HBASE-21642
>                 URL: https://issues.apache.org/jira/browse/HBASE-21642
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0
>
>         Attachments: HBASE-21642.v1.patch
>
>
> In our HBase clusters,  some users has the need to merge two diff table's 
> data into one.  Currently ,  the CopyTable will scan the source table , and 
> put mutations into destination table. 
> Although CopyTable with bulkload can speed a lot (compared to CopyTable with 
> scan and put), it still take lots of time to scan the source table.  and the 
> worst thing is:  CopyTable with scan table will impact the cluster's 
> availablity, it cost lots of resource in RS to scanning,  the cpu,  memory, 
> gc stw,  rs handlers, disk io, network io ... etc.  All those things will 
> affect the availablity. 
> So in our clusters,  we tried to do all scanning job by using scan snapshot 
> instead of scan table.  it at least isolate the cpu & memory & gc resource  
> between the online RS and scanning job. What's more,  the snapshot scanning 
> is much faster than scaning RS, and it's more stable.
> So, here,  I'll make the copy table tool support snapshot scanning. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21642) CopyTable by reading snapshot and bulkloading will save a lot of time.

Reply via email to