subject:"\[jira\] \[Commented\] \(HBASE\-16466\) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables"

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-07-31 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106914#comment-16106914
 ] 

Zheng Hu commented on HBASE-16466:
--

[~sukuna...@gmail.com],  Thanks for your reply,  I created HBASE-18484 to 
address it. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-07-29 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106085#comment-16106085
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi [~openinx]

I never tested Yarn layer alone in different cluster than source or peer,  
because  we always run VerifyReplication within source.
Please file a separate Jira if you think its valid use case and its an issue.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-07-28 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104582#comment-16104582
 ] 

Zheng Hu commented on HBASE-16466:
--

[~sukuna...@gmail.com],  I have one question for your test provided.   Did you 
run the MR job on the same HDFS cluster for Source/Peer HBase Cluster & Yarn 
Cluster ?   

Seems like that when source hbase cluster / peer hbase cluster  / yarn cluster 
locate in three different HDFS cluster , it has one problem. 

when restoring the snapshot into tmpdir ,  we need to  create region by 
following code (HRegion#createHRegion)
{code}
  public static HRegion createHRegion(final HRegionInfo info, final Path 
rootDir,
final Configuration conf, final TableDescriptor hTableDescriptor,
final WAL wal, final boolean initialize)
  throws IOException {
LOG.info("creating HRegion " + info.getTable().getNameAsString()
+ " HTD == " + hTableDescriptor + " RootDir = " + rootDir +
" Table name == " + info.getTable().getNameAsString());
FileSystem fs = FileSystem.get(conf);  
<---  Here our code use  fs.defaultFs configuration to create 
region.
Path tableDir = FSUtils.getTableDir(rootDir, info.getTable());
HRegionFileSystem.createRegionOnFileSystem(conf, fs, tableDir, info);
HRegion region = HRegion.newHRegion(tableDir, wal, fs, conf, info, 
hTableDescriptor, null);
if (initialize) region.initialize(null);
return region;
  }
{code}

When source cluster & peer cluster locate in two difference file systems , then 
their  fs.defaultFs should be difference,   so at least one cluster will fail 
when restore snapshot into tmpdir .  after I added the following fix, it works 
fine for me.

{code}
-  FileSystem fs = FileSystem.get(conf);  
+ FileSystem fs = rootDir.getFileSystem(conf);
{code}

Looking forward to your reply, Thanks. 


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-05 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15999276#comment-15999276
 ] 

Hudson commented on HBASE-16466:


FAILURE: Integrated in Jenkins build HBase-HBASE-14614 #223 (See 
[https://builds.apache.org/job/HBase-HBASE-14614/223/])
HBASE-16466 Snapshots support in VerifyReplication tool (apurtell: rev 
2de6b051f67b6a55eda8d4e247328fda24484adb)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationSmallTests.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-04 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997184#comment-15997184
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Thanks [~apurtell] , Will submit patch for 1.3. I tried and its failing with 
same error. Looks like I need to change unit test for 1.3 .

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-03 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996162#comment-15996162
 ] 

Hudson commented on HBASE-16466:


SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #2946 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/2946/])
HBASE-16466 Snapshots support in VerifyReplication tool (apurtell: rev 
2de6b051f67b6a55eda8d4e247328fda24484adb)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationSmallTests.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.java


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-03 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995943#comment-15995943
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi [~zghaobac] , 

I added a condition in doCommandLine to check for this. 
User should provide all four peer snapshot related params or none. 
If user misses any one then we throw validation error saying all peer snapshot 
related parameters need to be provided. 

Below condition. 
+  if (peerSnapshotName != null || peerSnapshotTmpDir != null || 
peerFSAddress != null
+  || peerHBaseRootAddress != null) {
+if (peerSnapshotName == null || peerSnapshotTmpDir == null || 
peerFSAddress == null
+|| peerHBaseRootAddress == null) {
+  printUsage(
+"Peer snapshot name, peer snapshot temp location, Peer HBase root 
address and  "
++ "peer FSAddress should be provided to use snapshots in peer 
cluster");
+  return false;
+}
+  }

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-03 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995853#comment-15995853
 ] 

Guanghao Zhang commented on HBASE-16466:


What will happen if user forgot to set one of the snapshot option 
(peerFSAddress, peerSnapshotTmpDir...)? The default value is null? 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-03 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995748#comment-15995748
 ] 

Andrew Purtell commented on HBASE-16466:


I will commit this to master. 

Would you like to post a patch here for branch-1 [~sukuna...@gmail.com]? I did 
try to pick this back. There were some fixups needed and then 
TestReplicationSmallTests would not pass, something to do with filesystems and 
job configuration. So no auto-backport, sorry. 


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-03 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995698#comment-15995698
 ] 

Andrew Purtell commented on HBASE-16466:


+1

v5 patch looks good.
Let me check locally then commit. Thanks [~sukuna...@gmail.com]

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-03 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995627#comment-15995627
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi [~apurtell] 

Added new patch v5, in this I removed coprocessor related change from this 
patch as it is working fine for normal tables. 
I will file separate Jira for coprocessor ClassNotFound issue related to 
Phoenix tables . 

Thanks
Sukumar

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993776#comment-15993776
 ] 

Andrew Purtell commented on HBASE-16466:


bq. We need TableMapReduceUtil.addTableCoprocessorJarsToClasspath() in 
createSubmittabelJob() method, otherwise we are getting Phoenix coprocessor 
classed not found error(I tried with Phoenix table).

That's what I thought. This doesn't belong in this patch. It belongs on a 
separate JIRA with a clear description about what the change does. Let's get 
that committed first and then come back here, if you need it somehow for 
testing this change.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992648#comment-15992648
 ] 

Hadoop QA commented on HBASE-16466:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
51s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
59s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 0s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 132m 48s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
41s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 178m 49s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865885/HBASE-16466.v4.patch |
| JIRA Issue | HBASE-16466 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 188f0c9288f9 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c8a7e80 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6657/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6657/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-02 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992413#comment-15992413
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi [~apurtell] , 

We need TableMapReduceUtil.addTableCoprocessorJarsToClasspath() in 
createSubmittabelJob() method, otherwise we are getting Phoenix coprocessor 
classed not found error(I tried with Phoenix table). 

oh my bad.  I fixed 99% of nit issues in v2 patch along with removing unwanted  
deletedDir() in createSubmittableJob() method. But when I generated v3 patch I 
missed it. 
Attaching v4 patch with all nit issues, unwanted code addressed. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991793#comment-15991793
 ] 

Andrew Purtell commented on HBASE-16466:


The change to TableMapReduceUtil is unrelated. Please remove it. If you want 
this this change in the code, file a new JIRA for it and we can get that 
committed separately. It's a nontrivial behavioral change.

Sorry to pick on nits but this patch does not conform to our coding standards. 
See chahttps://hbase.apache.org/book.html#common.patch.feedback which refers 
https://www.oracle.com/technetwork/java/index-135089.html . That's a lot to 
read, so let me specifically address what bothers the eye:

Braces on the same line, please.

{code}
if () {
} else { 
}
{code}

not 

{code}
if () 
{
}
else
{ 
}
{code}

Spaces between keywords and parentheses please.

Don't comment out code
{code}
  //deleteDirectories(conf, snapshotTempPath);
{code}
If you don't want it in there, remove it. 

If anything, the changes made to a file should look the same as the code above 
and below. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-01 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991344#comment-15991344
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

And I fixed below two minor issues found during this in this same patch(I hope 
its fine, if not i will put diff patch)

#HBASE-17900 -  VerifyReplication - input param variables declared as static 
causing issues to run VerifyReplication multiple times in single JVM(mainly for 
unit tests)
#HBASE-17899 - VerifyReplication not exiting for invalid arguments even though 
we have check for it.


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-01 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991274#comment-15991274
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

FYI  [~lhofhansl] / [~churromorales]

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-05-01 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990837#comment-15990837
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

[~apurtell] , could you please take a look at this please. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989648#comment-15989648
 ] 

Hadoop QA commented on HBASE-16466:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
53s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
31s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
33s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 17 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
59m 1s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 138m 44s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
43s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 225m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncSnapshotAdminApi |
| Timed out junit tests | org.apache.hadoop.hbase.client.TestFromClientSide |
|   | org.apache.hadoop.hbase.TestMovedRegionsCleaner |
|   | org.apache.hadoop.hbase.TestRegionRebalancing |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865624/HBASE-16466.v3.patch |
| JIRA Issue | HBASE-16466 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 7ba7efa1a49b 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 6edb8f8 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6627/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6627/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/6627/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6627/testReport/ |
| modules | C: hbase-server U: hbase-server |
|

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989441#comment-15989441
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Added v3 with single commit. And removed unused imports and format issues (I 
think so :) )

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989439#comment-15989439
 ] 

Ted Yu commented on HBASE-16466:


[~apurtell]:
Can you take a look at the patch ?

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989352#comment-15989352
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Yes looks like that created mess. Just give me sometime. Will come up with v3 
in sometime with single commit and also double check format issues. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989330#comment-15989330
 ] 

Ted Yu commented on HBASE-16466:


In the middle of patch v2:
{code}
Subject: [PATCH 2/2] #HBASE-16466 Snapshot support in VerifyRep - review
 comments related to format and unused imports
{code}
Judging from the size of v2, looks like you put two patches in one file.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989325#comment-15989325
 ] 

Ted Yu commented on HBASE-16466:


A few places from v2 where comment was not addressed.
Position of curly and long line:
{code}
+if(peerSnapshotName!=null)
+{
...
+  LOG.info("Using peer snapshot-"+peerSnapshotName +" with temp 
dir:"+peerSnapshotTmpDir +" peer root uri:"+FSUtils.getRootDir(peerConf)
{code}
Unused import:
{code}
+import org.apache.hadoop.hbase.client.Table;
{code}

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989316#comment-15989316
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Yes I did that across my changes. I used eclipse apache code formatter for 
that. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989313#comment-15989313
 ] 

Ted Yu commented on HBASE-16466:


Can you check patch v2 to see if my comments are addressed ?

I didn't comment on every place where curly braces are added - please check all 
the places.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989312#comment-15989312
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi Ted, 

Thanks a lot for round-the-clock reviews. 
Added release note. 
Is there anything else I have to do?

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988519#comment-15988519
 ] 

Ted Yu commented on HBASE-16466:


Please fill out release note.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988459#comment-15988459
 ] 

Ted Yu commented on HBASE-16466:


I downloaded 16466.v2.patch and found little difference between v1 and v2.

Please check.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988204#comment-15988204
 ] 

Ted Yu commented on HBASE-16466:


I was looking at:
https://reviews.apache.org/r/58827/diff/1#index_header

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988195#comment-15988195
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

I added new diff to review request with changes you suggested - 
https://reviews.apache.org/r/58827/file/1421/ 
I am using review board for the first time, please let me know if I did some 
thing wrong. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988192#comment-15988192
 ] 

Ted Yu commented on HBASE-16466:


I looked at review board which doesn't seem to address my comments above.

Keeping the new test in TestReplicationSmallTests is fine - the whole test took 
110 seconds in QA run.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988158#comment-15988158
 ] 

Hadoop QA commented on HBASE-16466:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
7s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 18 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 51s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 108m 22s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 148m 46s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.snapshot.TestMobSecureExportSnapshot |
|   | hadoop.hbase.snapshot.TestMobExportSnapshot |
|   | hadoop.hbase.snapshot.TestSecureExportSnapshot |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865447/HBASE-16466.v2.patch |
| JIRA Issue | HBASE-16466 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux eec18e56f4a3 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 4bc0eb3 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6613/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6613/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/6613/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6613/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output |

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988027#comment-15988027
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi Ted, 

Attached v2 patch with format fixes. 

I ran tests without snapshot and it took 76sec and including snapshot unit 
tests it took 87 sec, so its just 10sec more time. 
I think if I separate them into  their own tests then we will have unnecessary 
two clusters setup burden which will take more time. 

---
 T E S T S
---
Running org.apache.hadoop.hbase.replication.TestReplicationSmallTests
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 76.917 sec - 
in org.apache.hadoop.hbase.replication.TestReplicationSmallTests


---
 T E S T S
---
Running org.apache.hadoop.hbase.replication.TestReplicationSmallTests
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.247 sec - 
in org.apache.hadoop.hbase.replication.TestReplicationSmallTests
Results :
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0

I ran just snapshot test and it took 25 sec which includes setup.

---
 T E S T S
---
Running org.apache.hadoop.hbase.replication.TestReplicationSmallTests
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 25.409 sec - in 
org.apache.hadoop.hbase.replication.TestReplicationSmallTests
Results :
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

154 sec for one test is too much. it might be a bad run. Can you please try 
again.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987963#comment-15987963
 ] 

Ted Yu commented on HBASE-16466:


On Apache Jenkins:
{code}
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 106 sec - in 
org.apache.hadoop.hbase.replication.TestReplicationSmallTests
{code}
On my Mac, TestReplicationSmallTests#testVerifyReplicationWithSnapshotSupport 
took:
{code}
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.749 sec - 
in org.apache.hadoop.hbase.replication.TestReplicationSmallTests
{code}
Can you create a new replication test class to host the new tests ?

I looked at the review board. Other than format suggestions, I don't have much 
- so I will wait for the next patch.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987955#comment-15987955
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Thanks Ted for initial review. 

Added review request - https://reviews.apache.org/r/58827/  ( I think its not 
too late :) )

restoreDefaults() clears all input variables to default before every run as 
they are static variables. 
If they are static then you cannot run two VerifyReplication jobs at the same 
time(which we are planning to do). 
I made all of them class level variables so multiple jobs wont impact each 
other.

I will update patch with other improvements you suggested.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987941#comment-15987941
 ] 

Ted Yu commented on HBASE-16466:


{code}
+import org.apache.hadoop.hbase.client.HTable;
{code}
The above is not used.
{code}
+  TableName tableName) throws IOException
+  {
{code}
Normally the right curly is at the end of previous line (applies to several 
other places).
{code}
+  long startTime = 0;
+  long endTime = Long.MAX_VALUE;
{code}
Why are the above no longer static ?
{code}
+  String peerFSAddress = null;
+  String peerHBaseRootAddress = null;
{code}
Add comment explaining what the above are for.
{code}
+  endRow = 
((TableSnapshotInputFormat.TableSnapshotRegionSplit)tableSplit).getRegionInfo().getEndKey();
{code}
Wrap long line - 100 chars max.
Enclose the line in pair of curlies (for both if and else).
{code}
+  scan.setStopRow(endRow);
{code}
The above can be lifted out of if / else since it is common.
{code}
-restoreDefaults();
{code}
Why is the above dropped ?

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987892#comment-15987892
 ] 

Ted Yu commented on HBASE-16466:


Can you put patch on review board ?

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987890#comment-15987890
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi [~yuzhih...@gmail.com] , 

Added new patch for master branch and verified unit tests also. 

Added one more validation in new patch against master branch i.e. We cannot use 
recompare option and snapshots option both together as recompare is against 
actual tables assuming some delays and retry may pass.  and in case of 
snapshots recompare against actual tables is not valid and at the same time 
recompare against same snapshot wont change anything because snapshots are 
immutable. 


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987620#comment-15987620
 ] 

Ted Yu commented on HBASE-16466:


Please attach patch for master branch.
As you said, the new test would pass since master branch uses hadoop 2.7.1

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987608#comment-15987608
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi Ted, 

In my local setup I tried with HBase 0.98 , HBase 1.3 with Hadoop 2.7.3 and its 
passing(thats our setup :(). 
When I try with Hadoop 2.5.1 which is default configured with HBase 1.3 it is 
failing with not able to find jar error 
"java.io.FileNotFoundException: File does not exist: 
hdfs://localhost:51085/Users/smaddineni/.m2/repository/org/apache/hbase/hbase-common/1.3.1/hbase-common-1.3.1.jar"

I am looking into it. I tried by not calling 
addTableCoprocessorJarsToClasspath() and still no luck.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987590#comment-15987590
 ] 

Ted Yu commented on HBASE-16466:


testVerifyReplicationWithSnapshotSupport fails locally - please fix the test in 
the next patch.
Check whether addTableCoprocessorJarsToClasspath() works as expected.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986139#comment-15986139
 ] 

Hadoop QA commented on HBASE-16466:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
52s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
6s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} branch-1.3 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 50s 
{color} | {color:red} hbase-server in branch-1.3 has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 51 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
16m 44s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 13s 
{color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 2 
total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 100m 10s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 148m 55s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Write to static field 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication.startTime from 
instance method 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication.doCommandLine(String[])
  At VerifyReplication.java:from instance method 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication.doCommandLine(String[])
  At VerifyReplication.java:[line 384] |
| Failed junit tests | hadoop.hbase.replication.TestReplicationSmallTests |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.03.0-ce

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-27 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986079#comment-15986079
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Ted, I have below perf numbers as of now. Will get numbers on load impact. 

I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 
16. 

Rows  NORMALWITH_SNAPSHOTS
---
1m   1m16s36s
10m 6m15s1m13s
500m5h20m30s  8m40s   

With snapshots I am able to complete VerifyReplication job in 8 minutes instead 
of 5 hours using normal table scan approach. 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985969#comment-15985969
 ] 

Ted Yu commented on HBASE-16466:


Can you put the patch on review board ?

Normally we start with patch for master branch.

Can you measure the reduction in load using your patch ?

Thanks

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-10 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963874#comment-15963874
 ] 

Guanghao Zhang commented on HBASE-16466:


bq. For example if you want to compare data upto 11AM then take snapshot in 
both clusters after 11AM and then run VerifyReplication using both snapshots.
[~sukuna...@gmail.com] You mean use a same timestamp (11AM) when read the 
snapshot? Look forward to your patch.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-10 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963364#comment-15963364
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Hi [~zghaobac] , In this tool we are not taking snapshots. We are simply using 
existing snapshots given as input params to compare state at that time across 
peers. 
For example if you want to compare data upto 11AM then take snapshot in both 
clusters after 11AM and then run VerifyReplication using both snapshots. You 
can run this as many times as you want and get same result as snapshots wont 
change with live data. 

This is useful to reduce load on HBase while running on large tables and also 
useful to run same job multiple times to debug data mismatch issues due to 
Replication or something else. 


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-10 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963348#comment-15963348
 ] 

Maddineni Sukumar commented on HBASE-16466:
---

Yes I would like to... Will submit patch in a day... 
Sorry for the confusion [~andrew.purt...@gmail.com].
I will delete [~sukumaddineni]  id and keep only [~sukuna...@gmail.com]. 



> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-04-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963339#comment-15963339
 ] 

Andrew Purtell commented on HBASE-16466:


Did you still want to do this [~sukuna...@gmail.com] / [~sukumaddineni] . Also 
you should consider deleting one of your JIRA profiles so we don't get 
confused. :-) 

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2016-08-24 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434476#comment-15434476
 ] 

Guanghao Zhang commented on HBASE-16466:


If the table always have new data, how to make sure the snapshot is same in 
both clusters?

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2016-08-23 Thread Sukumar Maddineni (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433329#comment-15433329
 ] 

Sukumar Maddineni commented on HBASE-16466:
---

Yes you are correct [~zghaobac]. Take snapshot both sides and read it from 
snapshots. that will save some load on HBase in both clusters, again this is an 
option, user can choose or not. By default off. 



> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2016-08-22 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431893#comment-15431893
 ] 

Guanghao Zhang commented on HBASE-16466:


You are not in the list of contributors for the project so you can't see the 
"Assign to me" button..

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2016-08-22 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431886#comment-15431886
 ] 

Guanghao Zhang commented on HBASE-16466:


You mean use TableSnapshotScanner to read data from hdfs directly?

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2016-08-22 Thread Sukumar Maddineni (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431366#comment-15431366
 ] 

Sukumar Maddineni commented on HBASE-16466:
---

I am working on this and  I am not getting option to assign it to myself.

> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

54 matches

Mail list logo