[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-11-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677952#comment-16677952
 ] 

Hadoop QA commented on HBASE-15557:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
11s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  4m 
47s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-15557 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12947209/HBASE-15557.master.002.patch
 |
| Optional Tests |  dupname  asflicense  refguide  |
| uname | Linux 3ce3fcc68a65 4.4.0-137-generic #163-Ubuntu SMP Mon Sep 24 
13:14:43 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 6d46b8d256 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14980/artifact/patchprocess/branch-site/book.html
 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14980/artifact/patchprocess/patch-site/book.html
 |
| Max. process+thread count | 93 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14980/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch, 
> HBASE-15557.master.002.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB 

[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-11-07 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677913#comment-16677913
 ] 

Wellington Chevreuil commented on HBASE-15557:
--

Thanks for noticing that, [~busbey], actually I had submitted new patch file 
with email info corrected.

> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch, 
> HBASE-15557.master.002.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of the data for each range.
> Make the hashes available to the target cluster - I'd recommend using DistCp 
> to copy it across.
> In the target table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.SyncTable and pass it the directory where 
> you put the hashes, and the names of the source and destination tables. You 
> will likely also need to specify the source table's ZK quorum via the 
> --sourcezkcluster option. SyncTable will then read the hash information, and 
> compute the hashes of the same row ranges for the target table. For any row 
> range where the hash fails to match, it will open a remote scanner to the 
> source table, read the data for that range, and do Puts and Deletes to the 
> target table to update it to match the source.
> I hope that clarifies it a bit. Let me know if you need a hand. If anyone 
> wants to work on getting some documentation into the book, I can try to write 
> some more but would love a hand on turning it into an actual book patch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-11-06 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677001#comment-16677001
 ] 

Sean Busbey commented on HBASE-15557:
-

looks great. I'm all set to push this. [~wchevreuil] is the email address on 
the current patch the one you want attribution to go to?

> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of the data for each range.
> Make the hashes available to the target cluster - I'd recommend using DistCp 
> to copy it across.
> In the target table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.SyncTable and pass it the directory where 
> you put the hashes, and the names of the source and destination tables. You 
> will likely also need to specify the source table's ZK quorum via the 
> --sourcezkcluster option. SyncTable will then read the hash information, and 
> compute the hashes of the same row ranges for the target table. For any row 
> range where the hash fails to match, it will open a remote scanner to the 
> source table, read the data for that range, and do Puts and Deletes to the 
> target table to update it to match the source.
> I hope that clarifies it a bit. Let me know if you need a hand. If anyone 
> wants to work on getting some documentation into the book, I can try to write 
> some more but would love a hand on turning it into an actual book patch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-11-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676973#comment-16676973
 ] 

Hadoop QA commented on HBASE-15557:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  6m  
0s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
44s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-15557 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945650/HBASE-15557.master.001.patch
 |
| Optional Tests |  dupname  asflicense  refguide  |
| uname | Linux f7729664dd37 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 86cbbdea9e |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14962/artifact/patchprocess/branch-site/book.html
 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14962/artifact/patchprocess/patch-site/book.html
 |
| Max. process+thread count | 93 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14962/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of 

[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-10-25 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664224#comment-16664224
 ] 

Wellington Chevreuil commented on HBASE-15557:
--

Attached patch with my proposed description for HashTable/SyncTable on the ref 
guide. Please review and let me know on any suggestions.

> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of the data for each range.
> Make the hashes available to the target cluster - I'd recommend using DistCp 
> to copy it across.
> In the target table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.SyncTable and pass it the directory where 
> you put the hashes, and the names of the source and destination tables. You 
> will likely also need to specify the source table's ZK quorum via the 
> --sourcezkcluster option. SyncTable will then read the hash information, and 
> compute the hashes of the same row ranges for the target table. For any row 
> range where the hash fails to match, it will open a remote scanner to the 
> source table, read the data for that range, and do Puts and Deletes to the 
> target table to update it to match the source.
> I hope that clarifies it a bit. Let me know if you need a hand. If anyone 
> wants to work on getting some documentation into the book, I can try to write 
> some more but would love a hand on turning it into an actual book patch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-04-13 Thread Roland Teague (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437596#comment-16437596
 ] 

Roland Teague commented on HBASE-15557:
---

[~davelatham] Can you add documentation for this MR tool to the HBase Ref Guide 
on how to use the tool? This have been open for 2 years now.

> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Priority: Critical
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of the data for each range.
> Make the hashes available to the target cluster - I'd recommend using DistCp 
> to copy it across.
> In the target table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.SyncTable and pass it the directory where 
> you put the hashes, and the names of the source and destination tables. You 
> will likely also need to specify the source table's ZK quorum via the 
> --sourcezkcluster option. SyncTable will then read the hash information, and 
> compute the hashes of the same row ranges for the target table. For any row 
> range where the hash fails to match, it will open a remote scanner to the 
> source table, read the data for that range, and do Puts and Deletes to the 
> target table to update it to match the source.
> I hope that clarifies it a bit. Let me know if you need a hand. If anyone 
> wants to work on getting some documentation into the book, I can try to write 
> some more but would love a hand on turning it into an actual book patch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)