[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2017-01-13 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Status: Patch Available  (was: In Progress)

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-15860.master.002.patch, 
> HBASE-15860.master.002.patch, HBASE-15860.master.002.patch, HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2017-01-13 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Attachment: 0001-HBASE-15860.master.002.patch

Attaching a rebased version of the patch, since last one is already 6 months 
old.

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-15860.master.002.patch, 
> HBASE-15860.master.002.patch, HBASE-15860.master.002.patch, HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2017-01-13 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Status: In Progress  (was: Patch Available)

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-15860.master.002.patch, 
> HBASE-15860.master.002.patch, HBASE-15860.master.002.patch, HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2016-06-22 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez updated HBASE-15860:
--
Assignee: Wellington Chevreuil

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-15860.master.002.patch, 
> HBASE-15860.master.002.patch, HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2016-06-13 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Attachment: HBASE-15860.master.002.patch

Hi [~tedyu], these tests are passing locally. I'm re-attaching the patch, 
hopefully it will get a clean run now.

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-15860.master.002.patch, 
> HBASE-15860.master.002.patch, HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2016-06-13 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Attachment: HBASE-15860.master.002.patch

Thanks for reviewing it, [~esteban]! Attaching a new patch with the suggested.

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-15860.master.002.patch, HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2016-05-19 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Status: Patch Available  (was: Open)

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters

2016-05-19 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15860:
-
Attachment: HBASE-15860.patch

Proposed initial changes, using *DFSUtil.getRpcAddressesForNameserviceId*;

Didn't let the catch for the case *DFSUtil.getRpcAddressesForNameserviceId* 
method is not available (hadoop 2.4 or older), because that wouldn't work 
properly with ha nameservices anyway;

Added tests conditions to validate scenarios when multiple nameservices ids are 
defined;

Additional thought is if we really need to invoke *DFSUtil* methods via 
reflection, instead of simply refer them directly. Any comments on this?

> Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
> ---
>
> Key: HBASE-15860
> URL: https://issues.apache.org/jira/browse/HBASE-15860
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.0
>Reporter: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-15860.patch
>
>
> HBASE-14280 introduced fix for bulkload failures when referring a remote 
> cluster name service id if "bulkloading" from a HA cluster.
> HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke 
> *DFSUtil.getNNServiceRpcAddressesForCluster* instead of 
> *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above.
> Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" 
> instead, which already returns only addresses for specific nameservice 
> informed. This is available since hadoop 2.4.
> Sample proposal on FSHDFSUtils.getNNAddresses:
> ...
> {noformat}
>  String nameServiceId = serviceName.split(":")[1];
> if (dfsUtilClazz == null) {
>   dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil");
> }
> if (getNNAddressesMethod == null) {
>   getNNAddressesMethod =
>   dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", 
> Configuration.class,
> String.class, String.class);
> }
> Map nnMap =
> (Map) getNNAddressesMethod
> .invoke(null, conf, nameServiceId, null);
> for (Map.Entry e2 : nnMap.entrySet()) {
> InetSocketAddress addr = e2.getValue();
> addresses.add(addr);
> }
> ...
> {noformat}
> Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario 
> when multiple name service ids are defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)