[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Status: Patch Available (was: In Progress) > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: 0001-HBASE-15860.master.002.patch, > HBASE-15860.master.002.patch, HBASE-15860.master.002.patch, HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Attachment: 0001-HBASE-15860.master.002.patch Attaching a rebased version of the patch, since last one is already 6 months old. > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: 0001-HBASE-15860.master.002.patch, > HBASE-15860.master.002.patch, HBASE-15860.master.002.patch, HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Status: In Progress (was: Patch Available) > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: 0001-HBASE-15860.master.002.patch, > HBASE-15860.master.002.patch, HBASE-15860.master.002.patch, HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-15860: -- Assignee: Wellington Chevreuil > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-15860.master.002.patch, > HBASE-15860.master.002.patch, HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Attachment: HBASE-15860.master.002.patch Hi [~tedyu], these tests are passing locally. I'm re-attaching the patch, hopefully it will get a clean run now. > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-15860.master.002.patch, > HBASE-15860.master.002.patch, HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Attachment: HBASE-15860.master.002.patch Thanks for reviewing it, [~esteban]! Attaching a new patch with the suggested. > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-15860.master.002.patch, HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Status: Patch Available (was: Open) > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15860) Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters
[ https://issues.apache.org/jira/browse/HBASE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-15860: - Attachment: HBASE-15860.patch Proposed initial changes, using *DFSUtil.getRpcAddressesForNameserviceId*; Didn't let the catch for the case *DFSUtil.getRpcAddressesForNameserviceId* method is not available (hadoop 2.4 or older), because that wouldn't work properly with ha nameservices anyway; Added tests conditions to validate scenarios when multiple nameservices ids are defined; Additional thought is if we really need to invoke *DFSUtil* methods via reflection, instead of simply refer them directly. Any comments on this? > Improvements for HBASE-14280 - Fixing Bulkload for HDFS HA Clusters > --- > > Key: HBASE-15860 > URL: https://issues.apache.org/jira/browse/HBASE-15860 > Project: HBase > Issue Type: Improvement > Components: util >Affects Versions: 1.0.0 >Reporter: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-15860.patch > > > HBASE-14280 introduced fix for bulkload failures when referring a remote > cluster name service id if "bulkloading" from a HA cluster. > HBASE-14280 solution on *FSHDFSUtils.getNNAddresses* was to invoke > *DFSUtil.getNNServiceRpcAddressesForCluster* instead of > *DFSUtil.getNNServiceRpcAddresses*. This works for hadoop 2.6 and above. > Proposed change here is to use "*DFSUtil.getRpcAddressesForNameserviceId*" > instead, which already returns only addresses for specific nameservice > informed. This is available since hadoop 2.4. > Sample proposal on FSHDFSUtils.getNNAddresses: > ... > {noformat} > String nameServiceId = serviceName.split(":")[1]; > if (dfsUtilClazz == null) { > dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); > } > if (getNNAddressesMethod == null) { > getNNAddressesMethod = > dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", > Configuration.class, > String.class, String.class); > } > Map nnMap = > (Map) getNNAddressesMethod > .invoke(null, conf, nameServiceId, null); > for (Map.Entry e2 : nnMap.entrySet()) { > InetSocketAddress addr = e2.getValue(); > addresses.add(addr); > } > ... > {noformat} > Will also add test conditions for *FSHDFSUtils.isSameHdfs* to verify scenario > when multiple name service ids are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)