[GitHub] [hbase] bitterfox commented on a change in pull request #2988: HBASE-25608 Support HFileOutputFormat locality sensitive even destination cluster is different from source cluster

2021-03-16 Thread GitBox


bitterfox commented on a change in pull request #2988:
URL: https://github.com/apache/hbase/pull/2988#discussion_r595719496



##
File path: 
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
##
@@ -657,6 +683,49 @@ public static void configureIncrementalLoadMap(Job job, 
TableDescriptor tableDes
 LOG.info("Incremental table " + tableDescriptor.getTableName() + " output 
configured.");
   }
 
+  /**
+   * Configure HBase cluster key to load region location for 
locality-sensitive it's enabled.
+   * It's not necessary to call this method explicitly when the cluster key 
for HBase cluster to be
+   * used to load region location is configured in the job configuration.
+   * Call this method when another HBase cluster key is configured in the job 
configuration.
+   * For example, you should call when you load data from HBase cluster A using
+   * {@link TableInputFormat} and generate hfiles for HBase cluster B.
+   * Otherwise, HFileOutputFormat2 fetch location from cluster A and 
locality-sensitive won't
+   * working correctly.
+   * {@link #configureIncrementalLoad(Job, Table, RegionLocator)} calls this 
method using
+   * {@link Table#getConfiguration} as clusterConf.
+   * See HBASE-25608.
+   *
+   * @param job which has configuration to be updated
+   * @param clusterConf which contains cluster key of the HBase cluster to be 
locality-sensitive
+   *
+   * @see #configureIncrementalLoad(Job, Table, RegionLocator)
+   * @see #LOCALITY_SENSITIVE_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_QUORUM_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_CLIENT_PORT_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_ZNODE_PARENT_CONF_KEY
+   */
+  public static void configureLocalitySensitiveCluster(Job job, Configuration 
clusterConf) {

Review comment:
   Thank you! I'll fix and fill release note in the issue





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bitterfox commented on a change in pull request #2988: HBASE-25608 Support HFileOutputFormat locality sensitive even destination cluster is different from source cluster

2021-03-15 Thread GitBox


bitterfox commented on a change in pull request #2988:
URL: https://github.com/apache/hbase/pull/2988#discussion_r594873850



##
File path: 
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
##
@@ -657,6 +683,49 @@ public static void configureIncrementalLoadMap(Job job, 
TableDescriptor tableDes
 LOG.info("Incremental table " + tableDescriptor.getTableName() + " output 
configured.");
   }
 
+  /**
+   * Configure HBase cluster key to load region location for 
locality-sensitive it's enabled.
+   * It's not necessary to call this method explicitly when the cluster key 
for HBase cluster to be
+   * used to load region location is configured in the job configuration.
+   * Call this method when another HBase cluster key is configured in the job 
configuration.
+   * For example, you should call when you load data from HBase cluster A using
+   * {@link TableInputFormat} and generate hfiles for HBase cluster B.
+   * Otherwise, HFileOutputFormat2 fetch location from cluster A and 
locality-sensitive won't
+   * working correctly.
+   * {@link #configureIncrementalLoad(Job, Table, RegionLocator)} calls this 
method using
+   * {@link Table#getConfiguration} as clusterConf.
+   * See HBASE-25608.
+   *
+   * @param job which has configuration to be updated
+   * @param clusterConf which contains cluster key of the HBase cluster to be 
locality-sensitive
+   *
+   * @see #configureIncrementalLoad(Job, Table, RegionLocator)
+   * @see #LOCALITY_SENSITIVE_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_QUORUM_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_CLIENT_PORT_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_ZNODE_PARENT_CONF_KEY
+   */
+  public static void configureLocalitySensitiveCluster(Job job, Configuration 
clusterConf) {

Review comment:
   Could you review the proposal release note?
   ```
   Added configurations to specify the ZK cluster key for the remote cluster in 
HFileOutputFormat2.
   These configurations are used to fetch region location for 
locality-sensitive HFile generation.
   Previously, it fetches from cluster which is configured in Job configuration,
   but it does not work well when HBase clusters for input and output are 
different.
   HFileOutputFormat2#configureIncrementalLoad(Job, Table, RegionLocator) 
configure them using Table's configuration.
   You can also configure them by calling 
HFileOutputFormat2#configureRemoteCluster explicitly.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bitterfox commented on a change in pull request #2988: HBASE-25608 Support HFileOutputFormat locality sensitive even destination cluster is different from source cluster

2021-03-15 Thread GitBox


bitterfox commented on a change in pull request #2988:
URL: https://github.com/apache/hbase/pull/2988#discussion_r594846183



##
File path: 
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
##
@@ -657,6 +683,49 @@ public static void configureIncrementalLoadMap(Job job, 
TableDescriptor tableDes
 LOG.info("Incremental table " + tableDescriptor.getTableName() + " output 
configured.");
   }
 
+  /**
+   * Configure HBase cluster key to load region location for 
locality-sensitive it's enabled.
+   * It's not necessary to call this method explicitly when the cluster key 
for HBase cluster to be
+   * used to load region location is configured in the job configuration.
+   * Call this method when another HBase cluster key is configured in the job 
configuration.
+   * For example, you should call when you load data from HBase cluster A using
+   * {@link TableInputFormat} and generate hfiles for HBase cluster B.
+   * Otherwise, HFileOutputFormat2 fetch location from cluster A and 
locality-sensitive won't
+   * working correctly.
+   * {@link #configureIncrementalLoad(Job, Table, RegionLocator)} calls this 
method using
+   * {@link Table#getConfiguration} as clusterConf.
+   * See HBASE-25608.
+   *
+   * @param job which has configuration to be updated
+   * @param clusterConf which contains cluster key of the HBase cluster to be 
locality-sensitive
+   *
+   * @see #configureIncrementalLoad(Job, Table, RegionLocator)
+   * @see #LOCALITY_SENSITIVE_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_QUORUM_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_CLIENT_PORT_CONF_KEY
+   * @see #LOCALITY_SENSITIVE_ZOOKEEPER_ZNODE_PARENT_CONF_KEY
+   */
+  public static void configureLocalitySensitiveCluster(Job job, Configuration 
clusterConf) {

Review comment:
   Thank you for your review! It seems to me 'remote cluster' is a better 
name than what I had.
   I'll update PR and also prepare a release note soon





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org