bitterfox commented on a change in pull request #2988:
URL: https://github.com/apache/hbase/pull/2988#discussion_r594873850
##########
File path:
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
##########
@@ -657,6 +683,49 @@ public static void configureIncrementalLoadMap(Job job,
TableDescriptor tableDes
LOG.info("Incremental table " + tableDescriptor.getTableName() + " output
configured.");
}
+ /**
+ * Configure HBase cluster key to load region location for
locality-sensitive it's enabled.
+ * It's not necessary to call this method explicitly when the cluster key
for HBase cluster to be
+ * used to load region location is configured in the job configuration.
+ * Call this method when another HBase cluster key is configured in the job
configuration.
+ * For example, you should call when you load data from HBase cluster A using
+ * {@link TableInputFormat} and generate hfiles for HBase cluster B.
+ * Otherwise, HFileOutputFormat2 fetch location from cluster A and
locality-sensitive won't
+ * working correctly.
+ * {@link #configureIncrementalLoad(Job, Table, RegionLocator)} calls this
method using
+ * {@link Table#getConfiguration} as clusterConf.
+ * See HBASE-25608.
+ *
+ * @param job which has configuration to be updated
+ * @param clusterConf which contains cluster key of the HBase cluster to be
locality-sensitive
+ *
+ * @see #configureIncrementalLoad(Job, Table, RegionLocator)
+ * @see #LOCALITY_SENSITIVE_CONF_KEY
+ * @see #LOCALITY_SENSITIVE_ZOOKEEPER_QUORUM_CONF_KEY
+ * @see #LOCALITY_SENSITIVE_ZOOKEEPER_CLIENT_PORT_CONF_KEY
+ * @see #LOCALITY_SENSITIVE_ZOOKEEPER_ZNODE_PARENT_CONF_KEY
+ */
+ public static void configureLocalitySensitiveCluster(Job job, Configuration
clusterConf) {
Review comment:
Could you review the proposal release note?
```
Added configurations to specify the ZK cluster key for the remote cluster in
HFileOutputFormat2.
These configurations are used to fetch region location for
locality-sensitive HFile generation.
Previously, it fetches from cluster which is configured in Job configuration,
but it does not work well when HBase clusters for input and output are
different.
HFileOutputFormat2#configureIncrementalLoad(Job, Table, RegionLocator)
configure them using Table's configuration.
You can also configure them by calling
HFileOutputFormat2#configureRemoteCluster explicitly.
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]