lw309637554 commented on a change in pull request #1810:
URL: https://github.com/apache/hudi/pull/1810#discussion_r465133092
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -267,9 +267,16 @@ public Operation convert(String value) throws
ParameterException {
description = "Should duplicate records from source be
dropped/filtered out before insert/bulk-insert")
public Boolean filterDupes = false;
+ //will abandon in the future version, recommended use --enable-sync
Review comment:
agree with you ,and i will do it
##########
File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -255,6 +262,43 @@ private[hudi] object HoodieSparkSqlWriter {
hiveSyncConfig
}
+ private def metaSync(parameters: Map[String, String],
+ basePath: Path,
+ hadoopConf: Configuration): Boolean = {
+ val hiveSyncEnabled = parameters.get(HIVE_SYNC_ENABLED_OPT_KEY).exists(r
=> r.toBoolean)
+ var metaSyncEnabled = parameters.get(HUDI_SYNC_ENABLED_OPT_KEY).exists(r
=> r.toBoolean)
+ var syncClientToolClass = parameters.get(SYNC_CLIENT_TOOL_CLASS).get
+ // for backward compatibility
+ if (hiveSyncEnabled) {
+ metaSyncEnabled = true
+ syncClientToolClass = DEFAULT_SYNC_CLIENT_TOOL_CLASS
Review comment:
yes, when user set hiveSyncEnabled and --sync-tool-classes, sync both
hive and --sync-tool-classes make sense. i will fix it
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamerMetrics.java
##########
@@ -67,10 +77,15 @@ String getMetricsName(String action, String metric) {
return config == null ? null : String.format("%s.%s.%s", tableName,
action, metric);
}
- public void updateDeltaStreamerMetrics(long durationInNs, long hiveSyncNs) {
+ public void updateDeltaStreamerMetrics(long durationInNs, long syncNs,
boolean hiveSync) {
if (config.isMetricsOn()) {
Metrics.registerGauge(getMetricsName("deltastreamer", "duration"),
getDurationInMs(durationInNs));
- Metrics.registerGauge(getMetricsName("deltastreamer",
"hiveSyncDuration"), getDurationInMs(hiveSyncNs));
+ if (hiveSync) {
+ Metrics.registerGauge(getMetricsName("deltastreamer",
"hiveSyncDuration"), getDurationInMs(syncNs));
+ } else {
+ Metrics.registerGauge(getMetricsName("deltastreamer",
"metaSyncDuration"), getDurationInMs(syncNs));
Review comment:
i have do it , different sync tool class have its own metrics with name
of sync class
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -442,7 +449,8 @@ private void refreshTimeline() throws IOException {
long overallTimeMs = overallTimerContext != null ?
overallTimerContext.stop() : 0;
// Send DeltaStreamer Metrics
- metrics.updateDeltaStreamerMetrics(overallTimeMs, hiveSyncTimeMs);
+ metrics.updateDeltaStreamerMetrics(overallTimeMs, hiveSyncTimeMs, true);
+ metrics.updateDeltaStreamerMetrics(overallTimeMs, metaSyncTimeMs, false);
Review comment:
ok , have do this in syncMeta
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]