joha0123 opened a new issue #3941:
URL: https://github.com/apache/hudi/issues/3941
Our team wants to switch all ETL processes to Hudi, but I have the problem
that I cannot find a way to access the Hudi metrics.
What I am trying to achieve is, that after a commit I am having a
HoodieMetrics object, from which I can pick the metrics that I am interested
in. In the end I need to log the metrics into a database after each commit.
What is the way to do this? I tried setting `hoodie.metrics.on=true` and
`hoodie.metrics.type=INMEMORY`. But I don't unterstand how I can access the
metrics then.
The write looks something like this:
`
def targetTableWriteOptions: Map[String, String] = Map(
OPERATION.key() -> writeOperation,
RECORDKEY_FIELD.key() -> targetTable.recordKeyColumns.mkString(","),
PRECOMBINE_FIELD.key() -> targetTable.precombineColumn,
PARTITIONPATH_FIELD.key() -> targetTable.partitionColumns.mkString(","),
TABLE_NAME.key() -> targetTable.qualifiedName,
"hoodie.table.name" -> targetTable.qualifiedName,
HIVE_DATABASE.key() -> targetTable.database,
HIVE_TABLE.key() -> targetTable.tableName,
HIVE_PARTITION_FIELDS.key() -> targetTable.partitionColumns.mkString(",")
)
def loadFromSourceToTargetTable(): Unit = {
sourceDataframe.write.format("org.apache.hudi")
.options(HudiUtils.basicOptions ++ HudiUtils.basicHiveSyncOptions ++
targetTableWriteOptions ++ HudiUtils.hudiOptionsFromSparkConf)
.mode(saveMode)
.save(targetTable.basePath)
}
`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]