SrinivasMote opened a new issue, #12854:
URL: https://github.com/apache/hudi/issues/12854
**Describe the problem you faced**
Seeing lot of Messages in the Spark Log File for Every Parquet file the
Spark job is Writing to S3 with Hudi Configuration
We are currently on Hudi 0.11.0 and we don't see these log Messages until we
enable hoodie.write.commit.callback.on = true but looks like in Hudi 15 , this
is default and Call Back Feature is moved to a Different Objects.
INFO internal.DataSourceInternalWriterHelper: Received commit of a data
writer = HoodieWriterCommitMessage{writeStatuses=[WriteStatus {fileId=,
writeStat=HoodieWriteStat{fileId='', path='.parquet', prevCommit='null',
numWrites= , numDeletes=0, numUpdateWrites=0, totalWriteBytes= ,
totalWriteErrors=0, tempPath='null', cdcStats='null', partitionPath='',
totalLogRecords=0, totalLogFilesCompacted=0, totalLogSizeCompacted=0,
totalUpdatedRecordsCompacted=0, totalLogBlocks=0, totalCorruptLogBlock=0,
totalRollbackBlocks=0}, globalError='null', hasErrors='false', errorCount='0',
errorPct='0.0'}]}
**To Reproduce**
Steps to reproduce the behavior:
1. Just Run Any Spark job With hudi 0.15.0 and you will notice this Error
**Environment Description**
* Hudi version : 0.15.0
* Spark version : 3.5.1
* EMR Version : 7.3
* Storage (HDFS/S3/GCS..) : S3
* Running on Docker? (yes/no) :no
**Additional context**
I have tried Disabling following Parameters but this did not help .
hoodie.cleaner.incremental.mode=false
hoodie.embed.timeline.server=false
My Hudi Configs :
class org.apache.hudi.spark3.internal.HoodieDataSourceInternalTable,
hoodie.cleaner.incremental.mode=true,
hoodie.payload.ordering.field=trepprealid,
hoodie.sensitive.config.keys=ssl,tls,sasl,auth,credentials,
hoodie.datasource.write.insert.drop.duplicates=false,
hoodie.index.hbase.qps.allocator.class=org.apache.hudi.index.hbase.DefaultHBaseQPSResourceAllocator,
hoodie.clustering.plan.strategy.single.group.clustering.enabled=true,
hoodie.memory.merge.fraction=0.6,
hoodie.client.init.callback.classes=,
hoodie.bucket.index.num.buckets=256,
hoodie.datasource.hive_sync.database=presentation_dev,
hoodie.filesystem.view.remote.port=26754,
hoodie.metrics.lock.enable=false,
hoodie.metadata.record.index.max.filegroup.count=10000,
hoodie.global.simple.index.parallelism=100,
hoodie.clustering.schedule.inline=false,
hoodie.combine.before.insert=false,
hoodie.clustering.inline.max.commits=4,
hoodie.fail.writes.on.inline.table.service.exception=true,
hoodie.write.lock.zookeeper.connection_timeout_ms=15000,
hoodie.bloom.index.keys.per.bucket=10000000,
hoodie.write.concurrency.async.conflict.detector.period_ms=30000,
hoodie.datasource.write.row.writer.enable=true,
hoodie.embed.timeline.server=true,
hoodie.parquet.small.file.limit=104857600
**Stacktrace**
```Add the stacktrace of the error.```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]