SrinivasMote opened a new issue, #12854:
URL: https://github.com/apache/hudi/issues/12854

   **Describe the problem you faced** 
   Seeing lot of Messages in the Spark Log File for Every Parquet file the 
Spark job is Writing to S3 with Hudi Configuration
   We are currently on Hudi 0.11.0 and we don't see these log Messages until we 
enable hoodie.write.commit.callback.on = true but looks like in Hudi 15 , this 
is default and Call Back Feature is moved to a Different Objects.
   
    INFO internal.DataSourceInternalWriterHelper: Received commit of a data 
writer = HoodieWriterCommitMessage{writeStatuses=[WriteStatus {fileId=, 
writeStat=HoodieWriteStat{fileId='', path='.parquet', prevCommit='null', 
numWrites= , numDeletes=0, numUpdateWrites=0, totalWriteBytes= , 
totalWriteErrors=0, tempPath='null', cdcStats='null', partitionPath='', 
totalLogRecords=0, totalLogFilesCompacted=0, totalLogSizeCompacted=0, 
totalUpdatedRecordsCompacted=0, totalLogBlocks=0, totalCorruptLogBlock=0, 
totalRollbackBlocks=0}, globalError='null', hasErrors='false', errorCount='0', 
errorPct='0.0'}]}
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Just Run Any Spark job With hudi 0.15.0 and you will notice this Error 
   
   
   **Environment Description**
   
   * Hudi version : 0.15.0
   
   * Spark version : 3.5.1
   
   * EMR Version : 7.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) :no
   
   
   **Additional context**
   I have tried Disabling following Parameters but this did not help . 
   hoodie.cleaner.incremental.mode=false
   hoodie.embed.timeline.server=false
   
   My Hudi Configs : 
   class org.apache.hudi.spark3.internal.HoodieDataSourceInternalTable, 
   hoodie.cleaner.incremental.mode=true, 
   hoodie.payload.ordering.field=trepprealid, 
   hoodie.sensitive.config.keys=ssl,tls,sasl,auth,credentials, 
   hoodie.datasource.write.insert.drop.duplicates=false, 
hoodie.index.hbase.qps.allocator.class=org.apache.hudi.index.hbase.DefaultHBaseQPSResourceAllocator,
 hoodie.clustering.plan.strategy.single.group.clustering.enabled=true, 
   hoodie.memory.merge.fraction=0.6, 
   hoodie.client.init.callback.classes=, 
   hoodie.bucket.index.num.buckets=256, 
   hoodie.datasource.hive_sync.database=presentation_dev, 
   hoodie.filesystem.view.remote.port=26754, 
   hoodie.metrics.lock.enable=false, 
   hoodie.metadata.record.index.max.filegroup.count=10000, 
   hoodie.global.simple.index.parallelism=100,
    hoodie.clustering.schedule.inline=false, 
   hoodie.combine.before.insert=false, 
   hoodie.clustering.inline.max.commits=4, 
   hoodie.fail.writes.on.inline.table.service.exception=true, 
   hoodie.write.lock.zookeeper.connection_timeout_ms=15000,
    hoodie.bloom.index.keys.per.bucket=10000000, 
   hoodie.write.concurrency.async.conflict.detector.period_ms=30000, 
   hoodie.datasource.write.row.writer.enable=true, 
   hoodie.embed.timeline.server=true, 
   hoodie.parquet.small.file.limit=104857600
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to