yhyyz commented on issue #6900:
URL: https://github.com/apache/hudi/issues/6900#issuecomment-1286415788

   @nsivabalan  Thanks for you help. 
   1.  using structured streaming multiple stream query instead of 
`forEachBatch` and  with the following properties. Application running for 19 
hours without any errors.  But if set `hoodie.embed.timeline.server=true`, 
error occurred `UpsertPartitioner: Error trying to compute average 
bytes/record,... Caused by: java.io.FileNotFoundException: No such file or 
directory ..../.hoodie/....commit`.
   ```
   hoodie.datasource.hive_sync.enable=false
   hoodie.upsert.shuffle.parallelism=20
   hoodie.insert.shuffle.parallelism=20
   hoodie.keep.min.commits=6
   hoodie.keep.max.commits=7
   hoodie.parquet.small.file.limit=52428800
   hoodie.index.type=GLOBAL_BLOOM
   
hoodie.datasource.write.payload.class=org.apache.hudi.common.model.DefaultHoodieRecordPayload
   
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator
   hoodie.metadata.enable=true
   hoodie.cleaner.commits.retained=3
   hoodie.clean.async=false
   hoodie.clean.automatic=true
   hoodie.archive.async=false
   hoodie.datasource.compaction.async.enable=true
   hoodie.write.markers.type=DIRECT
   hoodie.embed.timeline.server=false
   hoodie.embed.timeline.server.async=false
   ```
   2. using `forEachBatch` with multiple thread,enable inline compaction 
instead of offline compaction and with following properties, error occurred 
`UpsertPartitioner: Error trying to compute average bytes/record,... Caused by: 
java.io.FileNotFoundException: No such file or directory 
..../.hoodie/....commit`, but application  still runs. I will set 
`hoodie.embed.timeline.server=false` to test again, any new information I will 
sync here.
   ```
   hoodie.datasource.hive_sync.enable=false
   hoodie.upsert.shuffle.parallelism=20
   hoodie.insert.shuffle.parallelism=20
   hoodie.keep.min.commits=6
   hoodie.keep.max.commits=7
   hoodie.parquet.small.file.limit=52428800
   hoodie.index.type=GLOBAL_BLOOM
   
hoodie.datasource.write.payload.class=org.apache.hudi.common.model.DefaultHoodieRecordPayload
   
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator
   hoodie.metadata.enable=true
   hoodie.cleaner.commits.retained=3
   hoodie.clean.max.commits=5
   hoodie.clean.async=false
   hoodie.clean.automatic=true
   hoodie.archive.async=false
   hoodie.compact.inline=true
   hoodie.datasource.compaction.async.enable=false
   hoodie.write.markers.type=DIRECT
   hoodie.embed.timeline.server=true
   hoodie.embed.timeline.server.async=false
   hoodie.compact.schedule.inline=false
   hoodie.compact.inline.max.delta.commits=2
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to