hudi-bot opened a new issue, #16458:
URL: https://github.com/apache/hudi/issues/16458

   The set of configuration parameters for Compaction service is confusing.
   
   In HoodieCompationConfig:
   * hoodie.compact.inline
   * hoodie.compact.schedule.inline
   * hoodie.log.compaction.enable
   * hoodie.log.compaction.inline
   * hoodie.compact.inline.max.delta.commits
   * hoodie.compact.inline.max.delta.seconds
   * hoodie.compact.inline.trigger.strategy
   * hoodie.parquet.small.file.limit
   * hoodie.record.size.estimation.threshold
   * hoodie.compaction.target.io
   * hoodie.compaction.logfile.size.threshold
   * hoodie.compaction.logfile.num.threshold
   * hoodie.compaction.strategy
   * hoodie.compaction.daybased.target.partitions
   * hoodie.copyonwrite.insert.split.size
   * hoodie.copyonwrite.insert.auto.split
   * hoodie.copyonwrite.record.size.estimate
   * hoodie.log.compaction.blocks.threshold
   
   In FlinkOptions:
   * compaction.async.enabled
   * compaction.schedule.enabled
   * compaction.delta_commits
   * compaction.delta_seconds
   * compaction.trigger.strategy
   * compaction.target_io
   * compaction.max_memory
   * compaction.tasks
   * compaction.timeout.seconds
   
   Need to refactor naming with saving backward compatibility.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-7646
   - Type: Improvement
   - Fix version(s):
     - 1.1.0
   
   
   ---
   
   
   ## Comments
   
   22/Apr/24 08:11;geserdugarov;The main question is what options are 
preferable, with ".inline" or with ".async" naming. The current distribution is 
the following.
   
   Using ".inline":
   * hoodie.compact.inline
   * hoodie.compact.schedule.inline
   * hoodie.log.compaction.inline
   * hoodie.clustering.inline
   * hoodie.clustering.schedule.inline
   * hoodie.partition.ttl.inline
   
   Using ".async":
   * hoodie.clean.async.enabled
   * clean.async.enabled
   * compaction.async.enabled
   * hoodie.kafka.compaction.async.enable
   * hoodie.clustering.async.enabled
   * clustering.async.enabled
   * hoodie.archive.async
   * hoodie.embed.timeline.server.async
   * hoodie.metadata.index.async
   * hoodie.datasource.compaction.async.enable
   
   Looks like it's preferable to move toward ".async" option.
   
   And from user point of view, it's more obvious what ".async" means in 
comparing with ".inline", which needs to clarify the Hudi write process for a 
user.;;;
   
   ---
   
   09/May/24 17:32;geserdugarov;Prepared local environment for TPC-H benchmark 
running. I will research Compaction parameters configuration from user point of 
view.;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to