Gatsby-Lee commented on issue #4747:
URL: https://github.com/apache/hudi/issues/4747#issuecomment-1032030308


   @nsivabalan 
   
   Again, thank you for your reply.
   To make my question more clear, I posted the configuration I am using. ( 
Hudi 0.9 )
   I disabled all Hudi Table Services Async. ( since I am not sure how Hudi 
Table Services Async works with Hudi Metadata )
   
   Other than "hoodie.clean.automatic", all Hudi Table Services are disabled. ( 
set to false )
   I guess with these configuration, all Hudi Table services either disabled or 
inline.
   
   Q: Even in this case, what you said can happen?
   """
   for eg, if your cleaner is aggressive and cleans up the data files which 
your long running query is still running, it could lead to FileNotFoundIssue.
   """
   
   Thank you
   Gatsby
   
   ```
   {
        "hoodie.datasource.clustering.async.enable": "false",  # default: false
        "hoodie.datasource.clustering.inline.enable": "false",  # default: false
        "hoodie.datasource.compaction.async.enable": "false",  # default: true
        ## compaction
        ## Compaction action merges logs and base files to produce new file 
slices
        ##  and cleaning action gets rid of unused/older file slices to reclaim 
space on DFS
        # @ref: https://hudi.apache.org/docs/concepts/#file-management
        "hoodie.clean.automatic": "true",  # default: true
        "hoodie.clean.async": "false",  # default: false
        "hoodie.cleaner.commits.retained": 5,  # default: 10
        "hoodie.cleaner.policy": "KEEP_LATEST_COMMITS",  # default: 
KEEP_LATEST_COMMITS
        "hoodie.compact.inline": "false",  # default: false
        ## Clustering
        "hoodie.clustering.async.enabled": "false",  # default: false
        "hoodie.clustering.async.max.commits": 4,  # default: 4
        "hoodie.clustering.inline": "false",  # default: false
        ## Metadata
        "hoodie.metadata.clean.async": "false",  # default: false
           ## Multi Writing
           "hoodie.cleaner.policy.failed.writes": "EAGER",  # default: EAGER. 
LAZY is required for Multi Writing
           "hoodie.write.concurrency.mode": "SINGLE_WRITER",  # default: 
SINGLE_WRITER, OPTIMISTIC_CONCURRENCY_CONTROL
           "hoodie.write.lock.provider": 
"org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
           "hoodie.write.lock.zookeeper.port": lock_provider_zookeeper_port,
           "hoodie.write.lock.zookeeper.url": lock_provider_zookeeper_url,
           "hoodie.write.lock.zookeeper.base_path": 
lock_provider_zookeeper_lock_base_path,
           "hoodie.write.lock.zookeeper.lock_key": 
lock_provider_zookeeper_lock_key,
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to