rmahindra123 opened a new pull request #3329:
URL: https://github.com/apache/hudi/pull/3329


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   The External Disk maps: BitCaskMap and RocksDbMap directly create file(s) 
within the base path folder provided as input. This may cause interference 
across different jobs. Also, it will be hard for users to keep track of folders 
or files created by hudi /deltastreamer. 
   
   This PR ensures that both BitcaskMap and RocksDbMap create a unique 
subfolder per instance, and clean up the folder once done. Also, added "hudi" 
prefix and the disk map type prefix to the folder names to ensure easier 
debuggability.
   
   With the PR fix, I have reverted the default config for 
FileSystemViewStorageConfig.java back to /tmp. The reason is that we need to 
provide a base path (currently existing folder) to the External Spillable Map, 
that will internally create sub-folders and clean them up after use. This avoid 
the situation mentioned in HUDI-2090, where different access control across 
users may cause some users jobs to fail.
   
   ## Brief change log
   
   - Changed BitCaskMap and RocksDbDiskMap to ensure it creates a subfolder 
within the base path, and add prefixes. Also ensured the subfolder is deleted 
on close.
   
   ## Verify this pull request
   
   - Added test to ensure the folder is cleaned up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to