CTTY commented on issue #17549:
URL: https://github.com/apache/hudi/issues/17549#issuecomment-3667540950

   Hi @KiteSoar , thanks for jumping on this! We do need some help on this 
issue. I don't have exact steps in mind right now but here is what we want to 
achieve:
   - Forbid using the constructor of concrete Storage implementation(e.g. 
`HoodieHadoopStorage`) to create storage instances across Hudi code base.
   - As a replacement, we can add `HoodieStorageFactory` to help 
users/developers construct a Storage instance when needed based on 
`StoragePath`, `StorageConfiguration`, and `HOODIE_STORAGE_CLASS` config.
   
   The motive is to avoid usages like this: 
https://github.com/CTTY/hudi/blob/0c0c402a9a03061155b5c43f1f4cdd37726eacfd/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala#L236
 
   ### Why? 
   - `HoodieStorage` is supposed to be an interface that allows downstream 
users to inject customized storage, so they have more control over the IO 
operations. 
   - Using `HoodieHadoopStorage` directly across the code base will cause Hudi 
to step over downstream customized storage implementations
   
   Please let me know if you want to work on this or have further questions, 
I'd be available on Slack as well. Thanks again!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to