CTTY commented on issue #17549: URL: https://github.com/apache/hudi/issues/17549#issuecomment-3667540950
Hi @KiteSoar , thanks for jumping on this! We do need some help on this issue. I don't have exact steps in mind right now but here is what we want to achieve: - Forbid using the constructor of concrete Storage implementation(e.g. `HoodieHadoopStorage`) to create storage instances across Hudi code base. - As a replacement, we can add `HoodieStorageFactory` to help users/developers construct a Storage instance when needed based on `StoragePath`, `StorageConfiguration`, and `HOODIE_STORAGE_CLASS` config. The motive is to avoid usages like this: https://github.com/CTTY/hudi/blob/0c0c402a9a03061155b5c43f1f4cdd37726eacfd/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala#L236 ### Why? - `HoodieStorage` is supposed to be an interface that allows downstream users to inject customized storage, so they have more control over the IO operations. - Using `HoodieHadoopStorage` directly across the code base will cause Hudi to step over downstream customized storage implementations Please let me know if you want to work on this or have further questions, I'd be available on Slack as well. Thanks again! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
