jerryshao commented on PR #4232: URL: https://github.com/apache/gravitino/pull/4232#issuecomment-2286492098
Hi @xiaozcy thanks a lot for your work. I think most of us are debating on how to make a meaningful property. Well, this indeed is a problem that should be carefully thought of, I was thinking of a more fundamental problem: how do we support different storage? As we may continue to add more storage support, like adls, gcs, oss, adding all of them as dependencies for Hadoop catalog will make this catalog a dependency hell. In the meantime, for most of the companies, one or two storages should be enough. So I was thinking of introducing a pluggable framework to support different cloud storages, users can configure to make the storage they want work, no need to package all of them together. Can you guys please investigate a bit about other projects like Iceberg, etc, to see how they handle this problem and make an elegant framework? Basically, what I want is a pluggable framework for different storages, users can configure to make it work, besides, they can be in the separate packages, unless you make it work manually, the storages cannot be worked automatically. Meanwhile, I think to support S3, we should also make it work in both Java and Python GVFS implementations, otherwise it cannot be used end to end. So @xiaozcy @yuqi1129 @FANNG1 can you guys please think about my questions and propose a better solution? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
