xloya commented on PR #6619: URL: https://github.com/apache/gravitino/pull/6619#issuecomment-2719886580
> > If I understand correctly, the current version of FileSystem Provider will also be used for Hadoop GVFS. I think using Hadoop's own cache here will cause security issues, because the authenticated FileSystem is no longer encapsulated by GVFS, and others can arbitrarily obtain the authenticated FileSystem through `FileSystem.get()` and do some unauthorized behavior. If you just want to add a filesystem cache on the server side, I think you can provide a cache class. Otherwise, we need to ensure that the client cannot get the authenticated filesystem at will. > > On the Gravitino server side, the fs cache is only used by the Gravitino server. Why can't we use the file system cache? @xloya , can you help to clarify it more clearly. > > In GVFS client, indeed, there will be a security vulnerability if the fs cache is enabled. What I mean is that we could enable hadoop filesystem cache in the server side, but the current `Filesystem Provider` is not only designed for the server, but also for the client. I think we need to consider how to enable Hadoop Filesystem cache only on the server, but not on the client. In addition, the Hadoop GVFS client already has an internal FileSystem cache that is independent of the Hadoop Filesystem cache, so there is no need to use `FileSystem.get()` to cache the FileSystem to improve performance. Using `FileSystem.get()` in the Hadoop GVFS client will cause security issues instead. So from this conclusion, I think there are two ways to solve this problem: 1. Keep the current way of using FileSystem Provider on both the client and the server, and still use Filesystem.newInstance() to always create a new Filesystem. At the same time, add an internal FileSystem cache on the server, just like the implementation of Hadoop GVFS. 2. Consider splitting or modifying the logic of FileSystem Provider to support different logics on the client and server to avoid the security issues of the client's FileSystem cache. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
