coolderli opened a new issue, #4558:
URL: https://github.com/apache/gravitino/issues/4558

   ### Describe the feature
   
   Implement fuse for gvfs to support mounting fileset to local directories. 
The instance defaults to mounting 
`fileset://fileset/fileset_catalog/schema/fileset_name` to 
`/fileset/fileset_catalog/schema/fileset_name`. So we can access it via posix 
protocol. In addition, we can support mounting to user-defined directories, so 
that users do not need to modify any code.
   
   ### Motivation
   
   In AI scenarios, users often use the posix protocol to access data. The data 
is stored in media such as JuiceFS, NAS, or CPFS, and then mounted to a local 
directory.
   Directly using these storage has the following disadvantages:
   1. Authentication issues: Each product has its authentication system, and 
users have a high access cost
   2. Authorization issues: When users share data, they need to share the 
permissions of the entire volume, and there is no way to perform fine-grained 
permission control
   3. Audit issues: Some NAS products lack audit logs, making it impossible to 
perform effective audits and data deletion
   
   ### Describe the solution
   
   1. Use the underlying fuse directly.  
   This means that fileset needs to manage a local directory. I think this is 
not a good solution, users will bypass gvfs without any benefits.
   
   2. Using fsspec fuse to implement gvfs fuse
   fsspec provides the feature of fuse, which supports forwarding fuse 
operations to fsspec fs operations: 
https://filesystem-spec.readthedocs.io/en/latest/features.html#mount-anything-with-fuse
   We could do some optimization based on fsspec fuse to support gvfs fuse.
   
   3. Implement gvfs fuse using JNI to call GravitinoVirtualFileSystem
   
![image](https://github.com/user-attachments/assets/ce303bdc-623a-4845-9b5b-d6f8d56ff06a)
   
   At present, Solution 2 and Solution 3 are similar. Solution 2 is implemented 
by calling Python gvfs, and Solution 3 is implemented by calling Java gvfs.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to