lizardoluis opened a new issue, #312:
URL: https://github.com/apache/arrow-rs-object-store/issues/312

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   With `LocalFileSystem`, a (malicious) user could access any file in the 
system without restriction, giving that symlinks that resolve to paths outside 
the root are followed.
   
   **Describe the solution you'd like**
   
   I propose enhancing `LocalFileSystem` by introducing an allow-list mechanism 
to restrict file operations to specified path prefixes. The idea is to define a 
list of allowed paths when constructing `LocalFileSystem`, ensuring that 
operations such as `put`, `delete`, `copy` etc, are only executed within these 
paths.
   
   #### Implementation Proposal
   
   1. Allow-List for Paths
   
   - Introduce an allow-list of path prefixes. If a directory is allowed, all 
its subdirectories should also be allowed.
   - To prevent bypassing this restriction via symbolic links, paths must be 
verified against their canonical form.
   
   2. Two Possible Approaches:
   
   - Modify `LocalFileSystem` to accept an allow-list as a parameter. (Breaking 
change)
   - Alternatively, introduce a separate `SecureLocalFileSystem` object that 
implements this security mechanism while keeping `LocalFileSystem` unchanged.
   
   3. Path Verification Logic
   
   - Introduce a `LocalPathPrefixVerification` enum to handle verification:
   
   ```
   pub enum LocalPathPrefixVerification {
       Unverified, // Verification will be skipped
       Verify(Vec<String>),  // Allowed list
   }
   ``` 
   
   - Implement a validation function that ensures the user-specified path 
belongs to an allowed prefix. The function canonicalizes paths to prevent 
symlink-based escapes.
   
   ```
   fn validate_path_within_root(
       path: &PathBuf,
       verification: &LocalPathPrefixVerification,
   ) -> Result<(), object_store::Error> {
       match verification {
           LocalPathPrefixVerification::Unverified => Ok(()),
           LocalPathPrefixVerification::Verify(allowed_paths) => {
               let canonical_prefix = get_longest_canonical_prefix(path);
               for path in allowed_paths {
                   let allowed_root_canonical = canonicalize(path)?;
                   if canonical_prefix.starts_with(&allowed_root_canonical) {
                       return Ok(());
                   }
               }
   
               Err(object_store::Error::Generic {
                   store: "path_not_allowed",
                   source: format!(
                       "path is not allowed: {:?}. Must be under one of {:?}",
                       path, allowed_paths
                   )
                   .into(),
               })
           }
       }
   }
   
   fn get_longest_canonical_prefix(path: &PathBuf) -> PathBuf {
       let mut current_path = path.clone();
       
       // could be implemented with a binary search too. 
       loop { 
           if let Ok(canonical_path) = 
canonicalize(current_path.to_str().unwrap()) {
               return canonical_path;
           }
           if !current_path.pop() {
               break;
           }
       }
       current_path
   }
   ```
   
   #### Alternative Proposal
   
   An alternative approach is to introduce a `disable_symlink` flag. When 
enabled, all subdirectories within the `LocalFileSystem` `prefix` would be 
allowed, but any symlink resolving to a path outside the `prefix` would trigger 
an error.
   
   **Describe alternatives you've considered**
   
   As a workaround, we currently wrap `LocalFileSystem` in our own 
implementation, adding the necessary security checks.
   
   **Additional context**
   
   I am happy to contribute with a pull request implementing this enhancement 
if accepted.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to