hesampakdaman commented on issue #5592:
URL: https://github.com/apache/arrow-rs/issues/5592#issuecomment-2142943951
> It is the same set of chars across all platforms, although I suppose we
could just add some special logic for windows within LocalFilesystem...
I was going to suggest to have several platform specific `INVALID` consts
:joy:
```diff
modified object_store/src/path/parts.rs
@@ -73,6 +73,7 @@ impl<'a> PathPart<'a> {
}
/// Characters we want to encode.
+#[cfg(target_family="windows")]
const INVALID: &AsciiSet = &CONTROLS
// The delimiter we are reserving for internal hierarchy
.add(DELIMITER_BYTE)
@@ -97,8 +98,38 @@ const INVALID: &AsciiSet = &CONTROLS
.add(b'\r')
.add(b'\n')
.add(b'*')
+ .add(b':')
.add(b'?');
```
> I think probably we should add specific handling within LocalFilesystem on
Windows. Just because Windows filesystems are a complete mess, I don't think
that shouldn't leak out. This would also avoid a breaking change
I didn't have a thorough look but does it suffice to have this special logic
in `path_to_filesystem`?
```rust
/// Return an absolute filesystem path of the given file location
pub fn path_to_filesystem(&self, location: &Path) -> Result<PathBuf> {
ensure!(
is_valid_file_path(location),
InvalidPathSnafu {
path: location.as_ref()
}
);
self.config.prefix_to_filesystem(location)
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]