plusplusjiajia commented on code in PR #703:
URL: https://github.com/apache/iceberg-cpp/pull/703#discussion_r3371613944
##########
src/iceberg/arrow/arrow_io.cc:
##########
@@ -473,9 +473,14 @@ class ArrowOutputFile : public OutputFile {
} // namespace
Result<std::string> ArrowFileSystemFileIO::ResolvePath(const std::string&
file_location) {
- if (file_location.find("://") != std::string::npos) {
- ICEBERG_ARROW_ASSIGN_OR_RETURN(auto path,
arrow_fs_->PathFromUri(file_location));
- return path;
+ if (auto pos = file_location.find("://"); pos != std::string::npos) {
+ auto path = arrow_fs_->PathFromUri(file_location);
+ if (path.ok()) {
+ return path.ValueOrDie();
+ }
+ // PathFromUri rejects S3-compatible schemes (s3a/s3n, gs://, oss://);
+ // fall back to the scheme-less bucket/key.
+ return file_location.substr(pos + 3);
Review Comment:
Agreed on the allowlist — I'd suggest it belongs at the selection layer, not
in ResolvePath. Java mirrors this: S3URI (≈ ResolvePath) is scheme-permissive
("any valid URI scheme … s3a/s3n … GCS"); the allowlist lives in
ResolvingFileIO.SCHEME_TO_FILE_IO. I've added the cpp equivalent in this PR —
DetectBuiltinFileIO now accepts s3a/s3n (commit 3556e4d), while ResolvePath
stays permissive (now also stripping query/fragment), matching S3URI. oss/gs
are left out for separate discussion, as you suggested. Let me know what you
think.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]