Copilot commented on code in PR #50044:
URL: https://github.com/apache/arrow/pull/50044#discussion_r3380014783


##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -547,6 +572,31 @@ ARROW_EXPORT
 Result<std::shared_ptr<FileSystem>> FileSystemFromUri(const std::string& uri,
                                                       std::string* out_path = 
NULLPTR);
 
+/// \brief Create a new FileSystem by URI with extended backend-specific 
filesystem
+/// options
+///
+/// Recognized schemes are "file", "mock", "hdfs", "viewfs", "s3",
+/// "gs" and "gcs".
+///
+/// Support for other schemes can be added using RegisterFileSystemFactory.
+///
+/// \param[in] uri the URI to give access to
+/// \param[in] options a list of backend-specific filesystem options
+///            Each option is a (name, value) pair.
+///            The expected type is specific to the backend and
+///            option name.
+///            Options are forwarded to schemes dispatched through a registered
+///            FileSystemFactory. A registered factory that does not support 
options
+///            (currently "s3") returns NotImplemented for non-empty options.
+///            Schemes not handled by a registered factory will also return
+///            NotImplemented for non-empty options.
+/// \param[out] out_path (optional) Path inside the filesystem.
+/// \return out_fs FileSystem instance.
+ARROW_EXPORT
+Result<std::shared_ptr<FileSystem>> FileSystemFromUri(
+    const std::string& uri, const std::vector<std::pair<std::string, 
std::any>>& options,
+    std::string* out_path = NULLPTR);

Review Comment:
   The new 3-arg overload `FileSystemFromUri(uri, options, out_path)` makes 
calls like `FileSystemFromUri(uri, {}, &path)` ambiguous (previously `{}` could 
unambiguously mean a default `io::IOContext`). This is a source-breaking API 
change for downstream code that used `{}` to select the IOContext overload. To 
avoid the ambiguity, consider renaming/removing the 3-arg options overload 
(e.g. require passing an explicit `io::IOContext`, or provide a differently 
named helper such as `FileSystemFromUriAndOptions`).



##########
cpp/src/arrow/testing/examplefs.cc:
##########
@@ -26,12 +30,40 @@ namespace arrow::fs {
 
 auto kExampleFileSystemModule = ARROW_REGISTER_FILESYSTEM(
     "example",
-    [](const Uri& uri, const io::IOContext& io_context,
+    [](const Uri& uri, const std::vector<std::pair<std::string, std::any>>& 
options,
+       const io::IOContext& io_context,
        std::string* out_path) -> Result<std::shared_ptr<FileSystem>> {
       constexpr std::string_view kScheme = "example";
       EXPECT_EQ(uri.scheme(), kScheme);
       auto local_uri = "file" + uri.ToString().substr(kScheme.size());
-      return FileSystemFromUri(local_uri, io_context, out_path);
+      ARROW_ASSIGN_OR_RAISE(auto fs,
+                            FileSystemFromUri(local_uri, options, io_context, 
out_path));

Review Comment:
   `examplefs` forwards its own scheme-specific options to the underlying 
`file` filesystem (`FileSystemFromUri(local_uri, options, ...)`). Since options 
are backend-specific, passing `example_*` options to `file` is at best ignored 
and at worst will start failing if/when `file` begins rejecting unsupported 
options (as intended by this PR’s semantics). The delegating factory should 
typically pass empty options to the nested filesystem call.



##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -359,11 +361,34 @@ class ARROW_EXPORT FileSystem
 
 struct FileSystemFactory {
   std::function<Result<std::shared_ptr<FileSystem>>(
-      const Uri& uri, const io::IOContext& io_context, std::string* out_path)>
+      const Uri& uri, const std::vector<std::pair<std::string, std::any>>& 
options,
+      const io::IOContext& io_context, std::string* out_path)>
       function;
   std::string_view file;
   int line;
 
+  /// Construct from an options-aware factory function.
+  FileSystemFactory(std::function<Result<std::shared_ptr<FileSystem>>(
+                        const Uri&, const std::vector<std::pair<std::string, 
std::any>>&,
+                        const io::IOContext&, std::string*)>
+                        fn,
+                    std::string_view file, int line)
+      : function(std::move(fn)), file(file), line(line) {}
+
+  /// Construct from a non-options aware factory function maintaining source 
compatibility
+  /// with existing factories.
+  FileSystemFactory(std::function<Result<std::shared_ptr<FileSystem>>(
+                        const Uri&, const io::IOContext&, std::string*)>
+                        fn,
+                    std::string_view file, int line)
+      : function([fn = std::move(fn)](
+                     const Uri& uri,
+                     const std::vector<std::pair<std::string, std::any>>& 
/*ignored*/,
+                     const io::IOContext& ctx,
+                     std::string* out_path) { return fn(uri, ctx, out_path); 
}),
+        file(file),
+        line(line) {}

Review Comment:
   The source-compatibility `FileSystemFactory` ctor wraps legacy 3-arg 
factories by unconditionally ignoring the new `options` argument. This silently 
drops user-provided options for any factory still using the old signature 
(including built-in registrars that haven’t been migrated yet), which 
contradicts the documented behavior of rejecting unsupported options and can 
mask configuration/credential mistakes. Consider rejecting non-empty `options` 
here (and only delegating to the legacy factory when `options` is empty).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to