Copilot commented on code in PR #50151:
URL: https://github.com/apache/arrow/pull/50151#discussion_r3389221892
##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -357,13 +359,56 @@ class ARROW_EXPORT FileSystem
bool default_async_is_sync_ = true;
};
+using FileSystemFactoryOptions = std::vector<std::pair<std::string, std::any>>;
+
+class ARROW_EXPORT ExampleOption {
+ public:
+ explicit ExampleOption(int value) : value_(value) {}
Review Comment:
`FileSystemFactoryOptions` (based on `std::any`) and the `ExampleOption` /
`ExampleAcceptOptions` showcase types are introduced in the public filesystem
API surface. This is a significant API/ABI commitment for libarrow (including
exposing `std::any` in a cross-DSO plugin interface) and the `Example*` symbols
look like test/demo-only helpers rather than stable public API. Consider moving
these to an internal/testing header (or gating behind an explicit experimental
API marker) before merging so downstream users don’t accidentally depend on
them.
##########
python/pyarrow/_fs.pyx:
##########
@@ -1651,3 +1651,21 @@ def _copy_files_selector(FileSystem source_fs,
FileSelector source_sel,
destination_fs.unwrap(), c_destination_base_dir,
c_default_io_context(), chunk_size, use_threads,
))
+
+
+def _example_accept_options(str value, int value_int, int value_typed):
+ cdef:
+ CFSFileSystemFactoryOptions options
+ pair[c_string, c_any] entry
+ c_string result
+ entry.first = tobytes("example_option_string")
+ entry.second = <c_string>tobytes(value)
+ options.push_back(entry)
+ entry.first = tobytes("example_option_int")
+ entry.second = <int>value_int
+ options.push_back(entry)
+ entry.first = tobytes("example_option_typed_var")
+ entry.second = CExampleOption(value_typed)
+ options.push_back(entry)
+ result = GetResultValue(CExampleAcceptOptions(options))
+ return frombytes(result)
Review Comment:
The new Cython helper calls into C++ without releasing the GIL, unlike most
other filesystem helpers in this module (which typically wrap Arrow C++ calls
in `with nogil:`). Even though this is a small demo function, it’s better to
follow the existing pattern and avoid unnecessary GIL contention.
##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -563,6 +632,31 @@ Result<std::shared_ptr<FileSystem>>
FileSystemFromUri(const std::string& uri,
const io::IOContext&
io_context,
std::string* out_path =
NULLPTR);
+/// \brief Create a new FileSystem by URI with a custom IO context with
backend-specific
+/// filesystem options
+///
+/// Recognized schemes are "file", "mock", "hdfs", "viewfs", "s3",
+/// "gs" and "gcs".
Review Comment:
Same issue for the overload taking an `io::IOContext`: the “Recognized
schemes” list omits supported schemes such as `local`, `abfs`, and `abfss`,
which can make the new API seem more limited than it is.
##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -547,6 +592,30 @@ ARROW_EXPORT
Result<std::shared_ptr<FileSystem>> FileSystemFromUri(const std::string& uri,
std::string* out_path =
NULLPTR);
+/// \brief Create a new FileSystem by URI with extended backend-specific
filesystem
+/// options
+///
+/// Recognized schemes are "file", "mock", "hdfs", "viewfs", "s3",
+/// "gs" and "gcs".
Review Comment:
The new `FileSystemFromUriAndOptions` docstring’s “Recognized schemes” list
omits schemes that are actually supported (e.g., `local` is registered in
`localfs.cc`, and `abfs`/`abfss` are handled in `FileSystemFromUriReal`). This
can mislead API consumers about what URIs/options they can pass.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]