Copilot commented on code in PR #50081:
URL: https://github.com/apache/arrow/pull/50081#discussion_r3340514283


##########
r/src/filesystem.cpp:
##########
@@ -292,8 +292,17 @@ std::shared_ptr<fs::S3FileSystem> 
fs___S3FileSystem__create(
     bool check_directory_existence_before_creation = false, double 
connect_timeout = -1,
     double request_timeout = -1) {
   // We need to ensure that S3 is initialized before we start messing with the
-  // options
-  StopIfNotOk(fs::EnsureS3Initialized());
+  // options. We use InitializeS3() rather than EnsureS3Initialized() so we can
+  // enable the SIGPIPE handler - without it, stale connections in the SDK's
+  // connection pool can trigger SIGPIPE during Aws::ShutdownAPI(), which 
causes
+  // R's signal handler to longjmp out of the teardown and segfault (GH-50009).
+  fs::S3GlobalOptions options = fs::S3GlobalOptions::Defaults();
+  options.install_sigpipe_handler = true;
+  auto status = fs::InitializeS3(options);
+  // InitializeS3 returns Invalid if already initialized - that's fine
+  if (!status.ok() && !fs::IsS3Initialized()) {
+    StopIfNotOk(status);
+  }

Review Comment:
   This initializes S3 with `install_sigpipe_handler = true`, but 
`InitializeS3()` returns `Invalid` when S3 was already initialized and *ignores 
the passed options*. In R, many common S3 entry points initialize via 
`fs::FileSystemFromUri()` (which calls `EnsureS3Initialized()` with default 
`S3GlobalOptions`, i.e., without the SIGPIPE handler), so by the time 
`fs___S3FileSystem__create()` runs the SIGPIPE handler may still be uninstalled 
and GH-50009 / #32026 will remain reproducible (e.g., 
`read_parquet("s3://...")`, `s3_bucket("bucket")` with no extra args). Consider 
ensuring the SIGPIPE handler is installed in *all* R S3 initialization paths 
(e.g., pre-initialize with `InitializeS3(S3GlobalOptions{..., 
install_sigpipe_handler=true})` before calling `FileSystemFromUri` for `s3://` 
URIs, or otherwise make the default/EnsureS3Initialized path install it for the 
R build).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to