amoeba commented on PR #35398:
URL: https://github.com/apache/arrow/pull/35398#issuecomment-1535532893

   > ...would we still run into the same problem?
   
   I think we would still have this issue:
   
   ```r
   s3_bucket("my-buk", log_level = "fatal")
   s3_bucket("my-buk", log_level = "debug") # Doesn't do what user expects 
because S3 is already init'd
   ```
   
   We could issue a helpful warning here, though we could also do that with the 
proposed s3_init function too.
   
   @paleolimbot thank you for spitballing! I think see where you're going but I 
think we could avoid keeping state in the R side and just wrap 
[`arrow::fs::IsS3Initialized()`](https://github.com/apache/arrow/blob/main/cpp/src/arrow/filesystem/s3fs.cc#L2709).
   
   I think the crux of the problem is that the AWS C++ SDK is forcing us to 
have immutable state and IMO is that isn't a common pattern in R. It might be 
possible make this state mutable by re-initializing the SDK on the user's 
behalf when we detect we need to (e.g. the user changes the log level after 
`s3_bucket()` but before `bucket$ls()` ) but that feels clunky and I'm not sure 
how well it'd work. I think shutting down the API (`arrow::fs::FinalizeS3()`) 
involves blocking until all SDK resources (e.g., threads) are free'd up.
   
   I might do a bit of searching around to see if there are other packages that 
deal with this kind of situation. If we ultimately don't find something we 
like, I'd be fine closing this PR and saving it for another time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to