[ 
https://issues.apache.org/jira/browse/ARROW-17173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607930#comment-17607930
 ] 

Dewey Dunnington commented on ARROW-17173:
------------------------------------------

Thanks to you both for explaining...it's an exciting feature and I hope we can 
get it to work!

I also wonder if it wouldn't be safer to warn (instead of error) when Python 
tries but fails to set up a {{StopSource}}. It sounds like Python is careful 
not to nest cancellable operations within the Python library; however, a GDAL 
or an R package linking to the same .so might do this by accident (or with a 
good reason that we haven't considered). It sounds like this would be possible 
(the operation just wouldn't be cancellable in Python)?

> [C++] Clarify lifecycle of a StopSource/StopToken
> -------------------------------------------------
>
>                 Key: ARROW-17173
>                 URL: https://issues.apache.org/jira/browse/ARROW-17173
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Dewey Dunnington
>            Priority: Major
>
> In ARROW-11841 we ran into an issue where a single cancellable operation 
> (i.e., {{SetSignalStopSource()}}/{{ResetSignalStopSource()}} was a poor fit: 
> the {{StopToken}} must be assigned to an {{IOContext}} when a filesystem is 
> created; however, the filesystem may be reused for more than one cancellable 
> operation (e.g., reading a CSV). Following the instructions in the current 
> API (in util/cancel.h) results in a situation the lifecycle of the filesystem 
> must match the lifecycle of the {{StopSource}}, which can be difficult to 
> program around.
> A related problem is that where we load Python and R Arrow libraries that 
> link to the same .so. After ARROW-11841, R will have the ability to register 
> signal handlers to interrupt Arrow operations, and users that load pyarrow 
> via reticulate must be careful to disable it or they will get an error along 
> the lines of "StopSource already set up".
> From a purely R-centric point of view, we could provide our own {{StopToken}} 
> implementation if we were allowed to since R already implements the proper 
> signal handler and the arrow R package implements the proper event loop to 
> make this thread safe. Currently the {{StopToken}} is passed by value and 
> thus a subclass is not an option. For R, anyway, this would eliminate any 
> need to consider the lifecycle of another object.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to