[
https://issues.apache.org/jira/browse/ARROW-17173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607930#comment-17607930
]
Dewey Dunnington commented on ARROW-17173:
------------------------------------------
Thanks to you both for explaining...it's an exciting feature and I hope we can
get it to work!
I also wonder if it wouldn't be safer to warn (instead of error) when Python
tries but fails to set up a {{StopSource}}. It sounds like Python is careful
not to nest cancellable operations within the Python library; however, a GDAL
or an R package linking to the same .so might do this by accident (or with a
good reason that we haven't considered). It sounds like this would be possible
(the operation just wouldn't be cancellable in Python)?
> [C++] Clarify lifecycle of a StopSource/StopToken
> -------------------------------------------------
>
> Key: ARROW-17173
> URL: https://issues.apache.org/jira/browse/ARROW-17173
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Dewey Dunnington
> Priority: Major
>
> In ARROW-11841 we ran into an issue where a single cancellable operation
> (i.e., {{SetSignalStopSource()}}/{{ResetSignalStopSource()}} was a poor fit:
> the {{StopToken}} must be assigned to an {{IOContext}} when a filesystem is
> created; however, the filesystem may be reused for more than one cancellable
> operation (e.g., reading a CSV). Following the instructions in the current
> API (in util/cancel.h) results in a situation the lifecycle of the filesystem
> must match the lifecycle of the {{StopSource}}, which can be difficult to
> program around.
> A related problem is that where we load Python and R Arrow libraries that
> link to the same .so. After ARROW-11841, R will have the ability to register
> signal handlers to interrupt Arrow operations, and users that load pyarrow
> via reticulate must be careful to disable it or they will get an error along
> the lines of "StopSource already set up".
> From a purely R-centric point of view, we could provide our own {{StopToken}}
> implementation if we were allowed to since R already implements the proper
> signal handler and the arrow R package implements the proper event loop to
> make this thread safe. Currently the {{StopToken}} is passed by value and
> thus a subclass is not an option. For R, anyway, this would eliminate any
> need to consider the lifecycle of another object.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)