[
https://issues.apache.org/jira/browse/ARROW-17173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607714#comment-17607714
]
David Li commented on ARROW-17173:
----------------------------------
I don't think you can avoid plumbing around it, either explicitly as we have,
or implicitly by maintaining some sort of thread or task-local state (which we
have to be careful to propagate/save/restore). I like this series of posts
which discusses the same issue in Python:
https://vorpus.org/blog/timeouts-and-cancellation-for-humans/ but there's no
easy answer there (Trio does a lot of work to maintain the task-local state to
make cancellation work).
I also agree that at least for the immediate issue, filesystems shouldn't be
getting StopToken from the IOContext, but rather individual operations should
have a StopToken.
> [C++] Clarify lifecycle of a StopSource/StopToken
> -------------------------------------------------
>
> Key: ARROW-17173
> URL: https://issues.apache.org/jira/browse/ARROW-17173
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Dewey Dunnington
> Priority: Major
>
> In ARROW-11841 we ran into an issue where a single cancellable operation
> (i.e., {{SetSignalStopSource()}}/{{ResetSignalStopSource()}} was a poor fit:
> the {{StopToken}} must be assigned to an {{IOContext}} when a filesystem is
> created; however, the filesystem may be reused for more than one cancellable
> operation (e.g., reading a CSV). Following the instructions in the current
> API (in util/cancel.h) results in a situation the lifecycle of the filesystem
> must match the lifecycle of the {{StopSource}}, which can be difficult to
> program around.
> A related problem is that where we load Python and R Arrow libraries that
> link to the same .so. After ARROW-11841, R will have the ability to register
> signal handlers to interrupt Arrow operations, and users that load pyarrow
> via reticulate must be careful to disable it or they will get an error along
> the lines of "StopSource already set up".
> From a purely R-centric point of view, we could provide our own {{StopToken}}
> implementation if we were allowed to since R already implements the proper
> signal handler and the arrow R package implements the proper event loop to
> make this thread safe. Currently the {{StopToken}} is passed by value and
> thus a subclass is not an option. For R, anyway, this would eliminate any
> need to consider the lifecycle of another object.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)