paleolimbot commented on PR #14250: URL: https://github.com/apache/arrow/pull/14250#issuecomment-1275475932
I'll investigate more when I'm back from PTO on Monday, but I did a quick check and (1) cancellation still works interactively and (2) this does seem to break forked process behaviour when using R + arrow: <details> ``` r library(arrow, warn.conflicts = FALSE) #> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information. tf <- tempfile() readr::write_csv(vctrs::vec_rep(mtcars, 5e5), tf) # try to slow down CSV reading set_cpu_count(1) set_io_thread_count(2) # make sure we can cancel read_csv_arrow(tf) #> # A tibble: 16,000,000 × 11 #> mpg cyl disp hp drat wt qsec vs am gear carb #> <dbl> <int> <dbl> <int> <dbl> <dbl> <dbl> <int> <int> <int> <int> #> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 #> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 #> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 #> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 #> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 #> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 #> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 #> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 #> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 #> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 #> # … with 15,999,990 more rows # see if arrow works in a forked R process parallel::mclapply(1:2, function(...) read_csv_arrow(tf)) #> Warning in parallel::mclapply(1:2, function(...) read_csv_arrow(tf)): scheduled #> cores 1, 2 did not deliver results, all values of the jobs will be affected #> [[1]] #> NULL #> #> [[2]] #> NULL ``` <sup>Created on 2022-10-11 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup> </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
