paleolimbot commented on PR #13635:
URL: https://github.com/apache/arrow/pull/13635#issuecomment-1249400470

   Reprex to test, since it can really only be tested interactively:
   
   ```r
   library(arrow, warn.conflicts = FALSE)
   
   tf <- tempfile()
   readr::write_csv(vctrs::vec_rep(mtcars, 5e5), tf)
   
   # try to slow down CSV reading
   set_cpu_count(1)
   set_io_thread_count(2)
   
   # hit Control-C while this line runs!
   # (for me this takes about 3 seconds to run without cancelling)
   system.time(read_csv_arrow(tf))
   
   # ExecPlans don't cancel as snappily as CSV reading since it's implemented
   # at the end of the plan (i.e., we have to wait for a batch to be ready
   # before the stop token is checked). To observe meaningful cancellation
   # we need a bunch of files in a dataset.
   even_more_files <- purrr::map_chr(1:10, function(i) {
     another_tf_copy <- tempfile()
     file.copy(tf, another_tf_copy)
     another_tf_copy
   })
   
   # hit Control-C while this line runs!
   # (for me this takes about 30 seconds to run without cancelling,
   # but with hitting the cancel button I can get it down to 10 seconds)
   system.time(open_dataset(c(tf, even_more_files), format = "csv") %>% 
dplyr::collect())
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to