[
https://issues.apache.org/jira/browse/ARROW-9903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190359#comment-17190359
]
Sean Clement commented on ARROW-9903:
-------------------------------------
To clarify what I mean, the freeze might occur at file 1, 8, or 17 with no
error occurred during the previous files processing. It simply stops proceeding
without error message for an indefinite amount of time when the dataset is
called to produce the data frame needed.
> [R] open_dataset freezes opening feather files
> ----------------------------------------------
>
> Key: ARROW-9903
> URL: https://issues.apache.org/jira/browse/ARROW-9903
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Environment: Rstudio
> Reporter: Sean Clement
> Priority: Major
>
> Session info:
> {code:java}
> // R version 4.0.2 (2020-06-22)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 19041)Matrix products: defaultlocale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>
> [5] LC_TIME=English_United States.1252 attached base packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.1 purrr_0.3.4
> readr_1.3.1 tidyr_1.1.1
> [7] tibble_3.0.3 ggplot2_3.3.2 tidyverse_1.3.0 arrow_1.0.1 loaded
> via a namespace (and not attached):
> [1] Rcpp_1.0.5 cellranger_1.1.0 pillar_1.4.6 compiler_4.0.2
> dbplyr_1.4.4 tools_4.0.2
> [7] bit_1.1-15.2 lubridate_1.7.9 jsonlite_1.7.0 lifecycle_0.2.0
> gtable_0.3.0 pkgconfig_2.0.3
> [13] rlang_0.4.7 reprex_0.3.0 cli_2.0.2 DBI_1.1.0
> rstudioapi_0.11 haven_2.3.1
> [19] withr_2.2.0 xml2_1.3.2 httr_1.4.2 fs_1.4.1
> generics_0.0.2 vctrs_0.3.2
> [25] hms_0.5.3 bit64_0.9-7 grid_4.0.2 tidyselect_1.1.0
> glue_1.4.1 R6_2.4.1
> [31] fansi_0.4.1 readxl_1.3.1 modelr_0.1.8 blob_1.2.1
> magrittr_1.5 backports_1.1.7
> [37] scales_1.1.1 ellipsis_0.3.1 rvest_0.3.5 assertthat_0.2.1
> colorspace_1.4-1 stringi_1.4.6
> [43] munsell_0.5.0 broom_0.7.0 crayon_1.3.4
> {code}
> While cycling through and processing files using open_dataset(..., format =
> "feather") in R, the function hangs randomly and will not proceed to the next
> file. The freeze does not appear at the same file each time, additionally,
> the same function freezes when used one on occasion.
> When open_dataset hangs the only way to get R to stop is using Task Manager
> as Rstudio becomes totally unresponsive.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)