jsicherman opened a new issue, #43241:
URL: https://github.com/apache/arrow/issues/43241

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I have some code which reads a parquet file, does a bit of processing, and 
returns the result in a loop, and it leads to sporadic crashes of my RStudio 
instance. The R code looks something like this...
   
   ```
   foo <- lapply(1:1000, function(i) {
       read_parquet(
           file.path('data', i, 'bar.parquet')],
           as_data_frame = F
       ) %>%
       select(A, B, C, D) %>%
       mutate(E = D + 1) %>%
       left_join(tibble.frame(A = 1:3, F = 6:8), by = 'A') %>%
       group_by(A, B) %>%
       summarise(
           G = n()
       ) %>% collect %>% as.data.table
   }) %>% rbindlist
   ```
   
   It will eventually crash, seemingly at random, either early or late in the 
loop. The following is logged under `systemctl status rstudio.service -l`
   
   ```
   ERROR The previous R session terminated abnormally; LOGGED FROM: 
rstudio::core::Error {anonymous}::rInit(const rstudio::r::session::RInitInfo&) 
src/cpp/session/SessionMain.cpp:725
   ```
   
   ```
   R version 4.2.1 (2022-06-23)
   Platform: x86_64-conda-linux-gnu (64-bit)
   Running under: Amazon Linux 2
   
   > package.version('arrow')
   [1] "9.0.0"
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to