[ 
https://issues.apache.org/jira/browse/ARROW-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van den Bossche updated ARROW-15729:
------------------------------------------
    Fix Version/s:     (was: 6.0.1)

> [R] Reading large files randomly freezes
> ----------------------------------------
>
>                 Key: ARROW-15729
>                 URL: https://issues.apache.org/jira/browse/ARROW-15729
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: Christian
>            Priority: Critical
>
> Hi -
> I recently upgraded to Arrow 6.0.1 and am using it in R.
> Whenever reading a large file (~10gb) in Windows it randomly freezes 
> sometimes. I can see the memory being allocated in the first 10-20 seconds, 
> but then nothing happens and R just doesn't respond (the R process becomes 
> idle too).
> I'm using the option options(arrow.use_threads=FALSE).
> I didn't have this issue with the previous version (0.15.1) I was using. And 
> the file reads fine under Linux.
> I would post a reproducible example but it happens randomly. I even thought I 
> would just read large files in pieces by first getting all the distinct 
> sections of a specific column (with compute>collect) but that hangs too.
> Any ideas would be appreciated.
> *Edit*
> Not sure if it makes sense to anyone but after a few tries it seems that the 
> issue only happens in Rstudio. In the R console it loads it fine. All I'm 
> executing is the below.
> options(arrow.use_threads=FALSE)
> aa <- arrow::read_arrow('.../file.arrow5')
> One thing I want to point out that the underlying Rscript process under 
> Rstudio seems to definitely use more than one core when executing the above.
> *Edit2*
> Using arrow::set_cpu_count(1) seems to solve the issue.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to