Navid created ARROW-12700:
-----------------------------

             Summary: Read/Write_feather stuck forever after bad write, R, Win32
                 Key: ARROW-12700
                 URL: https://issues.apache.org/jira/browse/ARROW-12700
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 4.0.0
         Environment: RStudio (latest version), Windows 10
            Reporter: Navid


I currently have to switch R kernels to able to work with feather files due to 
the following bug:

 

I imported 11 million rows from a csv file using data.table. I then proceeded 
write_feathering it to a file which got stuck forever without any progress 
being made. I killed RStudio, then saw the 0 byte feather file gain a couple of 
Mb; I repeated this process and it just seems like it's working very, very 
slowly to produce the file. I decided to abort attempts on writing a feather 
file, but something happened with wherever it is caching the files, because the 
arrow-package won't allow read_feather to be executed (it's just stuck in a 
forever-loop with no progress being made).

I was able to alleviate this by changing R kernel to 4.0.3 from 4.0.4 which 
allowed read_feather to work again. After accidentally executing the 
write_feather function again and getting stuck in the loop, read_feather will 
no longer work again.


My theory is that whatever bug in the process for write_feather is occuring, it 
is creating some physical file that locks the rest of the feather-code to work, 
even after restarting the computer and cleaning all temporary files. The only 
solution seems to be to change R kernels, I've so far not been able to 'reset' 
the kernel, removing the arrow package and reinstalling it does not seem to 
solve it.

 

Importing the CSV file and writing it in Python did not create the same 
problem, the file was written in seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to