Navid created ARROW-12700:
-----------------------------
Summary: Read/Write_feather stuck forever after bad write, R, Win32
Key: ARROW-12700
URL: https://issues.apache.org/jira/browse/ARROW-12700
Project: Apache Arrow
Issue Type: Bug
Components: R
Affects Versions: 4.0.0
Environment: RStudio (latest version), Windows 10
Reporter: Navid
I currently have to switch R kernels to able to work with feather files due to
the following bug:
I imported 11 million rows from a csv file using data.table. I then proceeded
write_feathering it to a file which got stuck forever without any progress
being made. I killed RStudio, then saw the 0 byte feather file gain a couple of
Mb; I repeated this process and it just seems like it's working very, very
slowly to produce the file. I decided to abort attempts on writing a feather
file, but something happened with wherever it is caching the files, because the
arrow-package won't allow read_feather to be executed (it's just stuck in a
forever-loop with no progress being made).
I was able to alleviate this by changing R kernel to 4.0.3 from 4.0.4 which
allowed read_feather to work again. After accidentally executing the
write_feather function again and getting stuck in the loop, read_feather will
no longer work again.
My theory is that whatever bug in the process for write_feather is occuring, it
is creating some physical file that locks the rest of the feather-code to work,
even after restarting the computer and cleaning all temporary files. The only
solution seems to be to change R kernels, I've so far not been able to 'reset'
the kernel, removing the arrow package and reinstalling it does not seem to
solve it.
Importing the CSV file and writing it in Python did not create the same
problem, the file was written in seconds.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)