eitsupi commented on issue #34291:
URL: https://github.com/apache/arrow/issues/34291#issuecomment-1441794146
> So far, I could not save the data into another Arrow format to facilitate
further analyses. I tried to write the dataset into a parquet or feather
objects but RStudio always crashes because of memory issue. I have 8 core
computer and 32 GB ram. Is there a way to write the data into e.g. feather
format in a more efficient way without crashing?
Since you seem to be able to read all the data from CSV as a data frame, how
about setting `as_data_frame = FALSE` to read as Arrow Table?
I think it will work with less memory.
For example, we can convert to an Arrow IPC file (Feather V2) dataset
without going through a data frame as follows.
```r
arrow::read_delim_arrow(
"https://github.com/apache/arrow/files/10804095/Arrow_parse_Example4.txt",
delim = "\t",
quote = "",
as_data_frame = FALSE
) |>
arrow::write_dataset("test", format = "arrow")
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]