ZekeMarshall commented on issue #34291:
URL: https://github.com/apache/arrow/issues/34291#issuecomment-2597454269

   > > So far, I could not save the data into another Arrow format to 
facilitate further analyses. I tried to write the dataset into a parquet or 
feather objects but RStudio always crashes because of memory issue. I have 8 
core computer and 32 GB ram. Is there a way to write the data into e.g. feather 
format in a more efficient way without crashing?
   > 
   > Since you seem to be able to read all the data from CSV as a data frame, 
how about setting `as_data_frame = FALSE` to read as Arrow Table? I think it 
will work with less memory.
   > 
   > For example, we can convert to an Arrow IPC file (Feather V2) dataset 
without going through a data frame as follows.
   > 
   > arrow::read_delim_arrow(
   >   
"https://github.com/apache/arrow/files/10804095/Arrow_parse_Example4.txt";,
   >   delim = "\t",
   >   quote = "",
   >   as_data_frame = FALSE
   > ) |>
   >   arrow::write_dataset("test", format = "arrow")
   
   Hi @eitsupi and all,
   
   The above solution wherein the argument "delim = \t" is added worked for me 
when I received an almost identical error to @elgabbas.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to