elgabbas commented on issue #34291:
URL: https://github.com/apache/arrow/issues/34291#issuecomment-1442191856
Thanks @eitsupi ... This did not help in my case. Loading the data consumed
high memory and crashed my PC.
One possible solution is to loop through values of one of the columns,
filter the data based on this value, then save to disk manually for each
value.. Will apply this and see
```
arrow::open_dataset(sources = Path, format = "csv", delim = "\t", quote =
"") %>%
# Some filtering %>%
arrow::write_dataset(path = OutPath, max_open_files = 100L,
max_rows_per_file = 1000L, format = "arrow")
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]