Re: [I] [R] Arrow write_parquet() breaks data.table ability to set columns by reference [arrow]

via GitHub Wed, 22 Jan 2025 15:49:22 -0800


nicki-dese commented on issue #45300:
URL: https://github.com/apache/arrow/issues/45300#issuecomment-2608496717


   Thank you @amoeba, both for your info and offer of a chat. I have done a lot 
of exploring of arrow, and more recently duckdb. For the majority of our work, 
targets plus data.table with interim outputs saved as parquet via targets has 
worked really well. We often start our targets pipeline with 
arrow::open_dataset to filter our data before bringing it in to memory, which 
has been a game changer. However, open_dataset's schema inference from csvs is 
much worse than both fread and duckdb's, which has stopped our whole-hearted 
adoption.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [R] Arrow write_parquet() breaks data.table ability to set columns by reference [arrow]

Reply via email to