Zsolt Kegyes-Brassai created ARROW-12036:
--------------------------------------------
Summary: [R] dataset by a single parquet file
Key: ARROW-12036
URL: https://issues.apache.org/jira/browse/ARROW-12036
Project: Apache Arrow
Issue Type: Wish
Reporter: Zsolt Kegyes-Brassai
I like using the {{dplyr}} in conjunction with
[datasets|https://arrow.apache.org/docs/r/articles/dataset.html], it results in
a clean code.
There are times, when I would like to use the same workflow just for a single
(larger) parquet file and in most of those cases it doesn’t make sense to
create a separate folder for just one file.
(the {{read_parquet()}} provides options only for selecting the columns, no
filtering and grouping)
Is it possible/does it make sense to extend the {{open_dataset()}} with an
option to specify just a single file?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)