Zsolt Kegyes-Brassai created ARROW-12036:
--------------------------------------------

             Summary: [R] dataset by a single parquet file
                 Key: ARROW-12036
                 URL: https://issues.apache.org/jira/browse/ARROW-12036
             Project: Apache Arrow
          Issue Type: Wish
            Reporter: Zsolt Kegyes-Brassai


I like using the {{dplyr}} in conjunction with 
[datasets|https://arrow.apache.org/docs/r/articles/dataset.html], it results in 
a clean code.

There are times, when I would like to use the same workflow just for a single 
(larger) parquet file and in most of those cases it doesn’t make sense to 
create a separate folder for just one file. 

(the {{read_parquet()}} provides options only for selecting the columns, no 
filtering and grouping)

Is it possible/does it make sense to extend the {{open_dataset()}} with an 
option to specify just a single file?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to