[ 
https://issues.apache.org/jira/browse/ARROW-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517044#comment-17517044
 ] 

Weston Pace commented on ARROW-15260:
-------------------------------------

Oh, also, one might think that this query would push the filter down:

{noformat}
  filter = Expression$create(
    "match_substring",
    Expression$field_ref("__filename"),
    options = list(pattern = "cyl=8")
  )
{noformat}

In other words, you might think we would get the hint and only read files 
matching that pattern.  This is not the case.  We will read the entire dataset 
and apply the "cyl=8" filter in memory.

If we want to pushdown filters on the filename column we will need to add some 
special logic.  Feel free to create a JIRA.

> [R] open_dataset - add file_name as column
> ------------------------------------------
>
>                 Key: ARROW-15260
>                 URL: https://issues.apache.org/jira/browse/ARROW-15260
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: R
>            Reporter: Martin du Toit
>            Priority: Minor
>
> Hi. Is it possible to add the file_name as a column to a dataset?
> {code:r}
> ds <- open_dataset(.....)
> list_of_files <- ds$files
> {code}
> This works, but I need the file_name as a column.
> Thanks
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to