[ 
https://issues.apache.org/jira/browse/ARROW-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516837#comment-17516837
 ] 

Neal Richardson commented on ARROW-15260:
-----------------------------------------

> although C++ gives us errors if we try to insert a field reference to 
> {{__filename}} in an {{Expression}}

This is tricky because we (and apparently elsewhere in the C++ code) have logic 
to filter out secret internal columns like this: 
https://github.com/apache/arrow/blob/master/r/R/query-engine.R#L159-L163. 
Sounds like we need to find a safe way to loosen that, or otherwise rethink the 
implementation.

In terms of UX in R, a special helper like {{add_filenames <- function() 
Expression$field_ref("__filename")}} that you could call like {{mutate(ds, 
file_col = add_filenames())}} might be a reasonable interface to this. 

> [R] open_dataset - add file_name as column
> ------------------------------------------
>
>                 Key: ARROW-15260
>                 URL: https://issues.apache.org/jira/browse/ARROW-15260
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: R
>            Reporter: Martin du Toit
>            Priority: Minor
>
> Hi. Is it possible to add the file_name as a column to a dataset?
> {code:r}
> ds <- open_dataset(.....)
> list_of_files <- ds$files
> {code}
> This works, but I need the file_name as a column.
> Thanks
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to