nealrichardson opened a new pull request, #33770:
URL: https://github.com/apache/arrow/pull/33770

   ### Rationale for this change
   
   Followup to https://github.com/apache/arrow/pull/19706/files#r1073391100 
with the goal of deleting and simplifying some code. 
   
   Unfortunately, it does not work with nested field refs:
   
   ```
   ── 1. Error (test-dplyr-query.R:745): Can use nested field refs 
────────────────────────────────────
   Error in `compute.arrow_dplyr_query(x)`: Invalid: No projected schema was 
supplied and we could not infer the projected schema from the projection 
expression.
   /Users/enpiar/Documents/ursa/arrow/cpp/src/arrow/dataset/scanner.cc:201  
GetProjectedSchemaFromExpression(scan_options->projection, dataset_schema)
   /Users/enpiar/Documents/ursa/arrow/cpp/src/arrow/dataset/scanner.cc:979  
NormalizeScanOptions(scan_options, dataset->schema())
   Backtrace:
    1. arrow:::expect_equal(...)
         at test-dplyr-query.R:745:2
    5. arrow:::collect.arrow_dplyr_query(.)
    6. arrow:::compute.arrow_dplyr_query(x)
   ```
   
   The error comes from 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/dataset/scanner.cc#L143-L156,
 which was added in https://github.com/apache/arrow/pull/14264 (cc @vibhatha). 
`field_ref->IsName()` is false for nested FieldRefs so this code does not 
extract the fields. Something along the lines of the code being deleted from 
the R bindings in this PR 
(https://github.com/apache/arrow/compare/master...nealrichardson:arrow:project-expr?expand=1#diff-679c2a33131f0b0c37c8de00984511a42c834fadbf5ed39ea0b92277404499cdL63)
 is probably needed to handle nested field refs, though I'm not fully aware of 
the context in which `GetProjectedSchemaFromExpression` is being called to say 
for sure.
   
   ### Are there any user-facing changes?
   
   No, this is purely an internal refactor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to