vbarua opened a new pull request, #12495:
URL: https://github.com/apache/datafusion/pull/12495

   ## Which issue does this PR close?
   Continues #12347 
   
   This PR sets the 
[output_mapping](https://github.com/substrait-io/substrait/blob/bc4d6fb9bc0435c3db24172566c343e119fc50a9/proto/substrait/algebra.proto#L30-L33)
 field on ProjectRels for plans _produced_ by DataFusion. 
   
   ## Rationale for this change
   Given a plan
   ```
   Projection: data.b, data.a + data.a, data.a
     TableScan: data projection=[a, b]",
   ```
   
   When creating the Substrait ProjectRel, DataFusion will now set the output 
mapping to [2,3,4] so that the Substrait ProjectRel will match the DataFusion 
behaviour of _only_ outputting expressions.
   
   By default, if the `emit_kind` field is not set with an `output_mapping`, [a 
Substrait ProjectRel emits all input columns followed by all 
expressions](https://substrait.io/relations/logical_relations/#project-operation).
 This does not match the DataFusion behaviour.
   
   This change only includes _producer_ changes in order to ease the migration 
for users. Because DataFusion ignores the `emit_kind` field when consuming 
plans, DataFusion will be able to consume plans that set the output mapping 
correctly and plans that don't.
   
   Once the _consumer_ is updated to respect `emit_kind`, plans that don't set 
the `output_mapping` will not work as before. By shipping the producer change 
by itself, we allow users who serialize plans between versions the opportunity 
to update all of their plans before making a consumer change that would break 
them.
   
   ## Are these changes tested?
   Yes
   
   ## Are there any user-facing changes?
   Substrait plans with ProjectRels will now set the output_mapping field. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to