EpsilonPrime commented on issue #34484:
URL: https://github.com/apache/arrow/issues/34484#issuecomment-2111577116
Here's how I would implement augmented fields in Substrait:
There are some number of extra fields that can be defined on a consumer that
aren't strictly part of the data. These can be referenced by accessing hidden
fields known as "augmented" fields. To access them a Substrait plan can
reference these fields with a new FieldReference reference_type
augmented_reference. Since there is no enum or registry for augmented types we
access them by name.
```
oneof reference_type {
ReferenceSegment direct_reference = 1;
MaskExpression masked_reference = 2;
AugmentedReference augmented_reference =3 ;
}
message AugmentedReference {
string reference = 1;
}
```
Augmented fields are only valid in the read relation (and the short chain of
non-join relations (including FilterRel and most importantly ProjectRel)
thereafter unless we want to define a new steps out option). If these fields
need to be accessed after that point a ProjectRel should expose these fields to
be preserved as normal fields for the rest of the computation.
A consumer should reject plans that ask for augmented fields that are not
supported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]