jacques-n opened a new pull request #10979:
URL: https://github.com/apache/arrow/pull/10979


   An alternative serialized representation that I created a few months ago for 
a project I was working on, converted from Protobuf to Flatbuf for discussion 
here.
   
   Relate to #10856 and #10934.
   
   Key points of focus: 
   - All field references are ordinals. Column names are not part of any of the 
core processing (beyond mapping from schema at input and output). This is 
really a concern of the presentation tool, not the execution/analysis layer. In 
Calcite, for example, the validator returns the field names of the actual 
return expression. The internal field names are meaningless (and actually cause 
user confusion).
   - Leans heavily on concepts that Calcite has presented
   - Projection is only used for calculation, not for column addition/removal. 
For the latter, each relational operation expresses whether it has a direct 
emit or a remapped emit.
   - My thinking is this should support both logical and physical plans.
   - The idea with function signatures is that public ones are organizationally 
namespaced and managed via a formal asset that the Arrow project contains. 
People can also make their own with a subsection of the namespace.
   - As opposed to other representations, focus more on row-wise literals as 
99% this representation will be easier to work with (e.g. a < 5).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to