jacques-n opened a new pull request #10979: URL: https://github.com/apache/arrow/pull/10979
An alternative serialized representation that I created a few months ago for a project I was working on, converted from Protobuf to Flatbuf for discussion here. Relate to #10856 and #10934. Key points of focus: - All field references are ordinals. Column names are not part of any of the core processing (beyond mapping from schema at input and output). This is really a concern of the presentation tool, not the execution/analysis layer. In Calcite, for example, the validator returns the field names of the actual return expression. The internal field names are meaningless (and actually cause user confusion). - Leans heavily on concepts that Calcite has presented - Projection is only used for calculation, not for column addition/removal. For the latter, each relational operation expresses whether it has a direct emit or a remapped emit. - My thinking is this should support both logical and physical plans. - The idea with function signatures is that public ones are organizationally namespaced and managed via a formal asset that the Arrow project contains. People can also make their own with a subsection of the namespace. - As opposed to other representations, focus more on row-wise literals as 99% this representation will be easier to work with (e.g. a < 5). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
