yyanyy opened a new pull request #1975:
URL: https://github.com/apache/iceberg/pull/1975
This change adds sort_order_id to content file, and allow read and write of
this attribute in manifest entry. The logic of populating sort order id in
writers from table attribute is not included here and will be the main focus
for the next PR, as doing so will likely requires a lot of signature changes.
**Questions**
- Not sure if sort order should be nullable by default or 0
(from`unsorted_order`): decided as nullable for now, as I think positional
delete files shouldn't have sort order (since they should be sorted by
file_path and position per spec)
- Do we want only sort order id, or actual sort order struct? Id itself is
good for comparison but when merging data/delete files I think we do need the
actual sort order struct. Without including it in manifest entries, I think we
may need to do an additional lookup of the table to fetch it, which could slow
down the query.
- For the next PR, do we assume the table's current sort order id is the
authoritative place to get sort order information when adding a new file? I
wonder if engine is ever possible/allowed to override the default sort order to
unsorted and pass it back to data/delete writer.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]