yyanyy opened a new pull request #1975:
URL: https://github.com/apache/iceberg/pull/1975


   This change adds sort_order_id to content file, and allow read and write of 
this attribute in manifest entry. The logic of populating sort order id in 
writers from table attribute is not included here and will be the main focus 
for the next PR, as doing so will likely requires a lot of signature changes. 
   
   **Questions**
   - Not sure if sort order should be nullable by default or  0 
(from`unsorted_order`): decided as nullable for now, as I think positional 
delete files shouldn't have sort order (since they should be sorted by 
file_path and position per spec)
   - Do we want only sort order id, or actual sort order struct? Id itself is 
good for comparison but when merging data/delete files I think we do need the 
actual sort order struct. Without including it in manifest entries, I think we 
may need to do an additional lookup of the table to fetch it, which could slow 
down the query. 
   - For the next PR, do we assume the table's current sort order id is the 
authoritative place to get sort order information when adding a new file?  I 
wonder if engine is ever possible/allowed to override the default sort order to 
unsorted and pass it back to data/delete writer. 
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to