zratkai commented on code in PR #5541: URL: https://github.com/apache/hive/pull/5541#discussion_r1875869433
########## parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g: ########## @@ -1840,6 +1841,14 @@ tableImplBuckets -> ^(TOK_ALTERTABLE_BUCKETS $num) ; +tableWriteOrdered +@init { pushMsg("table sorted specification", state); } +@after { popMsg(state); } + : + KW_WRITE KW_ORDERED KW_BY sortCols=columnNameOrderList Review Comment: @zhangbutao , @okumin , @deniskuzZ First of all, thanks for your comments! This syntax comes from the official Iceberg recommendation, which is for alter table, and to be consistend I modified the ALTER to CREATE in the syntax, so at CREATE table it is the same as ALTER TABLE: [https://iceberg.apache.org/docs/1.7.0/spark-ddl/#alter-table-write-ordered-by] My plan is to work as a next step on ALTER TABLE after this PR is done. I already created the ticket for that at the time this ticket was born: [https://issues.apache.org/jira/browse/HIVE-28587] About table property idea and comparisms: The iceberg syntax defines multiple keywords in the above page, not just the colum names: "ALTER TABLE prod.db.sample WRITE ORDERED BY category ASC NULLS LAST, id DESC NULLS FIRST" The mentioned examples do not cover all of that. E.g. I could not find in the linked Trino documentation how the ordering is defined (ASC/DESC) or the null ordering (NULLS FIRST/LAST). With that the table property solution would look like ugly and hard to remember and write correctly for the user. Since it is passedas a table property deep in the code here, it works with table property as well with this commit : e.g.: create table ice_orc_sorted (id int, text string) stored by iceberg stored as orc TBLPROPERTIES('iceberg.write-order'='{"order-id":1,"fields":[{"transform":"identity","source-id":1,"direction":"asc","null-order":"nulls-last"}]}'); The syntax comes from Iceberg built in format for sorted order, but I do not find it user friendly and I do not recommend to go that way. About SORTED BY/ SORT BY/LOCALSORT BY: as you already mentioned it would be confusing, and the "ALTER/CREATE TABLE prod.db.sample WRITE ORDERED BY category ASC NULLS LAST, id DESC NULLS FIRST" is recommended by Iceberg, and easy to understand and remember. So every DB engine has it's own syntax for this purpose and **neither is following Iceberg recommendation except this one.** As far as I found Spark does not even support yet sort order at alter table: https://spark.apache.org/docs/3.5.3/sql-ref-syntax-ddl-alter-table.html#content https://spark.apache.org/docs/3.5.3/sql-ref-syntax-ddl-create-table.html At the linked iceberg discussion even Ryan Blue said this: "No, I don't think so. These aren't table properties... " And someone else said: " I don't think using TBLPROPERTIES is a good idea..." I hope I answered all the questions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org