kasakrisz commented on code in PR #5627: URL: https://github.com/apache/hive/pull/5627#discussion_r1954304603
########## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java: ########## @@ -857,7 +858,12 @@ public DynamicPartitionCtx createDPContext( if (sortField.sourceId() == field.fieldId()) { customSortPositions.add(pos); customSortOrder.add(sortField.direction() == SortDirection.ASC ? 1 : 0); - customSortNullOrder.add(sortField.nullOrder() == NullOrder.NULLS_FIRST ? 0 : 1); + + NullOrdering nullOrdering = NullOrdering.NULLS_LAST; + if (sortField.nullOrder() == NullOrder.NULLS_FIRST) { + nullOrdering = NullOrdering.NULLS_FIRST; + } + customSortNullOrder.add(nullOrdering.getCode()); Review Comment: There were two failing test without this change. ``` TestHiveIcebergInserts.testSortedInsert TestHiveIcebergInserts.testSortedAndTransformedInsertIntoPartitionedTable ``` When I analyzed the cause of the failure of these tests I bumped into this bug: ``` sortField.nullOrder() == NullOrder.NULLS_FIRST ? 0 : 1 ``` maps `NULLS_FIRST` to `0` which actually means nulls last in Hive https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java#L30-L31 > We added a feature to configure the sort order of Iceberg tables. I wondered if we can keep the semantics of WRITE LOCALLY ORDERED BY a I found that null sort order is set here when it was not specified explicitly at create table: https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java#L292 and the defaults are defined at the parser level https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L2205-L2206 This is not affected by the setting `hive.default.nulls.last`. > sortField.nullOrder() is non-null and this change never takes effect in the real world. I don't understand this part. Could you please elaborate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org