kasakrisz commented on code in PR #5627:
URL: https://github.com/apache/hive/pull/5627#discussion_r1954304603


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -857,7 +858,12 @@ public DynamicPartitionCtx createDPContext(
           if (sortField.sourceId() == field.fieldId()) {
             customSortPositions.add(pos);
             customSortOrder.add(sortField.direction() == SortDirection.ASC ? 1 
: 0);
-            customSortNullOrder.add(sortField.nullOrder() == 
NullOrder.NULLS_FIRST ? 0 : 1);
+
+            NullOrdering nullOrdering = NullOrdering.NULLS_LAST;
+            if (sortField.nullOrder() == NullOrder.NULLS_FIRST) {
+              nullOrdering = NullOrdering.NULLS_FIRST;
+            }
+            customSortNullOrder.add(nullOrdering.getCode());

Review Comment:
   There were two failing test without this change.
   ```
   TestHiveIcebergInserts.testSortedInsert
   TestHiveIcebergInserts.testSortedAndTransformedInsertIntoPartitionedTable
   ```
   When I analyzed the cause of the failure of these tests I bumped into this 
bug: 
   ```
   sortField.nullOrder() == NullOrder.NULLS_FIRST ? 0 : 1
   ```
   maps `NULLS_FIRST` to `0` which actually means nulls last in Hive
   
https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java#L30-L31
   
   > We added a feature to configure the sort order of Iceberg tables. I 
wondered if we can keep the semantics of WRITE LOCALLY ORDERED BY a
   
   I found that null sort order is set here when it was not specified 
explicitly at create table:
   
https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java#L292
   and the defaults are defined at the parser level
   
https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L2205-L2206
   
   This is not affected by the setting `hive.default.nulls.last`.
   
   >  sortField.nullOrder() is non-null and this change never takes effect in 
the real world.
   
   I don't understand this part. Could you please elaborate.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to