nsivabalan commented on code in PR #12452:
URL: https://github.com/apache/hudi/pull/12452#discussion_r1883296568


##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -226,26 +227,22 @@ public String getRecordKey(T record, Schema schema) {
    * @param metadataMap  A map containing the record metadata.
    * @param schema       The Avro schema of the record.
    * @param orderingFieldName name of the ordering field
-   * @param orderingFieldTypeOpt type of the ordering field
-   * @param orderingFieldDefault default value for ordering
    * @return The ordering value.
    */
   public Comparable getOrderingValue(Option<T> recordOption,
                                      Map<String, Object> metadataMap,
                                      Schema schema,
-                                     String orderingFieldName,
-                                     Option<Schema.Type> orderingFieldTypeOpt,
-                                     Comparable orderingFieldDefault) {
+                                     Option<String> orderingFieldName) {
     if (metadataMap.containsKey(INTERNAL_META_ORDERING_FIELD)) {
       return (Comparable) metadataMap.get(INTERNAL_META_ORDERING_FIELD);
     }
 
-    if (!recordOption.isPresent() || !orderingFieldTypeOpt.isPresent()) {
-      return orderingFieldDefault;
+    if (!recordOption.isPresent() || orderingFieldName.isEmpty()) {

Review Comment:
   whats the case where recordOption could be empty? 
   are there chances that ordering col is set in options and is of type String 
(non integer) and recordOption is empty? won't we return 0 (integer) as the 
default from here. 
   



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##########
@@ -116,9 +115,16 @@ public 
HoodieBaseFileGroupRecordBuffer(HoodieReaderContext<T> readerContext,
     } else {
       this.payloadClass = Option.empty();
     }
-    this.orderingFieldName = 
Option.ofNullable(ConfigUtils.getOrderingField(props)).orElseGet(() -> 
hoodieTableMetaClient.getTableConfig().getPreCombineField());
-    this.orderingFieldTypeOpt = recordMergeMode == 
RecordMergeMode.COMMIT_TIME_ORDERING ? Option.empty() : 
AvroSchemaUtils.findNestedFieldType(readerSchema, this.orderingFieldName);
-    this.orderingFieldDefault = orderingFieldTypeOpt.map(type -> 
readerContext.castValue(0, type)).orElse(0);
+    this.orderingFieldName = recordMergeMode == 
RecordMergeMode.COMMIT_TIME_ORDERING
+        ? Option.empty()
+        : Option.ofNullable(ConfigUtils.getOrderingField(props))
+        .or(() -> {
+          String preCombineField = 
hoodieTableMetaClient.getTableConfig().getPreCombineField();
+          if (StringUtils.isNullOrEmpty(preCombineField)) {

Review Comment:
   Option.ofNullable() could simplify L 123 to 126.



##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/TestMergeIntoTable.scala:
##########
@@ -543,7 +543,7 @@ class TestMergeIntoTable extends HoodieSparkSqlTestBase 
with ScalaAssertionSuppo
         s"""
            | merge into $tableName t0
            | using (
-           |  select 2 as s_id, 'a2' as s_name, 15 as s_price, 1001 as ts, 
'2021-03-21' as dt
+           |  select 2 as s_id, 'a2' as s_name, 15 as s_price, 1001L as ts, 
'2021-03-21' as dt

Review Comment:
   why do we need this fixes (1001 -> 1001L) ? 
   can you throw some light please



##########
hudi-common/src/main/java/org/apache/hudi/common/model/EventTimeAvroPayload.java:
##########
@@ -41,7 +42,7 @@ public EventTimeAvroPayload(GenericRecord record, Comparable 
orderingVal) {
   }
 
   public EventTimeAvroPayload(Option<GenericRecord> record) {
-    this(record.isPresent() ? record.get() : null, 0); // natural order
+    this(record.isPresent() ? record.get() : null, 
COMMIT_TIME_ORDERING_VALUE); // natural order

Review Comment:
   so, in general we assume integer as default even if ordering col chosen is 
string or other data types? 
   and we never broke anything so far is it? 



##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecord.java:
##########
@@ -56,6 +56,7 @@ public abstract class HoodieRecord<T> implements 
HoodieRecordCompatibilityInterf
   public static final String FILENAME_METADATA_FIELD = 
HoodieMetadataField.FILENAME_METADATA_FIELD.getFieldName();
   public static final String OPERATION_METADATA_FIELD = 
HoodieMetadataField.OPERATION_METADATA_FIELD.getFieldName();
   public static final String HOODIE_IS_DELETED_FIELD = "_hoodie_is_deleted";
+  public static final int COMMIT_TIME_ORDERING_VALUE = 0;

Review Comment:
   why can't we name this as DEFAULT_ORDERING_VALUE or 
NON_EXISTANT_ORDERING_VALUE
   



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HiveHoodieReaderContext.java:
##########
@@ -264,12 +270,27 @@ public UnaryOperator<ArrayWritable> projectRecord(Schema 
from, Schema to, Map<St
   }
 
   @Override
-  public Comparable castValue(Comparable value, Schema.Type newType) {
-    //TODO: [HUDI-8261] actually do casting here
+  public Comparable convertValueToEngineType(Comparable value) {
     if (value instanceof WritableComparable) {
       return value;
     }
-    return (WritableComparable) 
HoodieRealtimeRecordReaderUtils.avroToArrayWritable(value, 
Schema.create(newType));
+    //TODO: [HUDI-8261] cover more types
+    if (value == null) {
+      return null;
+    } else if (value instanceof String) {
+      return new Text((String) value);
+    } else if (value instanceof Integer) {
+      return new IntWritable((int) value);
+    } else if (value instanceof Long) {
+      return new LongWritable((long) value);
+    } else if (value instanceof Float) {
+      return new FloatWritable((float) value);
+    } else if (value instanceof Double) {
+      return new DoubleWritable((double) value);
+    } else if (value instanceof Boolean) {

Review Comment:
   I don't even know if these datatypes will work for precombine only. since 
default value is picked as 0. 
   can we test it out and updates docs on known limitations if applicable



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to