danny0405 commented on code in PR #8955:
URL: https://github.com/apache/hudi/pull/8955#discussion_r1229351990


##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieColumnProjectionUtils.java:
##########
@@ -112,22 +120,53 @@ public static List<Pair<String,String>> 
getIOColumnNameAndTypes(Configuration co
   /**
    * If schema contains timestamp columns, this method is used for 
compatibility when there is no timestamp fields.
    *
-   * <p>We expect 3 cases to use parquet-avro reader {@link 
org.apache.hudi.hadoop.avro.HoodieAvroParquetReader} to read timestamp column:
+   * <p>We expect 2 cases to use parquet-avro reader {@link 
org.apache.hudi.hadoop.avro.HoodieAvroParquetReader} to read timestamp column:
    *
    * <ol>
    *   <li>Read columns contain timestamp type;</li>
    *   <li>Empty original columns;</li>
-   *   <li>Empty read columns but existing original columns contain timestamp 
type.</li>
    * </ol>
    */
   public static boolean supportTimestamp(Configuration conf) {
     List<String> readCols = Arrays.asList(getReadColumnNames(conf));
     if (readCols.isEmpty()) {
-      return getIOColumnTypes(conf).contains("timestamp");
+      return false;
+    }
+
+    String colTypes = conf.get(IOConstants.COLUMNS_TYPES, "");
+    if (colTypes == null || colTypes.isEmpty()) {
+      return true;

Review Comment:
   What does this mean `colTypes == null || colTypes.isEmpty()` and why it 
returns true?



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieColumnProjectionUtils.java:
##########
@@ -112,22 +120,53 @@ public static List<Pair<String,String>> 
getIOColumnNameAndTypes(Configuration co
   /**
    * If schema contains timestamp columns, this method is used for 
compatibility when there is no timestamp fields.
    *
-   * <p>We expect 3 cases to use parquet-avro reader {@link 
org.apache.hudi.hadoop.avro.HoodieAvroParquetReader} to read timestamp column:
+   * <p>We expect 2 cases to use parquet-avro reader {@link 
org.apache.hudi.hadoop.avro.HoodieAvroParquetReader} to read timestamp column:
    *
    * <ol>
    *   <li>Read columns contain timestamp type;</li>
    *   <li>Empty original columns;</li>
-   *   <li>Empty read columns but existing original columns contain timestamp 
type.</li>
    * </ol>
    */
   public static boolean supportTimestamp(Configuration conf) {
     List<String> readCols = Arrays.asList(getReadColumnNames(conf));
     if (readCols.isEmpty()) {
-      return getIOColumnTypes(conf).contains("timestamp");
+      return false;
+    }
+
+    String colTypes = conf.get(IOConstants.COLUMNS_TYPES, "");
+    if (colTypes == null || colTypes.isEmpty()) {
+      return true;
     }
+
+    ArrayList<TypeInfo> types = 
TypeInfoUtils.getTypeInfosFromTypeString(colTypes);
     List<String> names = getIOColumns(conf);
-    List<String> types = getIOColumnTypes(conf);
-    return types.isEmpty() || IntStream.range(0, names.size()).filter(i -> 
readCols.contains(names.get(i)))
-        .anyMatch(i -> types.get(i).equals("timestamp"));
+    return IntStream.range(0, names.size()).filter(i -> 
readCols.contains(names.get(i)))
+        .anyMatch(i -> typeContainsTimestamp(types.get(i)));
+  }
+
+  public static boolean typeContainsTimestamp(TypeInfo type) {
+    Category category = type.getCategory();

Review Comment:
   Can we move this method to `TypeInfoUtils`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to