Re: [PR] [HUDI-8187] Hudi 1.0 reader should be able to read both 1.0 and 0.x tables with custom key generator [hudi]

via GitHub Thu, 19 Sep 2024 10:36:27 -0700


jonvex commented on code in PR #11926:
URL: https://github.com/apache/hudi/pull/11926#discussion_r1767324493



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/CustomAvroKeyGenerator.java:
##########
@@ -76,8 +76,11 @@ private static List<BaseKeyGenerator> 
getPartitionKeyGenerators(List<String> par
       return Collections.emptyList(); // Corresponds to no partition case
     } else {
       return partitionPathFields.stream().map(field -> {
-        Pair<String, CustomAvroKeyGenerator.PartitionKeyType> partitionAndType 
= getPartitionFieldAndKeyType(field);
-        CustomAvroKeyGenerator.PartitionKeyType keyType = 
partitionAndType.getRight();
+        Pair<String, Option<CustomAvroKeyGenerator.PartitionKeyType>> 
partitionAndType = getPartitionFieldAndKeyType(field);
+        if (partitionAndType.getRight().isEmpty()) {
+          throw new HoodieKeyException("Unable to find field names for 
partition path in proper format");

Review Comment:
   We can make this more clear to the user. You should explain what the proper 
format is key:type,...



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/CustomAvroKeyGenerator.java:
##########
@@ -99,31 +102,42 @@ public static List<PartitionKeyType> 
getPartitionTypes(List<String> partitionPat
     if (partitionPathFields.size() == 1 && 
partitionPathFields.get(0).isEmpty()) {
       return Collections.emptyList(); // Corresponds to no partition case
     } else {
-      return partitionPathFields.stream().map(field -> {
-        Pair<String, CustomAvroKeyGenerator.PartitionKeyType> partitionAndType 
= getPartitionFieldAndKeyType(field);
-        return partitionAndType.getRight();
-      }).collect(Collectors.toList());
+      return partitionPathFields.stream().map(field -> 
getPartitionFieldAndKeyType(field).getRight())
+          .filter(Option::isPresent)

Review Comment:
   We shouldn't have missing types right?. Should we be throwing an exception 
like above?



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieHadoopFsRelationFactory.scala:
##########
@@ -94,7 +94,15 @@ abstract class HoodieBaseHadoopFsRelationFactory(val 
sqlContext: SQLContext,
     tableConfig.getPartitionFields.orElse(Array.empty).toSeq
   } else {
     //it's custom keygen
-    
CustomAvroKeyGenerator.getTimestampFields(HoodieTableConfig.getPartitionFieldsForKeyGenerator(tableConfig).orElse(java.util.Collections.emptyList[String]())).asScala.toSeq
+    val timestampFieldsOpt = CustomAvroKeyGenerator.getTimestampFields(
+      
HoodieTableConfig.getPartitionFieldsForKeyGenerator(tableConfig).orElse(java.util.Collections.emptyList[String]()))

Review Comment:
   I think there is a case where they use custom keygen but only use string 
fields. If it's not too hard we should make it so we don't need to read them



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/CustomAvroKeyGenerator.java:
##########
@@ -99,31 +102,42 @@ public static List<PartitionKeyType> 
getPartitionTypes(List<String> partitionPat
     if (partitionPathFields.size() == 1 && 
partitionPathFields.get(0).isEmpty()) {
       return Collections.emptyList(); // Corresponds to no partition case
     } else {
-      return partitionPathFields.stream().map(field -> {
-        Pair<String, CustomAvroKeyGenerator.PartitionKeyType> partitionAndType 
= getPartitionFieldAndKeyType(field);
-        return partitionAndType.getRight();
-      }).collect(Collectors.toList());
+      return partitionPathFields.stream().map(field -> 
getPartitionFieldAndKeyType(field).getRight())
+          .filter(Option::isPresent)

Review Comment:
   Either all should be missing or none I think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-8187] Hudi 1.0 reader should be able to read both 1.0 and 0.x tables with custom key generator [hudi]

Reply via email to