yihua commented on code in PR #12380:
URL: https://github.com/apache/hudi/pull/12380#discussion_r1864138219


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieHadoopFsRelationFactory.scala:
##########
@@ -79,6 +79,9 @@ abstract class HoodieBaseHadoopFsRelationFactory(val 
sqlContext: SQLContext,
   protected lazy val basePath: StoragePath = metaClient.getBasePath
   protected lazy val partitionColumns: Array[String] = 
tableConfig.getPartitionFields.orElse(Array.empty)
 
+  // very much not recommended to use a partition column as the precombine

Review Comment:
   Could we print a warning message if partition field and precombine field is 
the same?



##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/ddl/TestCreateTable.scala:
##########
@@ -1555,4 +1555,34 @@ class TestCreateTable extends HoodieSparkSqlTestBase {
       )("Failed to create catalog table in metastore")
     }
   }
+
+  test("Test Create Table with Same Value for Partition and Precombine") {

Review Comment:
   Could we add more tests covering different branches / cases introduced in 
`partitionColumnsToRead`?



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieHadoopFsRelationFactory.scala:
##########
@@ -103,6 +103,21 @@ abstract class HoodieBaseHadoopFsRelationFactory(val 
sqlContext: SQLContext,
       // For older tables the partition type may not be available so falling 
back to partition fields in those cases
       tableConfig.getPartitionFields.orElse(Array.empty).toSeq
     }
+  } else {
+    Seq.empty
+  }
+
+  protected lazy val partitionColumnsToRead: Seq[String] = (
+    shouldExtractPartitionValuesFromPartitionPath,
+    keygenTypeHasVariablePartitionCols,
+    partitionColumnsHasPrecombine
+  ) match {
+    case (true, _, _) => Seq.empty
+    case (false, true, false) => variableTimestampKeygenPartitionCols
+    case (false, true, true) if 
variableTimestampKeygenPartitionCols.contains(preCombineFieldOpt.get) => 
variableTimestampKeygenPartitionCols
+    case (false, true, true) => variableTimestampKeygenPartitionCols :+ 
preCombineFieldOpt.get

Review Comment:
   Could we rewrite this to be legible, sth like
   ```
   case (false, true, true) => if 
(variableTimestampKeygenPartitionCols.contains(preCombineFieldOpt.get)) { 
     variableTimestampKeygenPartitionCols
   } else {
     variableTimestampKeygenPartitionCols :+ preCombineFieldOpt.get
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to