xushiyan commented on code in PR #5378:
URL: https://github.com/apache/hudi/pull/5378#discussion_r854721036
##########
hudi-spark-datasource/hudi-spark3.1.x/src/main/scala/org/apache/spark/sql/adapter/Spark3_1Adapter.scala:
##########
@@ -23,7 +23,7 @@ import org.apache.spark.SPARK_VERSION
import org.apache.spark.sql.avro.{HoodieAvroDeserializer,
HoodieAvroSerializer, HoodieSpark3_1AvroDeserializer,
HoodieSpark3_1AvroSerializer}
import org.apache.spark.sql.catalyst.plans.logical._
import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat,
Spark312HoodieParquetFileFormat}
+import org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat,
Spark31HoodieParquetFileFormat}
Review Comment:
this renaming seems to be fine. @xiaorixiaoyao is there any particular
reason that it was only meant for Spark312?
##########
hudi-common/src/main/java/org/apache/hudi/common/util/ReflectionUtils.java:
##########
@@ -85,11 +87,14 @@ public static <T extends HoodieRecordPayload> T
loadPayload(String recordPayload
* Creates an instance of the given class. Use this version when dealing
with interface types as constructor args.
*/
public static Object loadClass(String clazz, Class<?>[] constructorArgTypes,
Object... constructorArgs) {
- try {
- return
getClass(clazz).getConstructor(constructorArgTypes).newInstance(constructorArgs);
- } catch (InstantiationException | IllegalAccessException |
InvocationTargetException | NoSuchMethodException e) {
- throw new HoodieException("Unable to instantiate class " + clazz, e);
- }
+ return newInstanceUnchecked(getClass(clazz), constructorArgTypes,
constructorArgs);
+ }
+
+ /**
+ * Creates an instance of the given class. Constructor arg types are
inferred.
+ */
+ public static Object loadClass(String clazz, Object... constructorArgs) {
+ return newInstanceUnchecked(getClass(clazz), constructorArgs);
Review Comment:
this is affecting a lot of code paths. can't we make a specialized one just
for this patch and leave the original as is? we should mitigate the risk here.
##########
hudi-common/src/main/java/org/apache/hudi/common/util/ReflectionUtils.java:
##########
@@ -85,11 +87,14 @@ public static <T extends HoodieRecordPayload> T
loadPayload(String recordPayload
* Creates an instance of the given class. Use this version when dealing
with interface types as constructor args.
*/
public static Object loadClass(String clazz, Class<?>[] constructorArgTypes,
Object... constructorArgs) {
- try {
- return
getClass(clazz).getConstructor(constructorArgTypes).newInstance(constructorArgs);
- } catch (InstantiationException | IllegalAccessException |
InvocationTargetException | NoSuchMethodException e) {
- throw new HoodieException("Unable to instantiate class " + clazz, e);
- }
+ return newInstanceUnchecked(getClass(clazz), constructorArgTypes,
constructorArgs);
+ }
+
+ /**
+ * Creates an instance of the given class. Constructor arg types are
inferred.
+ */
+ public static Object loadClass(String clazz, Object... constructorArgs) {
+ return newInstanceUnchecked(getClass(clazz), constructorArgs);
Review Comment:
I don't even see this `newInstanceUnchecked` was used in this patch. can we
revert the change? or create a special one if you need it here?
##########
hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark32HoodieParquetFileFormat.scala:
##########
@@ -225,6 +239,10 @@ class Spark32HoodieParquetFileFormat(private val
shouldAppendPartitionValues: Bo
if (enableVectorizedReader) {
val vectorizedReader =
if (shouldUseInternalSchema) {
+ val int96RebaseSpec =
+
DataSourceUtils.int96RebaseSpec(footerFileMetaData.getKeyValueMetaData.get,
int96RebaseModeInRead)
+ val datetimeRebaseSpec =
+
DataSourceUtils.datetimeRebaseSpec(footerFileMetaData.getKeyValueMetaData.get,
datetimeRebaseModeInRead)
Review Comment:
this implies schema evolution only works with 3.2.1 ? cc @xiaorixiaoyao
##########
hudi-spark-datasource/hudi-spark3.1.x/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark31HoodieParquetFileFormat.scala:
##########
@@ -61,7 +61,7 @@ import java.net.URI
* <li>Schema on-read</li>
* </ol>
*/
-class Spark312HoodieParquetFileFormat(private val shouldAppendPartitionValues:
Boolean) extends ParquetFileFormat {
+class Spark31HoodieParquetFileFormat(private val shouldAppendPartitionValues:
Boolean) extends ParquetFileFormat {
Review Comment:
the changes to 31 parquet format is purely refactoring right? looking ok but
i would rather be super paranoid at this stage to introduce more diffs than
needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]