[GitHub] [hudi] TeRS-K commented on a change in pull request #2740: [HUDI-1055] Remove hardcoded parquet in tests

GitBox Mon, 12 Apr 2021 15:15:40 -0700


TeRS-K commented on a change in pull request #2740:
URL: https://github.com/apache/hudi/pull/2740#discussion_r611989076




##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/bootstrap/HoodieSparkBootstrapSchemaProvider.java
##########
@@ -47,7 +47,7 @@ protected Schema getBootstrapSourceSchema(HoodieEngineContext 
context, List<Pair
     MessageType parquetSchema = partitions.stream().flatMap(p -> 
p.getValue().stream()).map(fs -> {
       try {
         Path filePath = FileStatusUtils.toPath(fs.getPath());
-        return ParquetUtils.readSchema(context.getHadoopConf().get(), 
filePath);
+        return new ParquetUtils().readSchema(context.getHadoopConf().get(), 
filePath);

Review comment:
       There is actually another function called `readAvroSchema` in 
DataFileUtils that always return a schema in avro format regardless of the file 
format, however, in this scenario it uses a `ParquetToSparkSchemaConverter` on 
a parquet schema, which must be of the `MessageType` type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] TeRS-K commented on a change in pull request #2740: [HUDI-1055] Remove hardcoded parquet in tests

Reply via email to