TeRS-K commented on a change in pull request #2740:
URL: https://github.com/apache/hudi/pull/2740#discussion_r611989076
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/bootstrap/HoodieSparkBootstrapSchemaProvider.java
##########
@@ -47,7 +47,7 @@ protected Schema getBootstrapSourceSchema(HoodieEngineContext
context, List<Pair
MessageType parquetSchema = partitions.stream().flatMap(p ->
p.getValue().stream()).map(fs -> {
try {
Path filePath = FileStatusUtils.toPath(fs.getPath());
- return ParquetUtils.readSchema(context.getHadoopConf().get(),
filePath);
+ return new ParquetUtils().readSchema(context.getHadoopConf().get(),
filePath);
Review comment:
There is actually another function called `readAvroSchema` in
DataFileUtils that always return a schema in avro format regardless of the file
format, however, in this scenario it uses a `ParquetToSparkSchemaConverter` on
a parquet schema, which must be of the `MessageType` type.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]