Re: [PR] [HUDI-7874] Fix Hudi being able to read 2-level structure [hudi]

via GitHub Thu, 13 Jun 2024 15:53:11 -0700


VitoMakarevich commented on PR #11450:
URL: https://github.com/apache/hudi/pull/11450#issuecomment-2166919410


   Also I see [PR](https://github.com/apache/hudi/pull/7512) which introduced 
custom class of
   
`hudi-hadoop-common/src/main/java/org/apache/parquet/avro/HoodieAvroReadSupport.java`.
   
   As I understand it came after the developers' decision to not use the schema 
with which the file has been written in favor of the deduced writer schema. So 
the purpose of the previous PR was that:
   
   If for some reason Parquet file is written in a new style(3-level nesting) - 
likely with some other than Spark tool or with 
"spark.hadoop.parquet.avro.write-old-list-structure", "false" - then if there 
is no overrides(safety measures first), kindly request reader to read it as a 
new style.
   Without it - it was likely leading to the same issue we are facing, so the 
current change is basically the reverse case - if the file was written as 2 
level, no matter what setting is in the runtime, use 2 level readers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-7874] Fix Hudi being able to read 2-level structure [hudi]

Reply via email to