eadwright commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-846615701


   Update - I've created some Java code which writes a parquet file nearly big 
enough to cause the issue, and can successfully read this file. However, two 
problems:
   
   * If I bump the record count so I have a file big enough to reproduce this 
bug 1633, another bug causes 32-bit integer overflow. The code as it stands 
cannot reproduce the file which was created in python.
   * The code I'm using to read the data, the potential unit test, does not 
work against the python-produced file (which has an avro schema) as I get this 
error (I suspect some classpath issue) `java.lang.NoSuchMethodError: 
org.apache.parquet.format.LogicalType.getSetField()Lshaded/parquet/org/apache/thrift/TFieldIdEnum;`
   
   So... if any of you can reproduce the original error with the parquet file I 
posted above, and can validate this 7 line fix addresses it, that'd be great. 
Open to ideas of course.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to