eadwright commented on pull request #902: URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-846615701
Update - I've created some Java code which writes a parquet file nearly big enough to cause the issue, and can successfully read this file. However, two problems: * If I bump the record count so I have a file big enough to reproduce this bug 1633, another bug causes 32-bit integer overflow. The code as it stands cannot reproduce the file which was created in python. * The code I'm using to read the data, the potential unit test, does not work against the python-produced file (which has an avro schema) as I get this error (I suspect some classpath issue) `java.lang.NoSuchMethodError: org.apache.parquet.format.LogicalType.getSetField()Lshaded/parquet/org/apache/thrift/TFieldIdEnum;` So... if any of you can reproduce the original error with the parquet file I posted above, and can validate this 7 line fix addresses it, that'd be great. Open to ideas of course. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
