mickjermsurawong-stripe commented on issue #1654: URL: https://github.com/apache/iceberg/issues/1654#issuecomment-729232208
Hi ryan, it's really not a great patch we made. From the original static method [`openReader`](https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/file/DataFileReader.java#L54) which originally does two things: - header checking (where the bug is) - using public constructors for corresponding avro versions found We simply change any invocation in Iceberg for `DataFileReader.openReader` to our new class `AvroPatch2944DataFileReader.openReader` ``` public class AvroPatch2944DataFileReader { public static <D> FileReader<D> openReader(SeekableInput in, DatumReader<D> reader) throws IOException { // read header with corrected pointer // instantiate DataFileReader } } ``` The alternative is to simply skip checking header. Since the avro files read here is always written by the Iceberg framework, i think it is safe to simply call constructor of the file reader for new version (instead of 1.2). I'd prefer this for upstream Iceberg. For our internal fork, we don't want to make the same assumption, so we took a more conservative approach, maintaining the same logic with re-implementation ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
