Atul Mohan created PARQUET-2178: ----------------------------------- Summary: ParquetReader constructed using builder fails to read encrypted files Key: PARQUET-2178 URL: https://issues.apache.org/jira/browse/PARQUET-2178 Project: Parquet Issue Type: Bug Components: parquet-mr Affects Versions: 1.12.2 Reporter: Atul Mohan
ParquetReader objects can be constructed using the builder as follows: {code:java} ParquetReader<Group> builderReader = ParquetReader.builder( new GroupReadSupport(), new Path("path/to/c000.snappy.parquet")) .withConf(conf) .build();{code} This parquetReader object cannot be used to read encrypted files as {noformat} builderReader.read(){noformat} fails with the following exception: {code:java} java.lang.NullPointerException at org.apache.parquet.crypto.keytools.FileKeyUnwrapper.getKey(FileKeyUnwrapper.java:87) {code} It seems like the reason is that the _withConf_ method within the ParquetReader builder [clears the optionsBuilder set earlier|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetReader.java#L231]. My proposal for a solution would be to un-deprecate the constructor: {code:java} ParquetReader(Configuration conf, Path file, ReadSupport<T> readSupport){code} so that applications can read encrypted parquet files using the ParquetReader. If approved, I can do a PR to make this change. Here is a sample test showcasing the issue: [https://gist.github.com/a2l007/3d813cc5e44c45100dda169dc6245ae4] -- This message was sent by Atlassian Jira (v8.20.10#820010)