[
https://issues.apache.org/jira/browse/PARQUET-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500502#comment-16500502
]
ASF GitHub Bot commented on PARQUET-1317:
-----------------------------------------
zivanfi closed pull request #491: PARQUET-1317: Fix ParquetMetadataConverter
throw NPE
URL: https://github.com/apache/parquet-mr/pull/491
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git
a/parquet-hadoop/src/test/java/org/apache/parquet/format/converter/TestParquetMetadataConverter.java
b/parquet-hadoop/src/test/java/org/apache/parquet/format/converter/TestParquetMetadataConverter.java
index 4fc4035f7..b3eebd6ae 100644
---
a/parquet-hadoop/src/test/java/org/apache/parquet/format/converter/TestParquetMetadataConverter.java
+++
b/parquet-hadoop/src/test/java/org/apache/parquet/format/converter/TestParquetMetadataConverter.java
@@ -146,6 +146,22 @@ public void testSchemaConverterDecimal() {
Assert.assertEquals(expected, schemaElements);
}
+ @Test
+ public void testLogicalTypesBackwardCompatibleWithConvertedTypes() {
+ ParquetMetadataConverter parquetMetadataConverter = new
ParquetMetadataConverter();
+ MessageType expected = Types.buildMessage()
+ .required(PrimitiveTypeName.BINARY)
+ .as(OriginalType.DECIMAL).precision(9).scale(2)
+ .named("aBinaryDecimal")
+ .named("Message");
+ List<SchemaElement> parquetSchema =
parquetMetadataConverter.toParquetSchema(expected);
+ // Set logical type field to null to test backward compatibility with
files written by older API,
+ // where converted_types are written to the metadata, but logicalType is
missing
+ parquetSchema.get(1).setLogicalType(null);
+ MessageType schema =
parquetMetadataConverter.fromParquetSchema(parquetSchema, null);
+ assertEquals(expected, schema);
+ }
+
@Test
public void testEnumEquivalence() {
ParquetMetadataConverter parquetMetadataConverter = new
ParquetMetadataConverter();
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> ParquetMetadataConverter throw NPE
> ----------------------------------
>
> Key: PARQUET-1317
> URL: https://issues.apache.org/jira/browse/PARQUET-1317
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.10.1
> Reporter: Yuming Wang
> Assignee: Yuming Wang
> Priority: Major
>
> How to reproduce:
> {code:scala}
> $ bin/spark-shell
> scala> spark.range(10).selectExpr("cast(id as string) as
> id").coalesce(1).write.parquet("/tmp/parquet-1317")
> scala>
> java -jar ./parquet-tools/target/parquet-tools-1.10.1-SNAPSHOT.jar head
> --debug
> file:///tmp/parquet-1317/part-00000-6cfafbdd-fdeb-4861-8499-8583852ba437-c000.snappy.parquet
> {code}
> {noformat}
> java.io.IOException: Could not read footer: java.lang.NullPointerException
> at
> org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:271)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:202)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readFooters(ParquetFileReader.java:354)
> at
> org.apache.parquet.tools.command.RowCountCommand.execute(RowCountCommand.java:88)
> at org.apache.parquet.tools.Main.main(Main.java:223)
> Caused by: java.lang.NullPointerException
> at
> org.apache.parquet.format.converter.ParquetMetadataConverter.getOriginalType(ParquetMetadataConverter.java:828)
> at
> org.apache.parquet.format.converter.ParquetMetadataConverter.buildChildren(ParquetMetadataConverter.java:1173)
> at
> org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetSchema(ParquetMetadataConverter.java:1124)
> at
> org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:1058)
> at
> org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:1052)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:532)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:505)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:499)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:476)
> at
> org.apache.parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:261)
> at
> org.apache.parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:257)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> java.io.IOException: Could not read footer:
> java.lang.NullPointerException{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)