Kengo Seki created PARQUET-1596:
-----------------------------------

             Summary: PARQUET-1375 broke parquet-cli's to-avro command
                 Key: PARQUET-1596
                 URL: https://issues.apache.org/jira/browse/PARQUET-1596
             Project: Parquet
          Issue Type: Bug
          Components: parquet-cli
            Reporter: Kengo Seki


Given the following JSON file:

{code}
$ cat /tmp/sample.json 
{ "id": 1, "name": "Alice" }
{ "id": 2, "name": "Bob" }
{ "id": 3, "name": "Carol" }
{ "id": 4, "name": "Dave" }
{code}

using {{to-avro}} on the master branch for converting this into avro fails with 
NPE:

{code}
$ git branch -v
* master 47398be7 PARQUET-1375: Upgrade to Jackson 2.9.9 (#616)
$ mvn clean install -DskipTests

(snip)

[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ parquet-cli 
---
[INFO] Installing 
/home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT.jar 
to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.jar
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/pom.xml to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.pom
[INFO] Installing 
/home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-tests.jar
 to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-tests.jar
[INFO] Installing 
/home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
 to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  14.769 s
[INFO] Finished at: 2019-06-12T23:52:57+09:00
[INFO] ------------------------------------------------------------------------
$ mvn dependency:copy-dependencies

(snip)

$ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main to-avro 
/tmp/sample.json -o /tmp/sample.avro
Unknown error
java.lang.RuntimeException: Failed on record 0
        at 
org.apache.parquet.cli.commands.ToAvroCommand.run(ToAvroCommand.java:120)
        at org.apache.parquet.cli.Main.run(Main.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.parquet.cli.Main.main(Main.java:177)
Caused by: java.lang.NullPointerException
        at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:153)
        at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:145)
        at 
org.apache.parquet.cli.commands.ToAvroCommand.run(ToAvroCommand.java:112)
        ... 3 more
$ echo $?
1
{code}

But with its previous revision, it succeeds:

{code}
$ git checkout HEAD^
HEAD is now at 9d6fb45e PARQUET-1576 Bump Apache Avro to 1.9.0 (#638)
$ mvn clean install -DskipTests

(snip)

[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ parquet-cli 
---
[INFO] Installing 
/home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT.jar 
to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.jar
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/pom.xml to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.pom
[INFO] Installing 
/home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-tests.jar
 to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-tests.jar
[INFO] Installing 
/home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
 to 
/home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  15.822 s
[INFO] Finished at: 2019-06-12T23:57:04+09:00
[INFO] ------------------------------------------------------------------------
$ mvn dependency:copy-dependencies

(snip)

$ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main to-avro 
/tmp/sample.json -o /tmp/sample.avro
$ echo $?
0
$ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main head 
/tmp/sample.avro
{"id": 1, "name": "Alice"}
{"id": 2, "name": "Bob"}
{"id": 3, "name": "Carol"}
{"id": 4, "name": "Dave"}
{code}

Reverting the following code

{code:title=AvroJson.java}
   public static Iterator<JsonNode> parser(final InputStream stream) {
     try(JsonParser parser = FACTORY.createParser(stream)) {
{code}

to

{code}
   public static Iterator<JsonNode> parser(final InputStream stream) {
     try {
      JsonParser parser = FACTORY.createParser(stream);
{code}

seems to work.

cc [~Fokko] :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to