Nicholas Rushton created PARQUET-1408:
-----------------------------------------

             Summary: parquet-tools SimpleRecord does display null columns
                 Key: PARQUET-1408
                 URL: https://issues.apache.org/jira/browse/PARQUET-1408
             Project: Parquet
          Issue Type: Bug
    Affects Versions: 1.9.0
            Reporter: Nicholas Rushton
             Fix For: 1.10.1


When using parquet-tools on a parquet file with null records the null columns 
are omitted from the output.

 

Example:
{code:java}
scala> case class Foo(a: Int, b: String)
defined class Foo

scala> org.apache.spark.sql.SparkSession.builder.getOrCreate.createDataset((0 
to 1000).map(x => Foo(1,null))).write.parquet("/tmp/foobar/"){code}
Actual:
{code:java}
☁  parquet-tools [master] ⚡  java -jar target/parquet-tools-1.10.1-SNAPSHOT.jar 
cat -j 
/tmp/foobar/part-00000-436a4d37-d82a-4771-8e7e-e4d428464675-c000.snappy.parquet 
| head -n5
{"a":1}
{"a":1}
{"a":1}
{"a":1}
{"a":1}{code}
Expected:
{code:java}
☁  parquet-tools [master] ⚡  java -jar target/parquet-tools-1.10.1-SNAPSHOT.jar 
cat -j 
/tmp/foobar/part-00000-436a4d37-d82a-4771-8e7e-e4d428464675-c000.snappy.parquet 
| head -n5
{"a":1,"b":null}
{"a":1,"b":null}
{"a":1,"b":null}
{"a":1,"b":null}
{"a":1,"b":null}{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to