Nicholas Rushton created PARQUET-1408:
-----------------------------------------
Summary: parquet-tools SimpleRecord does display null columns
Key: PARQUET-1408
URL: https://issues.apache.org/jira/browse/PARQUET-1408
Project: Parquet
Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Nicholas Rushton
Fix For: 1.10.1
When using parquet-tools on a parquet file with null records the null columns
are omitted from the output.
Example:
{code:java}
scala> case class Foo(a: Int, b: String)
defined class Foo
scala> org.apache.spark.sql.SparkSession.builder.getOrCreate.createDataset((0
to 1000).map(x => Foo(1,null))).write.parquet("/tmp/foobar/"){code}
Actual:
{code:java}
☁ parquet-tools [master] ⚡ java -jar target/parquet-tools-1.10.1-SNAPSHOT.jar
cat -j
/tmp/foobar/part-00000-436a4d37-d82a-4771-8e7e-e4d428464675-c000.snappy.parquet
| head -n5
{"a":1}
{"a":1}
{"a":1}
{"a":1}
{"a":1}{code}
Expected:
{code:java}
☁ parquet-tools [master] ⚡ java -jar target/parquet-tools-1.10.1-SNAPSHOT.jar
cat -j
/tmp/foobar/part-00000-436a4d37-d82a-4771-8e7e-e4d428464675-c000.snappy.parquet
| head -n5
{"a":1,"b":null}
{"a":1,"b":null}
{"a":1,"b":null}
{"a":1,"b":null}
{"a":1,"b":null}{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)