Jorge Leitão created ARROW-12963:
------------------------------------
Summary: [Parquet] Can't roundtrip required/non-nullable strings
Key: ARROW-12963
URL: https://issues.apache.org/jira/browse/ARROW-12963
Project: Apache Arrow
Issue Type: Bug
Components: Parquet
Affects Versions: 4.0.0
Reporter: Jorge Leitão
The following code errors with
{code:java}
File "bla.parquet", line 1
PAR1<<\
^
SyntaxError: invalid syntax
{code}
{code:java}
import pyarrow as pa
string_required = ["Hello", "bbb", "aa", "", "bbb", "abc", "bbb", "bbb", "def",
"aaa"]
fields = [
pa.field("string1", pa.utf8(), nullable=False),
]
schema = pa.schema(fields)
data = {
"string1": string_required,
}
t = pa.table(data, schema=schema)
pa.parquet.write_table(
t,
"bla.parquet",
data_page_version=f"1.0",
)
parquet_file = pa.parquet.ParquetFile("bla.parquet")
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)