Hi,
I created a table with metadata and am trying to write it into an ORC. Using
orc-metadata tool I can not see metadata set in pyarrow in the file afterwards.
The 'user metadata' is empty.
$ python3.10
>>> import pyarrow as pa
>>> from pyarrow import orc
>>> n_legs = pa.array([2, 4, 5, 100])
>>> animals = pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"])
>>> names = ["n_legs", "animals"]
>>> my_metadata={"n_legs": "Number of legs per animal"}
>>> table1 = pa.table([n_legs, animals], names=names, metadata = my_metadata)
>>> table1.schema
n_legs: int64
animals: string
-- schema metadata --
n_legs: 'Number of legs per animal'
>>> table1.schema.metadata
{b'n_legs': b'Number of legs per animal'}
>>> orc.write_table(table1, "table1.orc")
$ orc-metadata table1.orc
{ "name": "table1.orc",
"type": "struct<n_legs:bigint,animals:string>",
"attributes": {},
"rows": 4,
"stripe count": 1,
"format": "0.12", "writer version": "ORC-135", "software version": "ORC C++
1.9.0",
"compression": "none",
"file length": 405,
"content": 185, "stripe stats": 56, "footer": 136, "postscript": 24,
"row index stride": 10000,
"user metadata": {
},
"stripes": [
{ "stripe": 0, "rows": 4,
"offset": 3, "length": 185,
"index": 69, "data": 45, "footer": 71
}
]
}
I must be doing something wrong .. or is this expected?
Thanks,
Hinko