Hi,

I created a table with metadata and am trying to write it into an ORC. Using 
orc-metadata tool I can not see metadata set in pyarrow in the file afterwards. 
The 'user metadata' is empty.

$ python3.10
>>> import pyarrow as pa
>>> from pyarrow import orc
>>> n_legs = pa.array([2, 4, 5, 100])
>>> animals = pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"])
>>> names = ["n_legs", "animals"]
>>> my_metadata={"n_legs": "Number of legs per animal"}
>>> table1 = pa.table([n_legs, animals], names=names, metadata = my_metadata)
>>> table1.schema
n_legs: int64
animals: string
-- schema metadata --
n_legs: 'Number of legs per animal'
>>> table1.schema.metadata
{b'n_legs': b'Number of legs per animal'}
>>> orc.write_table(table1, "table1.orc")


$ orc-metadata table1.orc 
{ "name": "table1.orc",
  "type": "struct<n_legs:bigint,animals:string>",
  "attributes": {},
  "rows": 4,
  "stripe count": 1,
  "format": "0.12", "writer version": "ORC-135", "software version": "ORC C++ 
1.9.0",
  "compression": "none",
  "file length": 405,
  "content": 185, "stripe stats": 56, "footer": 136, "postscript": 24,
  "row index stride": 10000,
  "user metadata": {
  },
  "stripes": [
    { "stripe": 0, "rows": 4,
      "offset": 3, "length": 185,
      "index": 69, "data": 45, "footer": 71
    }
  ]
}


I must be doing something wrong .. or is this expected?

Thanks,
Hinko

Reply via email to