Hi,
I was trying to write out a really long column of strings where it makes sense
to use a dictionary encoding. But, I'm running into a NotImplementedError and
wondering if I'm doing something wrong or if it's really a TODO:
e.g.
>>> xs = pyarrow.DictionaryArray.from_arrays(np.tile(0, 1000),
>>> np.array(["TEST"],dtype=object))
>>> pyarrow.parquet.write_table(pyarrow.Table.from_arrays(arrays=[xs],
>>> names=["COL"]), "/tmp/TEST.pq")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/data01/home/michaelk/src/core/external/lib/python2.7/site-packages/pyarrow/parquet.py",
line 772, in write_table
writer = ParquetWriter(where, table.schema, **options)
File "pyarrow/_parquet.pyx", line 582, in
pyarrow._parquet.ParquetWriter.__cinit__
(/data01/home/michaelk/build/arrow/python/build/temp.linux-x86_64-2.7/_parquet.cxx:9915)
File "pyarrow/error.pxi", line 66, in pyarrow.lib.check_status
(/data01/home/michaelk/build/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:7369)
pyarrow.lib.ArrowNotImplementedError: NotImplemented: unhandled type
Is there something that I'm doing wrong here? Thanks.
-Mike
DISCLAIMER: This e-mail message and any attachments are intended solely for the
use of the individual or entity to which it is addressed and may contain
information that is confidential or legally privileged. If you are not the
intended recipient, you are hereby notified that any dissemination,
distribution, copying or other use of this message or its attachments is
strictly prohibited. If you have received this message in error, please notify
the sender immediately and permanently delete this message and any attachments.