We have this marked for getting done during the Arrow 0.6.0 release cycle, though most of the work is probably in the parquet-cpp codebase. We would appreciate any help as there aren't many of us.
Thanks On Sun, Jul 30, 2017 at 7:49 PM, Katelman, Michael <[email protected]> wrote: > Thanks, Wes. Not sure if it's helpful at all for prioritization, but for > related reasons I'm also interested in ARROW-232. > > -----Original Message----- > From: Wes McKinney [mailto:[email protected]] > Sent: Sunday, July 30, 2017 11:03 AM > To: [email protected] > Subject: Re: writing tables with dictionary arrays > > hi Mike > > No, it's a TODO: > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_PARQUET-2D929&d=DwIFaQ&c=f5Q7ov8zryUUIGT55zpGgw&r=p7uiAfJkXEwbVhZPqB-VxtsgxuGNpO5tGgnMUX3wqrPAIvdxhcKmn9kvZiXDziBQ&m=DFEP-w7VtFUtB3G4o9vcRlkprTEMcj8iTSNhG07SmF8&s=qqD169RNtHcOaM2RlsjRf-9rXmnGimEyV1pCQeaISDo&e= > > - Wes > > On Sun, Jul 30, 2017 at 11:00 AM, Katelman, Michael > <[email protected]> wrote: >> Hi, >> >> I was trying to write out a really long column of strings where it makes >> sense to use a dictionary encoding. But, I'm running into a >> NotImplementedError and wondering if I'm doing something wrong or if it's >> really a TODO: >> >> e.g. >> >>>>> xs = pyarrow.DictionaryArray.from_arrays(np.tile(0, 1000), >>>>> np.array(["TEST"],dtype=object)) >>>>> pyarrow.parquet.write_table(pyarrow.Table.from_arrays(arrays=[xs], >>>>> names=["COL"]), "/tmp/TEST.pq") >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> File >> "/data01/home/michaelk/src/core/external/lib/python2.7/site-packages/pyarrow/parquet.py", >> line 772, in write_table >> writer = ParquetWriter(where, table.schema, **options) >> File "pyarrow/_parquet.pyx", line 582, in >> pyarrow._parquet.ParquetWriter.__cinit__ >> (/data01/home/michaelk/build/arrow/python/build/temp.linux-x86_64-2.7/_parquet.cxx:9915) >> File "pyarrow/error.pxi", line 66, in pyarrow.lib.check_status >> (/data01/home/michaelk/build/arrow/python/build/temp.linux-x86_64-2.7/ >> lib.cxx:7369) >> pyarrow.lib.ArrowNotImplementedError: NotImplemented: unhandled type >> >> Is there something that I'm doing wrong here? Thanks. >> >> -Mike >> >> >> >> >> >> DISCLAIMER: This e-mail message and any attachments are intended solely for >> the use of the individual or entity to which it is addressed and may contain >> information that is confidential or legally privileged. If you are not the >> intended recipient, you are hereby notified that any dissemination, >> distribution, copying or other use of this message or its attachments is >> strictly prohibited. If you have received this message in error, please >> notify the sender immediately and permanently delete this message and any >> attachments. >> >> >>
