[ https://issues.apache.org/jira/browse/ARROW-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated ARROW-3325: -------------------------------- Fix Version/s: (was: 0.13.0) 0.14.0 > [Python] Support reading Parquet binary/string columns as pandas Categorical > ---------------------------------------------------------------------------- > > Key: ARROW-3325 > URL: https://issues.apache.org/jira/browse/ARROW-3325 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Wes McKinney > Assignee: Wes McKinney > Priority: Major > Labels: parquet > Fix For: 0.14.0 > > > Requires PARQUET-1324 and probably quite a bit of extra work > Properly implementing this will require dictionary normalization across row > groups. When reading a new row group, a fast path that compares the current > dictionary with the prior dictionary should be used. This also needs to > handle the case where a column chunk "fell back" to PLAIN encoding mid-stream -- This message was sent by Atlassian JIRA (v7.6.3#76005)