[
https://issues.apache.org/jira/browse/ARROW-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223499#comment-16223499
]
ASF GitHub Bot commented on ARROW-1743:
---------------------------------------
Licht-T commented on a change in pull request #1260: ARROW-1743: [Python] Avoid
non-array writeable-flag check
URL: https://github.com/apache/arrow/pull/1260#discussion_r147553977
##########
File path: python/pyarrow/pandas_compat.py
##########
@@ -404,7 +404,7 @@ def table_to_blockmanager(options, table, memory_pool,
nthreads=1):
index_name = None if is_unnamed_index_level(name) else name
col_pandas = col.to_pandas()
values = col_pandas.values
- if not values.flags.writeable:
+ if hasattr(values, 'flags') and not values.flags.writeable:
Review comment:
@xhochy Do we need to check writability of `Categorical.codes` and
`Categorical. categories.values` even if we cannot edit directly?
```python
>>> import pandas as pd
>>> import numpy as np
>>> values = pd.Categorical(['A', 'A'])
>>>
>>> arr = pa.DictionaryArray.from_arrays(
... np.array([0, 0]),
... np.array([100]))
>>>
>>> result = arr.to_pandas()
>>>
>>> result.codes = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/pandas/core/categorical.py",
line 506, in _set_codes
raise ValueError("cannot set Categorical codes directly")
ValueError: cannot set Categorical codes directly
>>>
>>> result.categories.values = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Table to_pandas fails when index contains categorical column
> ------------------------------------------------------------
>
> Key: ARROW-1743
> URL: https://issues.apache.org/jira/browse/ARROW-1743
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.7.1
> Reporter: Brian Pendleton
> Assignee: Licht Takeuchi
> Labels: pull-request-available
>
> Categorical columns in the index of a dataframe are causing a roundtrip
> failure.
> {code}
> >>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [1, 2, 3]})
> >>> df['a'] = df.a.astype('category')
> >>> df = df.set_index('a')
> >>> tbl = pa.Table.from_pandas(df)
> >>> tbl.to_pandas()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "table.pxi", line 881, in pyarrow.lib.Table.to_pandas
> File
> "C:\Users\bpendlet\Miniconda3\envs\panpy3\lib\site-packages\pyarrow\pandas_compat.py",
> line 303, in table_to_blockmanager
> if not values.flags.writeable:
> AttributeError: 'Categorical' object has no attribute 'flags'
> {code}
> Works as expected when you don't change have the categorical:
> {code}
> >>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [1, 2, 3]})
> >>> df = df.set_index('a')
> >>> tbl = pa.Table.from_pandas(df)
> >>> tbl.to_pandas()
> b
> a
> 1 1
> 2 2
> 3 3
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)