[jira] [Comment Edited] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
[ https://issues.apache.org/jira/browse/ARROW-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463622#comment-16463622 ] Licht Takeuchi edited comment on ARROW-2273 at 5/4/18 9:39 AM: --- [~mitar], Yes, it is still there. {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, this is not the right time to support {{SparseDataFrame}} in pyarrow. was (Author: licht-t): [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, this is not the right time to support {{SparseDataFrame}} in pyarrow. > Cannot deserialize pandas SparseDataFrame > - > > Key: ARROW-2273 > URL: https://issues.apache.org/jira/browse/ARROW-2273 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.9.0 >Reporter: Mitar >Assignee: Licht Takeuchi >Priority: Major > > >>> import pyarrow > >>> import pandas > >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, > >>> 9]}) > >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) > Traceback (most recent call last): > File "", line 1, in > File "serialization.pxi", line 441, in pyarrow.lib.deserialize > File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from > File "serialization.pxi", line 257, in > pyarrow.lib.SerializedPyObject.deserialize > File "serialization.pxi", line 174, in > pyarrow.lib.SerializationContext._deserialize_callback > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", > line 77, in _deserialize_pandas_dataframe > return pdcompat.serialized_dict_to_dataframe(data) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in serialized_dict_to_dataframe > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 478, in _reconstruct_block > block = _int.make_block(block_arr, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 2957, in make_block > return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 120, in __init__ > len(self.mgr_locs))) > ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
[ https://issues.apache.org/jira/browse/ARROW-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463622#comment-16463622 ] Licht Takeuchi edited comment on ARROW-2273 at 5/4/18 9:38 AM: --- [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, it is not the right time to support this in pyarrow. was (Author: licht-t): [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, it is not the right time yet to support this in pyarrow. > Cannot deserialize pandas SparseDataFrame > - > > Key: ARROW-2273 > URL: https://issues.apache.org/jira/browse/ARROW-2273 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.9.0 >Reporter: Mitar >Assignee: Licht Takeuchi >Priority: Major > > >>> import pyarrow > >>> import pandas > >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, > >>> 9]}) > >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) > Traceback (most recent call last): > File "", line 1, in > File "serialization.pxi", line 441, in pyarrow.lib.deserialize > File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from > File "serialization.pxi", line 257, in > pyarrow.lib.SerializedPyObject.deserialize > File "serialization.pxi", line 174, in > pyarrow.lib.SerializationContext._deserialize_callback > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", > line 77, in _deserialize_pandas_dataframe > return pdcompat.serialized_dict_to_dataframe(data) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in serialized_dict_to_dataframe > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 478, in _reconstruct_block > block = _int.make_block(block_arr, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 2957, in make_block > return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 120, in __init__ > len(self.mgr_locs))) > ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
[ https://issues.apache.org/jira/browse/ARROW-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463622#comment-16463622 ] Licht Takeuchi edited comment on ARROW-2273 at 5/4/18 9:38 AM: --- [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, this is not the right time to support {{SparseDataFrame}} in pyarrow. was (Author: licht-t): [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, it is not the right time to support this in pyarrow. > Cannot deserialize pandas SparseDataFrame > - > > Key: ARROW-2273 > URL: https://issues.apache.org/jira/browse/ARROW-2273 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.9.0 >Reporter: Mitar >Assignee: Licht Takeuchi >Priority: Major > > >>> import pyarrow > >>> import pandas > >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, > >>> 9]}) > >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) > Traceback (most recent call last): > File "", line 1, in > File "serialization.pxi", line 441, in pyarrow.lib.deserialize > File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from > File "serialization.pxi", line 257, in > pyarrow.lib.SerializedPyObject.deserialize > File "serialization.pxi", line 174, in > pyarrow.lib.SerializationContext._deserialize_callback > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", > line 77, in _deserialize_pandas_dataframe > return pdcompat.serialized_dict_to_dataframe(data) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in serialized_dict_to_dataframe > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 478, in _reconstruct_block > block = _int.make_block(block_arr, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 2957, in make_block > return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 120, in __init__ > len(self.mgr_locs))) > ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
[ https://issues.apache.org/jira/browse/ARROW-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463622#comment-16463622 ] Licht Takeuchi edited comment on ARROW-2273 at 5/4/18 9:37 AM: --- [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, it is not the right time yet to support this in pyarrow. was (Author: licht-t): [~mitar], Yes, it is still there. But, {{SparseDataFrame}} is naive implementation and has many bugs. I've spent a lot of time to fix these but it is hard to fix all. IMO, it is not the right time yet to support this in pyarrow. > Cannot deserialize pandas SparseDataFrame > - > > Key: ARROW-2273 > URL: https://issues.apache.org/jira/browse/ARROW-2273 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.9.0 >Reporter: Mitar >Assignee: Licht Takeuchi >Priority: Major > > >>> import pyarrow > >>> import pandas > >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, > >>> 9]}) > >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) > Traceback (most recent call last): > File "", line 1, in > File "serialization.pxi", line 441, in pyarrow.lib.deserialize > File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from > File "serialization.pxi", line 257, in > pyarrow.lib.SerializedPyObject.deserialize > File "serialization.pxi", line 174, in > pyarrow.lib.SerializationContext._deserialize_callback > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", > line 77, in _deserialize_pandas_dataframe > return pdcompat.serialized_dict_to_dataframe(data) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in serialized_dict_to_dataframe > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 478, in _reconstruct_block > block = _int.make_block(block_arr, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 2957, in make_block > return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 120, in __init__ > len(self.mgr_locs))) > ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
[ https://issues.apache.org/jira/browse/ARROW-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463576#comment-16463576 ] Licht Takeuchi edited comment on ARROW-2273 at 5/4/18 8:54 AM: --- Yes, I'll do that. was (Author: licht-t): Okay, I'll do that. > Cannot deserialize pandas SparseDataFrame > - > > Key: ARROW-2273 > URL: https://issues.apache.org/jira/browse/ARROW-2273 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.9.0 >Reporter: Mitar >Assignee: Licht Takeuchi >Priority: Major > > >>> import pyarrow > >>> import pandas > >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, > >>> 9]}) > >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) > Traceback (most recent call last): > File "", line 1, in > File "serialization.pxi", line 441, in pyarrow.lib.deserialize > File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from > File "serialization.pxi", line 257, in > pyarrow.lib.SerializedPyObject.deserialize > File "serialization.pxi", line 174, in > pyarrow.lib.SerializationContext._deserialize_callback > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", > line 77, in _deserialize_pandas_dataframe > return pdcompat.serialized_dict_to_dataframe(data) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in serialized_dict_to_dataframe > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 478, in _reconstruct_block > block = _int.make_block(block_arr, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 2957, in make_block > return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 120, in __init__ > len(self.mgr_locs))) > ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
[ https://issues.apache.org/jira/browse/ARROW-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463370#comment-16463370 ] Licht Takeuchi edited comment on ARROW-2273 at 5/4/18 5:09 AM: --- SparseDataFrame is planned to be deprecated in pandas. [https://github.com/pandas-dev/pandas/issues/19239] was (Author: licht-t): SparseDataFrame is planned to be deprecated. [https://github.com/pandas-dev/pandas/issues/19239] > Cannot deserialize pandas SparseDataFrame > - > > Key: ARROW-2273 > URL: https://issues.apache.org/jira/browse/ARROW-2273 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.9.0 >Reporter: Mitar >Priority: Major > > >>> import pyarrow > >>> import pandas > >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, > >>> 9]}) > >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) > Traceback (most recent call last): > File "", line 1, in > File "serialization.pxi", line 441, in pyarrow.lib.deserialize > File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from > File "serialization.pxi", line 257, in > pyarrow.lib.SerializedPyObject.deserialize > File "serialization.pxi", line 174, in > pyarrow.lib.SerializationContext._deserialize_callback > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", > line 77, in _deserialize_pandas_dataframe > return pdcompat.serialized_dict_to_dataframe(data) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in serialized_dict_to_dataframe > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 450, in > for block in data['blocks']] > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", > line 478, in _reconstruct_block > block = _int.make_block(block_arr, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 2957, in make_block > return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) > File > ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", > line 120, in __init__ > len(self.mgr_locs))) > ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)