[
https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joris Van den Bossche updated ARROW-6222:
-----------------------------------------
Component/s: Python
> [Python] Serialising numpy array yields
> `pyarrow.lib.ArrowNotImplementedError: list<item: float>`
> -------------------------------------------------------------------------------------------------
>
> Key: ARROW-6222
> URL: https://issues.apache.org/jira/browse/ARROW-6222
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.14.1
> Reporter: Marcel Ackermann
> Priority: Major
>
> I want to serialize pytorch tensors, but as they are not implemented in arrow
> yet I convert them to a numpy array like this: {{t.numpy()}}
> ([https://pytorch.org/docs/stable/tensors.html?highlight=numpy#torch.Tensor.numpy)]
> which returns an {{ndarray{{. My tensors are 1-dimensional, the result is a
> 1-dimensional ndarray.
> Calling {{df.to_feather("fname.feather")}} yields
> {{pyarrow.lib.ArrowNotImplementedError: list<item: float>}}.
> Next I tried {{pyarrow.array(t.numpy())}} which results in
> {{pyarrow.lib.ArrowInvalid: ('Could not convert [\n 0.00500498,\n
> -0.00732583,\n... with type pyarrow.lib.FloatArray: did not recognize Python
> value type when inferring an Arrow data type', 'Conversion failed for column
> 0 with type object')}}.
> I would appreciate if this would work more out-of-the-box.
> Upon request a full example:
> {code:python}
> import torch
> import pyarrow
> import pandas as pd
> pd.DataFrame([[torch.ones(2)]], columns=["0"]).to_feather("fname.feather")
> pd.DataFrame([[torch.ones(2).numpy()]],
> columns=["0"]).to_feather("fname.feather")
> pd.DataFrame([[pyarrow.array(torch.ones(2).numpy())]],
> columns=["0"]).to_feather("fname.feather")
> {code}
> {code:python}
> ArrowInvalid: ('Could not convert tensor([1., 1.]) with type Tensor: did not
> recognize Python value type when inferring an Arrow data type', 'Conversion
> failed for column 0 with type object')
> ArrowNotImplementedError: list<item: float>
> ArrowInvalid: ('Could not convert [\n 1,\n 1\n] with type
> pyarrow.lib.FloatArray: did not recognize Python value type when inferring an
> Arrow data type', 'Conversion failed for column 0 with type object')
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)