[ 
https://issues.apache.org/jira/browse/ARROW-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250439#comment-17250439
 ] 

slatebit commented on ARROW-10935:
----------------------------------

Hi Joris,

Thanks for the reply. Apologies on my end for not finding that duplicate 
report. Regarding it blocking my use case, I'm using the Python library Vaex. 
Under the hood, it relies on PyArrow as one of it's supported backends. One of 
its key features is lazy evaluation in that columns/features are saved as 
expressions and only evaluated when necessary, such as printing out the 
dataframe.

Currently, Vaex is built in such a way that it will if a column is a PyArrow 
array or chunked array, it does not do any type conversions and assumes that 
PyArrow will handle everything. This assumption led to this bug report. 
However, given your proposed solution, I'll bring this up with the Vaex team 
via a GitHub issue. Appreciate your help!

> [Python] pa.array() doesn't support pa.lib.TimestampScalar objects
> ------------------------------------------------------------------
>
>                 Key: ARROW-10935
>                 URL: https://issues.apache.org/jira/browse/ARROW-10935
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 2.0.0
>         Environment: Windows 10, Python 3.7.4, PyArrow 2.0.0
>            Reporter: slatebit
>            Priority: Blocker
>
> I encountered this edge case bug in PyArrow v2.0.0. For some reason, 
> pa.array() does not know how to support pa.lib.TimestampScalar objects. This 
> bug completely blocks my specific use case, although I do recognize that this 
> edge case seems kind of wonky. Nonetheless, I don't see any reason why 
> PyArrow would not understand one of it's own object types.
>  
> Stacktrace:
> {code:java}
> ArrowInvalid: Could not convert 2020-11-04 22:50:16.276892 with type 
> pyarrow.lib.TimestampScalar: did not recognize Python value type when 
> inferring an Arrow data type
> {code}
>  
> Reproducible Code:
> {code:java}
> import pandas as pd
> import pyarrow as pa
> pa.array([pa.scalar(pd.to_datetime('2020-11-04 22:50:16.276892000'))])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to