[
https://issues.apache.org/jira/browse/ARROW-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608447#comment-16608447
]
Krisztian Szucs commented on ARROW-1989:
----------------------------------------
{code:python}
In [45]: pa.array([datetime.date(2018, 12, 12)], type=pa.timestamp('s'))
---------------------------------------------------------------------------
ArrowTypeError Traceback (most recent call last)
<ipython-input-45-f6eb2418d6b7> in <module>()
----> 1 pa.array([datetime.date(2018, 12, 12)], type=pa.timestamp('s'))
~/Workspace/arrow/python/pyarrow/array.pxi in pyarrow.lib.array()
169 else:
170 # ConvertPySequence does strict conversion if type is
explicitly passed
--> 171 return _sequence_to_array(obj, mask, size, type, pool,
from_pandas)
172
173
~/Workspace/arrow/python/pyarrow/array.pxi in pyarrow.lib._sequence_to_array()
33 cdef shared_ptr[CChunkedArray] out
34 with nogil:
---> 35 check_status(ConvertPySequence(sequence, mask, options, &out))
36
37 if out.get().num_chunks() == 1:
~/Workspace/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
89 raise ArrowNotImplementedError(message)
90 elif status.IsTypeError():
---> 91 raise ArrowTypeError(message)
92 elif status.IsCapacityError():
93 raise ArrowCapacityError(message)
ArrowTypeError: an integer is required (got type datetime.date)
{code}
however with datetime it works
{code:python}
In [46]: pa.array([datetime.datetime(2018, 12, 12)], type=pa.timestamp('s'))
Out[46]:
<pyarrow.lib.TimestampArray object at 0x11d243638>
[
1544572800
]
{code}
I think We should have a general solution to extend the low level errors with
extra, python related context.
The current error handling in cython seems really lightweight
https://github.com/apache/arrow/blob/master/python/pyarrow/error.pxi#L71
Would it be OK to extend it with an error rewriting logic?
> [Python] Better UX on timestamp conversion to Pandas
> ----------------------------------------------------
>
> Key: ARROW-1989
> URL: https://issues.apache.org/jira/browse/ARROW-1989
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Uwe L. Korn
> Priority: Major
> Fix For: 0.11.0
>
>
> Converting timestamp columns to Pandas, users often have the problem that
> they have dates that are larger than Pandas can represent with their
> nanosecond representation. Currently they simply see an Arrow exception and
> think that this problem is caused by Arrow. We should try to change the error
> from
> {code}
> ArrowInvalid: Casting from timestamp[ns] to timestamp[us] would lose data: XX
> {code}
> to something along the lines of
> {code}
> ArrowInvalid: Casting from timestamp[ns] to timestamp[us] would lose data:
> XX. This conversion is needed as Pandas does only support nanosecond
> timestamps. Your data is likely out of the range that can be represented with
> nanosecond resolution.
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)