[
https://issues.apache.org/jira/browse/ARROW-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joris Van den Bossche updated ARROW-13040:
------------------------------------------
Description:
In contrast to {{to_pandas_dtype}}, a {{to_numpy_dtype}} could preserve the
resolution for datetime-like types (timestamp, date, duration)
Original issue report below:
----
Most of them mistakenly assume nanoseconds, but some are not implemented.
Here's the complete run-down:
{{date32/date64/time32/time64}}
{{---------------------------}}
{{>>> pyarrow.date32()}}
{{DataType(date32[day])}}
{{>>> pyarrow.date32().to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.date64()}}
{{DataType(date64[ms])}}
{{>>> pyarrow.date64().to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.time32("s")}}
{{Time32Type(time32[s])}}
{{>>> pyarrow.time32("s").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time32[s]}}
{{>>> pyarrow.time32("ms")}}
{{Time32Type(time32[ms])}}
{{>>> pyarrow.time32("ms").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time32[ms]}}
{{>>> pyarrow.time64("us")}}
{{Time64Type(time64[us])}}
{{>>> pyarrow.time64("us").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time64[us]}}
{{>>> pyarrow.time64("ns")}}
{{Time64Type(time64[ns])}}
{{>>> pyarrow.time64("ns").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time64[ns]}}
{{timestamp}}
{{---------}}
{{>>> pyarrow.timestamp("s")}}
{{TimestampType(timestamp[s])}}
{{>>> pyarrow.timestamp("s").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.timestamp("ms")}}
{{TimestampType(timestamp[ms])}}
{{>>> pyarrow.timestamp("ms").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.timestamp("us")}}
{{TimestampType(timestamp[us])}}
{{>>> pyarrow.timestamp("us").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.timestamp("ns")}}
{{TimestampType(timestamp[ns])}}
{{>>> pyarrow.timestamp("ns").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{duration}}
{{--------}}
{{>>> pyarrow.duration("s")}}
{{DurationType(duration[s])}}
{{>>> pyarrow.duration("s").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
{{>>> pyarrow.duration("ms")}}
{{DurationType(duration[ms])}}
{{>>> pyarrow.duration("ms").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
{{>>> pyarrow.duration("us")}}
{{DurationType(duration[us])}}
{{>>> pyarrow.duration("us").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
{{>>> pyarrow.duration("ns")}}
{{DurationType(duration[ns])}}
{{>>> pyarrow.duration("ns").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
was:
In contrast to {{to_pandas_dtype}}, a {{to_numpy_dtype}} could preserve the
resolution for datetime-like types (timestamp, date, duration)
----
Most of them mistakenly assume nanoseconds, but some are not implemented.
Here's the complete run-down:
{{date32/date64/time32/time64}}
{{---------------------------}}
{{>>> pyarrow.date32()}}
{{DataType(date32[day])}}
{{>>> pyarrow.date32().to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.date64()}}
{{DataType(date64[ms])}}
{{>>> pyarrow.date64().to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.time32("s")}}
{{Time32Type(time32[s])}}
{{>>> pyarrow.time32("s").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time32[s]}}
{{>>> pyarrow.time32("ms")}}
{{Time32Type(time32[ms])}}
{{>>> pyarrow.time32("ms").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time32[ms]}}
{{>>> pyarrow.time64("us")}}
{{Time64Type(time64[us])}}
{{>>> pyarrow.time64("us").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time64[us]}}
{{>>> pyarrow.time64("ns")}}
{{Time64Type(time64[ns])}}
{{>>> pyarrow.time64("ns").to_pandas_dtype()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/types.pxi", line 200, in pyarrow.lib.DataType.to_pandas_dtype}}
{{NotImplementedError: time64[ns]}}
{{timestamp}}
{{---------}}
{{>>> pyarrow.timestamp("s")}}
{{TimestampType(timestamp[s])}}
{{>>> pyarrow.timestamp("s").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.timestamp("ms")}}
{{TimestampType(timestamp[ms])}}
{{>>> pyarrow.timestamp("ms").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.timestamp("us")}}
{{TimestampType(timestamp[us])}}
{{>>> pyarrow.timestamp("us").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{>>> pyarrow.timestamp("ns")}}
{{TimestampType(timestamp[ns])}}
{{>>> pyarrow.timestamp("ns").to_pandas_dtype()}}
{{dtype('<M8[ns]')}}
{{duration}}
{{--------}}
{{>>> pyarrow.duration("s")}}
{{DurationType(duration[s])}}
{{>>> pyarrow.duration("s").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
{{>>> pyarrow.duration("ms")}}
{{DurationType(duration[ms])}}
{{>>> pyarrow.duration("ms").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
{{>>> pyarrow.duration("us")}}
{{DurationType(duration[us])}}
{{>>> pyarrow.duration("us").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
{{>>> pyarrow.duration("ns")}}
{{DurationType(duration[ns])}}
{{>>> pyarrow.duration("ns").to_pandas_dtype()}}
{{dtype('<m8[ns]')}}
> [Python] Add DataType.to_numpy_dtype (equivalent of to_pandas_dtype, but for
> numpy)
> -----------------------------------------------------------------------------------
>
> Key: ARROW-13040
> URL: https://issues.apache.org/jira/browse/ARROW-13040
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 4.0.0
> Reporter: Jim Pivarski
> Priority: Major
>
> In contrast to {{to_pandas_dtype}}, a {{to_numpy_dtype}} could preserve the
> resolution for datetime-like types (timestamp, date, duration)
> Original issue report below:
> ----
> Most of them mistakenly assume nanoseconds, but some are not implemented.
> Here's the complete run-down:
> {{date32/date64/time32/time64}}
> {{---------------------------}}
> {{>>> pyarrow.date32()}}
> {{DataType(date32[day])}}
> {{>>> pyarrow.date32().to_pandas_dtype()}}
> {{dtype('<M8[ns]')}}
> {{>>> pyarrow.date64()}}
> {{DataType(date64[ms])}}
> {{>>> pyarrow.date64().to_pandas_dtype()}}
> {{dtype('<M8[ns]')}}
> {{>>> pyarrow.time32("s")}}
> {{Time32Type(time32[s])}}
> {{>>> pyarrow.time32("s").to_pandas_dtype()}}
> {{Traceback (most recent call last):}}
> {{ File "<stdin>", line 1, in <module>}}
> {{ File "pyarrow/types.pxi", line 200, in
> pyarrow.lib.DataType.to_pandas_dtype}}
> {{NotImplementedError: time32[s]}}
> {{>>> pyarrow.time32("ms")}}
> {{Time32Type(time32[ms])}}
> {{>>> pyarrow.time32("ms").to_pandas_dtype()}}
> {{Traceback (most recent call last):}}
> {{ File "<stdin>", line 1, in <module>}}
> {{ File "pyarrow/types.pxi", line 200, in
> pyarrow.lib.DataType.to_pandas_dtype}}
> {{NotImplementedError: time32[ms]}}
> {{>>> pyarrow.time64("us")}}
> {{Time64Type(time64[us])}}
> {{>>> pyarrow.time64("us").to_pandas_dtype()}}
> {{Traceback (most recent call last):}}
> {{ File "<stdin>", line 1, in <module>}}
> {{ File "pyarrow/types.pxi", line 200, in
> pyarrow.lib.DataType.to_pandas_dtype}}
> {{NotImplementedError: time64[us]}}
> {{>>> pyarrow.time64("ns")}}
> {{Time64Type(time64[ns])}}
> {{>>> pyarrow.time64("ns").to_pandas_dtype()}}
> {{Traceback (most recent call last):}}
> {{ File "<stdin>", line 1, in <module>}}
> {{ File "pyarrow/types.pxi", line 200, in
> pyarrow.lib.DataType.to_pandas_dtype}}
> {{NotImplementedError: time64[ns]}}
> {{timestamp}}
> {{---------}}
> {{>>> pyarrow.timestamp("s")}}
> {{TimestampType(timestamp[s])}}
> {{>>> pyarrow.timestamp("s").to_pandas_dtype()}}
> {{dtype('<M8[ns]')}}
> {{>>> pyarrow.timestamp("ms")}}
> {{TimestampType(timestamp[ms])}}
> {{>>> pyarrow.timestamp("ms").to_pandas_dtype()}}
> {{dtype('<M8[ns]')}}
> {{>>> pyarrow.timestamp("us")}}
> {{TimestampType(timestamp[us])}}
> {{>>> pyarrow.timestamp("us").to_pandas_dtype()}}
> {{dtype('<M8[ns]')}}
> {{>>> pyarrow.timestamp("ns")}}
> {{TimestampType(timestamp[ns])}}
> {{>>> pyarrow.timestamp("ns").to_pandas_dtype()}}
> {{dtype('<M8[ns]')}}
> {{duration}}
> {{--------}}
> {{>>> pyarrow.duration("s")}}
> {{DurationType(duration[s])}}
> {{>>> pyarrow.duration("s").to_pandas_dtype()}}
> {{dtype('<m8[ns]')}}
> {{>>> pyarrow.duration("ms")}}
> {{DurationType(duration[ms])}}
> {{>>> pyarrow.duration("ms").to_pandas_dtype()}}
> {{dtype('<m8[ns]')}}
> {{>>> pyarrow.duration("us")}}
> {{DurationType(duration[us])}}
> {{>>> pyarrow.duration("us").to_pandas_dtype()}}
> {{dtype('<m8[ns]')}}
> {{>>> pyarrow.duration("ns")}}
> {{DurationType(duration[ns])}}
> {{>>> pyarrow.duration("ns").to_pandas_dtype()}}
> {{dtype('<m8[ns]')}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)