[ 
https://issues.apache.org/jira/browse/ARROW-15748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497471#comment-17497471
 ] 

Joris Van den Bossche commented on ARROW-15748:
-----------------------------------------------

The link you provide for the actual behaviour points to the C++ docs, and while 
that indeed uses "day", the bindings in Python _do_ use "second": 
https://github.com/apache/arrow/blob/094c5ba186cddd69d4aa83de5ed2b62d4ed07081/python/pyarrow/_compute.pyx#L892

Now, the confusing part is that this class is not instantiated (I assume) if no 
options are used at all, and in that case it uses the defaults from C++.  You 
can see this in the following example:

{code:python}
>>> arr = pa.array([pd.Timestamp("2012-01-01 09:01:02.123456")])
>>> import pyarrow.compute as pc
>>> pc.round_temporal(arr)    # <--- indeed uses "day" by default
<pyarrow.lib.TimestampArray object at 0x7f5d7b56a040>
[
  2012-01-01 00:00:00.000000
]
>>> pc.round_temporal(arr, unit="second")    # <--- manually specifying 
>>> "second" still works
<pyarrow.lib.TimestampArray object at 0x7f5d7a67fd00>
[
  2012-01-01 09:01:02.000000
]
>>> pc.round_temporal(arr, multiple=5)    # <--- but when specifying a 
>>> different option, it now actually defaults to "second" ...
<pyarrow.lib.TimestampArray object at 0x7f5d7b548b80>
[
  2012-01-01 09:01:00.000000
]
{code}

Now, long story short, the simple conclusion is of course still that we should 
align the defaults in C++ and Python

> [Python] Round temporal options default unit is `day` but documented as 
> `second`.
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-15748
>                 URL: https://issues.apache.org/jira/browse/ARROW-15748
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 8.0.0
>            Reporter: A. Coady
>            Priority: Minor
>
> The [python documentation for round temporal options 
> |https://arrow.apache.org/docs/dev/python/generated/pyarrow.compute.RoundTemporalOptions.html]
>  says the default unit is `second`, but the [actual 
> behavior|https://arrow.apache.org/docs/dev/cpp/api/compute.html#classarrow_1_1compute_1_1_round_temporal_options]
>  is a default of `day`.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to