rok commented on issue #46301:
URL: https://github.com/apache/arrow/issues/46301#issuecomment-2853891271

   Thinking about this some more - `1970-01-01` is a very arbitrary origin 
point for most applications. There is nothing meaningful about it except for 
the fact it's an internal implementation detail. (Which could be said of year 0 
too, but year 0 should feel fore intuitive generally). But pyarrow docs does 
indeed state "[By default, the origin is 
1970-01-01T00:00:00](https://arrow.apache.org/docs/python/generated/pyarrow.compute.floor_temporal.html)"
 for "not `calendar_based_origin`" so it would be best to change this.
   
   As an aside -  it would be very valuable to have intra-library tests for 
ceil/round/floor behavior that would compare results on the "full" time 
continuum (from say 2000 BC to 3000 AD, uniformly sampled, with a known seed, 
on a resolution lower than the unit being rounded too). As this is hard 
currently (e.g. pandas not flooring to years, some libraries not rounding in 
local time) next best thing would be to have a comprehensive test dataset.
   
   @AlenkaF I'd be happy to help you want to do this.
   
   > Note that the week unit behaves similarly (there may be other units as 
well; a parameterized test could help reveal this):
   >
   > ```python
   > >>> s = pa.scalar(datetime(1970, 1, 1), type=pa.timestamp('ns'))
   > >>> print(pc.floor_temporal(s, 1, 'week'))
   > 1969-12-29 00:00:00
   > ```
   
   Well, 1969-12-29 was a Monday according to google, so this is correct (if 
painful to think about).
   
   @MarcoGorelli do you intend to support rounding to multiple time units in 
local time in narwhals?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to