rok commented on issue #46301: URL: https://github.com/apache/arrow/issues/46301#issuecomment-2853891271
Thinking about this some more - `1970-01-01` is a very arbitrary origin point for most applications. There is nothing meaningful about it except for the fact it's an internal implementation detail. (Which could be said of year 0 too, but year 0 should feel fore intuitive generally). But pyarrow docs does indeed state "[By default, the origin is 1970-01-01T00:00:00](https://arrow.apache.org/docs/python/generated/pyarrow.compute.floor_temporal.html)" for "not `calendar_based_origin`" so it would be best to change this. As an aside - it would be very valuable to have intra-library tests for ceil/round/floor behavior that would compare results on the "full" time continuum (from say 2000 BC to 3000 AD, uniformly sampled, with a known seed, on a resolution lower than the unit being rounded too). As this is hard currently (e.g. pandas not flooring to years, some libraries not rounding in local time) next best thing would be to have a comprehensive test dataset. @AlenkaF I'd be happy to help you want to do this. > Note that the week unit behaves similarly (there may be other units as well; a parameterized test could help reveal this): > > ```python > >>> s = pa.scalar(datetime(1970, 1, 1), type=pa.timestamp('ns')) > >>> print(pc.floor_temporal(s, 1, 'week')) > 1969-12-29 00:00:00 > ``` Well, 1969-12-29 was a Monday according to google, so this is correct (if painful to think about). @MarcoGorelli do you intend to support rounding to multiple time units in local time in narwhals? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
