rok commented on pull request #10176: URL: https://github.com/apache/arrow/pull/10176#issuecomment-854790630
> I did a quick naive check of the performance of this branch, and comparing the simple components, they are consistently 6 to 25x faster compared to the pandas implementation (6x for the year/month/day, 25-30x for hour/minute/second). That aspect seems to be good! > (I don't know how good the pandas implementation is, so it might not be the most relevant comparison (comparing with eg clickhouse might be more interesting), but it at least says something) Nice! Thanks for measuring that. That's a dramatic improvement! Maybe we could still optimize somewhat but that would require a bit of a study of `date.h` and benchmarking as changes are tried. I've fixed the `iso_year` function and added some tests [borrowed from pandas](https://github.com/pandas-dev/pandas/blob/059c8bac51e47d6eaaa3e36d6a293a22312925e6/pandas/tests/tslibs/test_ccalendar.py#L39). I think the last thing left is the matching to proper time unit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
