jayzhan211 opened a new issue, #6746:
URL: https://github.com/apache/arrow-rs/issues/6746
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
<!--
A clear and concise description of what the problem is. Ex. I'm always
frustrated when [...]
(This section helps Arrow developers understand the context and *why* for
this feature, in addition to the *what*)
-->
**Describe the solution you'd like**
<!--
A clear and concise description of what you want to happen.
-->
In Arrow-rs, if we want to get the minute of the timestamp,
`timestamp_s_to_datetime` is called to get the `NaiveDateTime` and `minute()`
is called to get the final answer.
For example, given a timestamp something like `1599563412`, we will convert
it to `2020-09-08T12:10:12.123456780` in chrono's `NaiveDateTime` format. Then
`minute` is called to get `10`.
However, looks to the function in chrono, even if we just need
`NaiveTime::from_num_seconds_from_midnight_opt` for computing the `minute`,
`NaiveTime::from_num_days_from_ce_opt` is still called which ideally we should
ignore it.
```rust
#[inline]
#[must_use]
pub const fn from_timestamp(secs: i64, nsecs: u32) -> Option<Self> {
let days = secs.div_euclid(86_400) + UNIX_EPOCH_DAY;
let secs = secs.rem_euclid(86_400);
if days < i32::MIN as i64 || days > i32::MAX as i64 {
return None;
}
let date = try_opt!(NaiveDate::from_num_days_from_ce_opt(days as
i32));
let time =
try_opt!(NaiveTime::from_num_seconds_from_midnight_opt(secs as u32, nsecs));
Some(date.and_time(time).and_utc())
}
```
I propose we upstream chrono crate and find a way to get `date` and `time`
separately given `secs: i64, nsecs: u32`. Then we call corresponding smaller
function based on what we need to minimum the cost.
**Describe alternatives you've considered**
<!--
A clear and concise description of any alternative solutions or features
you've considered.
-->
**Additional context**
<!--
Add any other context or screenshots about the feature request here.
-->
This is the flamegraph for clickbench Q18 in datafusion, we can see the
`from_num_days_from_ce_opt` spent much time for `date_part` function. But the
query care only about `minute` so the compute of `from_num_days_from_ce_opt` is
useless.
<img width="1687" alt="Screenshot 2024-11-18 at 5 40 45 PM"
src="https://github.com/user-attachments/assets/aa3f07d2-0bd8-4e34-b746-cc93e8e2107b">
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]