Jefffrey commented on code in PR #9961:
URL: https://github.com/apache/arrow-rs/pull/9961#discussion_r3278074517
##########
arrow-cast/src/parse.rs:
##########
@@ -1797,6 +1831,32 @@ mod tests {
}
}
+ #[test]
+ fn parse_date32_extended_year() {
+ // `Date32` covers any i32 days-from-epoch, verify we can parse it
+ let cases: &[(&str, i32)] = &[
+ ("+1970-01-01", 0),
+ ("+2024-01-01", 19_723),
+ ("-0001-01-01", -719_893),
+ ("+29349-01-26", 10_000_000),
+ ("+2739877-01-03", 1_000_000_000),
+ // Extremes of the Date32 representable range.
+ ("+5881580-07-11", i32::MAX),
+ ("-5877641-06-23", i32::MIN),
+ ];
Review Comment:
Checked against DuckDB and looks good:
```sql
memory D select date '1970-01-01' + 19723;
┌────────────────────────────────────────┐
│ (CAST('1970-01-01' AS "DATE") + 19723) │
│ date │
├────────────────────────────────────────┤
│ 2024-01-01 │
└────────────────────────────────────────┘
memory D select date '1970-01-01' - 719893;
┌─────────────────────────────────────────┐
│ (CAST('1970-01-01' AS "DATE") - 719893) │
│ date │
├─────────────────────────────────────────┤
│ 0002-01-01 (BC) │
└─────────────────────────────────────────┘
memory D select date '1970-01-01' + 10000000;
┌───────────────────────────────────────────┐
│ (CAST('1970-01-01' AS "DATE") + 10000000) │
│ date │
├───────────────────────────────────────────┤
│ 29349-01-26 │
└───────────────────────────────────────────┘
memory D select date '1970-01-01' + 1000000000;
┌─────────────────────────────────────────────┐
│ (CAST('1970-01-01' AS "DATE") + 1000000000) │
│ date │
├─────────────────────────────────────────────┤
│ 2739877-01-03 │
└─────────────────────────────────────────────┘
memory D select date '1970-01-01' + 2147483646;
┌─────────────────────────────────────────────┐
│ (CAST('1970-01-01' AS "DATE") + 2147483646) │
│ date │
├─────────────────────────────────────────────┤
│ 5881580-07-10 │
└─────────────────────────────────────────────┘
memory D select date '1970-01-01' - 2147483646;
┌─────────────────────────────────────────────┐
│ (CAST('1970-01-01' AS "DATE") - 2147483646) │
│ date │
├─────────────────────────────────────────────┤
│ 5877642-06-25 (BC) │
└─────────────────────────────────────────────┘
```
- For the last two, DuckDB didn't let me use `i32::MIN` and `i32::MAX` as
they were out of bounds, so I had to use `i32::MIN + 2` and `i32::MAX - 1`; the
dates still look correct
##########
arrow-cast/src/parse.rs:
##########
@@ -585,6 +585,34 @@ const EPOCH_DAYS_FROM_CE: i32 = 719_163;
/// Error message if nanosecond conversion request beyond supported interval
const ERR_NANOSECONDS_NOT_SUPPORTED: &str = "The dates that can be represented
as nanoseconds have to be between 1677-09-21T00:12:44.0 and
2262-04-11T23:47:16.854775804";
+/// Parse the ISO 8601 signed extended-year form (`±YYYY[Y...]-MM-DD`) into
+/// raw `(year, month, day)` components, without validating the calendar date.
+///
+/// The leading sign is required and the year must have at least 4 digits.
+/// Returns `None` if the prefix isn't `+`/`-`, the shape is malformed, or any
+/// component fails to parse numerically.
+fn parse_extended_ymd(string: &str) -> Option<(i32, u32, u32)> {
+ if !(string.starts_with('+') || string.starts_with('-')) {
Review Comment:
I wonder if this check should be a debug assert, since at both callsites
this is already verified before calling this function. Or it's cheap enough not
to be a concern?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]