martin-g commented on code in PR #21212:
URL: https://github.com/apache/datafusion/pull/21212#discussion_r3006765902
##########
datafusion/spark/src/function/conversion/cast.rs:
##########
Review Comment:
Update the list here.
##########
datafusion/spark/src/function/conversion/cast.rs:
##########
Review Comment:
This is no more valid.
If the input is NaN/Infinity then the new implementation returns None
(ansi=false).
##########
datafusion/spark/src/function/conversion/cast.rs:
##########
@@ -34,12 +35,36 @@ use std::sync::Arc;
const MICROS_PER_SECOND: i64 = 1_000_000;
-/// Convert seconds to microseconds with saturating overflow behavior (matches
spark spec)
+/// Convert integer seconds to microseconds with saturating overflow behavior
#[inline]
fn secs_to_micros(secs: i64) -> i64 {
secs.saturating_mul(MICROS_PER_SECOND)
}
+/// Convert float seconds to microseconds
+/// Returns None for NaN/Infinity/Overflow in non-ANSI mode, error in ANSI mode
+#[inline]
+fn float_secs_to_micros(val: f64, enable_ansi_mode: bool) ->
Result<Option<i64>> {
+ if val.is_nan() || val.is_infinite() {
+ if enable_ansi_mode {
+ return exec_err!(
+ "Cannot cast {} to TIMESTAMP",
+ if val.is_nan() { "NaN" } else { "Infinity" }
Review Comment:
f64::NaN's Display is `NaN`.
f64::INFINITY prints as `inf`
f64::NEG_INFINITY prints as `-inf`
Your approach would print `Infinity` even for NEG_INFINITY
##########
datafusion/spark/src/function/conversion/cast.rs:
##########
@@ -34,12 +35,36 @@ use std::sync::Arc;
const MICROS_PER_SECOND: i64 = 1_000_000;
-/// Convert seconds to microseconds with saturating overflow behavior (matches
spark spec)
+/// Convert integer seconds to microseconds with saturating overflow behavior
#[inline]
fn secs_to_micros(secs: i64) -> i64 {
secs.saturating_mul(MICROS_PER_SECOND)
}
+/// Convert float seconds to microseconds
+/// Returns None for NaN/Infinity/Overflow in non-ANSI mode, error in ANSI mode
+#[inline]
+fn float_secs_to_micros(val: f64, enable_ansi_mode: bool) ->
Result<Option<i64>> {
+ if val.is_nan() || val.is_infinite() {
+ if enable_ansi_mode {
+ return exec_err!(
+ "Cannot cast {} to TIMESTAMP",
+ if val.is_nan() { "NaN" } else { "Infinity" }
+ );
+ }
+ return Ok(None);
+ }
+ let micros = val * MICROS_PER_SECOND as f64;
+ if micros >= i64::MIN as f64 && micros <= i64::MAX as f64 {
+ Ok(Some(micros as i64))
+ } else {
+ if enable_ansi_mode {
+ return exec_err!("Overflow casting {} to TIMESTAMP", val);
+ }
+ Ok(None)
Review Comment:
```
spark-sql (default)> SELECT cast(9223372036854776001.0 as timestamp);
1970-01-01 02:03:13
Time taken: 0.066 seconds, Fetched 1 row(s)
spark-sql (default)> SELECT cast(9223372036854778001.0 as timestamp);
1970-01-01 02:36:33
Time taken: 0.063 seconds, Fetched 1 row(s)
```
This is Spark 4.0.2. It does not return NULL.
##########
datafusion/spark/src/function/conversion/cast.rs:
##########
@@ -34,12 +35,36 @@ use std::sync::Arc;
const MICROS_PER_SECOND: i64 = 1_000_000;
-/// Convert seconds to microseconds with saturating overflow behavior (matches
spark spec)
+/// Convert integer seconds to microseconds with saturating overflow behavior
#[inline]
fn secs_to_micros(secs: i64) -> i64 {
secs.saturating_mul(MICROS_PER_SECOND)
}
+/// Convert float seconds to microseconds
+/// Returns None for NaN/Infinity/Overflow in non-ANSI mode, error in ANSI mode
+#[inline]
+fn float_secs_to_micros(val: f64, enable_ansi_mode: bool) ->
Result<Option<i64>> {
+ if val.is_nan() || val.is_infinite() {
+ if enable_ansi_mode {
+ return exec_err!(
+ "Cannot cast {} to TIMESTAMP",
+ if val.is_nan() { "NaN" } else { "Infinity" }
+ );
+ }
+ return Ok(None);
+ }
+ let micros = val * MICROS_PER_SECOND as f64;
+ if micros >= i64::MIN as f64 && micros <= i64::MAX as f64 {
Review Comment:
```rust
println!("i64::MAX : {}", i64::MAX); // 9223372036854775807
println!("i64::MAX as f64: {}", i64::MAX as f64); // 9223372036854776000
```
I.e. any value between 9223372036854775807 and 9223372036854776000 will pass
the condition but then casting it to i64 will saturate it to i64::MAX.
If `enable_ansi_mode==true` you probably don't want this behavior ?!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]