Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-07-01 Thread via GitHub


findepi commented on PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#issuecomment-3022853488

   @jatin510 as mentioned 
[above](https://github.com/apache/datafusion/pull/16539#issuecomment-3014136564),
 the Float64 to timestamp cast is broken too. I filed separate issue for this: 
https://github.com/apache/datafusion/issues/16636


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-30 Thread via GitHub


jatin510 commented on PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#issuecomment-3019974705

   hey !
   
   When i was working with https://github.com/apache/datafusion/issues/14612 I 
found out that, after changing the default `parse_float_as_decimal` to `true`. 
The test cases for the timestamp started failing.
   
   My initial impression was, not to change the behaviour of the existing test 
cases with this parse option.
   So, I worked on https://github.com/apache/datafusion/pull/15486, and passed 
the tests successfully.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-30 Thread via GitHub


alamb commented on PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#issuecomment-3018897937

   > > The #15486 was merged fairly recently and didn't change the non-scalar 
code path. I think we should restore the decimal cast behavior to as it was 
before #15486.
   > 
   > @alamb do we have an agreement on the direction proposed above?
   
   
   > The https://github.com/apache/datafusion/pull/15486 was merged fairly 
recently and didn't change the non-scalar code path. I think we should restore 
the decimal cast behavior to as it was before 
https://github.com/apache/datafusion/pull/15486. For those who -- like me -- 
don't want decimal fraction to get ignored, the to_timestamp seems like the 
alternative to use. Further, we should remove any other type-special handling 
in ScalarValue::cast_to_with_options, because it's likely other discrepancies.
   
   I think this makes sense to me -- ideally we can document this somewhere so 
we don't churn / rework the semantics again in the future


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-30 Thread via GitHub


findepi commented on PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#issuecomment-3018609552

   > The #15486 was merged fairly recently and didn't change the non-scalar 
code path. I think we should restore the decimal cast behavior to as it was 
before #15486.
   
   
   @alamb do we have an agreement on the direction proposed above?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-25 Thread via GitHub


findepi commented on code in PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#discussion_r2166181494


##
datafusion/common/src/scalar/mod.rs:
##
@@ -3069,7 +3069,7 @@ impl ScalarValue {
 ScalarValue::Decimal128(Some(decimal_value), _, scale),
 DataType::Timestamp(time_unit, None),
 ) => {
-let scale_factor = 10_i128.pow(*scale as u32);
+let scale_factor = 10_i128.pow(*scale as u32 + 3);

Review Comment:
   > I think the underlying arrow kernel may not support casting decimal --> 
timestamp
   
   Can be!
   
   So how does this work when the value is not constant-folded?
   What code ends up executing in the 'normal query' case?
   
   Can the same code be executed also in constant-folding case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-24 Thread via GitHub


chenkovsky commented on code in PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#discussion_r2165199807


##
datafusion/common/src/scalar/mod.rs:
##
@@ -3069,7 +3069,7 @@ impl ScalarValue {
 ScalarValue::Decimal128(Some(decimal_value), _, scale),
 DataType::Timestamp(time_unit, None),
 ) => {
-let scale_factor = 10_i128.pow(*scale as u32);
+let scale_factor = 10_i128.pow(*scale as u32 + 3);

Review Comment:
   > Does this depend on the timestamp precision (time unit)? Can you please 
add tests for other units?
   > 
   > Do we need this special casing at all for casting decimal to timestamp? or 
can we end up calling `cast_with_options`, just like we do for virtually all 
other type pairs?
   
   @findepi you are right, it's time unit related. I updated code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-24 Thread via GitHub


chenkovsky commented on PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#issuecomment-3002194281

   should I also change 
https://github.com/apache/datafusion/blob/334d449ff778af590a7d8421544ac60222e5f69b/datafusion/functions/src/datetime/to_timestamp.rs#L339
 ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-24 Thread via GitHub


chenkovsky commented on code in PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#discussion_r2165155973


##
datafusion/common/src/scalar/mod.rs:
##
@@ -3069,7 +3069,7 @@ impl ScalarValue {
 ScalarValue::Decimal128(Some(decimal_value), _, scale),
 DataType::Timestamp(time_unit, None),
 ) => {
-let scale_factor = 10_i128.pow(*scale as u32);
+let scale_factor = 10_i128.pow(*scale as u32 + 3);

Review Comment:
   > I think the underlying arrow kernel may not support casting decimal --> 
timestamp (I bet this is something spark related) 🤔
   @alamb here implements casting decimal --> timestamp
   
https://github.com/apache/arrow-rs/blob/b6240b32e235d4ca330372e3be31f784ba133252/arrow-cast/src/cast/mod.rs#L4032



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-24 Thread via GitHub


alamb commented on code in PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#discussion_r2164790029


##
datafusion/common/src/scalar/mod.rs:
##
@@ -3069,7 +3069,7 @@ impl ScalarValue {
 ScalarValue::Decimal128(Some(decimal_value), _, scale),
 DataType::Timestamp(time_unit, None),
 ) => {
-let scale_factor = 10_i128.pow(*scale as u32);
+let scale_factor = 10_i128.pow(*scale as u32 + 3);

Review Comment:
   I think the underlying arrow kernel may not support casting decimal --> 
timestamp (I bet this is something spark related) 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: The inconsistency between scalar and array on the cast of timestamp [datafusion]

2025-06-24 Thread via GitHub


findepi commented on code in PR #16539:
URL: https://github.com/apache/datafusion/pull/16539#discussion_r2164433469


##
datafusion/common/src/scalar/mod.rs:
##
@@ -3069,7 +3069,7 @@ impl ScalarValue {
 ScalarValue::Decimal128(Some(decimal_value), _, scale),
 DataType::Timestamp(time_unit, None),
 ) => {
-let scale_factor = 10_i128.pow(*scale as u32);
+let scale_factor = 10_i128.pow(*scale as u32 + 3);

Review Comment:
   Does this depend on the timestamp precision (time unit)? Can you please add 
tests for other units?
   
   Do we need this special casing at all for casting decimal to timestamp?
   or can we end up calling `cast_with_options`, just like we do for virtually 
all other type pairs?
   



##
datafusion/sqllogictest/test_files/timestamps.slt:
##
@@ -3420,3 +3420,8 @@ select to_timestamp('-1');
 
 query error DataFusion error: Arrow error: Parser error: Error parsing 
timestamp from '\-1': timestamp must contain at least 10 characters
 select to_timestamp(arrow_cast('-1', 'Utf8'));
+
+query B
+SELECT CAST(CAST(x AS decimal(17,2)) AS timestamp(3)) = CAST(CAST(1 AS 
decimal(17,2)) AS timestamp(3)) from (values (1)) t(x);
+

Review Comment:
   Formatting like this will help see why these should be equal. Also it's 
enough to project the values. (They should be equal, but also they should be 
particular values).
   
   ```
   SELECT CAST(CAST(1   AS decimal(17,2)) AS timestamp(3)) AS a UNION ALL
   SELECT CAST(CAST(one AS decimal(17,2)) AS timestamp(3)) AS a FROM (VALUES 
(1)) t(one);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]