alamb commented on code in PR #6832:
URL: https://github.com/apache/arrow-datafusion/pull/6832#discussion_r1267268245


##########
datafusion/core/tests/sqllogictests/test_files/decimal.slt:
##########
@@ -365,7 +365,7 @@ select c1*c5 from decimal_simple;
 query T
 select arrow_typeof(c1/cast(0.00001 as decimal(5,5))) from decimal_simple 
limit 1;
 ----
-Decimal128(21, 12)
+Decimal128(19, 10)

Review Comment:
   cc @liukun4515 and @viirya -- I assume you have reviewed the decimal changes 
upstream



##########
datafusion/core/tests/sqllogictests/test_files/arrow_typeof.slt:
##########
@@ -303,9 +303,11 @@ select arrow_cast('30 minutes', 'Interval(MonthDayNano)');
 
 ## Duration
 
-query error DataFusion error: Optimizer rule 'simplify_expressions' 
failed\ncaused by\nThis feature is not implemented: Can't create a scalar from 
array of type "Duration\(Second\)"
+query ?

Review Comment:
   🎉 



##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1035,13 +1035,13 @@ async fn timestamp_sub_with_tz() -> Result<()> {
     let sql = "SELECT val, ts1 - ts2 AS ts_diff FROM table_a ORDER BY ts2 - 
ts1";
     let actual = execute_to_batches(&ctx, sql).await;
     let expected = vec![
-        "+-----+---------------------------------------------------+",
-        "| val | ts_diff                                           |",
-        "+-----+---------------------------------------------------+",
-        "| 3   | 0 years 0 mons 0 days 10 hours 0 mins 30.000 secs |",
-        "| 1   | 0 years 0 mons 0 days 10 hours 0 mins 20.000 secs |",
-        "| 2   | 0 years 0 mons 0 days 10 hours 0 mins 10.000 secs |",
-        "+-----+---------------------------------------------------+",
+        "+-----+---------+",
+        "| val | ts_diff |",
+        "+-----+---------+",
+        "| 3   | PT30S   |",

Review Comment:
   I remember the `PT30S`  format is some sort of standard but I think it is a 
pretty terrible UX and very uncommon. 
   
   I know there is a technical difference between duration and interval in the 
arrow spec but as a user, I find this very confusing:
   
   ```
   ❯ select (now() - '2023-01-01'::timestamp);
   +----------------------------+
   | now() - Utf8("2023-01-01") |
   +----------------------------+
   | P198DT72932.972880S        |
   +----------------------------+
   ```
   
   On the other hand the original formatting is not consistent with Postgres 
either.
   
   
   ```
   postgres=# select interval '5 minutes 2  days';
       interval
   -----------------
    2 days 00:05:00
   (1 row)
   
   postgres=# select interval '5 minutes';
    interval
   ----------
    00:05:00
   (1 row)
   ```
   
   I played around with this locally and  we can still get an `interval` with 
an explicit cast 
   
   ```sql
   ❯ select (now() - '2023-01-01'::timestamp);
   +----------------------------+
   | now() - Utf8("2023-01-01") |
   +----------------------------+
   | P198DT72932.972880S        |
   +----------------------------+
   1 row in set. Query took 0.002 seconds.
   ❯ select (now() - '2023-01-01'::timestamp)::interval;
   +------------------------------------------------------------+
   | now() - Utf8("2023-01-01")                                 |
   +------------------------------------------------------------+
   | 0 years 0 mons 0 days 4772 hours 15 mins 35.571791000 secs |
   +------------------------------------------------------------+
   1 row in set. Query took 0.002 seconds.
   ```
   
   So we can perhaps add the appropriate coercion rules if necessary
   
   Or perhaps we can change the formatting of Durations in arrow-rs to be more 
interval like
   
   @waitingkuo do you have any thoughts
   



##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -2609,14 +2441,14 @@ mod tests {
             Operator::Divide,
             create_decimal_array(
                 &[
-                    Some(99193548387), // 0.99193548387

Review Comment:
   why these changes?



##########
datafusion/physical-expr/src/expressions/datetime.rs:
##########


Review Comment:
   👏 



##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1035,13 +1035,13 @@ async fn timestamp_sub_with_tz() -> Result<()> {
     let sql = "SELECT val, ts1 - ts2 AS ts_diff FROM table_a ORDER BY ts2 - 
ts1";
     let actual = execute_to_batches(&ctx, sql).await;
     let expected = vec![
-        "+-----+---------------------------------------------------+",
-        "| val | ts_diff                                           |",
-        "+-----+---------------------------------------------------+",
-        "| 3   | 0 years 0 mons 0 days 10 hours 0 mins 30.000 secs |",
-        "| 1   | 0 years 0 mons 0 days 10 hours 0 mins 20.000 secs |",
-        "| 2   | 0 years 0 mons 0 days 10 hours 0 mins 10.000 secs |",
-        "+-----+---------------------------------------------------+",
+        "+-----+---------+",
+        "| val | ts_diff |",
+        "+-----+---------+",
+        "| 3   | PT30S   |",

Review Comment:
   I remember the `PT30S`  format is some sort of standard but I think it is a 
pretty terrible UX and very uncommon. 
   
   I know there is a technical difference between duration and interval in the 
arrow spec but as a user, I find this very confusing:
   
   ```
   ❯ select (now() - '2023-01-01'::timestamp);
   +----------------------------+
   | now() - Utf8("2023-01-01") |
   +----------------------------+
   | P198DT72932.972880S        |
   +----------------------------+
   ```
   
   On the other hand the original formatting is not consistent with Postgres 
either.
   
   
   ```
   postgres=# select interval '5 minutes 2  days';
       interval
   -----------------
    2 days 00:05:00
   (1 row)
   
   postgres=# select interval '5 minutes';
    interval
   ----------
    00:05:00
   (1 row)
   ```
   
   I played around with this locally and  we can still get an `interval` with 
an explicit cast 
   
   ```sql
   ❯ select (now() - '2023-01-01'::timestamp);
   +----------------------------+
   | now() - Utf8("2023-01-01") |
   +----------------------------+
   | P198DT72932.972880S        |
   +----------------------------+
   1 row in set. Query took 0.002 seconds.
   ❯ select (now() - '2023-01-01'::timestamp)::interval;
   +------------------------------------------------------------+
   | now() - Utf8("2023-01-01")                                 |
   +------------------------------------------------------------+
   | 0 years 0 mons 0 days 4772 hours 15 mins 35.571791000 secs |
   +------------------------------------------------------------+
   1 row in set. Query took 0.002 seconds.
   ```
   
   So we can perhaps add the appropriate coercion rules if necessary
   
   Or perhaps we can change the formatting of Durations in arrow-rs to be more 
interval like
   
   @waitingkuo do you have any thoughts
   



##########
datafusion/physical-expr/src/expressions/datetime.rs:
##########


Review Comment:
   👏 



##########
datafusion/core/tests/sqllogictests/test_files/interval.slt:
##########
@@ -430,15 +436,11 @@ select '1 month'::interval + 
'1980-01-01T12:00:00'::timestamp;
 ----
 1980-02-01T12:00:00
 
-query D
+query error DataFusion error: Error during planning: Cannot coerce arithmetic 
expression Interval\(MonthDayNano\) \- Date32 to valid types
 select '1 month'::interval - '1980-01-01'::date;
-----
-1979-12-01
 
-query P
+query error DataFusion error: Error during planning: Cannot coerce arithmetic 
expression Interval\(MonthDayNano\) \- Timestamp\(Nanosecond, None\) to valid 
types
 select '1 month'::interval - '1980-01-01T12:00:00'::timestamp;
-----

Review Comment:
   I agree an interval minus a timestamp doesn't make a lot of sense



##########
datafusion/core/tests/sqllogictests/test_files/binary.slt:
##########
@@ -110,14 +110,19 @@ ff0102 FixedSizeBinary(3)
 000102 FixedSizeBinary(3)
 
 # Comparison
-# https://github.com/apache/arrow-rs/pull/4492
-query error DataFusion error: Arrow error: Compute error: eq_dyn_binary_scalar 
only supports Binary or LargeBinary arrays

Review Comment:
   FYI @maxburke 



##########
datafusion/core/tests/sqllogictests/test_files/arrow_typeof.slt:
##########
@@ -303,9 +303,11 @@ select arrow_cast('30 minutes', 'Interval(MonthDayNano)');
 
 ## Duration
 
-query error DataFusion error: Optimizer rule 'simplify_expressions' 
failed\ncaused by\nThis feature is not implemented: Can't create a scalar from 
array of type "Duration\(Second\)"
+query ?

Review Comment:
   🎉 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to