[GitHub] [arrow-datafusion] liukun4515 commented on issue #6876: support: Date +/plus Int or date_add function

via GitHub Mon, 10 Jul 2023 22:51:18 -0700


liukun4515 commented on issue #6876:
URL: 
https://github.com/apache/arrow-datafusion/issues/6876#issuecomment-1630182777


   
   > 👍 -- I believe @tustvold is cleaning up the arithmetic logic in arrow-rs / 
datafusion now
   
   Ok, I will take look this work and track this process of work
   
   > What types can the `value_expr` be in `spark`?
   
   In the spark
   ```
   spark-sql> select version();
   3.2.0 5d45a415f3a29898d92380380cfd82bfc7f579ea
   Time taken: 0.084 seconds, Fetched 1 row(s)
   
   spark-sql> desc test;
   a                    date
   b                    int
   ```
   
   `date` + integer constant
   
   ```
   spark-sql> explain extended select a+10 from test;
   == Parsed Logical Plan ==
   'Project [unresolvedalias(('a + 10), None)]
   +- 'UnresolvedRelation [test], [], false
   
   == Analyzed Logical Plan ==
   date_add(a, 10): date
   Project [date_add(a#49, 10) AS date_add(a, 10)#51]
   +- SubqueryAlias spark_catalog.default.test
      +- HiveTableRelation [`default`.`test`, 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#49, b#50], 
Partition Cols: []]
   
   == Optimized Logical Plan ==
   Project [date_add(a#49, 10) AS date_add(a, 10)#51]
   +- HiveTableRelation [`default`.`test`, 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#49, b#50], 
Partition Cols: []]
   
   == Physical Plan ==
   *(1) Project [date_add(a#49, 10) AS date_add(a, 10)#51]
   +- Scan hive default.test [a#49], HiveTableRelation [`default`.`test`, 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#49, b#50], 
Partition Cols: []]
   
   Time taken: 0.04 seconds, Fetched 1 row(s)
   ```
   
   There is a specific analyse rule to handle the  data/time with the operation 
of arithmetic. 
   
   
   `date` +/date_add    integer column/expr
   ```
   spark-sql> explain extended select a+b from test;
   == Parsed Logical Plan ==
   'Project [unresolvedalias(('a + 'b), None)]
   +- 'UnresolvedRelation [test], [], false
   
   == Analyzed Logical Plan ==
   date_add(a, b): date
   Project [date_add(a#88, b#89) AS date_add(a, b)#90]
   +- SubqueryAlias spark_catalog.default.test
      +- HiveTableRelation [`default`.`test`, 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#88, b#89], 
Partition Cols: []]
   
   == Optimized Logical Plan ==
   Project [date_add(a#88, b#89) AS date_add(a, b)#90]
   +- HiveTableRelation [`default`.`test`, 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#88, b#89], 
Partition Cols: []]
   
   == Physical Plan ==
   *(1) Project [date_add(a#88, b#89) AS date_add(a, b)#90]
   +- Scan hive default.test [a#88, b#89], HiveTableRelation [`default`.`test`, 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#88, b#89], 
Partition Cols: []]
   
   Time taken: 0.035 seconds, Fetched 1 row(s)
   ```
   
   because the PG support the operation `date +/- integer` described in the doc 
https://www.postgresql.org/docs/current/functions-datetime.html
   For example
   ```
   date + integer → date
   
   Add a number of days to a date
   
   date '2001-09-28' + 7 → 2001-10-05
   ```
   
   So I want to support the more arithmetic operation for 
date/time/timestamp/interval in the datafusion(maybe we can implement them in 
the arrow-rs).
   
   
   The  date operated by the arithmetic operation is required in the sql system 
or the query engine, So i don't  know if the implementation of above operation 
in the arrow-rs kernel is suitable?
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] liukun4515 commented on issue #6876: support: Date +/plus Int or date_add function

Reply via email to