liukun4515 commented on issue #6876:
URL:
https://github.com/apache/arrow-datafusion/issues/6876#issuecomment-1630182777
> 👍 -- I believe @tustvold is cleaning up the arithmetic logic in arrow-rs /
datafusion now
Ok, I will take look this work and track this process of work
> What types can the `value_expr` be in `spark`?
In the spark
```
spark-sql> select version();
3.2.0 5d45a415f3a29898d92380380cfd82bfc7f579ea
Time taken: 0.084 seconds, Fetched 1 row(s)
spark-sql> desc test;
a date
b int
```
`date` + integer constant
```
spark-sql> explain extended select a+10 from test;
== Parsed Logical Plan ==
'Project [unresolvedalias(('a + 10), None)]
+- 'UnresolvedRelation [test], [], false
== Analyzed Logical Plan ==
date_add(a, 10): date
Project [date_add(a#49, 10) AS date_add(a, 10)#51]
+- SubqueryAlias spark_catalog.default.test
+- HiveTableRelation [`default`.`test`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#49, b#50],
Partition Cols: []]
== Optimized Logical Plan ==
Project [date_add(a#49, 10) AS date_add(a, 10)#51]
+- HiveTableRelation [`default`.`test`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#49, b#50],
Partition Cols: []]
== Physical Plan ==
*(1) Project [date_add(a#49, 10) AS date_add(a, 10)#51]
+- Scan hive default.test [a#49], HiveTableRelation [`default`.`test`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#49, b#50],
Partition Cols: []]
Time taken: 0.04 seconds, Fetched 1 row(s)
```
There is a specific analyse rule to handle the data/time with the operation
of arithmetic.
`date` +/date_add integer column/expr
```
spark-sql> explain extended select a+b from test;
== Parsed Logical Plan ==
'Project [unresolvedalias(('a + 'b), None)]
+- 'UnresolvedRelation [test], [], false
== Analyzed Logical Plan ==
date_add(a, b): date
Project [date_add(a#88, b#89) AS date_add(a, b)#90]
+- SubqueryAlias spark_catalog.default.test
+- HiveTableRelation [`default`.`test`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#88, b#89],
Partition Cols: []]
== Optimized Logical Plan ==
Project [date_add(a#88, b#89) AS date_add(a, b)#90]
+- HiveTableRelation [`default`.`test`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#88, b#89],
Partition Cols: []]
== Physical Plan ==
*(1) Project [date_add(a#88, b#89) AS date_add(a, b)#90]
+- Scan hive default.test [a#88, b#89], HiveTableRelation [`default`.`test`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [a#88, b#89],
Partition Cols: []]
Time taken: 0.035 seconds, Fetched 1 row(s)
```
because the PG support the operation `date +/- integer` described in the doc
https://www.postgresql.org/docs/current/functions-datetime.html
For example
```
date + integer → date
Add a number of days to a date
date '2001-09-28' + 7 → 2001-10-05
```
So I want to support the more arithmetic operation for
date/time/timestamp/interval in the datafusion(maybe we can implement them in
the arrow-rs).
The date operated by the arithmetic operation is required in the sql system
or the query engine, So i don't know if the implementation of above operation
in the arrow-rs kernel is suitable?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]