MaxGekk opened a new pull request #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
   ### What changes were proposed in this pull request?
   In the PR, I propose to extend `ExtractBenchmark` and add new ones for:
   - `EXTRACT` and `DATE` as input column
   - the `DATE_PART` function and `DATE`/`TIMESTAMP` input column
   Proposed benchmarks for the `DATE` type requires casting longs to dates. I 
extended the `CAST` expression to support casting integral types to dates. 
`CAST` interprets the input as number of days since epoch `1970-01-01` which 
can be a negative number.
   ### Why are the changes needed?
   The `EXTRACT` expression is rebased on the `DATE_PART` expression by the PR where some of sub-expressions take 
`DATE` column as the input (`Millennium`, `Year` and etc.) but others require 
`TIMESTAMP` column (`Hour`, `Minute`). Separate benchmarks for `DATE` should 
exclude overhead of implicit conversions `DATE` <-> `TIMESTAMP`.
   ### Does this PR introduce any user-facing change?
   After the changes, users can convert `long`, `int`, `short` and `byte` 
columns to dates:
   spark-sql> select cast(1 as date);
   but before the same sql statement fails with the error:
   spark-sql> select cast(1 as date);
   Error in query: cannot resolve 'CAST(1 AS DATE)' due to data type mismatch: 
cannot cast int to date; line 1 pos 7;
   'Project [unresolvedalias(cast(1 as date), None)]
   +- OneRowRelation
   ### How was this patch tested?
   - Added new tests to `CastSuite` to check casting integral types to dates
   - Regenerated results of `ExtractBenchmark`

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to