MaxGekk opened a new pull request #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772
 
 
   ### What changes were proposed in this pull request?
   
   In the PR, I propose to extend `ExtractBenchmark` and add new ones for:
   - `EXTRACT` and `DATE` as input column
   - the `DATE_PART` function and `DATE`/`TIMESTAMP` input column
   
   Proposed benchmarks for the `DATE` type requires casting longs to dates. I 
extended the `CAST` expression to support casting integral types to dates. 
`CAST` interprets the input as number of days since epoch `1970-01-01` which 
can be a negative number.
   
   ### Why are the changes needed?
   
   The `EXTRACT` expression is rebased on the `DATE_PART` expression by the PR 
https://github.com/apache/spark/pull/25410 where some of sub-expressions take 
`DATE` column as the input (`Millennium`, `Year` and etc.) but others require 
`TIMESTAMP` column (`Hour`, `Minute`). Separate benchmarks for `DATE` should 
exclude overhead of implicit conversions `DATE` <-> `TIMESTAMP`.
   
   ### Does this PR introduce any user-facing change?
   
   After the changes, users can convert `long`, `int`, `short` and `byte` 
columns to dates:
   ```sql
   spark-sql> select cast(1 as date);
   1970-01-02
   ```
   but before the same sql statement fails with the error:
   ```sql
   spark-sql> select cast(1 as date);
   Error in query: cannot resolve 'CAST(1 AS DATE)' due to data type mismatch: 
cannot cast int to date; line 1 pos 7;
   'Project [unresolvedalias(cast(1 as date), None)]
   +- OneRowRelation
   ```
   
   ### How was this patch tested?
   - Added new tests to `CastSuite` to check casting integral types to dates
   - Regenerated results of `ExtractBenchmark`
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to