bersprockets opened a new pull request, #38635:
URL: https://github.com/apache/spark/pull/38635
### What changes were proposed in this pull request?
When a user specifies a null format in `to_number`/`try_to_number`, return
`null`, with a data type of `DecimalType.USER_DEFAULT`, rather than throwing a
`NullPointerException`.
Also, since the code for `ToNumber` and `TryToNumber` is virtually
identical, put all common code in new abstract class `ToNumberBase` to avoid
fixing the bug in two places.
### Why are the changes needed?
`to_number`/`try_to_number` currently throws a `NullPointerException` when
the format is `null`:
```
spark-sql> SELECT to_number('454', null);
[INTERNAL_ERROR] The Spark SQL phase analysis failed with an internal error.
Please, fill a bug report in, and provide the full stack trace.
org.apache.spark.SparkException: [INTERNAL_ERROR] The Spark SQL phase
analysis failed with an internal error. Please, fill a bug report in, and
provide the full stack trace.
at
org.apache.spark.SparkException$.internalError(SparkException.scala:88)
at
org.apache.spark.sql.execution.QueryExecution$.toInternalError(QueryExecution.scala:498)
at
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
...
Caused by: java.lang.NullPointerException
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormat$lzycompute(numberFormatExpressions.scala:72)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormat(numberFormatExpressions.scala:72)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormatter$lzycompute(numberFormatExpressions.scala:73)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormatter(numberFormatExpressions.scala:73)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.checkInputDataTypes(numberFormatExpressions.scala:81)
```
Also:
```
spark-sql> SELECT try_to_number('454', null);
[INTERNAL_ERROR] The Spark SQL phase analysis failed with an internal error.
Please, fill a bug report in, and provide the full stack trace.
org.apache.spark.SparkException: [INTERNAL_ERROR] The Spark SQL phase
analysis failed with an internal error. Please, fill a bug report in, and
provide the full stack trace.
at
org.apache.spark.SparkException$.internalError(SparkException.scala:88)
at
org.apache.spark.sql.execution.QueryExecution$.toInternalError(QueryExecution.scala:498)
at
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
...
Caused by: java.lang.NullPointerException
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormat$lzycompute(numberFormatExpressions.scala:72)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormat(numberFormatExpressions.scala:72)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormatter$lzycompute(numberFormatExpressions.scala:73)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.numberFormatter(numberFormatExpressions.scala:73)
at
org.apache.spark.sql.catalyst.expressions.ToNumber.checkInputDataTypes(numberFormatExpressions.scala:81)
at
org.apache.spark.sql.catalyst.expressions.TryToNumber.checkInputDataTypes(numberFormatExpressions.scala:146)
```
Compare to `to_binary` and `try_to_binary`:
```
spark-sql> SELECT to_binary('abc', null);
NULL
Time taken: 3.111 seconds, Fetched 1 row(s)
spark-sql> SELECT try_to_binary('abc', null);
NULL
Time taken: 0.06 seconds, Fetched 1 row(s)
spark-sql>
```
Also compare to `to_number` in PostgreSQL 11.18:
```
SELECT to_number('454', null) is null as a;
a
true
```
### Does this PR introduce _any_ user-facing change?
`to_number`/`try_to_number` with null format will now return `null` with a
data type of `DecimalType.USER_DEFAULT`.
### How was this patch tested?
New unit test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]