GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/21598
[SPARK-24605][SQL] size(null) returns null instead of -1
## What changes were proposed in this pull request?
In PR, I propose new behavior of `size(null)` under the config flag
`spark.sql.legacy.sizeOfNull`. If the former one is disabled, the `size()`
function returns `null` for `null` input. By default the
`spark.sql.legacy.sizeOfNull` is enabled to keep backward compatibility with
previous versions. In that case, `size(null)` returns `-1`.
## How was this patch tested?
Modified existing tests for the `size()` function to check new behavior
(`null`) and old one (`-1`).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 legacy-size-of-null
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21598.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21598
----
commit 2338b787c55d100ac7457e9576054044597acd50
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-06-20T05:59:49Z
New implementation of size returns null for null input
commit e18568f2fc182e41d043c6713d5df7186513fe05
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-06-20T06:00:24Z
Test for legacy and new implementation of size()
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]