GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/22367
[SPARK-17916][SPARK-25241][SQL][FOLLOWUP] Fix empty string being parsed as
null when nullValue is set.
## What changes were proposed in this pull request?
In the PR, I propose new CSV option `emptyValue` and an update in the SQL
Migration Guide which describes how to revert previous behavior when empty
strings were not written at all. Since Spark 2.4, empty strings are saved as
`""` to distinguish them from saved `null`s.
## How was this patch tested?
It was tested by `CSVSuite` and new tests added in the PR #22234
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 csv-empty-value-2.4
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22367.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22367
----
commit 465ed7a6011bd0437c7f88cb4c18ecea68cb60ac
Author: Mario Molina <mmolimar@...>
Date: 2018-08-25T17:42:03Z
Configurable empty values when reading/writing CSV files
commit 48e143d43a876afc4f0099bf7079130d74ebe855
Author: Mario Molina <mmolimar@...>
Date: 2018-08-26T23:29:32Z
Adding tests
commit 70e217146962186a391227f1417cf79c5e81c380
Author: Mario Molina <mmolimar@...>
Date: 2018-08-26T23:33:55Z
Changing emptyValue order arg in streaming.py
commit 8665f93c442915dc23a40ffb3c958a097dec34c5
Author: Mario Molina <mmolimar@...>
Date: 2018-08-27T02:03:41Z
Changing emptyValue order arg in set_opts
commit 867c6de34673bbc877e0e26e8c0d662e038e2946
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-08T20:40:41Z
Added comments for parameters
commit e0cb879f3bc28f66e19d049ed0ee6dc33fc5922c
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-08T21:02:21Z
Updating the migration guide
commit e23098c5a6322ab3cff851b37889163c9bd09491
Author: Mario Molina <mmolimar@...>
Date: 2018-08-26T23:28:34Z
Changing order in args for emptyValue
commit 732ec78c8d376bad0cc8897b1da48a56448590fb
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-08T21:11:56Z
Revert some checking
commit 7eac385568c78735bb7743cfcfa234c4bea97fb0
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-08T21:14:13Z
Revert unneeded changes
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]