GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/22389
[SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as
null when nullValue is set.
## What changes were proposed in this pull request?
In the PR, I propose new CSV option `emptyValue` and an update in the SQL
Migration Guide which describes how to revert previous behavior when empty
strings were not written at all. Since Spark 2.4, empty strings are saved as
`""` to distinguish them from saved `null`s.
Closes #22234
Closes #22367
## How was this patch tested?
It was tested by `CSVSuite` and new tests added in the PR #22234
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 csv-empty-value-master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22389.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22389
----
commit 458c09787b1325f95896c7f79015f041c53770dc
Author: Mario Molina <mmolimar@...>
Date: 2018-08-25T17:42:03Z
Configurable empty values when reading/writing CSV files
commit 471b8ba593388b0e53bbc95d71c032e40e058ae5
Author: Mario Molina <mmolimar@...>
Date: 2018-08-26T23:29:32Z
Adding tests
commit a89bc673a2f50147a52a38d00075687529246651
Author: Mario Molina <mmolimar@...>
Date: 2018-08-26T23:28:34Z
Changing order in args for emptyValue
commit 8e91d5dbc7d91ca9e6439c24248c1d79f06e4b4d
Author: Mario Molina <mmolimar@...>
Date: 2018-08-26T23:33:55Z
Changing emptyValue order arg in streaming.py
commit ddbac3e995dc613fc9ef854a3da05648105c208b
Author: Mario Molina <mmolimar@...>
Date: 2018-08-27T02:03:41Z
Changing emptyValue order arg in set_opts
commit 4cb2be7d09cf02c14ae010402620c025d389a5b9
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-08T20:40:41Z
Added comments for parameters
commit 8385c1149037cdc209a326b6945a1fdb1e0e64aa
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-08T21:02:21Z
Updating the migration guide
commit 75208a478d4ee8ecd1ed94ff6281b437c8c233c1
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-10T20:25:34Z
Revert "Adding tests"
This reverts commit 48e143d43a876afc4f0099bf7079130d74ebe855.
commit 9a04d87fb4fc52490dafe5dcd0dea85c4f1b3c5d
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-09-10T20:30:47Z
Addressing Hyukjin Kwon's concerns
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]