GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/22389

    [SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as 
null when nullValue is set.

    ## What changes were proposed in this pull request?
    
    In the PR, I propose new CSV option `emptyValue` and an update in the SQL 
Migration Guide which describes how to revert previous behavior when empty 
strings were not written at all. Since Spark 2.4, empty strings are saved as 
`""` to distinguish them from saved `null`s.
    
    Closes #22234
    Closes #22367
    
    ## How was this patch tested?
    
    It was tested by `CSVSuite` and new tests added in the PR #22234 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 csv-empty-value-master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22389
    
----
commit 458c09787b1325f95896c7f79015f041c53770dc
Author: Mario Molina <mmolimar@...>
Date:   2018-08-25T17:42:03Z

    Configurable empty values when reading/writing CSV files

commit 471b8ba593388b0e53bbc95d71c032e40e058ae5
Author: Mario Molina <mmolimar@...>
Date:   2018-08-26T23:29:32Z

    Adding tests

commit a89bc673a2f50147a52a38d00075687529246651
Author: Mario Molina <mmolimar@...>
Date:   2018-08-26T23:28:34Z

    Changing order in args for emptyValue

commit 8e91d5dbc7d91ca9e6439c24248c1d79f06e4b4d
Author: Mario Molina <mmolimar@...>
Date:   2018-08-26T23:33:55Z

    Changing emptyValue order arg in streaming.py

commit ddbac3e995dc613fc9ef854a3da05648105c208b
Author: Mario Molina <mmolimar@...>
Date:   2018-08-27T02:03:41Z

    Changing emptyValue order arg in set_opts

commit 4cb2be7d09cf02c14ae010402620c025d389a5b9
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-08T20:40:41Z

    Added comments for parameters

commit 8385c1149037cdc209a326b6945a1fdb1e0e64aa
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-08T21:02:21Z

    Updating the migration guide

commit 75208a478d4ee8ecd1ed94ff6281b437c8c233c1
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-10T20:25:34Z

    Revert "Adding tests"
    
    This reverts commit 48e143d43a876afc4f0099bf7079130d74ebe855.

commit 9a04d87fb4fc52490dafe5dcd0dea85c4f1b3c5d
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-10T20:30:47Z

    Addressing Hyukjin Kwon's concerns

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to