GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/22374

    [SPARK-25387][SQL] Fix for NPE caused by bad CSV input

    ## What changes were proposed in this pull request?
    
    The PR fixes NPE in `UnivocityParser` caused by malformed CSV input. In 
some cases, `uniVocity` parser can return `null` for bad input. In the PR, I 
propose to check result of parsing and not propagate NPE to upper layers.
    
    ## How was this patch tested?
    
    I added a test which reproduce the issue and tested by `CSVSuite`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 npe-on-bad-csv

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22374.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22374
    
----
commit 6f9aba58927525f48bfdca00f76bf4890fe4cf30
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-09T15:20:54Z

    Fix NPE in read with specified schema

commit 9284527d1c473facd589ea8195c017c37d076df5
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-09T15:33:41Z

    Fix NPE in read on schema inferring

commit 05fe5faf191209366a37d5531e68b51364993dee
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-09T15:39:32Z

    Checking multiLine mode

commit b20c12d7720aeba4c4e03f2a6c18ef076d5b894a
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-09T15:40:51Z

    Adding ticket number to test's title

commit c9ccbee5e15bfa4ee67e12b256eefa544ce01f74
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-09T15:44:31Z

    Fix imports

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to