GitHub user kubatyszko opened a pull request:

    https://github.com/apache/spark/pull/16319

    SPARK-18906 SQL - parsing CSV should return null for certain numeric …

    ## What changes were proposed in this pull request?
    
    CSV parser changes allowing parsing of numeric fields to fail and return 
null in such case.
    In conjunction with "nullValue" option that may be used elsewhere this 
allows handling of certain csv sources that may use empty string as indication 
of null in one column and another specific value indicating null in another.
    
    Currently the option "nullValue" can only be provided once and we can't 
assume that a data source won't have a single "null" indicator.
    
    This problem is very similar to the one discussed here: 
https://github.com/databricks/spark-csv/issues/239
    
    Sargis Dudaklayan and Kuba Tyszko - Zest Finance
    
    ## How was this patch tested?
    
    The patch was tested using freshly compiled spark version 2.0.1 on a sample 
data source that has "null" values in 2 columns, one specified as "NA" and set 
using nullValue and another column with "" indicating no integer value.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Katlean/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16319.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16319
    
----
commit 0e6b41e01f8c9ac7c8a262bd58f2564a64a638e6
Author: Kuba Tyszko <[email protected]>
Date:   2016-12-16T21:46:43Z

    SPARK-18906 SQL - parsing CSV should return null for certain numeric fields 
- Sargis Dudaklayan and Kuba Tyszko - Zest Finance

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to