[GitHub] spark pull request #22374: [SPARK-25387][SQL] Fix for NPE caused by bad CSV ...

MaxGekk Wed, 12 Sep 2018 08:41:24 -0700

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22374#discussion_r217082220
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
    @@ -216,7 +216,12 @@ class UnivocityParser(
       }
     
       private def convert(tokens: Array[String]): InternalRow = {
    -    if (tokens.length != parsedSchema.length) {
    +    if (tokens == null) {
    --- End diff --
    
    I got it on a CSV files with some marks at the beginning but `uniVocity` 
parser returns `null` in many cases when it cannot read/parse input, for 
example: 
https://github.com/uniVocity/univocity-parsers/blob/f616d151b48150bc9cb98943f9b6f8353b704359/src/main/java/com/univocity/parsers/common/AbstractParser.java#L663



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22374: [SPARK-25387][SQL] Fix for NPE caused by bad CSV ...

Reply via email to