Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22374#discussion_r216509913
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
    @@ -1700,4 +1700,13 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils with Te
         checkCount(2)
         countForMalformedCSV(0, Seq(""))
       }
    +
    +  test("SPARK-25387: bad input should not cause NPE") {
    +    val schema = StructType(StructField("a", IntegerType) :: Nil)
    +    val input = spark.createDataset(Seq("\u0000\u0000\u0001234"))
    --- End diff --
    
    btw, in this title, bad CSV means what (bad unicode?)? In this case, the 
CSV parser returns null and, in another case, it throws 
`com.univocity.parsers.common.TextParsingException`? I just want to know the 
behaivour in the parser.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to