[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

MaxGekk Mon, 19 Nov 2018 13:45:35 -0800

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22237#discussion_r234792965
  
    --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
    @@ -1694,7 +1694,7 @@ test_that("column functions", {
       df <- as.DataFrame(list(list("col" = "{\"date\":\"21/10/2014\"}")))
       schema2 <- structType(structField("date", "date"))
       s <- collect(select(df, from_json(df$col, schema2)))
    -  expect_equal(s[[1]][[1]], NA)
    +  expect_equal(s[[1]][[1]]$date, NA)
    --- End diff --
    
    Do you mean this particular line or in general?
    
    This line was changed because in the `PERMISSIVE` mode we usually return a 
`Row` with null fields that we wasn't able to parse instead of just `null` for 
whole row.
    
    In general, to fully support the `PERMISSIVE` mode without any excuses when 
uniVocity parser cannot detect any JSON tokens on root level. We switched to 
`FailureSafeParser` in `from_json` and `PERMISSIVE` as the default mode, 
recently there #22237. Previously `from_json` didn't support any modes 
comparing to JSON datasource.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

Reply via email to