GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/15792

    [SPARK-18295][SQL] Make to_json function null safe

    ## What changes were proposed in this pull request?
    
    This PR proposes to match up the behaviour of `to_json` to `from_json` 
function for null-safety.
    
    Currently, it throws `NullPointException` but this PR fixes this to produce 
`null` instead.
    
    with the data below:
    
    ```scala
    import spark.implicits._
    
    val df = Seq(Some(Tuple1(Tuple1(1))), None).toDF("a")
    df.show()
    ```
    
    ```
    +----+
    |   a|
    +----+
    | [1]|
    |null|
    +----+
    ```
    
    the codes below
    
    ```scala
    import org.apache.spark.sql.functions._
    
    df.select(to_json($"a")).show()
    ```
    
    produces..
    
    **Before**
    
    throws `NullPointException` as below:
    
    ```
    java.lang.NullPointerException
      at 
org.apache.spark.sql.catalyst.json.JacksonGenerator.org$apache$spark$sql$catalyst$json$JacksonGenerator$$writeFields(JacksonGenerator.scala:138)
      at 
org.apache.spark.sql.catalyst.json.JacksonGenerator$$anonfun$write$1.apply$mcV$sp(JacksonGenerator.scala:194)
      at 
org.apache.spark.sql.catalyst.json.JacksonGenerator.org$apache$spark$sql$catalyst$json$JacksonGenerator$$writeObject(JacksonGenerator.scala:131)
      at 
org.apache.spark.sql.catalyst.json.JacksonGenerator.write(JacksonGenerator.scala:193)
      at 
org.apache.spark.sql.catalyst.expressions.StructToJson.eval(jsonExpressions.scala:544)
      at 
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:142)
      at 
org.apache.spark.sql.catalyst.expressions.InterpretedProjection.apply(Projection.scala:48)
      at 
org.apache.spark.sql.catalyst.expressions.InterpretedProjection.apply(Projection.scala:30)
      at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    ```
    
    **After**
    
    ```
    +---------------+
    |structtojson(a)|
    +---------------+
    |       {"_1":1}|
    |           null|
    +---------------+
    ```
    
    
    ## How was this patch tested?
    
    Unit test in `JsonExpressionsSuite.scala` and `JsonFunctionsSuite.scala`.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-18295

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15792
    
----
commit 5c534727b0d72015104c242e369d7edc5b0fe910
Author: hyukjinkwon <[email protected]>
Date:   2016-11-07T02:34:28Z

    Make to_json expression/function null safe

commit ce0eddae4ee03002642c60cd21cc858ab4ae12a2
Author: hyukjinkwon <[email protected]>
Date:   2016-11-07T02:53:51Z

    Clean up the test

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to