GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/19664

    [SPARK-22442][SQL] ScalaReflection should produce correct field names for 
special characters

    ## What changes were proposed in this pull request?
    
    For a class with field name of special characters, e.g.:
    ```scala
    case class MyType(`field.1`: String, `field 2`: String)
    ```
    
    Although we can manipulate DataFrame/Dataset, the field names are encoded:
    ```scala
    scala> val df = Seq(MyType("a", "b"), MyType("c", "d")).toDF
    df: org.apache.spark.sql.DataFrame = [field$u002E1: string, field$u00202: 
string]
    scala> df.as[MyType].collect
    res7: Array[MyType] = Array(MyType(a,b), MyType(c,d))
    ```
    
    It causes resolving problem when we try to convert the data with 
non-encoded field names:
    ```scala
    spark.read.json(path).as[MyType]
    ```
    
    We should use decoded field name.
    
    ## How was this patch tested?
    
    Added unit test. May add another end-to-end test later.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-22442

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19664.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19664
    
----
commit 319c80447c4fc1baa3167c889d1d8c072ee5b31c
Author: Liang-Chi Hsieh <[email protected]>
Date:   2017-11-06T09:04:52Z

    ScalaReflection should produce correct field names for special characters.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to