GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/20246

    [SPARK-23054][SQL] Fix incorrect results of casting UserDefinedType to 
String 

    ## What changes were proposed in this pull request?
    This pr fixed the issue when casting `UserDefinedType`s into strings;
    ```
    >>> from pyspark.ml.classification import MultilayerPerceptronClassifier
    >>> from pyspark.ml.linalg import Vectors
    >>> df = spark.createDataFrame([(0.0, Vectors.dense([0.0, 0.0])), (1.0, 
Vectors.dense([0.0, 1.0]))], ["label", "features"])
    >>> df.selectExpr("CAST(features AS STRING)").show(truncate = False)
    +-------------------------------------------+
    |features                                   |
    +-------------------------------------------+
    |[6,1,0,0,2800000020,2,0,0,0]               |
    |[6,1,0,0,2800000020,2,0,0,3ff0000000000000]|
    +-------------------------------------------+
    ```
    This pr modified the result into;
    ```
    +---------+                                                                 
    
    |features |
    +---------+
    |[0.0,0.0]|
    |[0.0,1.0]|
    +---------+
    ```
    
    ## How was this patch tested?
    Added tests in `UserDefinedTypeSuite `.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark SPARK-23054

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20246
    
----
commit 137d85f23fa8d0e45144db89666f4c9083d14100
Author: Takeshi Yamamuro <yamamuro@...>
Date:   2018-01-12T02:45:42Z

    Cast user-defined data into strings

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to