GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/10391

    [SPARK-12439][SQL] Fix toCatalystArray and MapObjects

    JIRA: https://issues.apache.org/jira/browse/SPARK-12439
    
    In toCatalystArray, we should look at the data type returned by dataTypeFor 
instead of silentSchemaFor, to determine if the element is native type. An 
obvious problem is when the element is Option[Int] class, catalsilentSchemaFor 
will return Int, then we will wrongly recognize the element is native type.
    
    There is another problem when using Option as array element. When we encode 
data like Seq(Some(1), Some(2), None) with encoder, we will use MapObjects to 
construct an array for it later. But in MapObjects, we don't check if the 
return value of lambdaFunction is null or not. That causes a bug that the 
decoded data for Seq(Some(1), Some(2), None) would be Seq(1, 2, -1), instead of 
Seq(1, 2, null).


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 fix-catalystarray

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10391.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10391
    
----
commit 788a7c6b2156d43be9c389d6a67b70e8ff9bbbb2
Author: Liang-Chi Hsieh <[email protected]>
Date:   2015-12-19T12:45:48Z

    Fix toCatalystArray and MapObjects.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to