GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/13019

    [SPARK-15241][SPARK-15242][SQL] fix 2 decimal-related issues in RowEncoder

    ## What changes were proposed in this pull request?
    
    SPARK-15241: We now support java decimal and catalyst decimal in external 
row, it makes sense to also support scala decimal.
    
    SPARK-15242: This is a long-standing bug, and is exposed after 
https://github.com/apache/spark/pull/12364, which eliminate the `If` expression 
if the field is not nullable:
    ```
    val fieldValue = serializerFor(
      GetExternalRowField(inputObject, i, externalDataTypeForInput(f.dataType)),
      f.dataType)
    if (f.nullable) {
      If(
        Invoke(inputObject, "isNullAt", BooleanType, Literal(i) :: Nil),
        Literal.create(null, f.dataType),
        fieldValue)
    } else {
      fieldValue
    }
    ```
    
    Previously, we always use `DecimalType.SYSTEM_DEFAULT` as the output type 
of converted decimal field, which is wrong as it doesn't match the real decimal 
type. However, it works well because we always put converted field into `If` 
expression to do the null check, and `If` use its `trueValue`'s data type as 
its output type.
    Now if we have a not nullable decimal field, and the converted field's 
output type will be `DecimalType.SYSTEM_DEFAULT`, and we will write wrong data 
into unsafe row.
    
    The fix is simple, just use the given decimal type as the output type of 
converted decimal field.
    
    These 2 issues was found at https://github.com/apache/spark/pull/13008
    
    ## How was this patch tested?
    
    new tests in RowEncoderSuite

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark encoder-decimal

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13019.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13019
    
----
commit f49e6d463ac316e82f6118829cb2e68eb34fe358
Author: Wenchen Fan <[email protected]>
Date:   2016-05-10T02:22:08Z

    support scala decimal in external row

commit 9621fc627e41b025b50af75c19feecc691e24a1a
Author: Wenchen Fan <[email protected]>
Date:   2016-05-10T03:21:47Z

    preserve decimal type info

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to