GitHub user mn-mikke opened a pull request:

    https://github.com/apache/spark/pull/21215

    [SPARK-24148][SQL] Overloading array function to support typed empty arrays

    ## What changes were proposed in this pull request?
    The PR proposes to overload `array` function and allow users to specify the 
element type for empty arrays. Currently, empty arrays produced by `array` 
function are of `StringType` and there is no way how to cast them to a 
different type.
    
    A perfect example of the use case is `when(cond, 
trueExp).otherwise(falseExp)`, which expects `trueExp` and `falseExp` of being 
the same type. In scenario where we want to produce an empty array, in one of 
these cases, there's no other way than creating an `UDF`.
    
    ## How was this patch tested?
    Added test cases into `DataFrameComplexTypeSuite`
    
    ## Note
    Eventually, I will add a wrapper for PySpark, but would like to discuss the 
idea first. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AbsaOSS/spark 
feature/array-api-empty-array-to-master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21215.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21215
    
----
commit 44b18520dcf8e3e3639756cd8a12f75ea1080bee
Author: Marek Novotny <mn.mikke@...>
Date:   2018-05-02T13:42:42Z

    [SPARK-24148][SQL] Overloading array function to support typed empty arrays.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to