[GitHub] spark pull request: [SPARK-6367][SQL] Use the proper data type for...

yhuai Thu, 19 Mar 2015 15:05:19 -0700

GitHub user yhuai opened a pull request:

    https://github.com/apache/spark/pull/5094


    [SPARK-6367][SQL] Use the proper data type for those expressions that are 
hijacking existing data types.

    This PR adds internal UDTs for expressions that are hijacking existing data 
types.
    The following UDTs are added:
    * `HyperLogLogUDT` (`BinaryType` as the SQL type) for 
`ApproxCountDistinctPartition`
    * `OpenHashSetUDT` (`ArrayType` as the SQL type) for `CollectHashSet`, 
`NewSet`, `AddItemToSet`, and `CombineSets`. 
    
    I am also adding more unit tests for aggregation with code gen enabled.
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-6367

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yhuai/spark expressionType

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5094.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5094
    
----
commit a384b510c0cbfbf44855d2939aae737c26c20c85
Author: Yin Huai <[email protected]>
Date:   2015-03-19T21:59:04Z

    Add UDTs for expressions that return HyperLogLog and OpenHashSet.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6367][SQL] Use the proper data type for...

Reply via email to