GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/15928

    [SPARK-18478][SQL] Support codegen'd Hive UDFs

    ## What changes were proposed in this pull request?
    This pr is to support codegen'd Hive UDFs.
    
    ## How was this patch tested?
    Add tests for checking if plans have codegen'd Hive UDFs in `HiveUDFsSuite`.
    
    The performance gains are as follows;
    For `HiveSimpleUDF`,
    ```
    Call Hive UDF:                         Best/Avg Time(ms)    Rate(M/s)   Per 
Row(ns)   Relative
    
----------------------------------------------------------------------------------------------
    Call Hive UDF wholestage off                   3 /    3      43794.0        
   0.0       1.0X
    Call Hive UDF wholestage on                    1 /    2     101551.3        
   0.0       2.3X
    ```
    
    For `HiveGenericUDF`,
    ```
    Call Hive generic UDF:                 Best/Avg Time(ms)    Rate(M/s)   Per 
Row(ns)   Relative
    
----------------------------------------------------------------------------------------------
    Call Hive generic UDF wholestage off           2 /    2      86919.9        
   0.0       1.0X
    Call Hive generic UDF wholestage on            1 /    1     143314.2        
   0.0       1.6X
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark SupportHiveUdfCodegen

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15928.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15928
    
----
commit 4b2a804a41d00bac24ba67aefd6ed8d5293ca0cd
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-11-17T00:44:02Z

    Support codegen for HiveUdf

commit b89c835367760faacbcda92caa1aab886d96e598
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-11-17T07:12:12Z

    Add benchmark results

commit 0490984c1f4c4bfbeb6b6978cdcedcc57279f97e
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-11-17T16:55:30Z

    Support codegen for HiveGenericUdf

commit 9f98bcacae329f3a06a0a4f62c1a508b91707ffc
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-11-18T05:29:37Z

    Brush up code

commit 54775b2e0c0395379c9df7f7baa3221cf8e9585f
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-11-18T07:02:43Z

    Add tests

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to