GitHub user rolmovel opened a pull request:

    https://github.com/apache/incubator-zeppelin/pull/320

    ZEPPELIN-289: User can now enter custom expressions in notebooks' input 
fields

    Actually, with Zeppelin we can use Spark SQL UDFs perfectly fine. 
    
    We developed a custom UDF library that parses absolute and relative dates. 
Feeding this library into Spark SQL using the standard UDF mechanism is 
suboptimal, since each UDF call is repeated for each row of the queried table. 
    
    Example:
    ```
    select * from my_table where agg_date >= parseDate(“-5d”)
    ```
    This repeats the call to parseDate(...) for every single row of 'my_table'.
    
    Even worse, if we filter for a date range like in:
    ```
    select * from my_table where agg_date >= parseDate(“-5d”) and agg_date 
<= parseDate(“now”)
    ```
    the call to parseDate(...) is performed twice for each row in the table.
    
    Since Spark's UDFs do not have a concept of 'execution context' we were not 
able to overcome the problem.
    
    We implemented a mechanism of UDF evaluation in Zeppelin, before the query 
parameters are sent to the interpreter. Parametrizing queries as usual in 
Zeppelin, in Zeppelin's input forms you can now enter expressions like:
    ```
    eval:parseDate("-5d")
    ```
    or:
    ```
    eval:com.company.custom.udf.UDFUtility.parseDate("-5d")
    ```
    this is similar to how standard SQL works, where parameters are evaluated 
before being sent to the execution engine.
    
    You can find more info in the org.apache.zeppelin.display.Evaluator javadoc.
    
    The above mentioned query over a table of 1 million records lasts about 1 
minute. Applying this PR the execution time is reduced to 15 seconds.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/keedio/incubator-zeppelin 
eval-notebook-expression

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-zeppelin/pull/320.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #320
    
----
commit 1c8a97f38061808ed5221b1de568f7ef7487a34d
Author: Rodrigo Olmo Velasco <rolmo@macbook-pro-de-rodrigo.local>
Date:   2015-09-07T09:44:38Z

    ZEPPELIN-289: User can now enter custom expressions in notebooks' input 
fields. Expression will be evaluated server-side by Zeppelin before being sent 
to the interpreter.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to