GitHub user rolmovel opened a pull request: https://github.com/apache/incubator-zeppelin/pull/320
ZEPPELIN-289: User can now enter custom expressions in notebooks' input fields Actually, with Zeppelin we can use Spark SQL UDFs perfectly fine. We developed a custom UDF library that parses absolute and relative dates. Feeding this library into Spark SQL using the standard UDF mechanism is suboptimal, since each UDF call is repeated for each row of the queried table. Example: ``` select * from my_table where agg_date >= parseDate(â-5dâ) ``` This repeats the call to parseDate(...) for every single row of 'my_table'. Even worse, if we filter for a date range like in: ``` select * from my_table where agg_date >= parseDate(â-5dâ) and agg_date <= parseDate(ânowâ) ``` the call to parseDate(...) is performed twice for each row in the table. Since Spark's UDFs do not have a concept of 'execution context' we were not able to overcome the problem. We implemented a mechanism of UDF evaluation in Zeppelin, before the query parameters are sent to the interpreter. Parametrizing queries as usual in Zeppelin, in Zeppelin's input forms you can now enter expressions like: ``` eval:parseDate("-5d") ``` or: ``` eval:com.company.custom.udf.UDFUtility.parseDate("-5d") ``` this is similar to how standard SQL works, where parameters are evaluated before being sent to the execution engine. You can find more info in the org.apache.zeppelin.display.Evaluator javadoc. The above mentioned query over a table of 1 million records lasts about 1 minute. Applying this PR the execution time is reduced to 15 seconds. You can merge this pull request into a Git repository by running: $ git pull https://github.com/keedio/incubator-zeppelin eval-notebook-expression Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-zeppelin/pull/320.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #320 ---- commit 1c8a97f38061808ed5221b1de568f7ef7487a34d Author: Rodrigo Olmo Velasco <rolmo@macbook-pro-de-rodrigo.local> Date: 2015-09-07T09:44:38Z ZEPPELIN-289: User can now enter custom expressions in notebooks' input fields. Expression will be evaluated server-side by Zeppelin before being sent to the interpreter. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---