GitHub user zjffdu opened a pull request: https://github.com/apache/zeppelin/pull/1452
ZEPPELIN-1442. UDF can not be found due to 2 instances of SparkSession is created ### What is this PR for? The issue is that we create 2 SparkSession in zeppelin_pyspark.py (Because we create SQLContext first which will create SparkSession underlying). This cause 2 instances of SparkSession in JVM side and this means we have 2 instances of Catalog as well. So udf registered in SQLContext can be used in SparkSession. This PR will create SparkSession first and then assign its internal SQLContext to sqlContext in pyspark. ### What type of PR is it? [Bug Fix] ### Todos * [ ] - Task ### What is the Jira issue? * https://issues.apache.org/jira/browse/ZEPPELIN-1442 ### How should this be tested? Integration test is added. ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No ⦠You can merge this pull request into a Git repository by running: $ git pull https://github.com/zjffdu/zeppelin ZEPPELIN-1442 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zeppelin/pull/1452.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1452 ---- commit 948e8657634686f16a405b3938884a5fe48dfc1c Author: Jeff Zhang <zjf...@apache.org> Date: 2016-09-23T05:08:49Z ZEPPELIN-1442. UDF can not be found due to 2 instances of SparkSession is created ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---