Egor Pahomov created ZEPPELIN-1442:
--------------------------------------

             Summary: Pyspark in Zeppelin does not support UDF
                 Key: ZEPPELIN-1442
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1442
             Project: Zeppelin
          Issue Type: Bug
          Components: pySpark, python-interpreter
    Affects Versions: 0.6.1
            Reporter: Egor Pahomov


It worked in 0.5.6 with spark 1.6.2
On 2.0 - I've checked - when I run just pySpark everything working fine. 

I've build zeppelin with 
{code}
mvn clean package -Pspark-2.0 -Phadoop-2.6 -Pyarn -Ppyspark -DskipTests
{code}

{code}
%pyspark
from pyspark.sql import SQLContext, Row
from pyspark.sql.types import StringType, IntegerType, StructType, StructField, 
MapType, FloatType, ArrayType
from pyspark.sql.functions import udf

sqlContext.registerFunction("stringLengthString", lambda x: len(x))
{code}

Returns:
{code}
Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-3201872850141735060.py", line 266, in <module>
    raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-3201872850141735060.py", line 264, in <module>
    exec(code)
  File "<stdin>", line 7, in <module>
  File "/home/egor/spark/python/pyspark/sql/context.py", line 203, in 
registerFunction
    self.sparkSession.catalog.registerFunction(name, f, returnType)
AttributeError: 'JavaMember' object has no attribute 'registerFunction'
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to