Does this happen on a local mode as well or just on external cluster?
with regard to the repro - %sql select getNum() from filteredNc limit 1
I guess, filterdNc is some table you have? cause when I tried it on my
local machine I got :
no such table filteredNc; line 1 pos 21
Eran

On Thu, Jul 2, 2015 at 12:44 PM Ophir Cohen <oph...@gmail.com> wrote:

> Thank you Moon.
> Here is the link:
> https://issues.apache.org/jira/browse/ZEPPELIN-150
>
> Please let me know how can I help further more.
>
> On Thu, Jul 2, 2015 at 2:35 AM, moon soo Lee <m...@apache.org> wrote:
>
>> Really appreciate for sharing the problem.
>> Very interesting. Do you mind file a issue on JIRA?
>>
>> Best,
>> moon
>>
>> On Tue, Jun 30, 2015 at 4:32 AM Ophir Cohen <oph...@gmail.com> wrote:
>>
>>> BTW, this isn't working as well:
>>>
>>>
>>>
>>> *val sidNameDF = hc.sql("select sid, name from hive_table limit 10")val
>>> sidNameDF2 = hc.createDataFrame(sidNameDF.rdd, sidNameDF.schema)
>>> sidNameDF2.registerTempTable("tmp_sid_name2")*
>>>
>>>
>>> On Tue, Jun 30, 2015 at 1:45 PM, Ophir Cohen <oph...@gmail.com> wrote:
>>>
>>>> I've made some progress in this issue and I think it's a bug...
>>>>
>>>> Apparently, when trying to use registered UDFs on tables that comes
>>>> from Hive - it returns the above exception (*ClassNotFoundException:
>>>> org.apache.zeppelin.spark.ZeppelinContext*).
>>>> When create new table and register it - UDFs works as expected.
>>>> You can see below to full details and example.
>>>>
>>>> Can someone tell if it's the expected behavior or a bug?
>>>> BTW
>>>> I don't mind to work on that bug - if you can give a pointer to the
>>>> right places.
>>>>
>>>> BTW2
>>>> Trying to register the SAME DataFrame as tempTable does not solve the
>>>> problem - only creating new table out of new DataFrame (see below).
>>>>
>>>>
>>>> *Detailed example*
>>>> 1. I have table in Hive called '*hive_table*' with string field called
>>>> *'name'* and int filed called *'sid'*
>>>>
>>>> 2. I registered a udf:
>>>> *def getStr(str: String) = str + "_str"*
>>>> *hc.udf.register("getStr", getStr _)*
>>>>
>>>> 3. Running the following on Zeppelin:
>>>> *%sql select getStr(name), * from** hive_table*
>>>> yields with excpetion:
>>>> *ClassNotFoundException: org.apache.zeppelin.spark.ZeppelinContext*
>>>>
>>>> 4. Creating new table, as follows:
>>>> *case class SidName(sid: Int, name: String)*
>>>> *val sidNameList = hc.sql("select sid, name from hive_table limit
>>>> 10").collectAsList().map(row => new SidName(row.getInt(0),
>>>> row.getString(1)))*
>>>> *val sidNameDF = hc.createDataFrame(sidNameList)*
>>>> *sidNameDF.registerTempTable("tmp_sid_name")*
>>>>
>>>> 5. Query the new table in the same fashion:
>>>> *%sql select getStr(name), * from tmp_sid_name*
>>>>
>>>> This time I get the expected results!
>>>>
>>>>
>>>> On Mon, Jun 29, 2015 at 5:16 PM, Ophir Cohen <oph...@gmail.com> wrote:
>>>>
>>>>> BTW
>>>>> The same query, on the same cluster but on Spark shell return the
>>>>> expected results.
>>>>>
>>>>> On Mon, Jun 29, 2015 at 3:24 PM, Ophir Cohen <oph...@gmail.com> wrote:
>>>>>
>>>>>> It looks that Zeppelin jar does not distributed to Spark nodes,
>>>>>> though I can't understand why it needed for the UDF.
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 3:23 PM, Ophir Cohen <oph...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the response,
>>>>>>> I'm not sure what do you mean, it exactly what I tried and failed.
>>>>>>> As I wrote above, 'hc' is actually different name to sqlc (that is
>>>>>>> different name to z.sqlContext).
>>>>>>>
>>>>>>> I get the same results.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 29, 2015 at 2:12 PM, Mina Lee <mina...@nflabs.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Ophir,
>>>>>>>>
>>>>>>>> Can you try below?
>>>>>>>>
>>>>>>>> def getNum(): Int = {
>>>>>>>>     100
>>>>>>>> }
>>>>>>>> sqlc.udf.register("getNum", getNum _)
>>>>>>>> sqlc.sql("select getNum() from filteredNc limit 1").show
>>>>>>>>
>>>>>>>> FYI sqlContext(==sqlc) is internally created by Zeppelin
>>>>>>>> and use hiveContext as sqlContext by default.
>>>>>>>> (If you did not change useHiveContext to be "false" in interpreter
>>>>>>>> menu.)
>>>>>>>>
>>>>>>>> Hope it helps.
>>>>>>>>
>>>>>>>> On Mon, Jun 29, 2015 at 7:55 PM, Ophir Cohen <oph...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Guys?
>>>>>>>>> Somebody?
>>>>>>>>> Can it be that Zeppelin does not support UDFs?
>>>>>>>>>
>>>>>>>>> On Sun, Jun 28, 2015 at 11:53 AM, Ophir Cohen <oph...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Guys,
>>>>>>>>>> One more problem I have encountered using Zeppelin.
>>>>>>>>>> Using Spark 1.3.1 on Yarn Hadoop 2.4
>>>>>>>>>>
>>>>>>>>>> I'm trying to create and use UDF (hc == z.sqlContext ==
>>>>>>>>>> HiveContext):
>>>>>>>>>> 1. Create and register the UDF:
>>>>>>>>>> def getNum(): Int = {
>>>>>>>>>>     100
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> hc.udf.register("getNum",getNum _)
>>>>>>>>>> 2. And I try to use on exist table:
>>>>>>>>>> %sql select getNum() from filteredNc limit 1
>>>>>>>>>>
>>>>>>>>>> Or:
>>>>>>>>>> 3. Trying using direct hc:
>>>>>>>>>> hc.sql("select getNum() from filteredNc limit 1").collect
>>>>>>>>>>
>>>>>>>>>> Both of them yield with
>>>>>>>>>> *"java.lang.ClassNotFoundException:
>>>>>>>>>> org.apache.zeppelin.spark.ZeppelinContext"*
>>>>>>>>>> (see below the full exception).
>>>>>>>>>>
>>>>>>>>>> And my questions is:
>>>>>>>>>> 1. Can it be that ZeppelinContext is not available on Spark nodes?
>>>>>>>>>> 2. Why it need ZeppelinContext anyway? Why it's relevant?
>>>>>>>>>>
>>>>>>>>>> The exception:
>>>>>>>>>>  WARN [2015-06-28 08:43:53,850] ({task-result-getter-0}
>>>>>>>>>> Logging.scala[logWarning]:71) - Lost task 0.2 in stage 23.0 (TID 
>>>>>>>>>> 1626,
>>>>>>>>>> ip-10-216-204-246.ec2.internal): java.lang.NoClassDefFoundError:
>>>>>>>>>> Lorg/apache/zeppelin/spark/ZeppelinContext;
>>>>>>>>>>     at java.lang.Class.getDeclaredFields0(Native Method)
>>>>>>>>>>     at java.lang.Class.privateGetDeclaredFields(Class.java:2499)
>>>>>>>>>>     at java.lang.Class.getDeclaredField(Class.java:1951)
>>>>>>>>>>     at
>>>>>>>>>> java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
>>>>>>>>>>
>>>>>>>>>> <Many more of ObjectStreamClass lines of exception>
>>>>>>>>>>
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>> org.apache.zeppelin.spark.ZeppelinContext
>>>>>>>>>>     at
>>>>>>>>>> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:69)
>>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>>>>>>>     ... 103 more
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>> org.apache.zeppelin.spark.ZeppelinContext
>>>>>>>>>>     at java.lang.ClassLoader.findClass(ClassLoader.java:531)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26)
>>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34)
>>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:64)
>>>>>>>>>>     ... 105 more
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Reply via email to