fxoSa created TOREE-457:
---------------------------

             Summary: spark context seen corrupted after load KAfka libraries
                 Key: TOREE-457
                 URL: https://issues.apache.org/jira/browse/TOREE-457
             Project: TOREE
          Issue Type: Bug
          Components: Kernel
            Reporter: fxoSa
            Priority: Minor


I am trying to set up a jupyter notebook (apache-toree Scala) to access kafka 
logs from spark a streaming.
First I add dependencies using AddDeps:
 
{code:java}
%AddDeps org.apache.spark spark-streaming-kafka-0-10_2.11 2.2.0. 
Marking org.apache.spark:spark-streaming-kafka-0-10_2.11:2.2.0 for download 
Preparing to fetch from:
-> file:/tmp/toree_add_deps8235567186565695423/
-> https://repo1.maven.org/maven2
-> New file at 
/tmp/toree_add_deps8235567186565695423/https/repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-10_2.11/2.2.0/spark-streaming-kafka-0-10_2.11-2.2.0.jar
{code}
After that I am able to import successfully part of necesary libraries:

{code:java}

import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka010._
{code}


However code fails when I try to create streaming context:


{code:java}
val ssc = new StreamingContext(sc, Seconds(2))

    Name: Compile Error
Message: <console>:38: error: overloaded method constructor StreamingContext 
with alternatives:
  (path: String,sparkContext: 
org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext)org.apache.spark.streaming.StreamingContext
 <and>
  (path: String,hadoopConf: 
org.apache.hadoop.conf.Configuration)org.apache.spark.streaming.StreamingContext
 <and>
  (conf: org.apache.spark.SparkConf,batchDuration: 
org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext 
<and>
  (sparkContext: 
org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,batchDuration:
 org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext
 cannot be applied to 
(org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,
 org.apache.spark.streaming.Duration)
       val ssc = new StreamingContext(sc, Seconds(2))
                 ^
StackTrace: 

{code}

I have try it, in a jupyter docker 
https://github.com/jupyter/docker-stacks/tree/master/all-spark-notebook
and in spark cluster set up in Google cloud platform with the same results
Thanks





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to