It seems that the jar for cassandra is not loaded, you should have
them in the classpath.

On Mon, Feb 16, 2015 at 12:08 PM, Mohamed Lrhazi
<mohamed.lrh...@georgetown.edu> wrote:
> Hello all,
>
> Trying the example code from this package
> (https://github.com/Parsely/pyspark-cassandra) , I always get this error...
>
> Can you see what I am doing wrong? from googling arounf it seems to be that
> the jar is not found somehow...  The spark log shows the JAR was processed
> at least.
>
> Thank you so much.
>
> am using spark-1.2.1-bin-hadoop2.4.tgz
>
> test2.py is simply:
>
> from pyspark.context import SparkConf
> from pyspark_cassandra import CassandraSparkContext, saveToCassandra
> conf = SparkConf().setAppName("PySpark Cassandra Sample Driver")
> conf.set("spark.cassandra.connection.host", "devzero")
> sc = CassandraSparkContext(conf=conf)
>
> [root@devzero spark]# /usr/local/bin/docker-enter  spark-master bash -c
> "/spark/bin/spark-submit --py-files /spark/pyspark_cassandra.py --jars
> /spark/pyspark-cassandra-0.1-SNAPSHOT.jar --driver-class-path
> /spark/pyspark-cassandra-0.1-SNAPSHOT.jar /spark/test2.py"
> ...
> 15/02/16 05:58:45 INFO Slf4jLogger: Slf4jLogger started
> 15/02/16 05:58:45 INFO Remoting: Starting remoting
> 15/02/16 05:58:45 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkDriver@devzero:38917]
> 15/02/16 05:58:45 INFO Utils: Successfully started service 'sparkDriver' on
> port 38917.
> 15/02/16 05:58:45 INFO SparkEnv: Registering MapOutputTracker
> 15/02/16 05:58:45 INFO SparkEnv: Registering BlockManagerMaster
> 15/02/16 05:58:45 INFO DiskBlockManager: Created local directory at
> /tmp/spark-6cdca68b-edec-4a31-b3c1-a7e9d60191e7/spark-0e977468-6e31-4bba-959a-135d9ebda193
> 15/02/16 05:58:45 INFO MemoryStore: MemoryStore started with capacity 265.4
> MB
> 15/02/16 05:58:45 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 15/02/16 05:58:46 INFO HttpFileServer: HTTP File server directory is
> /tmp/spark-af61f7f5-7c0e-412c-8352-263338335fa5/spark-10b3891f-0321-44fe-ba60-1a8c102fd647
> 15/02/16 05:58:46 INFO HttpServer: Starting HTTP Server
> 15/02/16 05:58:46 INFO Utils: Successfully started service 'HTTP file
> server' on port 56642.
> 15/02/16 05:58:46 INFO Utils: Successfully started service 'SparkUI' on port
> 4040.
> 15/02/16 05:58:46 INFO SparkUI: Started SparkUI at http://devzero:4040
> 15/02/16 05:58:46 INFO SparkContext: Added JAR
> file:/spark/pyspark-cassandra-0.1-SNAPSHOT.jar at
> http://10.212.55.42:56642/jars/pyspark-cassandra-0.1-SNAPSHOT.jar with
> timestamp 1424066326632
> 15/02/16 05:58:46 INFO Utils: Copying /spark/test2.py to
> /tmp/spark-e8cc013e-faae-4208-8bcd-0bb6c00b1b6c/spark-54f2c41d-ae35-4efd-860c-2e5c60979b4c/test2.py
> 15/02/16 05:58:46 INFO SparkContext: Added file file:/spark/test2.py at
> http://10.212.55.42:56642/files/test2.py with timestamp 1424066326633
> 15/02/16 05:58:46 INFO Utils: Copying /spark/pyspark_cassandra.py to
> /tmp/spark-e8cc013e-faae-4208-8bcd-0bb6c00b1b6c/spark-54f2c41d-ae35-4efd-860c-2e5c60979b4c/pyspark_cassandra.py
> 15/02/16 05:58:46 INFO SparkContext: Added file
> file:/spark/pyspark_cassandra.py at
> http://10.212.55.42:56642/files/pyspark_cassandra.py with timestamp
> 1424066326642
> 15/02/16 05:58:46 INFO Executor: Starting executor ID <driver> on host
> localhost
> 15/02/16 05:58:46 INFO AkkaUtils: Connecting to HeartbeatReceiver:
> akka.tcp://sparkDriver@devzero:38917/user/HeartbeatReceiver
> 15/02/16 05:58:46 INFO NettyBlockTransferService: Server created on 32895
> 15/02/16 05:58:46 INFO BlockManagerMaster: Trying to register BlockManager
> 15/02/16 05:58:46 INFO BlockManagerMasterActor: Registering block manager
> localhost:32895 with 265.4 MB RAM, BlockManagerId(<driver>, localhost,
> 32895)
> 15/02/16 05:58:46 INFO BlockManagerMaster: Registered BlockManager
> 15/02/16 05:58:47 INFO SparkUI: Stopped Spark web UI at http://devzero:4040
> 15/02/16 05:58:47 INFO DAGScheduler: Stopping DAGScheduler
> 15/02/16 05:58:48 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor
> stopped!
> 15/02/16 05:58:48 INFO MemoryStore: MemoryStore cleared
> 15/02/16 05:58:48 INFO BlockManager: BlockManager stopped
> 15/02/16 05:58:48 INFO BlockManagerMaster: BlockManagerMaster stopped
> 15/02/16 05:58:48 INFO SparkContext: Successfully stopped SparkContext
> 15/02/16 05:58:48 INFO RemoteActorRefProvider$RemotingTerminator: Shutting
> down remote daemon.
> 15/02/16 05:58:48 INFO RemoteActorRefProvider$RemotingTerminator: Remote
> daemon shut down; proceeding with flushing remote transports.
> 15/02/16 05:58:48 INFO RemoteActorRefProvider$RemotingTerminator: Remoting
> shut down.
> Traceback (most recent call last):
>   File "/spark/test2.py", line 5, in <module>
>     sc = CassandraSparkContext(conf=conf)
>   File "/spark/python/pyspark/context.py", line 105, in __init__
>     conf, jsc)
>   File "/spark/pyspark_cassandra.py", line 17, in _do_init
>     self._jcsc = self._jvm.CassandraJavaUtil.javaFunctions(self._jsc)
>   File "/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line
> 726, in __getattr__
> py4j.protocol.Py4JError: Trying to call a package.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to