Did you receive any response on this? I am trying to load hbase classes and getting the same error "py4j.protocol.Py4JError: Trying to call a package." . Even though the $HBASE_HOME/lib/* had already been added to the compute-classpath.sh
2014-10-21 16:02 GMT-07:00 Mike Sukmanowsky <mike.sukmanow...@gmail.com>: > Hi there, > > I'm using Spark 1.1.0 and experimenting with trying to use the DataStax > Cassandra Connector (https://github.com/datastax/spark-cassandra-connector) > from within PySpark. > > As a baby step, I'm simply trying to validate that I have access to > classes that I'd need via Py4J. Sample python program: > > > from py4j.java_gateway import java_import > > from pyspark.conf import SparkConf > from pyspark import SparkContext > > conf = SparkConf().set("spark.cassandra.connection.host", "127.0.0.1") > sc = SparkContext(appName="Spark + Cassandra Example", conf=conf) > java_import(sc._gateway.jvm, "com.datastax.spark.connector.*") > print sc._jvm.CassandraRow() > > > > > CassandraRow corresponds to > https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/CassandraRow.scala > which is included in the JAR I submit. Feel free to download the JAR here > https://dl.dropboxusercontent.com/u/4385786/pyspark-cassandra-0.1.0-SNAPSHOT-standalone.jar > > I'm currently running this Python example with: > > > > spark-submit > --driver-class-path="/path/to/pyspark-cassandra-0.1.0-SNAPSHOT-standalone.jar" > --verbose src/python/cassandara_example.py > > > > But continually get the following error indicating that the classes aren't > in fact on the classpath of the GatewayServer: > > > > Traceback (most recent call last): > File > "/Users/mikesukmanowsky/Development/parsely/pyspark-cassandra/src/python/cassandara_example.py", > line 37, in <module> > main() > File > "/Users/mikesukmanowsky/Development/parsely/pyspark-cassandra/src/python/cassandara_example.py", > line 25, in main > print sc._jvm.CassandraRow() > File > "/Users/mikesukmanowsky/.opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", > line 726, in __getattr__ > py4j.protocol.Py4JError: Trying to call a package. > > > > The correct response from the GatewayServer should be: > > > In [22]: gateway.jvm.CassandraRow() > Out[22]: JavaObject id=o0 > > > > Also tried using --jars option instead and that doesn't seem to work > either. Is there something I'm missing as to why the classes aren't > available? > > > -- > Mike Sukmanowsky > Aspiring Digital Carpenter > > *p*: +1 (416) 953-4248 > *e*: mike.sukmanow...@gmail.com > > facebook <http://facebook.com/mike.sukmanowsky> | twitter > <http://twitter.com/msukmanowsky> | LinkedIn > <http://www.linkedin.com/profile/view?id=10897143> | github > <https://github.com/msukmanowsky> > >