Hi Tommer, I'm working on updating and improving the PR, and will work on getting an HBase example working with it. Will feed back as soon as I have had the chance to work on this a bit more.
N On Thu, May 29, 2014 at 3:27 AM, twizansk <twiza...@gmail.com> wrote: > The code which causes the error is: > > The code which causes the error is: > > sc = SparkContext("local", "My App") > rdd = sc.newAPIHadoopFile( > name, > 'org.apache.hadoop.hbase.mapreduce.TableInputFormat', > 'org.apache.hadoop.hbase.io.ImmutableBytesWritable', > 'org.apache.hadoop.hbase.client.Result', > conf={"hbase.zookeeper.quorum": "my-host", > "hbase.rootdir": "hdfs://my-host:8020/hbase", > "hbase.mapreduce.inputtable": "data"}) > > The full stack trace is: > > > > Py4JError Traceback (most recent call last) > <ipython-input-8-3b9a4ea2f659> in <module>() > 7 conf={"hbase.zookeeper.quorum": "my-host", > 8 "hbase.rootdir": "hdfs://my-host:8020/hbase", > ----> 9 "hbase.mapreduce.inputtable": "data"}) > 10 > 11 > > /opt/cloudera/parcels/CDH/lib/spark/python/pyspark/context.pyc in > newAPIHadoopFile(self, name, inputformat_class, key_class, value_class, > key_wrapper, value_wrapper, conf) > 281 for k, v in conf.iteritems(): > 282 jconf[k] = v > --> 283 jrdd = self._jvm.PythonRDD.newAPIHadoopFile(self._jsc, > name, > inputformat_class, key_class, value_class, > 284 key_wrapper, > value_wrapper, jconf) > 285 return RDD(jrdd, self, PickleSerializer()) > > > /opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py > in __getattr__(self, name) > 657 else: > 658 raise Py4JError('{0} does not exist in the JVM'. > --> 659 format(self._fqn + name)) > 660 > 661 def __call__(self, *args): > > Py4JError: org.apache.spark.api.python.PythonRDDnewAPIHadoopFile does not > exist in the JVM > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6507.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >