The code which causes the error is:

The code which causes the error is:

sc = SparkContext("local", "My App")
rdd = sc.newAPIHadoopFile(
    name, 
    'org.apache.hadoop.hbase.mapreduce.TableInputFormat',
    'org.apache.hadoop.hbase.io.ImmutableBytesWritable',
    'org.apache.hadoop.hbase.client.Result',
    conf={"hbase.zookeeper.quorum": "my-host", 
      "hbase.rootdir": "hdfs://my-host:8020/hbase",
      "hbase.mapreduce.inputtable": "data"})

The full stack trace is:



Py4JError                                 Traceback (most recent call last)
<ipython-input-8-3b9a4ea2f659> in <module>()
      7 conf={"hbase.zookeeper.quorum": "my-host", 
      8       "hbase.rootdir": "hdfs://my-host:8020/hbase",
----> 9       "hbase.mapreduce.inputtable": "data"})
     10 
     11 

/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/context.pyc in
newAPIHadoopFile(self, name, inputformat_class, key_class, value_class,
key_wrapper, value_wrapper, conf)
    281         for k, v in conf.iteritems():
    282             jconf[k] = v
--> 283         jrdd = self._jvm.PythonRDD.newAPIHadoopFile(self._jsc, name,
inputformat_class, key_class, value_class,
    284                                                     key_wrapper,
value_wrapper, jconf)
    285         return RDD(jrdd, self, PickleSerializer())

/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py
in __getattr__(self, name)
    657         else:
    658             raise Py4JError('{0} does not exist in the JVM'.
--> 659                     format(self._fqn + name))
    660 
    661     def __call__(self, *args):

Py4JError: org.apache.spark.api.python.PythonRDDnewAPIHadoopFile does not
exist in the JVM



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6507.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to