Try this.. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi>
On Wed, May 28, 2014 at 7:40 PM, Vibhor Banga <vibhorba...@gmail.com> wrote: > Any one who has used spark this way or has faced similar issue, please > help. > > Thanks, > -Vibhor > > > On Wed, May 28, 2014 at 6:03 PM, Vibhor Banga <vibhorba...@gmail.com>wrote: > >> Hi all, >> >> I am facing issues while using spark with HBase. I am getting >> NullPointerException at org.apache.hadoop.hbase.TableName.valueOf >> (TableName.java:288) >> >> Can someone please help to resolve this issue. What am I missing ? >> >> >> I am using following snippet of code - >> >> Configuration config = HBaseConfiguration.create(); >> >> config.set("hbase.zookeeper.znode.parent", "hostname1"); >> config.set("hbase.zookeeper.quorum","hostname1"); >> config.set("hbase.zookeeper.property.clientPort","2181"); >> config.set("hbase.master", "hostname1: >> config.set("fs.defaultFS","hdfs://hostname1/"); >> config.set("dfs.namenode.rpc-address","hostname1:8020"); >> >> config.set(TableInputFormat.INPUT_TABLE, "tableName"); >> >> JavaSparkContext ctx = new JavaSparkContext(args[0], "Simple", >> System.getenv(sparkHome), >> JavaSparkContext.jarOfClass(Simple.class)); >> >> JavaPairRDD<ImmutableBytesWritable, Result> hBaseRDD >> = ctx.newAPIHadoopRDD( config, TableInputFormat.class, >> ImmutableBytesWritable.class, Result.class); >> >> Map<ImmutableBytesWritable, Result> rddMap = >> hBaseRDD.collectAsMap(); >> >> >> But when I go to the spark cluster and check the logs, I see following >> error - >> >> INFO NewHadoopRDD: Input split: w3-target1.nm.flipkart.com:, >> 14/05/28 16:48:51 ERROR TableInputFormat: java.lang.NullPointerException >> at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:288) >> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:154) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:99) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:92) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:84) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:48) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109) >> at org.apache.spark.scheduler.Task.run(Task.scala:53) >> at >> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211) >> at >> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42) >> at >> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >> at >> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> Thanks, >> >> -Vibhor >> >> > > >
SparkHBaseMain.java
Description: Binary data