Re: Reading from HBase is too slow

2014-09-30 Thread Ted Yu
Can you launch a job which exercises TableInputFormat on the same table without using Spark ? This would show whether the slowdown is in HBase code or somewhere else. Cheers On Mon, Sep 29, 2014 at 11:40 PM, Tao Xiao wrote: > I checked HBase UI. Well, this table is not completely eve

Re: Reading from HBase is too slow

2014-09-29 Thread Tao Xiao
I checked HBase UI. Well, this table is not completely evenly spread across the nodes, but I think to some extent it can be seen as nearly evenly spread - at least there is not a single node which has too many regions. Here is a screenshot of HBase UI <http://imgbin.org/index.php?page=image

Re: Reading from HBase is too slow

2014-09-29 Thread Vladimir Rodionov
HBase TableInputFormat creates input splits one per each region. You can not achieve high level of parallelism unless you have 5-10 regions per RS at least. What does it mean? You probably have too few regions. You can verify that in HBase Web UI. -Vladimir Rodionov On Mon, Sep 29, 2014 at 7:21

Re: Reading from HBase is too slow

2014-09-29 Thread Russ Weeks
Hi, Tao, When I used newAPIHadoopRDD (Accumulo not HBase) I found that I had to specify executor-memory and num-executors explicitly on the command line or else I didn't get any parallelism across the cluster. I used --executor-memory 3G --num-executors 24 but obviously other parameters wi

Re: Reading from HBase is too slow

2014-09-29 Thread Nan Zhu
can you look at your HBase UI to check whether your job is just reading from a single region server? Best, -- Nan Zhu On Monday, September 29, 2014 at 10:21 PM, Tao Xiao wrote: > I submitted a job in Yarn-Client mode, which simply reads from a HBase table > containing tens of milli

Re: Reading from HBase is too slow

2014-09-29 Thread Ted Yu
Are the regions for this table evenly spread across nodes in your cluster ? Were region servers under (heavy) load when your job ran ? Cheers On Mon, Sep 29, 2014 at 7:21 PM, Tao Xiao wrote: > I submitted a job in Yarn-Client mode, which simply reads from a HBase > table containing t

Re: Reading from HBase is too slow

2014-09-29 Thread Tao Xiao
I submitted the job in Yarn-Client mode using the following script: export SPARK_JAR=/usr/games/spark/xt/spark-assembly_2.10-0.9.0-cdh5.0.1-hadoop2.3.0-cdh5.0.1.jar export HADOOP_CLASSPATH=$(hbase classpath) export CLASSPATH=$CLASSPATH:/usr/games/spark/xt/SparkDemo-0.0.1-SNAPSHOT.jar:/usr/games

Reading from HBase is too slow

2014-09-29 Thread Tao Xiao
I submitted a job in Yarn-Client mode, which simply reads from a HBase table containing tens of millions of records and then does a *count *action. The job runs for a much longer time than I expected, so I wonder whether it was because the data to read was too much. Actually, there are 20 nodes in

Re: Spark Hbase

2014-09-24 Thread Madabhattula Rajesh Kumar
/examples/pythonconverters/HBaseConverters.scala > > Cheers > > On Wed, Sep 24, 2014 at 9:39 AM, Madabhattula Rajesh Kumar < > mrajaf...@gmail.com> wrote: > >> Hi Team, >> >> Could you please point me the example program for Spark HBase to read >> columns and values >> >> Regards, >> Rajesh >> > >

Re: Spark Hbase

2014-09-24 Thread Ted Yu
Cheers On Wed, Sep 24, 2014 at 9:39 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi Team, > > Could you please point me the example program for Spark HBase to read > columns and values > > Regards, > Rajesh >

Spark Hbase

2014-09-24 Thread Madabhattula Rajesh Kumar
Hi Team, Could you please point me the example program for Spark HBase to read columns and values Regards, Rajesh

RE: Bulk-load to HBase

2014-09-22 Thread innowireless TaeYun Kim
(For that time, my program did not include the HBase export task.) BTW, I use Spark 1.0.0. Thank you. -Original Message- From: Sean Owen [mailto:so...@cloudera.com] Sent: Monday, September 22, 2014 6:26 PM To: innowireless TaeYun Kim Cc: user Subject: Re: Bulk-load to HBase On Mon, S

Re: Bulk-load to HBase

2014-09-22 Thread Sean Owen
On Mon, Sep 22, 2014 at 10:21 AM, innowireless TaeYun Kim wrote: > I have to merge the byte[]s that have the same key. > If merging is done with reduceByKey(), a lot of intermediate byte[] > allocation and System.arraycopy() is executed, and it is too slow. So I had > to resort to groupByKey(),

RE: Bulk-load to HBase

2014-09-22 Thread innowireless TaeYun Kim
. And in fact it actually worked well when I implemented the same process with HBase Put class. So, I assume that it is not the problem. WithIndex is for excluding the record for the first partition. I could remove the record after collect()and sort(), but it was easier. I think that the problem is

Re: Bulk-load to HBase

2014-09-22 Thread Sean Owen
, the number of regions is fairly small for the RDD, and the size > of a region is big. > This is intentional since the reasonable size of a HBase region is several GB. > But, for Spark, it is too big for a partition that can be handled for an > executor. > I thought mapPartitionsWith

RE: Bulk-load to HBase

2014-09-22 Thread innowireless TaeYun Kim
first record of each partitions. // First partition's first record is excluded, since it's not needed. .collect(); Collections.sort(splitKeys); // Now we have the split keys // Create a HBase table createHBaseTableWithSplitKeys(tableName, splitKey

Re: Bulk-load to HBase

2014-09-19 Thread Soumitra Kumar
new HTable(conf, "output") HFileOutputFormat.configureIncrementalLoad (job, table); saveAsNewAPIHadoopFile("hdfs://localhost.localdomain:8020/user/cloudera/spark", classOf[ImmutableBytesWritable], classOf[Put], classOf[HFileOutputF

Re: Bulk-load to HBase

2014-09-19 Thread Ted Yu
lot of data to be uploaded to HBase and also, I didn't want to > take the pain of importing generated HFiles into HBase. Is there a way to > invoke HBase HFile import batch script programmatically? > > On 19 September 2014 17:58, innowireless TaeYun Kim < > taeyun@innowirel

Re: Bulk-load to HBase

2014-09-19 Thread Aniket Bhatnagar
Agreed that the bulk import would be faster. In my case, I wasn't expecting a lot of data to be uploaded to HBase and also, I didn't want to take the pain of importing generated HFiles into HBase. Is there a way to invoke HBase HFile import batch script programmatically? On 19 Septemb

RE: Bulk-load to HBase

2014-09-19 Thread innowireless TaeYun Kim
: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr] Sent: Friday, September 19, 2014 9:20 PM To: user@spark.apache.org Subject: RE: Bulk-load to HBase Thank you for the example code. Currently I use foreachPartition() + Put(), but your example code can be used to clean up my code

RE: Bulk-load to HBase

2014-09-19 Thread innowireless TaeYun Kim
Thank you for the example code. Currently I use foreachPartition() + Put(), but your example code can be used to clean up my code. BTW, since the data uploaded by Put() goes through normal HBase write path, it can be slow. So, it would be nice if bulk-load could be used, since it

Re: Bulk-load to HBase

2014-09-19 Thread Aniket Bhatnagar
t found saveAsNewAPIHadoopDataset. > > Then, Can I use HFileOutputFormat with saveAsNewAPIHadoopDataset? Is there > any example code for that? > > > > Thanks. > > > > *From:* innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr] > *Sent:* Friday, September 19,

RE: Bulk-load to HBase

2014-09-19 Thread innowireless TaeYun Kim
@spark.apache.org Subject: RE: Bulk-load to HBase Hi, After reading several documents, it seems that saveAsHadoopDataset cannot use HFileOutputFormat. It's because saveAsHadoopDataset method uses JobConf, so it belongs to the old Hadoop API, while HFileOutputFormat is a member of mapr

RE: Bulk-load to HBase

2014-09-19 Thread innowireless TaeYun Kim
Am I right? If so, is there another method to bulk-load to HBase from RDD? Thanks. From: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr] Sent: Friday, September 19, 2014 7:17 PM To: user@spark.apache.org Subject: Bulk-load to HBase Hi, Is there a way to bulk-load to

Bulk-load to HBase

2014-09-19 Thread innowireless TaeYun Kim
Hi, Is there a way to bulk-load to HBase from RDD? HBase offers HFileOutputFormat class for bulk loading by MapReduce job, but I cannot figure out how to use it with saveAsHadoopDataset. Thanks.

Re: HBase 0.96+ with Spark 1.0+

2014-09-18 Thread Ted Yu
On 14.09.2014 19:21, Reinis Vicups wrote: >> I did actually try Seans suggestion just before I posted for the first time >> in this thread. I got an error when doing this and thought that I am not >> understanding what Sean was suggesting. >> >> Now I re-attempted yo

Re: HBase 0.96+ with Spark 1.0+

2014-09-18 Thread Reinis Vicups
m not understanding what Sean was suggesting. Now I re-attempted your suggestions with spark 1.0.0-cdh5.1.0, hbase 0.98.1-cdh5.1.0 and hadoop 2.3.0-cdh5.1.0 I am currently using. I used following: val mortbayEnforce = "org.mortbay.jetty" % "servlet-api" % "3.0.20

RE: HBase and non-existent TableInputFormat

2014-09-16 Thread abraham.jacob
Yes that was very helpful… ☺ Here are a few more I found on my quest to get HBase working with Spark – This one details about Hbase dependencies and spark classpaths http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html This one has a code overview – http://www.abcn.net/2014

Re: HBase and non-existent TableInputFormat

2014-09-16 Thread Nicholas Chammas
Btw, there are some examples in the Spark GitHub repo that you may find helpful. Here's one <https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala> related to HBase. On Tue, Sep 16, 2014 at 1:22 PM, wrote: > *Hi, * &g

RE: HBase and non-existent TableInputFormat

2014-09-16 Thread abraham.jacob
Hi, I had a similar situation in which I needed to read data from HBase and work with the data inside of a spark context. After much ggling, I finally got mine to work. There are a bunch of steps that you need to do get this working - The problem is that the spark context does not know

Re: HBase and non-existent TableInputFormat

2014-09-16 Thread Ted Yu
hbase-client module serves client facing APIs. hbase-server module is supposed to host classes used on server side. There is still some work to be done so that the above goal is achieved. On Tue, Sep 16, 2014 at 9:06 AM, Y. Dong wrote: > Thanks Ted. It is indeed in hbase-server. Just curi

Re: HBase and non-existent TableInputFormat

2014-09-16 Thread Ted Yu
bq. TableInputFormat does not even exist in hbase-client API It is in hbase-server module. Take a look at http://hbase.apache.org/book.html#mapreduce.example.read On Tue, Sep 16, 2014 at 8:18 AM, Y. Dong wrote: > Hello, > > I’m currently using spark-core 1.1 and hbase 0.98.5 and

HBase and non-existent TableInputFormat

2014-09-16 Thread Y. Dong
Hello, I’m currently using spark-core 1.1 and hbase 0.98.5 and I want to simply read from hbase. The Java code is attached. However the problem is TableInputFormat does not even exist in hbase-client API, is there any other way I can read from hbase? Thanks SparkConf sconf = new SparkConf

Re: HBase 0.96+ with Spark 1.0+

2014-09-14 Thread Reinis Vicups
I did actually try Seans suggestion just before I posted for the first time in this thread. I got an error when doing this and thought that I am not understanding what Sean was suggesting. Now I re-attempted your suggestions with spark 1.0.0-cdh5.1.0, hbase 0.98.1-cdh5.1.0 and hadoop 2.3.0

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread Ted Yu
sparkConf = new SparkConf().setAppName("HBaseTest") > | val sc = new SparkContext(sparkConf) > | val conf = HBaseConfiguration.create() > | // Other options for configuring scan behavior are available. More > information available at > | // > http://

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread arthur.hk.c...@gmail.com
hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html | conf.set(TableInputFormat.INPUT_TABLE, args(0)) | // Initialize hBase table if necessary | val admin = new HBaseAdmin(conf) | if (!admin.isTableAvailable(args(0))) { | val tableDesc =

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread Ted Yu
building-with-maven.md >> |+++ docs/building-with-maven.md >> -- >> File to patch: >> >> >> >> >> >> >> Please advise. >> Regards >> Arthur >> >> >> >> On 14 Sep, 2014, at 10:48 p

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread arthur.hk.c...@gmail.com
;> |--- docs/building-with-maven.md >> |+++ docs/building-with-maven.md >> -- >> File to patch: >> >> >> >> >> >> >> Please advise. >> Regards >> Arthur >> >> >> >> On

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread arthur.hk.c...@gmail.com
patch: > > > > > > > Please advise. > Regards > Arthur > > > > On 14 Sep, 2014, at 10:48 pm, Ted Yu wrote: > >> Spark examples builds against hbase 0.94 by default. >> >> If you want to run against 0.98, see: >> SPARK-1297

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread Ted Yu
t; > > Please advise. > Regards > Arthur > > > > On 14 Sep, 2014, at 10:48 pm, Ted Yu wrote: > > Spark examples builds against hbase 0.94 by default. > > If you want to run against 0.98, see: > SPARK-1297 https://issues.apache.org/jira/browse/SPARK-1297 > &

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread arthur.hk.c...@gmail.com
\ +To build against HBase 0.98.x releases, "hbase-hadoop1" is the default profile. This means hbase-0.98.x-hadoop1 would be used.\ +When building against hadoop-2, "hbase-hadoop2" profile should be specified.\ +\ Examples:\ \ \{% highlight bash %\}\ diff --git examples/pom.xm

Re: object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread Ted Yu
Spark examples builds against hbase 0.94 by default. If you want to run against 0.98, see: SPARK-1297 https://issues.apache.org/jira/browse/SPARK-1297 Cheers On Sun, Sep 14, 2014 at 7:36 AM, arthur.hk.c...@gmail.com < arthur.hk.c...@gmail.com> wrote: > Hi, > > I have

object hbase is not a member of package org.apache.hadoop

2014-09-14 Thread arthur.hk.c...@gmail.com
import org.apache.hadoop.hbase.mapreduce.TableInputFormat :31: error: object hbase is not a member of package org.apache.hadoop import org.apache.hadoop.hbase.mapreduce.TableInputFormat Regards Arthur

Re: Re[2]: HBase 0.96+ with Spark 1.0+

2014-09-12 Thread Aniket Bhatnagar
some other jar, you would have to exclude it from your build. Hope it helps. Thanks, Aniket On 12 September 2014 02:21, wrote: > Thank you, Aniket for your hint! > > Alas, I am facing really "hellish" situation as it seems, because I have > integration tests using BOTH

Re: Re[2]: HBase 0.96+ with Spark 1.0+

2014-09-11 Thread Sean Owen
This was already answered at the bottom of this same thread -- read below. On Thu, Sep 11, 2014 at 9:51 PM, wrote: > class "javax.servlet.ServletRegistration"'s signer information does not > match signer information of other classes in the same package > java.lang.SecurityException: class "javax

Re[2]: HBase 0.96+ with Spark 1.0+

2014-09-11 Thread spark
Thank you, Aniket for your hint! Alas, I am facing really "hellish" situation as it seems, because I have integration tests using BOTH spark and HBase (Minicluster). Thus I get either: class "javax.servlet.ServletRegistration"'s signer information does not match si

Re: Re[2]: HBase 0.96+ with Spark 1.0+

2014-09-11 Thread Aniket Bhatnagar
Dependency hell... My fav problem :). I had run into a similar issue with hbase and jetty. I cant remember thw exact fix, but is are excerpts from my dependencies that may be relevant: val hadoop2Common = "org.apache.hadoop" % "hadoop-common" % hadoo

Re[2]: HBase 0.96+ with Spark 1.0+

2014-09-11 Thread spark
Hi guys, any luck with this issue, anyone? I aswell tried all the possible exclusion combos to a no avail. thanks for your ideas reinis -Original-Nachricht- > Von: "Stephen Boesch" > An: user > Datum: 28-06-2014 15:12 > Betreff: Re: HBase 0.96+ with Spar

Re: Spark Streaming into HBase

2014-09-05 Thread Tathagata Das
kka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >>> >>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>> >>> at >>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(Fo

Re: Spark Streaming into HBase

2014-09-04 Thread kpeng1
sk(ForkJoinPool.java:1339) >> >> at >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >> >> at >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >> >> >> Basically, what I am doing on

spark streaming - saving DStream into HBASE doesn't work

2014-09-04 Thread salemi
Hi, I am using the following code to write data to hbase. I see the jobs are send off but I never get anything in my hbase database. Spark doesn't throw any error? How can such a problem be debugged. Is the code below correct for writing data to hbase? val conf = HBaseConfiguration.c

Re: Spark Streaming into HBase

2014-09-03 Thread Tathagata Das
run nc -lk, I type in > words separated by commas and hit enter i.e. "bill,ted". > > > On Wed, Sep 3, 2014 at 2:36 PM, Ted Yu wrote: > >> Adding back user@ >> >> I am not familiar with the NotSerializableException. Can you show the >> full stack trace ? >> >>

Re: Spark Streaming into HBase

2014-09-03 Thread Kevin Peng
hit enter i.e. "bill,ted". On Wed, Sep 3, 2014 at 2:36 PM, Ted Yu wrote: > Adding back user@ > > I am not familiar with the NotSerializableException. Can you show the > full stack trace ? > > See SPARK-1297 for changes you need to make so that Spark works with >

Re: Spark Streaming into HBase

2014-09-03 Thread kpeng1
approach I should be using? Thanks for the help. On Wed, Sep 3, 2014 at 2:43 PM, Sean Owen-2 [via Apache Spark User List] < ml-node+s1001560n13385...@n3.nabble.com> wrote: > This doesn't seem to have to do with HBase per se. Some function is > getting the StreamingContext into the

Re: Spark Streaming into HBase

2014-09-03 Thread Sean Owen
This doesn't seem to have to do with HBase per se. Some function is getting the StreamingContext into the closure and that won't work. Is this exactly the code? since it doesn't reference a StreamingContext, but is there maybe a different version in reality that tries to use S

Re: Spark Streaming into HBase

2014-09-03 Thread Ted Yu
Adding back user@ I am not familiar with the NotSerializableException. Can you show the full stack trace ? See SPARK-1297 for changes you need to make so that Spark works with hbase 0.98 Cheers On Wed, Sep 3, 2014 at 2:33 PM, Kevin Peng wrote: > Ted, > > The hbase-site.xml

Spark Streaming into HBase

2014-09-03 Thread kpeng1
I have been trying to understand how spark streaming and hbase connect, but have not been successful. What I am trying to do is given a spark stream, process that stream and store the results in an hbase table. So far this is what I have: import org.apache.spark.SparkConf import

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-28 Thread Ted Yu
t;> Hi, >> >> tried >> mvn -Phbase-hadoop2,hadoop-2.4,yarn -Dhadoop.version=2.4.1 -DskipTests >> dependency:tree > dep.txt >> >> Attached the dep. txt for your information. >> >> >> Regards >> Arthur >> >> On 28 Aug, 2014

Re: Compilaon Error: Spark 1.0.2 with HBase 0.98

2014-08-28 Thread Sean Owen
"0.98.2" is not an HBase version, but "0.98.2-hadoop2" is: http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.apache.hbase%22%20AND%20a%3A%22hbase%22 On Thu, Aug 28, 2014 at 2:54 AM, arthur.hk.c...@gmail.com < arthur.hk.c...@gmail.com> wrote: > Hi, > >

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-28 Thread arthur.hk.c...@gmail.com
> dep.txt > > Attached the dep. txt for your information. > > > Regards > Arthur > > On 28 Aug, 2014, at 12:22 pm, Ted Yu wrote: > >> I forgot to include '-Dhadoop.version=2.4.1' in the command below. >> >> The modified command passe

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-28 Thread Ted Yu
Attached the dep. txt for your information. > > > Regards > Arthur > > On 28 Aug, 2014, at 12:22 pm, Ted Yu wrote: > > I forgot to include '-Dhadoop.version=2.4.1' in the command below. > > The modified command passed. > > You can verify the dependence

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-28 Thread arthur.hk.c...@gmail.com
rsion=2.4.1' in the command below.The modified command passed.You can verify the dependence on hbase 0.98 through this command: mvn -Phbase-hadoop2,hadoop-2.4,yarn -Dhadoop.version=2.4.1 -DskipTests dependency:tree > dep.txtCheersOn Wed, Aug 27, 2014 at 8:58 PM, Ted Yu <yuzhih...@gmail.com>

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread Ted Yu
I forgot to include '-Dhadoop.version=2.4.1' in the command below. The modified command passed. You can verify the dependence on hbase 0.98 through this command: mvn -Phbase-hadoop2,hadoop-2.4,yarn -Dhadoop.version=2.4.1 -DskipTests dependency:tree > dep.txt Cheers On Wed, Aug

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread Ted Yu
e pom.xml.rej >> patching file pom.xml >> Hunk #1 FAILED at 54. >> Hunk #2 FAILED at 72. >> Hunk #3 FAILED at 171. >> 3 out of 3 hunks FAILED -- saving rejects to file pom.xml.rej >> can't find file to patch at input line 267 >> Perhaps you should hav

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread arthur.hk.c...@gmail.com
s was: > -- > | > |From cd58437897bf02b644c2171404ccffae5d12a2be Mon Sep 17 00:00:00 2001 > |From: tedyu > |Date: Mon, 11 Aug 2014 15:57:46 -0700 > |Subject: [PATCH 3/4] SPARK-1297 Upgrade HBase dependency to 0.98 - add > | description to building-with-

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread Ted Yu
aving rejects to file pom.xml.rej > can't find file to patch at input line 267 > Perhaps you should have used the -p or --strip option? > The text leading up to this was: > -- > | > |From cd58437897bf02b644c2171404ccffae5d12a2be Mon Sep 17 00:00:00 2001 >

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread arthur.hk.c...@gmail.com
m: tedyu |Date: Mon, 11 Aug 2014 15:57:46 -0700 |Subject: [PATCH 3/4] SPARK-1297 Upgrade HBase dependency to 0.98 - add | description to building-with-maven.md | |--- | docs/building-with-maven.md | 3 +++ | 1 file changed, 3 insertions(+) | |diff --git a/docs/building-with-maven.md b/docs/build

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread Ted Yu
: > https://github.com/apache/spark/pull/1893 > > > On Wed, Aug 27, 2014 at 6:57 PM, arthur.hk.c...@gmail.com < > arthur.hk.c...@gmail.com> wrote: > >> (correction: "Compilation Error: Spark 1.0.2 with HBase 0.98” , please >> ignore if duplicated) >> &g

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread arthur.hk.c...@gmail.com
e/spark/pull/1893 > > > On Wed, Aug 27, 2014 at 6:57 PM, arthur.hk.c...@gmail.com > wrote: > (correction: "Compilation Error: Spark 1.0.2 with HBase 0.98” , please > ignore if duplicated) > > > Hi, > > I need to use Spark with HBase 0.98 and tried

Re: Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread Ted Yu
See SPARK-1297 The pull request is here: https://github.com/apache/spark/pull/1893 On Wed, Aug 27, 2014 at 6:57 PM, arthur.hk.c...@gmail.com < arthur.hk.c...@gmail.com> wrote: > (correction: "Compilation Error: Spark 1.0.2 with HBase 0.98” , please > ignore if duplicated) >

Compilation Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread arthur.hk.c...@gmail.com
(correction: "Compilation Error: Spark 1.0.2 with HBase 0.98” , please ignore if duplicated) Hi, I need to use Spark with HBase 0.98 and tried to compile Spark 1.0.2 with HBase 0.98, My steps: wget http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2.tgz tar -vxf spark-1.0.2.tgz cd

Compilaon Error: Spark 1.0.2 with HBase 0.98

2014-08-27 Thread arthur.hk.c...@gmail.com
Hi, I need to use Spark with HBase 0.98 and tried to compile Spark 1.0.2 with HBase 0.98, My steps: wget http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2.tgz tar -vxf spark-1.0.2.tgz cd spark-1.0.2 edit project/SparkBuild.scala, set HBASE_VERSION // HBase version; set as appropriate. val

Re: Issue Connecting to HBase in spark shell

2014-08-27 Thread kpeng1
It looks like the issue I had is that I didn't pull in htrace-core jar into the spark class path. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-Connecting-to-HBase-in-spark-shell-tp12855p12924.html Sent from the Apache Spark User List mailing

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-19 Thread Cesar Arevalo
gt;> But, do you know how I am supposed to set that table name on the jobConf? >> I >> don't have access to that object from my client driver? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-19 Thread Yin Huai
that object from my client driver? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-when-connecting-from-Spark-to-a-Hive-table-backed-by-HBase-tp12284p12331.html > Sent from the Apache Spark User L

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread cesararevalo
-from-Spark-to-a-Hive-table-backed-by-HBase-tp12284p12331.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: u

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Zhan Zhang
Looks like hbaseTableName is null, probably caused by incorrect configuration. String hbaseTableName = jobConf.get(HBaseSerDe.HBASE_TABLE_NAME); setHTable(new HTable(HBaseConfiguration.create(jobConf), Bytes.toBytes(hbaseTableName))); Here is the definition. public static final Strin

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Cesar Arevalo
014 at 12:00 AM, Akhil Das >> wrote: >> >>> Looks like your hiveContext is null. Have a look at this documentation. >>> <https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables> >>> >>> Thanks >>> Best Regards >>

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Akhil Das
egards >> >> >> On Mon, Aug 18, 2014 at 12:09 PM, Cesar Arevalo < >> ce...@zephyrhealthinc.com> wrote: >> >>> Hello: >>> >>> I am trying to setup Spark to connect to a Hive table which is backed by >>> HBase, but I am running

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Cesar Arevalo
PM, Cesar Arevalo > wrote: > >> Hello: >> >> I am trying to setup Spark to connect to a Hive table which is backed by >> HBase, but I am running into the following NullPointerException: >> >> scala> val hiveCount = hiveContext.sql("select count(*) f

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Akhil Das
to a Hive table which is backed by > HBase, but I am running into the following NullPointerException: > > scala> val hiveCount = hiveContext.sql("select count(*) from > dataset_records").collect().head.getLong(0) > 14/08/18 06:34:29 INFO ParseDriver: Parsing command: select

NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-17 Thread Cesar Arevalo
Hello: I am trying to setup Spark to connect to a Hive table which is backed by HBase, but I am running into the following NullPointerException: scala> val hiveCount = hiveContext.sql("select count(*) from dataset_records").collect().head.getLong(0) 14/08/18 06:34:29 INFO ParseDr

Re: Spark Hbase job taking long time

2014-08-12 Thread Amit Singh Hora
ad of >> TableInputFormat.class. >> >> Cheers >> >> >> On Wed, Aug 6, 2014 at 5:54 AM, Amit Singh Hora <[hidden email] >> <http://user/SendEmail.jtp?type=node&node=11651&i=1>> wrote: >> >>> Hi All, >>> >>>

Re: Spark Streaming- Input from Kafka, output to HBase

2014-08-07 Thread JiajiaJing
erything worked. Best Regards, Jiajia -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Input-from-Kafka-output-to-HBase-tp8686p11732.html Sent from the Apache Spark User List mailing list archiv

Re: Spark Streaming- Input from Kafka, output to HBase

2014-08-07 Thread Tathagata Das
per? did > Kafka and HBase share the same zookeeper and port? If not, did u set a > right config for Hbase job? -- Khanderao > > > On Wed, Jul 2, 2014 at 4:12 PM, JiajiaJing wrote: > >> Hi, >> >> I am trying to write a program that take input from kafka topics, do

Re: Spark Hbase job taking long time

2014-08-07 Thread Ted Yu
es classOf[TableInputFormat] instead of > TableInputFormat.class. > > Cheers > > > On Wed, Aug 6, 2014 at 5:54 AM, Amit Singh Hora > wrote: > >> Hi All, >> >> I am trying to run a SQL query on HBase using spark job ,till now i am >> able >> to get t

Re: Spark with HBase

2014-08-07 Thread chutium
this two posts should be good for setting up spark+hbase environment and use the results of hbase table scan as RDD settings http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html some samples: http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html -- View

Re: Spark with HBase

2014-08-07 Thread Akhil Das
( the version is quiet old) Attached is a piece of Code (Spark Java API) to connect to HBase. Thanks Best Regards On Thu, Aug 7, 2014 at 1:48 PM, Deepa Jayaveer wrote: > Hi > I read your white paper about " " . We wanted to do a Proof of Concept on > Spark with HBase.

Re: Spark Streaming- Input from Kafka, output to HBase

2014-08-07 Thread Khanderao Kand
I hope this has been resolved, were u connected to right zookeeper? did Kafka and HBase share the same zookeeper and port? If not, did u set a right config for Hbase job? -- Khanderao On Wed, Jul 2, 2014 at 4:12 PM, JiajiaJing wrote: > Hi, > > I am trying to write a program that t

Spark with HBase

2014-08-07 Thread Deepa Jayaveer
Hi I read your white paper about " " . We wanted to do a Proof of Concept on Spark with HBase. Documents are not much available to set up the spark cluster in Hadoop 2 environment. If you have any, can you please give us some reference URLs Also, some sample program to connect to H

Spark Hbase job taking long time

2014-08-06 Thread Amit Singh Hora
Hi All, I am trying to run a SQL query on HBase using spark job ,till now i am able to get the desierd results but as the data set size increases Spark job is taking a long time I believe i am doing something wrong,as after going through documentation and videos discussing on spark performance

Re: Hbase

2014-08-01 Thread Madabhattula Rajesh Kumar
n.Seq; > > import scala.collection.JavaConverters.*; > > import scala.reflect.ClassTag; > > public class SparkHBaseMain { > > @SuppressWarnings("deprecation") > > public static void main(String[] arg){ > > try{ > > List jars = >>

Re: Hbase

2014-08-01 Thread Akhil Das
ot;, "/home/akhld/Downloads/sparkhbasecode/hbase-server-0.96.0-hadoop2.jar", "/home/akhld/Downloads/sparkhbasecode/hbase-protocol-0.96.0-hadoop2.jar", "/home/akhld/Downloads/sparkhbasecode/hbase-hadoop2-compat-0.96.0-hadoop2.jar", "/home/akhld/Downloads/sparkhbas

Re: Hbase

2014-08-01 Thread Madabhattula Rajesh Kumar
;post".getBytes(), >> "title".getBytes());* >> for(KeyValue kl:kvl){ >> String sb = new String(kl.getValue()); >> System.out.println(sb); >> } > > > > > Thanks > Best Regards > > > On Thu, Jul 31, 2014 at 10:19 PM, Madabhattula

Re: Hbase

2014-07-31 Thread Akhil Das
tle".getBytes());* > for(KeyValue kl:kvl){ > String sb = new String(kl.getValue()); > System.out.println(sb); > } Thanks Best Regards On Thu, Jul 31, 2014 at 10:19 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi Team, > > I&#

Hbase

2014-07-31 Thread Madabhattula Rajesh Kumar
Hi Team, I'm using below code to read table from hbase Configuration conf = HBaseConfiguration.create(); conf.set(TableInputFormat.INPUT_TABLE, "table1"); JavaPairRDD hBaseRDD = sc.newAPIHadoopRDD( conf, TableInputFormat.class, ImmutableBytesWritable.class,

javasparksql Hbase

2014-07-28 Thread Madabhattula Rajesh Kumar
Hi Team, Could you please let me know example program/link for JavaSparkSql to join 2 Hbase tables. Regards, Rajesh

Use Spark with HBase' HFileOutputFormat

2014-07-16 Thread Jianshi Huang
Hi, I want to use Spark with HBase and I'm confused about how to ingest my data using HBase' HFileOutputFormat. It recommends calling configureIncrementalLoad which does the following: - Inspects the table to configure a total order partitioner - Uploads the partitions file to t

Re: Need help on spark Hbase

2014-07-16 Thread Jerry Lam
Hi Rajesh, I saw : Warning: Local jar /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase -client-0.96.1.1-hadoop2.jar, does not exist, skipping. in your log. I believe this jar contains the HBaseConfiguration. I'm not sure what went wrong in your case but can you try without spaces in --jars

Re: Need help on spark Hbase

2014-07-16 Thread Madabhattula Rajesh Kumar
Hi Team, Now i've changed my code and reading configuration from hbase-site.xml file(this file is in classpath). When i run this program using : mvn exec:java -Dexec.mainClass="com.cisco.ana.accessavailability.AccessAvailability". It is working fine. But when i run this program fr

<    2   3   4   5   6   7   8   >