Hi Philip,
I got this bit of code to work in the spark-shell using scala against our dev
hbase cluster.
-bash-4.1$export
SPARK_CLASSPATH=$SPARK_CLASSPATH:/opt/cloudera/parcels/CDH/lib/hbase/hbase.jar:/opt/cloudera/parcels/CDH/lib/hbase/conf:/opt/cloudera/parcels/CDH/lib/hadoop/conf
-bash-4.1$./spark-shellscala>import
org.apache.hadoop.hbase.HBaseConfigurationscala>import
org.apache.hadoop.hbase.client._scala>import
org.apache.hadoop.hbase.util.Bytesscala>val conf =
HBaseConfiguration.create()scala>val table = new HTable(conf,
"my_items")scala>val p = new
Put(Bytes.toBytes("strawberry-fruit"))scala>p.add(Bytes.toBytes("item"),Bytes.toBytes("item"),Bytes.toBytes("strawberry"))scala>p.add(Bytes.toBytes("item"),Bytes.toBytes("category"),Bytes.toBytes("fruit"))scala>p.add(Bytes.toBytes("item"),Bytes.toBytes("price"),Bytes.toBytes("0.35"))scala>table.put(p)
It put the new row "strawberry-fruit" into an hbase table.
Sorry, but I have another newbie question. How do I add those CLASSPATH
dependencies when I want to compile a streaming jar in sbt so that the hbase
configs are automatically used?
Thanks,Ben
Date: Thu, 5 Dec 2013 10:24:02 -0700
From: philip.og...@oracle.com
To: user@spark.incubator.apache.org
Subject: Re: Writing to HBase
Here's a good place to start:
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3ccacyzca3askwd-tujhqi1805bn7sctguaoruhd5xtxcsul1a...@mail.gmail.com%3E
On 12/5/2013 10:18 AM, Benjamin Kim
wrote:
Does anyone have an example or some sort of
starting point code when writing from Spark Streaming into
HBase?
We currently stream ad server event log data using Flume-NG
to tail log entries, collect them, and put them directly into
a HBase table. We would like to do the same with Spark
Streaming. But, we would like to do the data massaging and
simple data analysis before. This will cut down the steps in
prepping data and the number of tables for our data scientists
and real-time feedback systems.
Thanks,
Ben