Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Mich Talebzadeh
thanks again all. My primary objective was to write to Hbase directly from Spark streaming and Phoenix was really the catalyst here. My point being that if I manage to write directly from Spark streaming to Hbase would that being a better option. FYI, I can read from phoenix table on Hbase

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread James Taylor
Hi Mich, I'd encourage you to use this mechanism mentioned by Josh: Another option is to use Phoenix-JDBC from within Spark Streaming. I've got a toy example of using Spark streaming with Phoenix DataFrames, but it could just as easily be a batched JDBC upsert. Trying to write directly to HBase

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Mich Talebzadeh
Thanks Josh, I will try your code as well. I wrote this simple program based on some code that directly creates or populates an Hbase table called "new" from Spark 2 import org.apache.spark._ import org.apache.spark.rdd.NewHadoopRDD import org.apache.hadoop.hbase.{HBaseConfiguration,

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Josh Mahonin
Hi Mich, You're correct that the rowkey is the primary key, but if you're writing to HBase directly and bypassing Phoenix, you'll have to be careful about the construction of your row keys to adhere to the Phoenix data types and row format. I don't think it's very well documented, but you might

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Mich Talebzadeh
Thank you all. very helpful. I have not tried the method Ciureanu suggested but will do so. Now I will be using Spark Streaming to populate Hbase table. I was hoping to do this through Phoenix but managed to write a script to write to Hbase table from Spark 2 itself. Having worked with Hbase I

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Ciureanu Constantin
In Spark 1.4 it worked via JDBC - sure it would work in 1.6 / 2.0 without issues. Here's a sample code I used (it was getting data in parallel 24 partitions) import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.rdd.JdbcRDD import java.sql.{Connection,

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Ted Yu
JIRA on hbase side: HBASE-16179 FYI On Fri, Oct 7, 2016 at 6:07 AM, Josh Mahonin wrote: > Hi Mich, > > There's an open ticket about this issue here: > https://issues.apache.org/jira/browse/PHOENIX- > > Long story short, Spark changed their API (again), breaking the

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Josh Mahonin
Hi Mich, There's an open ticket about this issue here: https://issues.apache.org/jira/browse/PHOENIX- Long story short, Spark changed their API (again), breaking the existing integration. I'm not sure the level of effort to get it working with Spark 2.0, but based on examples from other