Re: Spark-Phoenix Plugin

2018-08-06 Thread James Taylor
For the UPSERTs on a PreparedStatement that are done by Phoenix for writing
in the Spark adapter, not that these are *not* doing RPCs to the HBase
server to write data (i.e. they are never committed). Instead the UPSERTs
are used to ensure that the correct serialization is performed given the
Phoenix schema. We use a PhoenixRuntime API to get the List from the
uncommitted data and then perform a rollback. Using this technique,
features like salting, column encoding, row timestamp, etc. will continue
to work with the Spark integration.

Thanks,
James

On Mon, Aug 6, 2018 at 7:44 AM, Jaanai Zhang  wrote:

> you can get better performance if directly read/write HBase. you also use
> spark-phoenix, this is an example, reading data from CSV file and writing
> into Phoenix table:
>
> def main(args: Array[String]): Unit = {
>
>   val sc = new SparkContext("local", "phoenix-test")
>   val path = "/tmp/data"
>   val hbaseConnectionString = "host1,host2,host3"
>   val customSchema = StructType(Array(
> StructField("O_ORDERKEY", StringType, true),
> StructField("O_CUSTKEY", StringType, true),
> StructField("O_ORDERSTATUS", StringType, true),
> StructField("O_TOTALPRICE", StringType, true),
> StructField("O_ORDERDATE", StringType, true),
> StructField("O_ORDERPRIORITY", StringType, true),
> StructField("O_CLERK", StringType, true),
> StructField("O_SHIPPRIORITY", StringType, true),
> StructField("O_COMMENT", StringType, true)))
>
>   //import com.databricks.spark.csv._
>   val sqlContext = new SQLContext(sc)
>
>   val df = sqlContext.read
> .format("com.databricks.spark.csv")
> .option("delimiter", "|")
> .option("header", "false")
> .schema(customSchema)
> .load(path)
>
>   val start = System.currentTimeMillis()
>   df.write.format("org.apache.phoenix.spark")
> .mode("overwrite")
> .option("table", "DATAX")
> .option("zkUrl", hbaseConnectionString)
> .save()
>
>   val end = System.currentTimeMillis()
>   print("taken time:" + ((end - start) / 1000) + "s")
> }
>
>
>
>
> 
>Yun Zhang
>Best regards!
>
>
> 2018-08-06 20:10 GMT+08:00 Brandon Geise :
>
>> Thanks for the reply Yun.
>>
>>
>>
>> I’m not quite clear how this would exactly help on the upsert side?  Are
>> you suggesting deriving the type from Phoenix then doing the
>> encoding/decoding and writing/reading directly from HBase?
>>
>>
>>
>> Thanks,
>>
>> Brandon
>>
>>
>>
>> *From: *Jaanai Zhang 
>> *Reply-To: *
>> *Date: *Sunday, August 5, 2018 at 9:34 PM
>> *To: *
>> *Subject: *Re: Spark-Phoenix Plugin
>>
>>
>>
>> You can get data type from Phoenix meta, then encode/decode data to
>> write/read data. I think this way is effective, FYI :)
>>
>>
>>
>>
>> 
>>
>>Yun Zhang
>>
>>Best regards!
>>
>>
>>
>>
>>
>> 2018-08-04 21:43 GMT+08:00 Brandon Geise :
>>
>> Good morning,
>>
>>
>>
>> I’m looking at using a combination of Hbase, Phoenix and Spark for a
>> project and read that using the Spark-Phoenix plugin directly is more
>> efficient than JDBC, however it wasn’t entirely clear from examples when
>> writing a dataframe if an upsert is performed and how much fine-grained
>> options there are for executing the upsert.  Any information someone can
>> share would be greatly appreciated!
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Brandon
>>
>>
>>
>
>


Re: Spark-Phoenix Plugin

2018-08-06 Thread Jaanai Zhang
you can get better performance if directly read/write HBase. you also use
spark-phoenix, this is an example, reading data from CSV file and writing
into Phoenix table:

def main(args: Array[String]): Unit = {

  val sc = new SparkContext("local", "phoenix-test")
  val path = "/tmp/data"
  val hbaseConnectionString = "host1,host2,host3"
  val customSchema = StructType(Array(
StructField("O_ORDERKEY", StringType, true),
StructField("O_CUSTKEY", StringType, true),
StructField("O_ORDERSTATUS", StringType, true),
StructField("O_TOTALPRICE", StringType, true),
StructField("O_ORDERDATE", StringType, true),
StructField("O_ORDERPRIORITY", StringType, true),
StructField("O_CLERK", StringType, true),
StructField("O_SHIPPRIORITY", StringType, true),
StructField("O_COMMENT", StringType, true)))

  //import com.databricks.spark.csv._
  val sqlContext = new SQLContext(sc)

  val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("delimiter", "|")
.option("header", "false")
.schema(customSchema)
.load(path)

  val start = System.currentTimeMillis()
  df.write.format("org.apache.phoenix.spark")
.mode("overwrite")
.option("table", "DATAX")
.option("zkUrl", hbaseConnectionString)
.save()

  val end = System.currentTimeMillis()
  print("taken time:" + ((end - start) / 1000) + "s")
}





   Yun Zhang
   Best regards!


2018-08-06 20:10 GMT+08:00 Brandon Geise :

> Thanks for the reply Yun.
>
>
>
> I’m not quite clear how this would exactly help on the upsert side?  Are
> you suggesting deriving the type from Phoenix then doing the
> encoding/decoding and writing/reading directly from HBase?
>
>
>
> Thanks,
>
> Brandon
>
>
>
> *From: *Jaanai Zhang 
> *Reply-To: *
> *Date: *Sunday, August 5, 2018 at 9:34 PM
> *To: *
> *Subject: *Re: Spark-Phoenix Plugin
>
>
>
> You can get data type from Phoenix meta, then encode/decode data to
> write/read data. I think this way is effective, FYI :)
>
>
>
>
> 
>
>Yun Zhang
>
>Best regards!
>
>
>
>
>
> 2018-08-04 21:43 GMT+08:00 Brandon Geise :
>
> Good morning,
>
>
>
> I’m looking at using a combination of Hbase, Phoenix and Spark for a
> project and read that using the Spark-Phoenix plugin directly is more
> efficient than JDBC, however it wasn’t entirely clear from examples when
> writing a dataframe if an upsert is performed and how much fine-grained
> options there are for executing the upsert.  Any information someone can
> share would be greatly appreciated!
>
>
>
>
>
> Thanks,
>
> Brandon
>
>
>


Re: Spark-Phoenix Plugin

2018-08-06 Thread Josh Elser
Besides the distribution and parallelism of Spark as a distributed 
execution framework, I can't really see how phoenix-spark would be 
faster than the JDBC driver :). Phoenix-spark and the JDBC driver are 
using the same code under the hood.


Phoenix-spark is using the PhoenixOutputFormat (and thus, 
PhoenixRecordWriter) to write data to Phoenix. Maybe look at 
PhoenixRecordWritable, too. These ultimately are executing UPSERTs on a 
PreparedStatement.


There is also the CsvBulkLoadTool which can create HFiles to bulk load 
data in Phoenix. I'm not sure if phoenix-spark has something wired up 
that you can use to do this out of the box (certainly, you could do it 
yourself).


On 8/6/18 8:10 AM, Brandon Geise wrote:

Thanks for the reply Yun.

I’m not quite clear how this would exactly help on the upsert side?  Are 
you suggesting deriving the type from Phoenix then doing the 
encoding/decoding and writing/reading directly from HBase?


Thanks,

Brandon

*From: *Jaanai Zhang 
*Reply-To: *
*Date: *Sunday, August 5, 2018 at 9:34 PM
*To: *
*Subject: *Re: Spark-Phoenix Plugin

You can get data type from Phoenix meta, then encode/decode data to 
write/read data. I think this way is effective, FYI :)





    Yun Zhang

    Best regards!

2018-08-04 21:43 GMT+08:00 Brandon Geise <mailto:brandonge...@gmail.com>>:


Good morning,

I’m looking at using a combination of Hbase, Phoenix and Spark for a
project and read that using the Spark-Phoenix plugin directly is
more efficient than JDBC, however it wasn’t entirely clear from
examples when writing a dataframe if an upsert is performed and how
much fine-grained options there are for executing the upsert.  Any
information someone can share would be greatly appreciated!

Thanks,

Brandon



Re: Spark-Phoenix Plugin

2018-08-06 Thread Brandon Geise
Thanks for the reply Yun.  

 

I’m not quite clear how this would exactly help on the upsert side?  Are you 
suggesting deriving the type from Phoenix then doing the encoding/decoding and 
writing/reading directly from HBase?

 

Thanks,

Brandon

 

From: Jaanai Zhang 
Reply-To: 
Date: Sunday, August 5, 2018 at 9:34 PM
To: 
Subject: Re: Spark-Phoenix Plugin

 

You can get data type from Phoenix meta, then encode/decode data to write/read 
data. I think this way is effective, FYI :)


 



   Yun Zhang

   Best regards!

 

 

2018-08-04 21:43 GMT+08:00 Brandon Geise :

Good morning,

 

I’m looking at using a combination of Hbase, Phoenix and Spark for a project 
and read that using the Spark-Phoenix plugin directly is more efficient than 
JDBC, however it wasn’t entirely clear from examples when writing a dataframe 
if an upsert is performed and how much fine-grained options there are for 
executing the upsert.  Any information someone can share would be greatly 
appreciated!

 

 

Thanks,

Brandon

 



Re: Spark-Phoenix Plugin

2018-08-05 Thread Jaanai Zhang
You can get data type from Phoenix meta, then encode/decode data to
write/read data. I think this way is effective, FYI :)



   Yun Zhang
   Best regards!


2018-08-04 21:43 GMT+08:00 Brandon Geise :

> Good morning,
>
>
>
> I’m looking at using a combination of Hbase, Phoenix and Spark for a
> project and read that using the Spark-Phoenix plugin directly is more
> efficient than JDBC, however it wasn’t entirely clear from examples when
> writing a dataframe if an upsert is performed and how much fine-grained
> options there are for executing the upsert.  Any information someone can
> share would be greatly appreciated!
>
>
>
>
>
> Thanks,
>
> Brandon
>


Spark-Phoenix Plugin

2018-08-04 Thread Brandon Geise
Good morning,

 

I’m looking at using a combination of Hbase, Phoenix and Spark for a project 
and read that using the Spark-Phoenix plugin directly is more efficient than 
JDBC, however it wasn’t entirely clear from examples when writing a dataframe 
if an upsert is performed and how much fine-grained options there are for 
executing the upsert.  Any information someone can share would be greatly 
appreciated!

 

 

Thanks,

Brandon



Re: Spark Phoenix Plugin

2016-02-20 Thread Benjamin Kim
Josh,

My production environment at our company is:
CDH 5.4.8
Hadoop 2.6.0-cdh5.4.8
YARN 2.6.0-cdh5.4.8
HBase 1.0.0-cdh5.4.8
Apache
HBase 1.1.3
Spark 1.6.0
Phoenix 4.7.0

I tried to use the Phoenix Spark Plugin against both versions of HBase.

I hope this helps.

Thanks,
Ben


> On Feb 20, 2016, at 7:37 AM, Josh Mahonin  wrote:
> 
> Hi Ben,
> 
> Can you describe in more detail what your environment is? Are you using stock 
> installs of HBase, Spark and Phoenix? Are you using the hadoop2.4 pre-built 
> Spark distribution as per the documentation [1]?
> 
> The unread block data error is commonly traced back to this issue [2] which 
> indicates some sort of mismatched version problem..
> 
> Thanks,
> 
> Josh
> 
> [1] https://phoenix.apache.org/phoenix_spark.html 
> 
> [2] https://issues.apache.org/jira/browse/SPARK-1867 
> 
> 
> On Fri, Feb 19, 2016 at 2:18 PM, Benjamin Kim  > wrote:
> Hi Josh,
> 
> When I run the following code in spark-shell for spark 1.6:
> 
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
> df.select(df("ID")).show()
> 
> I get this error:
> 
> java.lang.IllegalStateException: unread block data
> 
> Thanks,
> Ben
> 
> 
>> On Feb 19, 2016, at 11:12 AM, Josh Mahonin > > wrote:
>> 
>> What specifically doesn't work for you?
>> 
>> I have a Docker image that I used to do some basic testing on it with and 
>> haven't run into any problems:
>> https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
>> 
>> 
>> On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim > > wrote:
>> All,
>> 
>> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the 
>> current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything 
>> works fine except for the Phoenix Spark Plugin. I wonder if it’s a version 
>> incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0 using 
>> Spark 1.6?
>> 
>> Thanks,
>> Ben
>> 
>>> On Feb 12, 2016, at 6:33 AM, Benjamin Kim >> > wrote:
>>> 
>>> Anyone know when Phoenix 4.7 will be officially released? And what Cloudera 
>>> distribution versions will it be compatible with?
>>> 
>>> Thanks,
>>> Ben
>>> 
 On Feb 10, 2016, at 11:03 AM, Benjamin Kim > wrote:
 
 Hi Pierre,
 
 I am getting this error now.
 
 Error: org.apache.phoenix.exception.PhoenixIOException: 
 org.apache.hadoop.hbase.DoNotRetryIOException: 
 SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: 
 org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
 
 I even tried to use sqlline.py to do some queries too. It resulted in the 
 same error. I followed the installation instructions. Is there something 
 missing?
 
 Thanks,
 Ben
 
 
> On Feb 9, 2016, at 10:20 AM, Ravi Kiran  > wrote:
> 
> Hi Pierre,
> 
>   Try your luck for building the artifacts from 
> https://github.com/chiastic-security/phoenix-for-cloudera 
> . Hopefully it 
> helps.
> 
> Regards
> Ravi .
> 
> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim  > wrote:
> Hi Pierre,
> 
> I found this article about how Cloudera’s version of HBase is very 
> different than Apache HBase so it must be compiled using Cloudera’s repo 
> and versions. But, I’m not having any success with it.
> 
> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>  
> 
> 
> There’s also a Chinese site that does the same thing.
> 
> https://www.zybuluo.com/xtccc/note/205739 
> 
> 
> I keep getting errors like the one’s below.
> 
> [ERROR] 
> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>  cannot find symbol
> [ERROR] symbol:   class Region
> [ERROR] location: class 
> org.apache.hadoop.hbase.regionserver.LocalIndexMerger
> …
> 
> Have you tried this also?
> 
> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s 
> HBase.
> 
> Thanks,
> Ben
> 
> 

Re: Spark Phoenix Plugin

2016-02-20 Thread Josh Mahonin
Hi Ben,

Can you describe in more detail what your environment is? Are you using
stock installs of HBase, Spark and Phoenix? Are you using the hadoop2.4
pre-built Spark distribution as per the documentation [1]?

The unread block data error is commonly traced back to this issue [2] which
indicates some sort of mismatched version problem..

Thanks,

Josh

[1] https://phoenix.apache.org/phoenix_spark.html
[2] https://issues.apache.org/jira/browse/SPARK-1867

On Fri, Feb 19, 2016 at 2:18 PM, Benjamin Kim  wrote:

> Hi Josh,
>
> When I run the following code in spark-shell for spark 1.6:
>
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
> df.select(df("ID")).show()
>
> I get this error:
>
> java.lang.IllegalStateException: unread block data
>
> Thanks,
> Ben
>
>
> On Feb 19, 2016, at 11:12 AM, Josh Mahonin  wrote:
>
> What specifically doesn't work for you?
>
> I have a Docker image that I used to do some basic testing on it with and
> haven't run into any problems:
> https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
>
> On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim  wrote:
>
>> All,
>>
>> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the
>> current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything
>> works fine except for the Phoenix Spark Plugin. I wonder if it’s a version
>> incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0
>> using Spark 1.6?
>>
>> Thanks,
>> Ben
>>
>> On Feb 12, 2016, at 6:33 AM, Benjamin Kim  wrote:
>>
>> Anyone know when Phoenix 4.7 will be officially released? And what
>> Cloudera distribution versions will it be compatible with?
>>
>> Thanks,
>> Ben
>>
>> On Feb 10, 2016, at 11:03 AM, Benjamin Kim  wrote:
>>
>> Hi Pierre,
>>
>> I am getting this error now.
>>
>> Error: org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException:
>> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.:
>> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
>>
>> I even tried to use sqlline.py to do some queries too. It resulted in the
>> same error. I followed the installation instructions. Is there something
>> missing?
>>
>> Thanks,
>> Ben
>>
>>
>> On Feb 9, 2016, at 10:20 AM, Ravi Kiran 
>> wrote:
>>
>> Hi Pierre,
>>
>>   Try your luck for building the artifacts from
>> https://github.com/chiastic-security/phoenix-for-cloudera. Hopefully it
>> helps.
>>
>> Regards
>> Ravi .
>>
>> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim  wrote:
>>
>>> Hi Pierre,
>>>
>>> I found this article about how Cloudera’s version of HBase is very
>>> different than Apache HBase so it must be compiled using Cloudera’s repo
>>> and versions. But, I’m not having any success with it.
>>>
>>>
>>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>>>
>>> There’s also a Chinese site that does the same thing.
>>>
>>> https://www.zybuluo.com/xtccc/note/205739
>>>
>>> I keep getting errors like the one’s below.
>>>
>>> [ERROR]
>>> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>>> cannot find symbol
>>> [ERROR] symbol:   class Region
>>> [ERROR] location: class
>>> org.apache.hadoop.hbase.regionserver.LocalIndexMerger
>>> …
>>>
>>> Have you tried this also?
>>>
>>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s
>>> HBase.
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>> On Feb 8, 2016, at 11:04 PM, pierre lacave  wrote:
>>>
>>> Havent met that one.
>>>
>>> According to SPARK-1867, the real issue is hidden.
>>>
>>> I d process by elimination, maybe try in local[*] mode first
>>>
>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867
>>>
>>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim  wrote:
>>>
 Pierre,

 I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But,
 now, I get this error:

 org.apache.spark.SparkException: Job aborted due to stage failure: Task
 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com):
 java.lang.IllegalStateException: unread block data

 It happens when I do:

 df.show()

 Getting closer…

 Thanks,
 Ben



 On Feb 8, 2016, at 2:57 PM, pierre lacave  wrote:

 This is the wrong client jar try with the one named
 phoenix-4.7.0-HBase-1.1-client-spark.jar

 On Mon, 8 Feb 2016, 22:29 Benjamin Kim  wrote:

> Hi Josh,
>
> I tried again by putting the 

Re: Spark Phoenix Plugin

2016-02-19 Thread Benjamin Kim
Hi Josh,

When I run the following code in spark-shell for spark 1.6:

import org.apache.phoenix.spark._
val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
"TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
df.select(df("ID")).show()

I get this error:

java.lang.IllegalStateException: unread block data

Thanks,
Ben


> On Feb 19, 2016, at 11:12 AM, Josh Mahonin  wrote:
> 
> What specifically doesn't work for you?
> 
> I have a Docker image that I used to do some basic testing on it with and 
> haven't run into any problems:
> https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
> 
> 
> On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim  > wrote:
> All,
> 
> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the 
> current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything 
> works fine except for the Phoenix Spark Plugin. I wonder if it’s a version 
> incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0 using 
> Spark 1.6?
> 
> Thanks,
> Ben
> 
>> On Feb 12, 2016, at 6:33 AM, Benjamin Kim > > wrote:
>> 
>> Anyone know when Phoenix 4.7 will be officially released? And what Cloudera 
>> distribution versions will it be compatible with?
>> 
>> Thanks,
>> Ben
>> 
>>> On Feb 10, 2016, at 11:03 AM, Benjamin Kim >> > wrote:
>>> 
>>> Hi Pierre,
>>> 
>>> I am getting this error now.
>>> 
>>> Error: org.apache.phoenix.exception.PhoenixIOException: 
>>> org.apache.hadoop.hbase.DoNotRetryIOException: 
>>> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: 
>>> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
>>> 
>>> I even tried to use sqlline.py to do some queries too. It resulted in the 
>>> same error. I followed the installation instructions. Is there something 
>>> missing?
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
 On Feb 9, 2016, at 10:20 AM, Ravi Kiran > wrote:
 
 Hi Pierre,
 
   Try your luck for building the artifacts from 
 https://github.com/chiastic-security/phoenix-for-cloudera 
 . Hopefully it 
 helps.
 
 Regards
 Ravi .
 
 On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim > wrote:
 Hi Pierre,
 
 I found this article about how Cloudera’s version of HBase is very 
 different than Apache HBase so it must be compiled using Cloudera’s repo 
 and versions. But, I’m not having any success with it.
 
 http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
  
 
 
 There’s also a Chinese site that does the same thing.
 
 https://www.zybuluo.com/xtccc/note/205739 
 
 
 I keep getting errors like the one’s below.
 
 [ERROR] 
 /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
  cannot find symbol
 [ERROR] symbol:   class Region
 [ERROR] location: class 
 org.apache.hadoop.hbase.regionserver.LocalIndexMerger
 …
 
 Have you tried this also?
 
 As a last resort, we will have to abandon Cloudera’s HBase for Apache’s 
 HBase.
 
 Thanks,
 Ben
 
 
> On Feb 8, 2016, at 11:04 PM, pierre lacave  > wrote:
> 
> Havent met that one.
> 
> According to SPARK-1867, the real issue is hidden.
> 
> I d process by elimination, maybe try in local[*] mode first
> 
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 
> 
> On Tue, 9 Feb 2016, 04:58 Benjamin Kim  > wrote:
> Pierre,
> 
> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, 
> now, I get this error:
> 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
> in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
> 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com 
> ): 
> java.lang.IllegalStateException: unread block data
> 
> It happens when I do:
> 
> df.show()
> 
> Getting closer…
> 
> Thanks,
> Ben
> 
> 
> 
>> On Feb 8, 2016, at 2:57 PM, pierre lacave 

Re: Spark Phoenix Plugin

2016-02-19 Thread Josh Mahonin
What specifically doesn't work for you?

I have a Docker image that I used to do some basic testing on it with and
haven't run into any problems:
https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark

On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim  wrote:

> All,
>
> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the
> current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything
> works fine except for the Phoenix Spark Plugin. I wonder if it’s a version
> incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0
> using Spark 1.6?
>
> Thanks,
> Ben
>
> On Feb 12, 2016, at 6:33 AM, Benjamin Kim  wrote:
>
> Anyone know when Phoenix 4.7 will be officially released? And what
> Cloudera distribution versions will it be compatible with?
>
> Thanks,
> Ben
>
> On Feb 10, 2016, at 11:03 AM, Benjamin Kim  wrote:
>
> Hi Pierre,
>
> I am getting this error now.
>
> Error: org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException:
> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.:
> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
>
> I even tried to use sqlline.py to do some queries too. It resulted in the
> same error. I followed the installation instructions. Is there something
> missing?
>
> Thanks,
> Ben
>
>
> On Feb 9, 2016, at 10:20 AM, Ravi Kiran  wrote:
>
> Hi Pierre,
>
>   Try your luck for building the artifacts from
> https://github.com/chiastic-security/phoenix-for-cloudera. Hopefully it
> helps.
>
> Regards
> Ravi .
>
> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim  wrote:
>
>> Hi Pierre,
>>
>> I found this article about how Cloudera’s version of HBase is very
>> different than Apache HBase so it must be compiled using Cloudera’s repo
>> and versions. But, I’m not having any success with it.
>>
>>
>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>>
>> There’s also a Chinese site that does the same thing.
>>
>> https://www.zybuluo.com/xtccc/note/205739
>>
>> I keep getting errors like the one’s below.
>>
>> [ERROR]
>> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>> cannot find symbol
>> [ERROR] symbol:   class Region
>> [ERROR] location: class
>> org.apache.hadoop.hbase.regionserver.LocalIndexMerger
>> …
>>
>> Have you tried this also?
>>
>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s
>> HBase.
>>
>> Thanks,
>> Ben
>>
>>
>> On Feb 8, 2016, at 11:04 PM, pierre lacave  wrote:
>>
>> Havent met that one.
>>
>> According to SPARK-1867, the real issue is hidden.
>>
>> I d process by elimination, maybe try in local[*] mode first
>>
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867
>>
>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim  wrote:
>>
>>> Pierre,
>>>
>>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But,
>>> now, I get this error:
>>>
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>>> 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
>>> 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com):
>>> java.lang.IllegalStateException: unread block data
>>>
>>> It happens when I do:
>>>
>>> df.show()
>>>
>>> Getting closer…
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>>
>>> On Feb 8, 2016, at 2:57 PM, pierre lacave  wrote:
>>>
>>> This is the wrong client jar try with the one named
>>> phoenix-4.7.0-HBase-1.1-client-spark.jar
>>>
>>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim  wrote:
>>>
 Hi Josh,

 I tried again by putting the settings within the spark-default.conf.


 spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar

 spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar

 I still get the same error using the code below.

 import org.apache.phoenix.spark._
 val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
 "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))

 Can you tell me what else you’re doing?

 Thanks,
 Ben


 On Feb 8, 2016, at 1:44 PM, Josh Mahonin  wrote:

 Hi Ben,

 I'm not sure about the format of those command line options you're
 passing. I've had success with spark-shell just by setting the
 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options
 on the spark config, as per the docs [1].

 I'm not sure if there's anything special needed for CDH or not though.
 I also have a docker image I've been toying with which has a working
 Spark/Phoenix setup using the Phoenix 4.7.0 RC and 

Re: Spark Phoenix Plugin

2016-02-19 Thread Benjamin Kim
All,

Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the 
current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything 
works fine except for the Phoenix Spark Plugin. I wonder if it’s a version 
incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0 using 
Spark 1.6?

Thanks,
Ben

> On Feb 12, 2016, at 6:33 AM, Benjamin Kim  wrote:
> 
> Anyone know when Phoenix 4.7 will be officially released? And what Cloudera 
> distribution versions will it be compatible with?
> 
> Thanks,
> Ben
> 
>> On Feb 10, 2016, at 11:03 AM, Benjamin Kim > > wrote:
>> 
>> Hi Pierre,
>> 
>> I am getting this error now.
>> 
>> Error: org.apache.phoenix.exception.PhoenixIOException: 
>> org.apache.hadoop.hbase.DoNotRetryIOException: 
>> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: 
>> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
>> 
>> I even tried to use sqlline.py to do some queries too. It resulted in the 
>> same error. I followed the installation instructions. Is there something 
>> missing?
>> 
>> Thanks,
>> Ben
>> 
>> 
>>> On Feb 9, 2016, at 10:20 AM, Ravi Kiran >> > wrote:
>>> 
>>> Hi Pierre,
>>> 
>>>   Try your luck for building the artifacts from 
>>> https://github.com/chiastic-security/phoenix-for-cloudera 
>>> . Hopefully it 
>>> helps.
>>> 
>>> Regards
>>> Ravi .
>>> 
>>> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim >> > wrote:
>>> Hi Pierre,
>>> 
>>> I found this article about how Cloudera’s version of HBase is very 
>>> different than Apache HBase so it must be compiled using Cloudera’s repo 
>>> and versions. But, I’m not having any success with it.
>>> 
>>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>>>  
>>> 
>>> 
>>> There’s also a Chinese site that does the same thing.
>>> 
>>> https://www.zybuluo.com/xtccc/note/205739 
>>> 
>>> 
>>> I keep getting errors like the one’s below.
>>> 
>>> [ERROR] 
>>> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>>>  cannot find symbol
>>> [ERROR] symbol:   class Region
>>> [ERROR] location: class 
>>> org.apache.hadoop.hbase.regionserver.LocalIndexMerger
>>> …
>>> 
>>> Have you tried this also?
>>> 
>>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s 
>>> HBase.
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
 On Feb 8, 2016, at 11:04 PM, pierre lacave > wrote:
 
 Havent met that one.
 
 According to SPARK-1867, the real issue is hidden.
 
 I d process by elimination, maybe try in local[*] mode first
 
 https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 
 
 On Tue, 9 Feb 2016, 04:58 Benjamin Kim > wrote:
 Pierre,
 
 I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, 
 I get this error:
 
 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com 
 ): 
 java.lang.IllegalStateException: unread block data
 
 It happens when I do:
 
 df.show()
 
 Getting closer…
 
 Thanks,
 Ben
 
 
 
> On Feb 8, 2016, at 2:57 PM, pierre lacave  > wrote:
> 
> This is the wrong client jar try with the one named 
> phoenix-4.7.0-HBase-1.1-client-spark.jar 
> 
> 
> On Mon, 8 Feb 2016, 22:29 Benjamin Kim  > wrote:
> Hi Josh,
> 
> I tried again by putting the settings within the spark-default.conf.
> 
> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
> 
> I still get the same error using the code below.
> 
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
> 
> Can you tell me what else you’re doing?
> 
> Thanks,
> Ben
> 
> 
>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin > 

Re: Spark Phoenix Plugin

2016-02-12 Thread Benjamin Kim
Anyone know when Phoenix 4.7 will be officially released? And what Cloudera 
distribution versions will it be compatible with?

Thanks,
Ben

> On Feb 10, 2016, at 11:03 AM, Benjamin Kim  wrote:
> 
> Hi Pierre,
> 
> I am getting this error now.
> 
> Error: org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: 
> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
> 
> I even tried to use sqlline.py to do some queries too. It resulted in the 
> same error. I followed the installation instructions. Is there something 
> missing?
> 
> Thanks,
> Ben
> 
> 
>> On Feb 9, 2016, at 10:20 AM, Ravi Kiran > > wrote:
>> 
>> Hi Pierre,
>> 
>>   Try your luck for building the artifacts from 
>> https://github.com/chiastic-security/phoenix-for-cloudera 
>> . Hopefully it 
>> helps.
>> 
>> Regards
>> Ravi .
>> 
>> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim > > wrote:
>> Hi Pierre,
>> 
>> I found this article about how Cloudera’s version of HBase is very different 
>> than Apache HBase so it must be compiled using Cloudera’s repo and versions. 
>> But, I’m not having any success with it.
>> 
>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>>  
>> 
>> 
>> There’s also a Chinese site that does the same thing.
>> 
>> https://www.zybuluo.com/xtccc/note/205739 
>> 
>> 
>> I keep getting errors like the one’s below.
>> 
>> [ERROR] 
>> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>>  cannot find symbol
>> [ERROR] symbol:   class Region
>> [ERROR] location: class org.apache.hadoop.hbase.regionserver.LocalIndexMerger
>> …
>> 
>> Have you tried this also?
>> 
>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s 
>> HBase.
>> 
>> Thanks,
>> Ben
>> 
>> 
>>> On Feb 8, 2016, at 11:04 PM, pierre lacave >> > wrote:
>>> 
>>> Havent met that one.
>>> 
>>> According to SPARK-1867, the real issue is hidden.
>>> 
>>> I d process by elimination, maybe try in local[*] mode first
>>> 
>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 
>>> 
>>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim >> > wrote:
>>> Pierre,
>>> 
>>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, 
>>> I get this error:
>>> 
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
>>> in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
>>> 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com 
>>> ): 
>>> java.lang.IllegalStateException: unread block data
>>> 
>>> It happens when I do:
>>> 
>>> df.show()
>>> 
>>> Getting closer…
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
>>> 
 On Feb 8, 2016, at 2:57 PM, pierre lacave > wrote:
 
 This is the wrong client jar try with the one named 
 phoenix-4.7.0-HBase-1.1-client-spark.jar 
 
 
 On Mon, 8 Feb 2016, 22:29 Benjamin Kim > wrote:
 Hi Josh,
 
 I tried again by putting the settings within the spark-default.conf.
 
 spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
 spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
 
 I still get the same error using the code below.
 
 import org.apache.phoenix.spark._
 val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
 "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
 
 Can you tell me what else you’re doing?
 
 Thanks,
 Ben
 
 
> On Feb 8, 2016, at 1:44 PM, Josh Mahonin  > wrote:
> 
> Hi Ben,
> 
> I'm not sure about the format of those command line options you're 
> passing. I've had success with spark-shell just by setting the 
> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options 
> on the spark config, as per the docs [1].
> 
> I'm not sure if there's anything special needed for CDH or not though. I 
> also have a docker image I've been toying with which has a working 
> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might 
> be a 

Re: Spark Phoenix Plugin

2016-02-10 Thread Benjamin Kim
Hi Pierre,

I am getting this error now.

Error: org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: 
SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: 
org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;

I even tried to use sqlline.py to do some queries too. It resulted in the same 
error. I followed the installation instructions. Is there something missing?

Thanks,
Ben


> On Feb 9, 2016, at 10:20 AM, Ravi Kiran  wrote:
> 
> Hi Pierre,
> 
>   Try your luck for building the artifacts from 
> https://github.com/chiastic-security/phoenix-for-cloudera 
> . Hopefully it 
> helps.
> 
> Regards
> Ravi .
> 
> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim  > wrote:
> Hi Pierre,
> 
> I found this article about how Cloudera’s version of HBase is very different 
> than Apache HBase so it must be compiled using Cloudera’s repo and versions. 
> But, I’m not having any success with it.
> 
> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>  
> 
> 
> There’s also a Chinese site that does the same thing.
> 
> https://www.zybuluo.com/xtccc/note/205739 
> 
> 
> I keep getting errors like the one’s below.
> 
> [ERROR] 
> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>  cannot find symbol
> [ERROR] symbol:   class Region
> [ERROR] location: class org.apache.hadoop.hbase.regionserver.LocalIndexMerger
> …
> 
> Have you tried this also?
> 
> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s HBase.
> 
> Thanks,
> Ben
> 
> 
>> On Feb 8, 2016, at 11:04 PM, pierre lacave > > wrote:
>> 
>> Havent met that one.
>> 
>> According to SPARK-1867, the real issue is hidden.
>> 
>> I d process by elimination, maybe try in local[*] mode first
>> 
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 
>> 
>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim > > wrote:
>> Pierre,
>> 
>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, I 
>> get this error:
>> 
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
>> stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 
>> (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com 
>> ): 
>> java.lang.IllegalStateException: unread block data
>> 
>> It happens when I do:
>> 
>> df.show()
>> 
>> Getting closer…
>> 
>> Thanks,
>> Ben
>> 
>> 
>> 
>>> On Feb 8, 2016, at 2:57 PM, pierre lacave >> > wrote:
>>> 
>>> This is the wrong client jar try with the one named 
>>> phoenix-4.7.0-HBase-1.1-client-spark.jar 
>>> 
>>> 
>>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim >> > wrote:
>>> Hi Josh,
>>> 
>>> I tried again by putting the settings within the spark-default.conf.
>>> 
>>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>> 
>>> I still get the same error using the code below.
>>> 
>>> import org.apache.phoenix.spark._
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>>> 
>>> Can you tell me what else you’re doing?
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
 On Feb 8, 2016, at 1:44 PM, Josh Mahonin > wrote:
 
 Hi Ben,
 
 I'm not sure about the format of those command line options you're 
 passing. I've had success with spark-shell just by setting the 
 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options 
 on the spark config, as per the docs [1].
 
 I'm not sure if there's anything special needed for CDH or not though. I 
 also have a docker image I've been toying with which has a working 
 Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might 
 be a useful reference for you as well [2].
 
 Good luck,
 
 Josh
 
 [1] https://phoenix.apache.org/phoenix_spark.html 
 
 [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
 
 
 On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim 

Re: Spark Phoenix Plugin

2016-02-09 Thread Benjamin Kim
Hi Ravi,

I see that the version is still 4.6. Does it include the fix for the Spark 
plugin? https://issues.apache.org/jira/browse/PHOENIX-2503 


This is the main reason I need it.

Thanks,
Ben

> On Feb 9, 2016, at 10:20 AM, Ravi Kiran  wrote:
> 
> Hi Pierre,
> 
>   Try your luck for building the artifacts from 
> https://github.com/chiastic-security/phoenix-for-cloudera 
> . Hopefully it 
> helps.
> 
> Regards
> Ravi .
> 
> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim  > wrote:
> Hi Pierre,
> 
> I found this article about how Cloudera’s version of HBase is very different 
> than Apache HBase so it must be compiled using Cloudera’s repo and versions. 
> But, I’m not having any success with it.
> 
> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>  
> 
> 
> There’s also a Chinese site that does the same thing.
> 
> https://www.zybuluo.com/xtccc/note/205739 
> 
> 
> I keep getting errors like the one’s below.
> 
> [ERROR] 
> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>  cannot find symbol
> [ERROR] symbol:   class Region
> [ERROR] location: class org.apache.hadoop.hbase.regionserver.LocalIndexMerger
> …
> 
> Have you tried this also?
> 
> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s HBase.
> 
> Thanks,
> Ben
> 
> 
>> On Feb 8, 2016, at 11:04 PM, pierre lacave > > wrote:
>> 
>> Havent met that one.
>> 
>> According to SPARK-1867, the real issue is hidden.
>> 
>> I d process by elimination, maybe try in local[*] mode first
>> 
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 
>> 
>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim > > wrote:
>> Pierre,
>> 
>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, I 
>> get this error:
>> 
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
>> stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 
>> (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com 
>> ): 
>> java.lang.IllegalStateException: unread block data
>> 
>> It happens when I do:
>> 
>> df.show()
>> 
>> Getting closer…
>> 
>> Thanks,
>> Ben
>> 
>> 
>> 
>>> On Feb 8, 2016, at 2:57 PM, pierre lacave >> > wrote:
>>> 
>>> This is the wrong client jar try with the one named 
>>> phoenix-4.7.0-HBase-1.1-client-spark.jar 
>>> 
>>> 
>>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim >> > wrote:
>>> Hi Josh,
>>> 
>>> I tried again by putting the settings within the spark-default.conf.
>>> 
>>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>> 
>>> I still get the same error using the code below.
>>> 
>>> import org.apache.phoenix.spark._
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>>> 
>>> Can you tell me what else you’re doing?
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
 On Feb 8, 2016, at 1:44 PM, Josh Mahonin > wrote:
 
 Hi Ben,
 
 I'm not sure about the format of those command line options you're 
 passing. I've had success with spark-shell just by setting the 
 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options 
 on the spark config, as per the docs [1].
 
 I'm not sure if there's anything special needed for CDH or not though. I 
 also have a docker image I've been toying with which has a working 
 Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might 
 be a useful reference for you as well [2].
 
 Good luck,
 
 Josh
 
 [1] https://phoenix.apache.org/phoenix_spark.html 
 
 [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
 
 
 On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim > wrote:
 Hi Pierre,
 
 I tried to run in spark-shell using spark 1.6.0 by running this:
 
 spark-shell --master yarn-client --driver-class-path 
 /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar 
 

Re: Spark Phoenix Plugin

2016-02-09 Thread Ravi Kiran
Hi Pierre,

  Try your luck for building the artifacts from
https://github.com/chiastic-security/phoenix-for-cloudera. Hopefully it
helps.

Regards
Ravi .

On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim  wrote:

> Hi Pierre,
>
> I found this article about how Cloudera’s version of HBase is very
> different than Apache HBase so it must be compiled using Cloudera’s repo
> and versions. But, I’m not having any success with it.
>
>
> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>
> There’s also a Chinese site that does the same thing.
>
> https://www.zybuluo.com/xtccc/note/205739
>
> I keep getting errors like the one’s below.
>
> [ERROR]
> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
> cannot find symbol
> [ERROR] symbol:   class Region
> [ERROR] location: class
> org.apache.hadoop.hbase.regionserver.LocalIndexMerger
> …
>
> Have you tried this also?
>
> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s
> HBase.
>
> Thanks,
> Ben
>
>
> On Feb 8, 2016, at 11:04 PM, pierre lacave  wrote:
>
> Havent met that one.
>
> According to SPARK-1867, the real issue is hidden.
>
> I d process by elimination, maybe try in local[*] mode first
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867
>
> On Tue, 9 Feb 2016, 04:58 Benjamin Kim  wrote:
>
>> Pierre,
>>
>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But,
>> now, I get this error:
>>
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
>> in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
>> 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com):
>> java.lang.IllegalStateException: unread block data
>>
>> It happens when I do:
>>
>> df.show()
>>
>> Getting closer…
>>
>> Thanks,
>> Ben
>>
>>
>>
>> On Feb 8, 2016, at 2:57 PM, pierre lacave  wrote:
>>
>> This is the wrong client jar try with the one named
>> phoenix-4.7.0-HBase-1.1-client-spark.jar
>>
>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim  wrote:
>>
>>> Hi Josh,
>>>
>>> I tried again by putting the settings within the spark-default.conf.
>>>
>>>
>>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>>
>>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>>
>>> I still get the same error using the code below.
>>>
>>> import org.apache.phoenix.spark._
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
>>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>>>
>>> Can you tell me what else you’re doing?
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin  wrote:
>>>
>>> Hi Ben,
>>>
>>> I'm not sure about the format of those command line options you're
>>> passing. I've had success with spark-shell just by setting the
>>> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options
>>> on the spark config, as per the docs [1].
>>>
>>> I'm not sure if there's anything special needed for CDH or not though. I
>>> also have a docker image I've been toying with which has a working
>>> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be
>>> a useful reference for you as well [2].
>>>
>>> Good luck,
>>>
>>> Josh
>>>
>>> [1] https://phoenix.apache.org/phoenix_spark.html
>>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
>>>
>>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim  wrote:
>>>
 Hi Pierre,

 I tried to run in spark-shell using spark 1.6.0 by running this:

 spark-shell --master yarn-client --driver-class-path
 /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options
 "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”

 The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.

 When I get to the line:

 val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
 “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))

 I get this error:

 java.lang.NoClassDefFoundError: Could not initialize class
 org.apache.spark.rdd.RDDOperationScope$

 Any ideas?

 Thanks,
 Ben


 On Feb 5, 2016, at 1:36 PM, pierre lacave  wrote:

 I don't know when the full release will be, RC1 just got pulled out,
 and expecting RC2 soon

 you can find them here

 https://dist.apache.org/repos/dist/dev/phoenix/


 there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you
 need to have in spark classpath


 *Pierre Lacave*
 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
 Phone :   +353879128708

 On Fri, Feb 5, 2016 at 9:28 

Re: Spark Phoenix Plugin

2016-02-09 Thread Benjamin Kim
Hi Pierre,

I found this article about how Cloudera’s version of HBase is very different 
than Apache HBase so it must be compiled using Cloudera’s repo and versions. 
But, I’m not having any success with it.

http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo

There’s also a Chinese site that does the same thing.

https://www.zybuluo.com/xtccc/note/205739

I keep getting errors like the one’s below.

[ERROR] 
/opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
 cannot find symbol
[ERROR] symbol:   class Region
[ERROR] location: class org.apache.hadoop.hbase.regionserver.LocalIndexMerger
…

Have you tried this also?

As a last resort, we will have to abandon Cloudera’s HBase for Apache’s HBase.

Thanks,
Ben


> On Feb 8, 2016, at 11:04 PM, pierre lacave  wrote:
> 
> Havent met that one.
> 
> According to SPARK-1867, the real issue is hidden.
> 
> I d process by elimination, maybe try in local[*] mode first
> 
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 
> 
> On Tue, 9 Feb 2016, 04:58 Benjamin Kim  > wrote:
> Pierre,
> 
> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, I 
> get this error:
> 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 
> (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com 
> ): 
> java.lang.IllegalStateException: unread block data
> 
> It happens when I do:
> 
> df.show()
> 
> Getting closer…
> 
> Thanks,
> Ben
> 
> 
> 
>> On Feb 8, 2016, at 2:57 PM, pierre lacave > > wrote:
>> 
>> This is the wrong client jar try with the one named 
>> phoenix-4.7.0-HBase-1.1-client-spark.jar 
>> 
>> 
>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim > > wrote:
>> Hi Josh,
>> 
>> I tried again by putting the settings within the spark-default.conf.
>> 
>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>> 
>> I still get the same error using the code below.
>> 
>> import org.apache.phoenix.spark._
>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>> 
>> Can you tell me what else you’re doing?
>> 
>> Thanks,
>> Ben
>> 
>> 
>>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin >> > wrote:
>>> 
>>> Hi Ben,
>>> 
>>> I'm not sure about the format of those command line options you're passing. 
>>> I've had success with spark-shell just by setting the 
>>> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options 
>>> on the spark config, as per the docs [1].
>>> 
>>> I'm not sure if there's anything special needed for CDH or not though. I 
>>> also have a docker image I've been toying with which has a working 
>>> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be 
>>> a useful reference for you as well [2].
>>> 
>>> Good luck,
>>> 
>>> Josh
>>> 
>>> [1] https://phoenix.apache.org/phoenix_spark.html 
>>> 
>>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
>>> 
>>> 
>>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim >> > wrote:
>>> Hi Pierre,
>>> 
>>> I tried to run in spark-shell using spark 1.6.0 by running this:
>>> 
>>> spark-shell --master yarn-client --driver-class-path 
>>> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options 
>>> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>>> 
>>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>>> 
>>> When I get to the line:
>>> 
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>>> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>>> 
>>> I get this error:
>>> 
>>> java.lang.NoClassDefFoundError: Could not initialize class 
>>> org.apache.spark.rdd.RDDOperationScope$
>>> 
>>> Any ideas?
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
 On Feb 5, 2016, at 1:36 PM, pierre lacave > wrote:
 
 I don't know when the full release will be, RC1 just got pulled out, and 
 expecting RC2 soon
 
 you can find them here 
 
 https://dist.apache.org/repos/dist/dev/phoenix/ 
 
 
 
 there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you 
 need 

Re: Spark Phoenix Plugin

2016-02-08 Thread pierre lacave
Havent met that one.

According to SPARK-1867, the real issue is hidden.

I d process by elimination, maybe try in local[*] mode first

https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867

On Tue, 9 Feb 2016, 04:58 Benjamin Kim  wrote:

> Pierre,
>
> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now,
> I get this error:
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
> in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
> 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com):
> java.lang.IllegalStateException: unread block data
>
> It happens when I do:
>
> df.show()
>
> Getting closer…
>
> Thanks,
> Ben
>
>
>
> On Feb 8, 2016, at 2:57 PM, pierre lacave  wrote:
>
> This is the wrong client jar try with the one named
> phoenix-4.7.0-HBase-1.1-client-spark.jar
>
> On Mon, 8 Feb 2016, 22:29 Benjamin Kim  wrote:
>
>> Hi Josh,
>>
>> I tried again by putting the settings within the spark-default.conf.
>>
>>
>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>
>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>
>> I still get the same error using the code below.
>>
>> import org.apache.phoenix.spark._
>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>>
>> Can you tell me what else you’re doing?
>>
>> Thanks,
>> Ben
>>
>>
>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin  wrote:
>>
>> Hi Ben,
>>
>> I'm not sure about the format of those command line options you're
>> passing. I've had success with spark-shell just by setting the
>> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options
>> on the spark config, as per the docs [1].
>>
>> I'm not sure if there's anything special needed for CDH or not though. I
>> also have a docker image I've been toying with which has a working
>> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be
>> a useful reference for you as well [2].
>>
>> Good luck,
>>
>> Josh
>>
>> [1] https://phoenix.apache.org/phoenix_spark.html
>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
>>
>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim  wrote:
>>
>>> Hi Pierre,
>>>
>>> I tried to run in spark-shell using spark 1.6.0 by running this:
>>>
>>> spark-shell --master yarn-client --driver-class-path
>>> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options
>>> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>>>
>>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>>>
>>> When I get to the line:
>>>
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
>>> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>>>
>>> I get this error:
>>>
>>> java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.spark.rdd.RDDOperationScope$
>>>
>>> Any ideas?
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>> On Feb 5, 2016, at 1:36 PM, pierre lacave  wrote:
>>>
>>> I don't know when the full release will be, RC1 just got pulled out, and
>>> expecting RC2 soon
>>>
>>> you can find them here
>>>
>>> https://dist.apache.org/repos/dist/dev/phoenix/
>>>
>>>
>>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you
>>> need to have in spark classpath
>>>
>>>
>>> *Pierre Lacave*
>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>>> Phone :   +353879128708
>>>
>>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim  wrote:
>>>
 Hi Pierre,

 When will I be able to download this version?

 Thanks,
 Ben


 On Friday, February 5, 2016, pierre lacave  wrote:

> This was addressed in Phoenix 4.7 (currently in RC)
> https://issues.apache.org/jira/browse/PHOENIX-2503
>
>
>
>
> *Pierre Lacave*
> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
> Phone :   +353879128708
>
> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim 
> wrote:
>
>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and
>> Spark 1.6. When I try to launch spark-shell, I get:
>>
>> java.lang.RuntimeException: java.lang.RuntimeException:
>> Unable to instantiate
>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>
>> I continue on and run the example code. When I get tot the line below:
>>
>> val df = sqlContext.load("org.apache.phoenix.spark",
>> Map("table" -> "TEST.MY_TEST", "zkUrl" ->
>> "zookeeper1,zookeeper2,zookeeper3:2181")
>>
>> I get this error:
>>
>> java.lang.NoSuchMethodError:
>> 

Re: Spark Phoenix Plugin

2016-02-08 Thread Benjamin Kim
Pierre,

I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, I 
get this error:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 
3, prod-dc1-datanode151.pdc1i.gradientx.com): java.lang.IllegalStateException: 
unread block data

It happens when I do:

df.show()

Getting closer…

Thanks,
Ben



> On Feb 8, 2016, at 2:57 PM, pierre lacave  wrote:
> 
> This is the wrong client jar try with the one named 
> phoenix-4.7.0-HBase-1.1-client-spark.jar 
> 
> 
> On Mon, 8 Feb 2016, 22:29 Benjamin Kim  > wrote:
> Hi Josh,
> 
> I tried again by putting the settings within the spark-default.conf.
> 
> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
> 
> I still get the same error using the code below.
> 
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
> 
> Can you tell me what else you’re doing?
> 
> Thanks,
> Ben
> 
> 
>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin > > wrote:
>> 
>> Hi Ben,
>> 
>> I'm not sure about the format of those command line options you're passing. 
>> I've had success with spark-shell just by setting the 
>> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options on 
>> the spark config, as per the docs [1].
>> 
>> I'm not sure if there's anything special needed for CDH or not though. I 
>> also have a docker image I've been toying with which has a working 
>> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be 
>> a useful reference for you as well [2].
>> 
>> Good luck,
>> 
>> Josh
>> 
>> [1] https://phoenix.apache.org/phoenix_spark.html 
>> 
>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
>> 
>> 
>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim > > wrote:
>> Hi Pierre,
>> 
>> I tried to run in spark-shell using spark 1.6.0 by running this:
>> 
>> spark-shell --master yarn-client --driver-class-path 
>> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options 
>> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>> 
>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>> 
>> When I get to the line:
>> 
>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>> 
>> I get this error:
>> 
>> java.lang.NoClassDefFoundError: Could not initialize class 
>> org.apache.spark.rdd.RDDOperationScope$
>> 
>> Any ideas?
>> 
>> Thanks,
>> Ben
>> 
>> 
>>> On Feb 5, 2016, at 1:36 PM, pierre lacave >> > wrote:
>>> 
>>> I don't know when the full release will be, RC1 just got pulled out, and 
>>> expecting RC2 soon
>>> 
>>> you can find them here 
>>> 
>>> https://dist.apache.org/repos/dist/dev/phoenix/ 
>>> 
>>> 
>>> 
>>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you 
>>> need to have in spark classpath
>>> 
>>> 
>>> Pierre Lacave
>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>>> Phone :   +353879128708 
>>> 
>>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim >> > wrote:
>>> Hi Pierre,
>>> 
>>> When will I be able to download this version?
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
>>> On Friday, February 5, 2016, pierre lacave >> > wrote:
>>> This was addressed in Phoenix 4.7 (currently in RC) 
>>> https://issues.apache.org/jira/browse/PHOENIX-2503 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Pierre Lacave
>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>>> Phone :   +353879128708 
>>> 
>>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim > wrote:
>>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and Spark 
>>> 1.6. When I try to launch spark-shell, I get:
>>> 
>>> java.lang.RuntimeException: java.lang.RuntimeException: Unable to 
>>> instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>> 
>>> I continue on and run the example code. When I get tot the line below:
>>> 
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>>> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
>>> 
>>> I get this error:
>>> 
>>> 

Re: Spark Phoenix Plugin

2016-02-08 Thread Benjamin Kim
Hi Pierre,

I tried to run in spark-shell using spark 1.6.0 by running this:

spark-shell --master yarn-client --driver-class-path 
/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options 
"-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”

The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.

When I get to the line:

val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
“TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))

I get this error:

java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.spark.rdd.RDDOperationScope$

Any ideas?

Thanks,
Ben


> On Feb 5, 2016, at 1:36 PM, pierre lacave  wrote:
> 
> I don't know when the full release will be, RC1 just got pulled out, and 
> expecting RC2 soon
> 
> you can find them here 
> 
> https://dist.apache.org/repos/dist/dev/phoenix/ 
> 
> 
> 
> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you need 
> to have in spark classpath
> 
> 
> Pierre Lacave
> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
> Phone :   +353879128708
> 
> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim  > wrote:
> Hi Pierre,
> 
> When will I be able to download this version?
> 
> Thanks,
> Ben
> 
> 
> On Friday, February 5, 2016, pierre lacave  > wrote:
> This was addressed in Phoenix 4.7 (currently in RC) 
> https://issues.apache.org/jira/browse/PHOENIX-2503 
> 
> 
> 
> 
> 
> Pierre Lacave
> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
> Phone :   +353879128708 
> 
> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim > wrote:
> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and Spark 
> 1.6. When I try to launch spark-shell, I get:
> 
> java.lang.RuntimeException: java.lang.RuntimeException: Unable to 
> instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> 
> I continue on and run the example code. When I get tot the line below:
> 
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
> 
> I get this error:
> 
> java.lang.NoSuchMethodError: 
> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
> 
> Can someone help?
> 
> Thanks,
> Ben
> 
> 



Re: Spark Phoenix Plugin

2016-02-08 Thread Josh Mahonin
Hi Ben,

I'm not sure about the format of those command line options you're passing.
I've had success with spark-shell just by setting the
'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options
on the spark config, as per the docs [1].

I'm not sure if there's anything special needed for CDH or not though. I
also have a docker image I've been toying with which has a working
Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be
a useful reference for you as well [2].

Good luck,

Josh

[1] https://phoenix.apache.org/phoenix_spark.html
[2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark

On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim  wrote:

> Hi Pierre,
>
> I tried to run in spark-shell using spark 1.6.0 by running this:
>
> spark-shell --master yarn-client --driver-class-path
> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options
> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>
> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>
> When I get to the line:
>
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>
> I get this error:
>
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.spark.rdd.RDDOperationScope$
>
> Any ideas?
>
> Thanks,
> Ben
>
>
> On Feb 5, 2016, at 1:36 PM, pierre lacave  wrote:
>
> I don't know when the full release will be, RC1 just got pulled out, and
> expecting RC2 soon
>
> you can find them here
>
> https://dist.apache.org/repos/dist/dev/phoenix/
>
>
> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you
> need to have in spark classpath
>
>
> *Pierre Lacave*
> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
> Phone :   +353879128708
>
> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim  wrote:
>
>> Hi Pierre,
>>
>> When will I be able to download this version?
>>
>> Thanks,
>> Ben
>>
>>
>> On Friday, February 5, 2016, pierre lacave  wrote:
>>
>>> This was addressed in Phoenix 4.7 (currently in RC)
>>> https://issues.apache.org/jira/browse/PHOENIX-2503
>>>
>>>
>>>
>>>
>>> *Pierre Lacave*
>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>>> Phone :   +353879128708
>>>
>>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim  wrote:
>>>
 I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and
 Spark 1.6. When I try to launch spark-shell, I get:

 java.lang.RuntimeException: java.lang.RuntimeException: Unable
 to instantiate 
 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

 I continue on and run the example code. When I get tot the line below:

 val df = sqlContext.load("org.apache.phoenix.spark",
 Map("table" -> "TEST.MY_TEST", "zkUrl" ->
 "zookeeper1,zookeeper2,zookeeper3:2181")

 I get this error:

 java.lang.NoSuchMethodError:
 com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;

 Can someone help?

 Thanks,
 Ben
>>>
>>>
>>>
>
>


Re: Spark Phoenix Plugin

2016-02-08 Thread Benjamin Kim
Hi Josh,

I tried again by putting the settings within the spark-default.conf.

spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar

I still get the same error using the code below.

import org.apache.phoenix.spark._
val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
"TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))

Can you tell me what else you’re doing?

Thanks,
Ben


> On Feb 8, 2016, at 1:44 PM, Josh Mahonin  wrote:
> 
> Hi Ben,
> 
> I'm not sure about the format of those command line options you're passing. 
> I've had success with spark-shell just by setting the 
> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options on 
> the spark config, as per the docs [1].
> 
> I'm not sure if there's anything special needed for CDH or not though. I also 
> have a docker image I've been toying with which has a working Spark/Phoenix 
> setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be a useful 
> reference for you as well [2].
> 
> Good luck,
> 
> Josh
> 
> [1] https://phoenix.apache.org/phoenix_spark.html 
> 
> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark 
> 
> 
> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim  > wrote:
> Hi Pierre,
> 
> I tried to run in spark-shell using spark 1.6.0 by running this:
> 
> spark-shell --master yarn-client --driver-class-path 
> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options 
> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
> 
> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
> 
> When I get to the line:
> 
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
> 
> I get this error:
> 
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.spark.rdd.RDDOperationScope$
> 
> Any ideas?
> 
> Thanks,
> Ben
> 
> 
>> On Feb 5, 2016, at 1:36 PM, pierre lacave > > wrote:
>> 
>> I don't know when the full release will be, RC1 just got pulled out, and 
>> expecting RC2 soon
>> 
>> you can find them here 
>> 
>> https://dist.apache.org/repos/dist/dev/phoenix/ 
>> 
>> 
>> 
>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you need 
>> to have in spark classpath
>> 
>> 
>> Pierre Lacave
>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>> Phone :   +353879128708 
>> 
>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim > > wrote:
>> Hi Pierre,
>> 
>> When will I be able to download this version?
>> 
>> Thanks,
>> Ben
>> 
>> 
>> On Friday, February 5, 2016, pierre lacave > > wrote:
>> This was addressed in Phoenix 4.7 (currently in RC) 
>> https://issues.apache.org/jira/browse/PHOENIX-2503 
>> 
>> 
>> 
>> 
>> 
>> Pierre Lacave
>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>> Phone :   +353879128708 
>> 
>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim > wrote:
>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and Spark 
>> 1.6. When I try to launch spark-shell, I get:
>> 
>> java.lang.RuntimeException: java.lang.RuntimeException: Unable to 
>> instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>> 
>> I continue on and run the example code. When I get tot the line below:
>> 
>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> 
>> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
>> 
>> I get this error:
>> 
>> java.lang.NoSuchMethodError: 
>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>> 
>> Can someone help?
>> 
>> Thanks,
>> Ben
>> 
>> 
> 
> 



Re: Spark Phoenix Plugin

2016-02-08 Thread pierre lacave
This is the wrong client jar try with the one named
phoenix-4.7.0-HBase-1.1-client-spark.jar

On Mon, 8 Feb 2016, 22:29 Benjamin Kim  wrote:

> Hi Josh,
>
> I tried again by putting the settings within the spark-default.conf.
>
>
> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>
> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>
> I still get the same error using the code below.
>
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>
> Can you tell me what else you’re doing?
>
> Thanks,
> Ben
>
>
> On Feb 8, 2016, at 1:44 PM, Josh Mahonin  wrote:
>
> Hi Ben,
>
> I'm not sure about the format of those command line options you're
> passing. I've had success with spark-shell just by setting the
> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options
> on the spark config, as per the docs [1].
>
> I'm not sure if there's anything special needed for CDH or not though. I
> also have a docker image I've been toying with which has a working
> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be
> a useful reference for you as well [2].
>
> Good luck,
>
> Josh
>
> [1] https://phoenix.apache.org/phoenix_spark.html
> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
>
> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim  wrote:
>
>> Hi Pierre,
>>
>> I tried to run in spark-shell using spark 1.6.0 by running this:
>>
>> spark-shell --master yarn-client --driver-class-path
>> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options
>> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>>
>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>>
>> When I get to the line:
>>
>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
>> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>>
>> I get this error:
>>
>> java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.spark.rdd.RDDOperationScope$
>>
>> Any ideas?
>>
>> Thanks,
>> Ben
>>
>>
>> On Feb 5, 2016, at 1:36 PM, pierre lacave  wrote:
>>
>> I don't know when the full release will be, RC1 just got pulled out, and
>> expecting RC2 soon
>>
>> you can find them here
>>
>> https://dist.apache.org/repos/dist/dev/phoenix/
>>
>>
>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you
>> need to have in spark classpath
>>
>>
>> *Pierre Lacave*
>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>> Phone :   +353879128708
>>
>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim  wrote:
>>
>>> Hi Pierre,
>>>
>>> When will I be able to download this version?
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>> On Friday, February 5, 2016, pierre lacave  wrote:
>>>
 This was addressed in Phoenix 4.7 (currently in RC)
 https://issues.apache.org/jira/browse/PHOENIX-2503




 *Pierre Lacave*
 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
 Phone :   +353879128708

 On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim 
 wrote:

> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and
> Spark 1.6. When I try to launch spark-shell, I get:
>
> java.lang.RuntimeException: java.lang.RuntimeException: Unable
> to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>
> I continue on and run the example code. When I get tot the line below:
>
> val df = sqlContext.load("org.apache.phoenix.spark",
> Map("table" -> "TEST.MY_TEST", "zkUrl" ->
> "zookeeper1,zookeeper2,zookeeper3:2181")
>
> I get this error:
>
> java.lang.NoSuchMethodError:
> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>
> Can someone help?
>
> Thanks,
> Ben



>>
>>
>
>


Re: Spark Phoenix Plugin

2016-02-05 Thread pierre lacave
This was addressed in Phoenix 4.7 (currently in RC)
https://issues.apache.org/jira/browse/PHOENIX-2503




*Pierre Lacave*
171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
Phone :   +353879128708

On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim  wrote:

> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and
> Spark 1.6. When I try to launch spark-shell, I get:
>
> java.lang.RuntimeException: java.lang.RuntimeException: Unable to
> instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>
> I continue on and run the example code. When I get tot the line below:
>
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table"
> -> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
>
> I get this error:
>
> java.lang.NoSuchMethodError:
> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>
> Can someone help?
>
> Thanks,
> Ben


Re: Spark Phoenix Plugin

2016-02-05 Thread Benjamin Kim
Hi Pierre,

When will I be able to download this version?

Thanks,
Ben

On Friday, February 5, 2016, pierre lacave  wrote:

> This was addressed in Phoenix 4.7 (currently in RC)
> https://issues.apache.org/jira/browse/PHOENIX-2503
>
>
>
>
> *Pierre Lacave*
> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
> Phone :   +353879128708
>
> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim  > wrote:
>
>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and
>> Spark 1.6. When I try to launch spark-shell, I get:
>>
>> java.lang.RuntimeException: java.lang.RuntimeException: Unable to
>> instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>
>> I continue on and run the example code. When I get tot the line below:
>>
>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table"
>> -> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
>>
>> I get this error:
>>
>> java.lang.NoSuchMethodError:
>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>>
>> Can someone help?
>>
>> Thanks,
>> Ben
>
>
>


Re: Spark Phoenix Plugin

2016-02-05 Thread pierre lacave
I don't know when the full release will be, RC1 just got pulled out, and
expecting RC2 soon

you can find them here

https://dist.apache.org/repos/dist/dev/phoenix/


there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you
need to have in spark classpath


*Pierre Lacave*
171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
Phone :   +353879128708

On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim  wrote:

> Hi Pierre,
>
> When will I be able to download this version?
>
> Thanks,
> Ben
>
>
> On Friday, February 5, 2016, pierre lacave  wrote:
>
>> This was addressed in Phoenix 4.7 (currently in RC)
>> https://issues.apache.org/jira/browse/PHOENIX-2503
>>
>>
>>
>>
>> *Pierre Lacave*
>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>> Phone :   +353879128708
>>
>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim  wrote:
>>
>>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and
>>> Spark 1.6. When I try to launch spark-shell, I get:
>>>
>>> java.lang.RuntimeException: java.lang.RuntimeException: Unable
>>> to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>>
>>> I continue on and run the example code. When I get tot the line below:
>>>
>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table"
>>> -> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
>>>
>>> I get this error:
>>>
>>> java.lang.NoSuchMethodError:
>>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>>>
>>> Can someone help?
>>>
>>> Thanks,
>>> Ben
>>
>>
>>