Re: Hbase Connection not seraializible in Spark -> foreachrdd

2016-09-22 Thread KhajaAsmath Mohammed
Thanks Das and Ayan.

Do you have any refrences on how to create connection pool for hbase inside
foreachpartitions as mentioned in guide. In my case, I have to use kerberos
hbase cluster.

On Wed, Sep 21, 2016 at 6:39 PM, Tathagata Das 
wrote:

> http://spark.apache.org/docs/latest/streaming-programming-
> guide.html#design-patterns-for-using-foreachrdd
>
> On Wed, Sep 21, 2016 at 4:26 PM, ayan guha  wrote:
>
>> Connection object is not serialisable. You need to implement a
>> getorcreate function which would run on each executors to create hbase
>> connection locally.
>> On 22 Sep 2016 08:34, "KhajaAsmath Mohammed" 
>> wrote:
>>
>>> Hello Everyone,
>>>
>>> I am running spark application to push data from kafka. I am able to get
>>> hbase kerberos connection successfully outside of functon before calling
>>> foreachrdd on Dstream.
>>>
>>> Job fails inside foreachrdd stating that hbaseconnection object is not
>>> serialized. could you please let me now  how toresolve this.
>>>
>>> @transient val hbaseConnection=hBaseEntityManager.getConnection()
>>>
>>> appEventDStream.foreachRDD(rdd => {
>>>   if (!rdd.isEmpty()) {
>>> rdd.foreach { entity =>
>>>   {
>>>   
>>> generatePut(hBaseEntityManager,hbaseConnection,entity.getClass.getSimpleName,entity.asInstanceOf[DataPoint])
>>>
>>> }
>>>
>>> }
>>>
>>>
>>> Error is thrown exactly at connection object inside foreachRdd saying it is 
>>> not serialize. could anyone provide solution for it
>>>
>>> Asmath
>>>
>>>
>


Re: Hbase Connection not seraializible in Spark -> foreachrdd

2016-09-21 Thread Tathagata Das
http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd

On Wed, Sep 21, 2016 at 4:26 PM, ayan guha  wrote:

> Connection object is not serialisable. You need to implement a getorcreate
> function which would run on each executors to create hbase connection
> locally.
> On 22 Sep 2016 08:34, "KhajaAsmath Mohammed" 
> wrote:
>
>> Hello Everyone,
>>
>> I am running spark application to push data from kafka. I am able to get
>> hbase kerberos connection successfully outside of functon before calling
>> foreachrdd on Dstream.
>>
>> Job fails inside foreachrdd stating that hbaseconnection object is not
>> serialized. could you please let me now  how toresolve this.
>>
>> @transient val hbaseConnection=hBaseEntityManager.getConnection()
>>
>> appEventDStream.foreachRDD(rdd => {
>>   if (!rdd.isEmpty()) {
>> rdd.foreach { entity =>
>>   {
>>   
>> generatePut(hBaseEntityManager,hbaseConnection,entity.getClass.getSimpleName,entity.asInstanceOf[DataPoint])
>>
>> }
>>
>> }
>>
>>
>> Error is thrown exactly at connection object inside foreachRdd saying it is 
>> not serialize. could anyone provide solution for it
>>
>> Asmath
>>
>>


Re: Hbase Connection not seraializible in Spark -> foreachrdd

2016-09-21 Thread ayan guha
Connection object is not serialisable. You need to implement a getorcreate
function which would run on each executors to create hbase connection
locally.
On 22 Sep 2016 08:34, "KhajaAsmath Mohammed" 
wrote:

> Hello Everyone,
>
> I am running spark application to push data from kafka. I am able to get
> hbase kerberos connection successfully outside of functon before calling
> foreachrdd on Dstream.
>
> Job fails inside foreachrdd stating that hbaseconnection object is not
> serialized. could you please let me now  how toresolve this.
>
> @transient val hbaseConnection=hBaseEntityManager.getConnection()
>
> appEventDStream.foreachRDD(rdd => {
>   if (!rdd.isEmpty()) {
> rdd.foreach { entity =>
>   {
>   
> generatePut(hBaseEntityManager,hbaseConnection,entity.getClass.getSimpleName,entity.asInstanceOf[DataPoint])
>
> }
>
> }
>
>
> Error is thrown exactly at connection object inside foreachRdd saying it is 
> not serialize. could anyone provide solution for it
>
> Asmath
>
>


Hbase Connection not seraializible in Spark -> foreachrdd

2016-09-21 Thread KhajaAsmath Mohammed
Hello Everyone,

I am running spark application to push data from kafka. I am able to get
hbase kerberos connection successfully outside of functon before calling
foreachrdd on Dstream.

Job fails inside foreachrdd stating that hbaseconnection object is not
serialized. could you please let me now  how toresolve this.

@transient val hbaseConnection=hBaseEntityManager.getConnection()

appEventDStream.foreachRDD(rdd => {
  if (!rdd.isEmpty()) {
rdd.foreach { entity =>
  {
  
generatePut(hBaseEntityManager,hbaseConnection,entity.getClass.getSimpleName,entity.asInstanceOf[DataPoint])

}

}


Error is thrown exactly at connection object inside foreachRdd saying
it is not serialize. could anyone provide solution for it

Asmath