Thanks Das and Ayan.

Do you have any refrences on how to create connection pool for hbase inside
foreachpartitions as mentioned in guide. In my case, I have to use kerberos
hbase cluster.

On Wed, Sep 21, 2016 at 6:39 PM, Tathagata Das <tathagata.das1...@gmail.com>
wrote:

> http://spark.apache.org/docs/latest/streaming-programming-
> guide.html#design-patterns-for-using-foreachrdd
>
> On Wed, Sep 21, 2016 at 4:26 PM, ayan guha <guha.a...@gmail.com> wrote:
>
>> Connection object is not serialisable. You need to implement a
>> getorcreate function which would run on each executors to create hbase
>> connection locally.
>> On 22 Sep 2016 08:34, "KhajaAsmath Mohammed" <mdkhajaasm...@gmail.com>
>> wrote:
>>
>>> Hello Everyone,
>>>
>>> I am running spark application to push data from kafka. I am able to get
>>> hbase kerberos connection successfully outside of functon before calling
>>> foreachrdd on Dstream.
>>>
>>> Job fails inside foreachrdd stating that hbaseconnection object is not
>>> serialized. could you please let me now  how toresolve this.
>>>
>>> @transient val hbaseConnection=hBaseEntityManager.getConnection()
>>>
>>> appEventDStream.foreachRDD(rdd => {
>>>   if (!rdd.isEmpty()) {
>>>     rdd.foreach { entity =>
>>>       {
>>>           
>>> generatePut(hBaseEntityManager,hbaseConnection,entity.getClass.getSimpleName,entity.asInstanceOf[DataPoint])
>>>
>>>         }
>>>
>>> }
>>>
>>>
>>> Error is thrown exactly at connection object inside foreachRdd saying it is 
>>> not serialize. could anyone provide solution for it
>>>
>>> Asmath
>>>
>>>
>

Reply via email to