Thanks Das and Ayan. Do you have any refrences on how to create connection pool for hbase inside foreachpartitions as mentioned in guide. In my case, I have to use kerberos hbase cluster.
On Wed, Sep 21, 2016 at 6:39 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > http://spark.apache.org/docs/latest/streaming-programming- > guide.html#design-patterns-for-using-foreachrdd > > On Wed, Sep 21, 2016 at 4:26 PM, ayan guha <guha.a...@gmail.com> wrote: > >> Connection object is not serialisable. You need to implement a >> getorcreate function which would run on each executors to create hbase >> connection locally. >> On 22 Sep 2016 08:34, "KhajaAsmath Mohammed" <mdkhajaasm...@gmail.com> >> wrote: >> >>> Hello Everyone, >>> >>> I am running spark application to push data from kafka. I am able to get >>> hbase kerberos connection successfully outside of functon before calling >>> foreachrdd on Dstream. >>> >>> Job fails inside foreachrdd stating that hbaseconnection object is not >>> serialized. could you please let me now how toresolve this. >>> >>> @transient val hbaseConnection=hBaseEntityManager.getConnection() >>> >>> appEventDStream.foreachRDD(rdd => { >>> if (!rdd.isEmpty()) { >>> rdd.foreach { entity => >>> { >>> >>> generatePut(hBaseEntityManager,hbaseConnection,entity.getClass.getSimpleName,entity.asInstanceOf[DataPoint]) >>> >>> } >>> >>> } >>> >>> >>> Error is thrown exactly at connection object inside foreachRdd saying it is >>> not serialize. could anyone provide solution for it >>> >>> Asmath >>> >>> >