Re: Spark Streaming + Kudu

2018-03-06 Thread Ravi Kanth
Mike- I actually got a hold of the pid's for the spark executors but facing issues to run the jstack. There are some VM exceptions. I will figure it out and will attach the jstack. Thanks for your patience. On 6 March 2018 at 20:42, Mike Percy wrote: > Hmm, could you try in

Re: Spark Streaming + Kudu

2018-03-06 Thread Mike Percy
Hmm, could you try in spark local mode? i.e. https://jaceklaskowski. gitbooks.io/mastering-apache-spark/content/spark-local.html Mike On Tue, Mar 6, 2018 at 7:14 PM, Ravi Kanth wrote: > Mike, > > Can you clarify a bit on grabbing the jstack for the process? I launched

Re: Spark Streaming + Kudu

2018-03-06 Thread Ravi Kanth
Mike, Can you clarify a bit on grabbing the jstack for the process? I launched my Spark application and tried to get the pid using which I thought I can grab jstack trace during hang. Unfortunately, I am not able to figure out grabbing pid for Spark application. Thanks, Ravi On 6 March 2018 at

Re: Spark Streaming + Kudu

2018-03-06 Thread Ravi Kanth
Yes, I have debugged to find the root cause. Every logger before "table = client.openTable(tableName);" is executing fine and exactly at the point of opening the table, it is throwing the below exception and nothing is being executed after that. Still the Spark batches are being processed and at

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy
Have you considered checking your session error count or pending errors in your while loop every so often? Can you identify where your code is hanging when the connection is lost (what line)? Mike On Mon, Mar 5, 2018 at 9:08 PM, Ravi Kanth wrote: > In addition to my

Re: Spark Streaming + Kudu

2018-03-05 Thread Ravi Kanth
In addition to my previous comment, I raised a support ticket for this issue with Cloudera and one of the support person mentioned below, *"Thank you for clarifying, The exceptions are logged but not re-thrown to an upper layer, so that explains why the Spark application is not aware of the

Re: Spark Streaming + Kudu

2018-03-05 Thread Ravi Kanth
Mike, Thanks for the information. But, once the connection to any of the Kudu servers is lost then there is no way I can have a control on the KuduSession object and so with getPendingErrors(). The KuduClient in this case is becoming a zombie and never returned back till the connection is

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy
Hi Ravi, it would be helpful if you could attach what you are getting back from getPendingErrors() -- perhaps from dumping RowError.toString() from items in the returned array -- and indicate what you were hoping to get back. Note that a RowError can also return to you the Operation

Re: Spark Streaming + Kudu

2018-03-05 Thread Ravi Kanth
Hi Mike, Thanks for the reply. Yes, I am using AUTO_FLUSH_BACKGROUND. So, I am trying to use Kudu Client API to perform UPSERT into Kudu and I integrated this with Spark. I am trying to test a case where in if any of Kudu server fails. So, in this case, if there is any problem in writing,

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy
Hi Ravi, are you using AUTO_FLUSH_BACKGROUND ? You mention that you are trying to use getPendingErrors()

Re: Spark Streaming + Kudu

2018-02-26 Thread Ravi Kanth
Thank Clifford. We are running Kudu 1.4 version. Till date we didn't see any issues in production and we are not losing tablet servers. But, as part of testing I have to generate few unforeseen cases to analyse the application performance. One among that is bringing down the tablet server or