Re: Apache Kafka / Spark Integration - Exception - The server disconnected before a response was received.

2018-04-10 Thread M Singh
 
Hi Daniel:
Yes I am working with Spark Structured Streaming. 
The exception is emanating from spark kafka connector but I was wondering if 
someone has encountered this issue and resolved it by some configuration 
parameter in kafka client/broker or OS settings.
Thanks
Mans
On Tuesday, April 10, 2018, 7:49:42 AM PDT, Daniel Hinojosa 
 wrote:  
 
 This looks more like a spark issue than it does a Kafka judging by the
stack trace, are you using Spark structured streaming with Kafka
integration by chance?

On Mon, Apr 9, 2018 at 8:47 AM, M Singh 
wrote:

> Hi Folks:
> Just wanted to see if anyone has any suggestions on this issue.
> Thanks
>
>
>    On Monday, March 26, 2018, 11:04:02 AM PDT, M Singh
>  wrote:
>
>  Hi Ted:
> Here is the exception trace (Note - The exception is occuring in the kafka
> spark writer class).
>
> I will try to check broker logs.  Is there anything specific I should look
> for ?
> Driver stacktrace:
>  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$
> scheduler$DAGScheduler$$failJobAndIndependentStages(
> DAGScheduler.scala:1708)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(
> DAGScheduler.scala:1696)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(
> DAGScheduler.scala:1695)
>  at scala.collection.mutable.ResizableArray$class.foreach(
> ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>  at org.apache.spark.scheduler.DAGScheduler.abortStage(
> DAGScheduler.scala:1695)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$
> handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$
> handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
>  at scala.Option.foreach(Option.scala:257)
>  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(
> DAGScheduler.scala:855)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
> doOnReceive(DAGScheduler.scala:1923)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
> onReceive(DAGScheduler.scala:1878)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
> onReceive(DAGScheduler.scala:1867)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:671)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2029)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2069)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)
>  at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.
> apply(RDD.scala:926)
>  at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.
> apply(RDD.scala:924)
>  at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:151)
>  at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:112)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>  at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:924)
>  at org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.
> apply$mcV$sp(KafkaWriter.scala:89)
>  at org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.
> apply(KafkaWriter.scala:89)
>  at org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.
> apply(KafkaWriter.scala:89)
>  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(
> SQLExecution.scala:65)
>  at org.apache.spark.sql.kafka010.KafkaWriter$.write(KafkaWriter.scala:88)
>  at org.apache.spark.sql.kafka010.KafkaSink.addBatch(KafkaSink.scala:38)
>
>
>    On Monday, March 26, 2018, 9:34:06 AM PDT, Ted Yu 
> wrote:
>
>  Can you post the stack trace for NetworkException (pastebin) ?
>
> Please also check the broker logs to see if there was some clue around the
> time this happened.
>
> Thanks
>
> On Mon, Mar 26, 2018 at 9:30 AM, M Singh 
> wrote:
>
> > Hi:
> > I am working with spark 2.2.1 and spark kafka 0.10 client integration
> with
> > Kafka brokers using 0.11.
> > I get the exception - org.apache.kafka.common.errors.NetworkException:
> > The server disconnected before a response was received - when the
> > application is trying to write to a topic. This exception kills the spark
> > application.
> > Based on some similar issues I saw on the web I've added the following
> > kafka configuration but it has not helped.
> > acks = 0
> > request.timeout.ms = 45000
> > receive.buffer.bytes = 1024000
> > I've posted this question to apache spark users list but have not
> received
> > any response.  If anyone has any suggestion/pointers, please let me know.
> > Thanks
> >
>
>
  

Re: Apache Kafka / Spark Integration - Exception - The server disconnected before a response was received.

2018-04-10 Thread Daniel Hinojosa
This looks more like a spark issue than it does a Kafka judging by the
stack trace, are you using Spark structured streaming with Kafka
integration by chance?

On Mon, Apr 9, 2018 at 8:47 AM, M Singh 
wrote:

> Hi Folks:
> Just wanted to see if anyone has any suggestions on this issue.
> Thanks
>
>
> On Monday, March 26, 2018, 11:04:02 AM PDT, M Singh
>  wrote:
>
>  Hi Ted:
> Here is the exception trace (Note - The exception is occuring in the kafka
> spark writer class).
>
> I will try to check broker logs.  Is there anything specific I should look
> for ?
> Driver stacktrace:
>  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$
> scheduler$DAGScheduler$$failJobAndIndependentStages(
> DAGScheduler.scala:1708)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(
> DAGScheduler.scala:1696)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(
> DAGScheduler.scala:1695)
>  at scala.collection.mutable.ResizableArray$class.foreach(
> ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>  at org.apache.spark.scheduler.DAGScheduler.abortStage(
> DAGScheduler.scala:1695)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$
> handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$
> handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
>  at scala.Option.foreach(Option.scala:257)
>  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(
> DAGScheduler.scala:855)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
> doOnReceive(DAGScheduler.scala:1923)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
> onReceive(DAGScheduler.scala:1878)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
> onReceive(DAGScheduler.scala:1867)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:671)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2029)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2069)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)
>  at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.
> apply(RDD.scala:926)
>  at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.
> apply(RDD.scala:924)
>  at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:151)
>  at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:112)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>  at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:924)
>  at org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.
> apply$mcV$sp(KafkaWriter.scala:89)
>  at org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.
> apply(KafkaWriter.scala:89)
>  at org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.
> apply(KafkaWriter.scala:89)
>  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(
> SQLExecution.scala:65)
>  at org.apache.spark.sql.kafka010.KafkaWriter$.write(KafkaWriter.scala:88)
>  at org.apache.spark.sql.kafka010.KafkaSink.addBatch(KafkaSink.scala:38)
>
>
> On Monday, March 26, 2018, 9:34:06 AM PDT, Ted Yu 
> wrote:
>
>  Can you post the stack trace for NetworkException (pastebin) ?
>
> Please also check the broker logs to see if there was some clue around the
> time this happened.
>
> Thanks
>
> On Mon, Mar 26, 2018 at 9:30 AM, M Singh 
> wrote:
>
> > Hi:
> > I am working with spark 2.2.1 and spark kafka 0.10 client integration
> with
> > Kafka brokers using 0.11.
> > I get the exception - org.apache.kafka.common.errors.NetworkException:
> > The server disconnected before a response was received - when the
> > application is trying to write to a topic. This exception kills the spark
> > application.
> > Based on some similar issues I saw on the web I've added the following
> > kafka configuration but it has not helped.
> > acks = 0
> > request.timeout.ms = 45000
> > receive.buffer.bytes = 1024000
> > I've posted this question to apache spark users list but have not
> received
> > any response.  If anyone has any suggestion/pointers, please let me know.
> > Thanks
> >
>
>


Re: Apache Kafka / Spark Integration - Exception - The server disconnected before a response was received.

2018-04-09 Thread M Singh
Hi Folks:
Just wanted to see if anyone has any suggestions on this issue.
Thanks
 

On Monday, March 26, 2018, 11:04:02 AM PDT, M Singh 
 wrote:  
 
 Hi Ted:
Here is the exception trace (Note - The exception is occuring in the kafka 
spark writer class).

I will try to check broker logs.  Is there anything specific I should look for ?
Driver stacktrace:
 at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1708)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1696)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1695)
 at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
 at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1695)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
 at scala.Option.foreach(Option.scala:257)
 at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:855)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1923)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1878)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1867)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:671)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2029)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2069)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)
 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:926)
 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:924)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
 at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:924)
 at 
org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.apply$mcV$sp(KafkaWriter.scala:89)
 at 
org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.apply(KafkaWriter.scala:89)
 at 
org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.apply(KafkaWriter.scala:89)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
 at org.apache.spark.sql.kafka010.KafkaWriter$.write(KafkaWriter.scala:88)
 at org.apache.spark.sql.kafka010.KafkaSink.addBatch(KafkaSink.scala:38)
 

    On Monday, March 26, 2018, 9:34:06 AM PDT, Ted Yu  
wrote:  
 
 Can you post the stack trace for NetworkException (pastebin) ?

Please also check the broker logs to see if there was some clue around the
time this happened.

Thanks

On Mon, Mar 26, 2018 at 9:30 AM, M Singh 
wrote:

> Hi:
> I am working with spark 2.2.1 and spark kafka 0.10 client integration with
> Kafka brokers using 0.11.
> I get the exception - org.apache.kafka.common.errors.NetworkException:
> The server disconnected before a response was received - when the
> application is trying to write to a topic. This exception kills the spark
> application.
> Based on some similar issues I saw on the web I've added the following
> kafka configuration but it has not helped.
> acks = 0
> request.timeout.ms = 45000
> receive.buffer.bytes = 1024000
> I've posted this question to apache spark users list but have not received
> any response.  If anyone has any suggestion/pointers, please let me know.
> Thanks
>
    

Re: Apache Kafka / Spark Integration - Exception - The server disconnected before a response was received.

2018-03-26 Thread M Singh
Hi Ted:
Here is the exception trace (Note - The exception is occuring in the kafka 
spark writer class).

I will try to check broker logs.  Is there anything specific I should look for ?
Driver stacktrace:
 at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1708)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1696)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1695)
 at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
 at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1695)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
 at scala.Option.foreach(Option.scala:257)
 at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:855)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1923)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1878)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1867)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:671)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2029)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2069)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)
 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:926)
 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:924)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
 at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:924)
 at 
org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.apply$mcV$sp(KafkaWriter.scala:89)
 at 
org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.apply(KafkaWriter.scala:89)
 at 
org.apache.spark.sql.kafka010.KafkaWriter$$anonfun$write$1.apply(KafkaWriter.scala:89)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
 at org.apache.spark.sql.kafka010.KafkaWriter$.write(KafkaWriter.scala:88)
 at org.apache.spark.sql.kafka010.KafkaSink.addBatch(KafkaSink.scala:38)
 

On Monday, March 26, 2018, 9:34:06 AM PDT, Ted Yu  
wrote:  
 
 Can you post the stack trace for NetworkException (pastebin) ?

Please also check the broker logs to see if there was some clue around the
time this happened.

Thanks

On Mon, Mar 26, 2018 at 9:30 AM, M Singh 
wrote:

> Hi:
> I am working with spark 2.2.1 and spark kafka 0.10 client integration with
> Kafka brokers using 0.11.
> I get the exception - org.apache.kafka.common.errors.NetworkException:
> The server disconnected before a response was received - when the
> application is trying to write to a topic. This exception kills the spark
> application.
> Based on some similar issues I saw on the web I've added the following
> kafka configuration but it has not helped.
> acks = 0
> request.timeout.ms = 45000
> receive.buffer.bytes = 1024000
> I've posted this question to apache spark users list but have not received
> any response.  If anyone has any suggestion/pointers, please let me know.
> Thanks
>
  

Re: Apache Kafka / Spark Integration - Exception - The server disconnected before a response was received.

2018-03-26 Thread Ted Yu
Can you post the stack trace for NetworkException (pastebin) ?

Please also check the broker logs to see if there was some clue around the
time this happened.

Thanks

On Mon, Mar 26, 2018 at 9:30 AM, M Singh 
wrote:

> Hi:
> I am working with spark 2.2.1 and spark kafka 0.10 client integration with
> Kafka brokers using 0.11.
> I get the exception - org.apache.kafka.common.errors.NetworkException:
> The server disconnected before a response was received - when the
> application is trying to write to a topic. This exception kills the spark
> application.
> Based on some similar issues I saw on the web I've added the following
> kafka configuration but it has not helped.
> acks = 0
> request.timeout.ms = 45000
> receive.buffer.bytes = 1024000
> I've posted this question to apache spark users list but have not received
> any response.  If anyone has any suggestion/pointers, please let me know.
> Thanks
>