Re: Spark Kafka API tries connecting to dead node for every batch, which increases the processing time

2017-10-16 Thread Cody Koeninger
Have you tried the 0.10 integration? I'm not sure how you would know whether a broker is up or down without attempting to connect to it. Do you have an alternative suggestion? Not sure how much interest there is in patches to the 0.8 integration at this point. On Mon, Oct 16, 2017 at 9:23 AM,

Re: Spark Kafka API tries connecting to dead node for every batch, which increases the processing time

2017-10-16 Thread Suprith T Jain
Yes I tried that. But it's not that effective. In fact kafka SimpleConsumer tries to reconnect in case of socket error (sendRequest method). So it ll always be twice the timeout for every window and for every node that is down. On 16-Oct-2017 7:34 PM, "Cody Koeninger"

Re: Spark Kafka API tries connecting to dead node for every batch, which increases the processing time

2017-10-16 Thread Cody Koeninger
Have you tried adjusting the timeout? On Mon, Oct 16, 2017 at 8:08 AM, Suprith T Jain wrote: > Hi guys, > > I have a 3 node cluster and i am running a spark streaming job. consider the > below example > > /*spark-submit* --master yarn-cluster --class >