> On July 21, 2015, 11:15 p.m., Gwen Shapira wrote:
> > core/src/main/scala/kafka/network/SocketServer.scala, line 465
> > <https://reviews.apache.org/r/36664/diff/1/?file=1018238#file1018238line465>
> >
> >     Turns out that catching Throwable is a really bad idea: 
> > https://www.sumologic.com/2014/05/05/why-you-should-never-catch-throwable-in-scala/

Ah... Didn't know that before. I explicitly listed the exceptions.


> On July 21, 2015, 11:15 p.m., Gwen Shapira wrote:
> > core/src/main/scala/kafka/network/SocketServer.scala, line 400
> > <https://reviews.apache.org/r/36664/diff/1/?file=1018238#file1018238line400>
> >
> >     So in case of unexpected exception, we log an error and keep running?
> >     
> >     Isn't it better to kill the processor, since we don't know what's the 
> > state of the system? If the acceptor keeps placing messages in the queue 
> > for a dead processor, isn't it a separate issue?

This part I'm not quite sure. I am not very experienced in the error handling 
in such case, so please correct me if I missed something.
Here is what I thought.

The way it currently works is that the acceptor will 
1. accept new connection request and create new socket channel
2. choose a processor and put the socket channel into the processor's new 
connection queue

The processor will just take the socket channels from the queue and register it 
to the selector.
If the processor runs and get an uncaught exception, there are several 
possibilities. 
Case 1: The exception was from one socket channel. 
Case 2: The exception was associated with a bad request. 
In case 1, ideally we should just disconnect that socket channel without 
affecting other socket channels.
In case 2, I think we should log the error and skip the message - assuming 
client will retry sending data if no response was received for a given peoriod 
of time.

I am not sure if letting processor exit is a good idea because this will lead 
to the result of a badly behaving client screw the entire cluster - it might 
screw processors one by one. Comparing with that, I kind of leaning towards 
keeping the processor running and serving other normal TCP connections if 
possible, but log the error so monitoring system can detect and see if human 
intervention is needed.

Also, I don't know what to do here to prevent the thread from exiting without 
catching all the throwables.
According to this blog 
http://www.tzavellas.com/techblog/2010/09/20/catching-throwable-in-scala/
I guess I can rethrow all the ControlThrowables, but intercept the rests?


- Jiangjie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36664/#review92496
-----------------------------------------------------------


On July 21, 2015, 11:03 p.m., Jiangjie Qin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36664/
> -----------------------------------------------------------
> 
> (Updated July 21, 2015, 11:03 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2353
>     https://issues.apache.org/jira/browse/KAFKA-2353
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> Catch exception in kafka.network.Processor to avoid socket leak and exiting 
> unexpectedly.
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/network/SocketServer.scala 
> 91319fa010b140cca632e5fa8050509bd2295fc9 
> 
> Diff: https://reviews.apache.org/r/36664/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jiangjie Qin
> 
>

Reply via email to