Hi all,

so I found the cause and fixed it.
In fact, when the connection aborted we ended up in a situation where we 
created a thread pool (worker pool for netty) which was never shutdown (as no 
channel was created where the handling was done with later on).
I created the PR https://github.com/apache/plc4x/pull/76 to develop.

If this PR gets accepted I suggest to create a bugfix release 0.4.1 as this is 
really an issue for us in production.

Any concerns with this approach?

Thanks!
Julian

On 2019/08/02 15:05:47, Julian Feinauer <j.feina...@pragmaticminds.de> wrote: 
> Hey,
> 
> agree @cdutz... I am just running an example and it really seems like that.
> So I'll try to finish a MWE and perhaps ask on the netty list : )
> 
> Julian
> 
> Am 02.08.19, 16:58 schrieb "Christofer Dutz" <christofer.d...@c-ware.de>:
> 
>     Hi Julian,
>     
>     Well if I look into my sock drawer at home I think we might be leaking 
> some socks ... I agree ... there are several single-socks in there ;-)
>     
>     But regarding netty ... yes it is absolutely possible we're not handling 
> this correctly as the docs are quite extensive and I didn't bother reading 
> all of them ;-)
>     
>     So perhaps we should read them or ask some Netty pro
>     
>     Chris
>     
>     Am 02.08.19, 16:50 schrieb "Julian Feinauer" 
> <j.feina...@pragmaticminds.de>:
>     
>         Hi all,
>         
>         we observe a strange behavior in production.
>         We are still investigating the exact scenario and it’s a bit complex 
> as we have many connections to many plcs and fire many requests through many 
> different channels…
>         But what we observe is that we get the well known “too many open 
> files” Exception ona linux server WHEN one of the plcs gets unreachable (pool 
> will try many times to recreate the connection).
>         
>         I just checked the Codebase for a Second and I think we are handling 
> the exceptions wrong (or not at all?).
>         If I understand it correctly from [1] (didn’t bother to check nettys 
> doc as its rather poor) we should close the socket somewhere but we ALWAYS do 
> super.exceptionCaught() which just propagates it upward in the channel 
> hierarchy but seems to NEVER close it.
>         
>         Am I wrong with that?
>         
>         We try to get create a MWE which reproduces that behavior to check if 
> we fix it like that.
>         
>         Best
>         Julian
>         
>         [1] https://www.baeldung.com/netty-exception-handling
>         
>     
>     
> 
> 

Reply via email to