As I mentioned in the PR,

Are we sure the callback is only called on failure on connection layer? I 
wouldn't like PLC4X to kill the worker when communicating with a non-standard 
PLC hence one of our protocol layers firing an error while processing the data.

Chris

Am 04.08.19, 20:26 schrieb "Julian Feinauer" <jfeina...@apache.org>:

    Hi all,
    
    so I found the cause and fixed it.
    In fact, when the connection aborted we ended up in a situation where we 
created a thread pool (worker pool for netty) which was never shutdown (as no 
channel was created where the handling was done with later on).
    I created the PR https://github.com/apache/plc4x/pull/76 to develop.
    
    If this PR gets accepted I suggest to create a bugfix release 0.4.1 as this 
is really an issue for us in production.
    
    Any concerns with this approach?
    
    Thanks!
    Julian
    
    On 2019/08/02 15:05:47, Julian Feinauer <j.feina...@pragmaticminds.de> 
wrote: 
    > Hey,
    > 
    > agree @cdutz... I am just running an example and it really seems like 
that.
    > So I'll try to finish a MWE and perhaps ask on the netty list : )
    > 
    > Julian
    > 
    > Am 02.08.19, 16:58 schrieb "Christofer Dutz" <christofer.d...@c-ware.de>:
    > 
    >     Hi Julian,
    >     
    >     Well if I look into my sock drawer at home I think we might be 
leaking some socks ... I agree ... there are several single-socks in there ;-)
    >     
    >     But regarding netty ... yes it is absolutely possible we're not 
handling this correctly as the docs are quite extensive and I didn't bother 
reading all of them ;-)
    >     
    >     So perhaps we should read them or ask some Netty pro
    >     
    >     Chris
    >     
    >     Am 02.08.19, 16:50 schrieb "Julian Feinauer" 
<j.feina...@pragmaticminds.de>:
    >     
    >         Hi all,
    >         
    >         we observe a strange behavior in production.
    >         We are still investigating the exact scenario and it’s a bit 
complex as we have many connections to many plcs and fire many requests through 
many different channels…
    >         But what we observe is that we get the well known “too many open 
files” Exception ona linux server WHEN one of the plcs gets unreachable (pool 
will try many times to recreate the connection).
    >         
    >         I just checked the Codebase for a Second and I think we are 
handling the exceptions wrong (or not at all?).
    >         If I understand it correctly from [1] (didn’t bother to check 
nettys doc as its rather poor) we should close the socket somewhere but we 
ALWAYS do super.exceptionCaught() which just propagates it upward in the 
channel hierarchy but seems to NEVER close it.
    >         
    >         Am I wrong with that?
    >         
    >         We try to get create a MWE which reproduces that behavior to 
check if we fix it like that.
    >         
    >         Best
    >         Julian
    >         
    >         [1] https://www.baeldung.com/netty-exception-handling
    >         
    >     
    >     
    > 
    > 
    

Reply via email to