Re: AW: Trying to fix the timeout issues ...

2023-03-17 Thread Cesar Garcia
Hello,

Indeed, it is a solution focused on the S7-400H(F)[1], also applicable to
the S7-1500H (but since I don't have the hardware, I'm not saying it
works). :-)

>From the video you can see that there are two PLCs in parallel, which
generates the HA for the entire hardware.

For the tests I have the S7-400 CPU, with two CP 443-1, which allows me to
emulate the S7-400H.

Going back to the point, the connection scheme monitors outgoing and
incoming traffic, this is because there is a subscription service that can
be synchronous (CYC) or asynchronous (ALRM), as well as internal services
to check the channel (PING).

I am currently incorporating all these features into the driver that were
removed at some point :-( .

Absolutely agree with Chris regarding functional tests for generic routines
(SPI, etc.), I'm just presenting this as a specific solution that might
help visualize other solutions.

Ok and so on...

My grain of sand.

1. https://www.youtube.com/watch?v=cwHHqCoGKeA

El vie, 17 mar 2023 a las 14:21, Łukasz Dywicki ()
escribió:

> If you have a look on test I pointed - its current weakness is that
> timeout do not occur if you do not have data packet incoming from
> network pipe.
>
> Present implementation simply needs working network to spot timeout
> which contradicts the timeout concept in general:
>
> https://github.com/apache/plc4x/blob/1045cf01a525acddf118ecfb1916df3f84921853/plc4j/spi/src/test/java/org/apache/plc4x/java/spi/Plc4xNettyWrapperTest.java#L94
>
> If you remove that line driver handler will never be notified about
> failure.
> Solution used by heyoulin/spnettec was to introduce watchdog thread on
> our end which is, fairly speaking, best way to do. Patch I made is there
> to define SPIs and cover its brief behavior with netty wrapper test. It
> will definitely help with i.e. ADS doing silent tcp_close on connections
> which result in no more plc traffic.
>
> Cesar's approach addresses a bit different issue which is oriented
> towards HA, not sure if channels are targeted towards same or different
> PLCs.
>
> Cheers,
> Łukasz
>
> On 17.03.2023 18:35, Christofer Dutz wrote:
> > Hi all,
> >
> > I mean … I would feel more comfortable, if we came up with a test that
> demonstrates the current implementations weaknesses. This way we can prove
> another implementation addresses that issue. Otherwise it just feels like
> we’re swapping one option with another without any idea if this really will
> be better.
> >
> > Who knows? Perhaps it addresses this one issue better but other
> situations in other usage-patterns worse?
> >
> > I think if we work on stuff like this, we should sort of start
> implementing tests.
> >
> > Chris
> >
> > Von: Cesar Garcia 
> > Datum: Freitag, 17. März 2023 um 18:03
> > An: dev@plc4x.apache.org 
> > Betreff: Re: Trying to fix the timeout issues ...
> > Hello everyone,
> >
> > For the S7HA version I used a different path for the timeout
> implementation.
> >
> >  / NIOChannel B
> > PLC4X -> EmbeddedChannel ->
> > \ NIOChannel A
> >
> > Channels A and B have timeout monitoring (IdleStateHandler) to handle
> > disconnection.
> >
> > The EmbeddedChannel implementation contains all the state machines
> required
> > to manage TCP/IP.  It also handles the FailOver between the channels.
> >
> > This way you don't have to recreate the connection from the client's
> point
> > of view (PLC4X).
> >
> > The client is told to disconnect when the TCP/IP channels A and B are
> > actually disconnected. But it does not destroy the main pipe.
> >
> > my grain of sand,
> >
> > El vie, 17 mar 2023 a las 3:09, Christofer Dutz (<
> christofer.d...@c-ware.de>)
> > escribió:
> >
> >> Hi all,
> >>
> >> I would like to address the timeout handling in our Java driver core
> next.
> >> Unfortunately, I am a bit unsure how to address that.
> >> Usually, I would whip up a Unit-Test that causes the error and then to
> fix
> >> it.
> >>
> >> However, I don’t quite know how to produce the problem that people are
> >> describing.
> >>
> >> Would anyone here be able to assist me with at least that? I’m happy to
> do
> >> the fixing.
> >> I just feel uncomfortable just swapping some code out with random other
> >> code.
> >>
> >>
> >> Chris
> >>
> >>
> >
> > --
> > *CEOS Automatización, C.A.*
> > *GALPON SERVICIO INDUSTRIALES Y NAVALES FA,

Re: AW: Trying to fix the timeout issues ...

2023-03-17 Thread Łukasz Dywicki
If you have a look on test I pointed - its current weakness is that 
timeout do not occur if you do not have data packet incoming from 
network pipe.


Present implementation simply needs working network to spot timeout 
which contradicts the timeout concept in general:

https://github.com/apache/plc4x/blob/1045cf01a525acddf118ecfb1916df3f84921853/plc4j/spi/src/test/java/org/apache/plc4x/java/spi/Plc4xNettyWrapperTest.java#L94

If you remove that line driver handler will never be notified about failure.
Solution used by heyoulin/spnettec was to introduce watchdog thread on 
our end which is, fairly speaking, best way to do. Patch I made is there 
to define SPIs and cover its brief behavior with netty wrapper test. It 
will definitely help with i.e. ADS doing silent tcp_close on connections 
which result in no more plc traffic.


Cesar's approach addresses a bit different issue which is oriented 
towards HA, not sure if channels are targeted towards same or different 
PLCs.


Cheers,
Łukasz

On 17.03.2023 18:35, Christofer Dutz wrote:

Hi all,

I mean … I would feel more comfortable, if we came up with a test that 
demonstrates the current implementations weaknesses. This way we can prove 
another implementation addresses that issue. Otherwise it just feels like we’re 
swapping one option with another without any idea if this really will be better.

Who knows? Perhaps it addresses this one issue better but other situations in 
other usage-patterns worse?

I think if we work on stuff like this, we should sort of start implementing 
tests.

Chris

Von: Cesar Garcia 
Datum: Freitag, 17. März 2023 um 18:03
An: dev@plc4x.apache.org 
Betreff: Re: Trying to fix the timeout issues ...
Hello everyone,

For the S7HA version I used a different path for the timeout implementation.

 / NIOChannel B
PLC4X -> EmbeddedChannel ->
\ NIOChannel A

Channels A and B have timeout monitoring (IdleStateHandler) to handle
disconnection.

The EmbeddedChannel implementation contains all the state machines required
to manage TCP/IP.  It also handles the FailOver between the channels.

This way you don't have to recreate the connection from the client's point
of view (PLC4X).

The client is told to disconnect when the TCP/IP channels A and B are
actually disconnected. But it does not destroy the main pipe.

my grain of sand,

El vie, 17 mar 2023 a las 3:09, Christofer Dutz ()
escribió:


Hi all,

I would like to address the timeout handling in our Java driver core next.
Unfortunately, I am a bit unsure how to address that.
Usually, I would whip up a Unit-Test that causes the error and then to fix
it.

However, I don’t quite know how to produce the problem that people are
describing.

Would anyone here be able to assist me with at least that? I’m happy to do
the fixing.
I just feel uncomfortable just swapping some code out with random other
code.


Chris




--
*CEOS Automatización, C.A.*
*GALPON SERVICIO INDUSTRIALES Y NAVALES FA, C.A.,*
*PISO 1, OFICINA 2, AV. RAUL LEONI, SECTOR GUAMACHITO,*

*FRENTE A LA ASOCIACION DE GANADEROS,BARCELONA,EDO. ANZOATEGUI*
*Ing. César García*

*Cel: +58 414-760.98.95*

*Hotline Técnica SIEMENS: 0800 1005080*

*Email: support.aan.automat...@siemens.com
*



AW: Trying to fix the timeout issues ...

2023-03-17 Thread Christofer Dutz
Hi all,

I mean … I would feel more comfortable, if we came up with a test that 
demonstrates the current implementations weaknesses. This way we can prove 
another implementation addresses that issue. Otherwise it just feels like we’re 
swapping one option with another without any idea if this really will be better.

Who knows? Perhaps it addresses this one issue better but other situations in 
other usage-patterns worse?

I think if we work on stuff like this, we should sort of start implementing 
tests.

Chris

Von: Cesar Garcia 
Datum: Freitag, 17. März 2023 um 18:03
An: dev@plc4x.apache.org 
Betreff: Re: Trying to fix the timeout issues ...
Hello everyone,

For the S7HA version I used a different path for the timeout implementation.

/ NIOChannel B
PLC4X -> EmbeddedChannel ->
   \ NIOChannel A

Channels A and B have timeout monitoring (IdleStateHandler) to handle
disconnection.

The EmbeddedChannel implementation contains all the state machines required
to manage TCP/IP.  It also handles the FailOver between the channels.

This way you don't have to recreate the connection from the client's point
of view (PLC4X).

The client is told to disconnect when the TCP/IP channels A and B are
actually disconnected. But it does not destroy the main pipe.

my grain of sand,

El vie, 17 mar 2023 a las 3:09, Christofer Dutz ()
escribió:

> Hi all,
>
> I would like to address the timeout handling in our Java driver core next.
> Unfortunately, I am a bit unsure how to address that.
> Usually, I would whip up a Unit-Test that causes the error and then to fix
> it.
>
> However, I don’t quite know how to produce the problem that people are
> describing.
>
> Would anyone here be able to assist me with at least that? I’m happy to do
> the fixing.
> I just feel uncomfortable just swapping some code out with random other
> code.
>
>
> Chris
>
>

--
*CEOS Automatización, C.A.*
*GALPON SERVICIO INDUSTRIALES Y NAVALES FA, C.A.,*
*PISO 1, OFICINA 2, AV. RAUL LEONI, SECTOR GUAMACHITO,*

*FRENTE A LA ASOCIACION DE GANADEROS,BARCELONA,EDO. ANZOATEGUI*
*Ing. César García*

*Cel: +58 414-760.98.95*

*Hotline Técnica SIEMENS: 0800 1005080*

*Email: support.aan.automat...@siemens.com
*


Re: Trying to fix the timeout issues ...

2023-03-17 Thread Cesar Garcia
Hello everyone,

For the S7HA version I used a different path for the timeout implementation.

/ NIOChannel B
PLC4X -> EmbeddedChannel ->
   \ NIOChannel A

Channels A and B have timeout monitoring (IdleStateHandler) to handle
disconnection.

The EmbeddedChannel implementation contains all the state machines required
to manage TCP/IP.  It also handles the FailOver between the channels.

This way you don't have to recreate the connection from the client's point
of view (PLC4X).

The client is told to disconnect when the TCP/IP channels A and B are
actually disconnected. But it does not destroy the main pipe.

my grain of sand,

El vie, 17 mar 2023 a las 3:09, Christofer Dutz ()
escribió:

> Hi all,
>
> I would like to address the timeout handling in our Java driver core next.
> Unfortunately, I am a bit unsure how to address that.
> Usually, I would whip up a Unit-Test that causes the error and then to fix
> it.
>
> However, I don’t quite know how to produce the problem that people are
> describing.
>
> Would anyone here be able to assist me with at least that? I’m happy to do
> the fixing.
> I just feel uncomfortable just swapping some code out with random other
> code.
>
>
> Chris
>
>

-- 
*CEOS Automatización, C.A.*
*GALPON SERVICIO INDUSTRIALES Y NAVALES FA, C.A.,*
*PISO 1, OFICINA 2, AV. RAUL LEONI, SECTOR GUAMACHITO,*

*FRENTE A LA ASOCIACION DE GANADEROS,BARCELONA,EDO. ANZOATEGUI*
*Ing. César García*

*Cel: +58 414-760.98.95*

*Hotline Técnica SIEMENS: 0800 1005080*

*Email: support.aan.automat...@siemens.com
*


Re: Trying to fix the timeout issues ...

2023-03-17 Thread youlin he
PR 822 can handle base netty timeout very well.

But CachedConnection should have it's own timeout handle

This is not the same.

Most of the connection problems are caused by the base netty connection
that not deal bery well. @Łukasz Dywicki  have fixed response timeout
problem。

Like I pointed out on the https://github.com/apache/plc4x/issues/801 issue,
there are other netty connection issues that cause unpredictable errors


Łukasz Dywicki  于2023年3月17日周五 19:25写道:

> Hey Chris,
> We do have one, fairly basic Plc4xNettyWrapperTest which covers timeout
> handling. The PR I made https://github.com/apache/plc4x/pull/822 was
> spotted by this test and showed that we had behavior change.
>
> One thing which I still haven't found answer to is how to bind lifecycle
> of timeout manager with connection and how to let end user pass external
> timeout manager into driver (so he can centrally track degraded
> operations).
>
> Best,
> Łukasz
>
> On 17.03.2023 08:09, Christofer Dutz wrote:
> > Hi all,
> >
> > I would like to address the timeout handling in our Java driver core
> next.
> > Unfortunately, I am a bit unsure how to address that.
> > Usually, I would whip up a Unit-Test that causes the error and then to
> fix it.
> >
> > However, I don’t quite know how to produce the problem that people are
> describing.
> >
> > Would anyone here be able to assist me with at least that? I’m happy to
> do the fixing.
> > I just feel uncomfortable just swapping some code out with random other
> code.
> >
> >
> > Chris
> >
> >
>


Re: Trying to fix the timeout issues ...

2023-03-17 Thread Łukasz Dywicki

Hey Chris,
We do have one, fairly basic Plc4xNettyWrapperTest which covers timeout 
handling. The PR I made https://github.com/apache/plc4x/pull/822 was 
spotted by this test and showed that we had behavior change.


One thing which I still haven't found answer to is how to bind lifecycle 
of timeout manager with connection and how to let end user pass external 
timeout manager into driver (so he can centrally track degraded operations).


Best,
Łukasz

On 17.03.2023 08:09, Christofer Dutz wrote:

Hi all,

I would like to address the timeout handling in our Java driver core next.
Unfortunately, I am a bit unsure how to address that.
Usually, I would whip up a Unit-Test that causes the error and then to fix it.

However, I don’t quite know how to produce the problem that people are 
describing.

Would anyone here be able to assist me with at least that? I’m happy to do the 
fixing.
I just feel uncomfortable just swapping some code out with random other code.


Chris




Trying to fix the timeout issues ...

2023-03-17 Thread Christofer Dutz
Hi all,

I would like to address the timeout handling in our Java driver core next.
Unfortunately, I am a bit unsure how to address that.
Usually, I would whip up a Unit-Test that causes the error and then to fix it.

However, I don’t quite know how to produce the problem that people are 
describing.

Would anyone here be able to assist me with at least that? I’m happy to do the 
fixing.
I just feel uncomfortable just swapping some code out with random other code.


Chris