Re: Possible Http11NioProtocol regression since 9.0.48?

2021-07-06 Thread Mark Thomas

On 06/07/2021 10:01, Rémy Maucherat wrote:

On Tue, Jul 6, 2021 at 10:41 AM Mark Thomas  wrote:


On 05/07/2021 20:19, André van der Lugt wrote:


Mark, thank you for your efforts so far.


No problem. Happy to help.

At this point it is hard to tell where the root cause is. It is possible
that the Tomcat changes introduced a bug in Tomcat. It is also possible
that the Tomcat changes triggered a set of circumstances that exposed a
pre-existing bug elsewhere (Java, OS, network).


I can revert the changes again and we can figure out what's wrong in
10.x eventually.


I'm not sure there is anything wrong in Tomcat at this point.

- No-one else has reported the issue (yet).
- The issue doesn't occur with a direct connection to Tomcat.
- The network trace shows a lot of packet re-ordering which must be
  happening somewhere in the network - which makes me wonder what else
  the network is doing.

At this point I think we need more information before making a decision.

Mark



Rémy



As the next step, I'd suggest collecting a network trace from the Tomcat
VM at the same you collect a trace from the client. The aim of this
would be to determine if Tomcat / Java / OS/ VM is delaying those final
packets or if the delay is occurring somewhere in the network. By
comparing the two traces you should be able to narrow down where the
delay is happening.

Mark






Other questions that come to mind:

1. Was the network trace created from a client connected directly to
Tomcat or was there a proxy / firewall / something else between the
client and Tomcat?


Tomcat is running on the company network to which I'm connected
through company mandated VPN and who knows what else
equipment there is between me and Tomcat's VM.

Your question made me think: what if I try from a browser installed
on another VM on the company network? So I did from two different
machines and bingo ... the problem does not manifest itself.
Maybe I need to have a word with our IT?



2. Where was the trace collected from? On the client? On the server?
Somewhere else?


Trace was collected on the client.

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Possible Http11NioProtocol regression since 9.0.48?

2021-07-06 Thread Rémy Maucherat
On Tue, Jul 6, 2021 at 10:41 AM Mark Thomas  wrote:
>
> On 05/07/2021 20:19, André van der Lugt wrote:
> >
> > Mark, thank you for your efforts so far.
>
> No problem. Happy to help.
>
> At this point it is hard to tell where the root cause is. It is possible
> that the Tomcat changes introduced a bug in Tomcat. It is also possible
> that the Tomcat changes triggered a set of circumstances that exposed a
> pre-existing bug elsewhere (Java, OS, network).

I can revert the changes again and we can figure out what's wrong in
10.x eventually.

Rémy

>
> As the next step, I'd suggest collecting a network trace from the Tomcat
> VM at the same you collect a trace from the client. The aim of this
> would be to determine if Tomcat / Java / OS/ VM is delaying those final
> packets or if the delay is occurring somewhere in the network. By
> comparing the two traces you should be able to narrow down where the
> delay is happening.
>
> Mark
>
>
> >
> >>
> >> Other questions that come to mind:
> >>
> >> 1. Was the network trace created from a client connected directly to
> >> Tomcat or was there a proxy / firewall / something else between the
> >> client and Tomcat?
> >
> > Tomcat is running on the company network to which I'm connected
> > through company mandated VPN and who knows what else
> > equipment there is between me and Tomcat's VM.
> >
> > Your question made me think: what if I try from a browser installed
> > on another VM on the company network? So I did from two different
> > machines and bingo ... the problem does not manifest itself.
> > Maybe I need to have a word with our IT?
> >
> >>
> >> 2. Where was the trace collected from? On the client? On the server?
> >> Somewhere else?
> >
> > Trace was collected on the client.
> >
> > -
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Possible Http11NioProtocol regression since 9.0.48?

2021-07-06 Thread Mark Thomas

On 05/07/2021 20:19, André van der Lugt wrote:


Mark, thank you for your efforts so far.


No problem. Happy to help.

At this point it is hard to tell where the root cause is. It is possible 
that the Tomcat changes introduced a bug in Tomcat. It is also possible 
that the Tomcat changes triggered a set of circumstances that exposed a 
pre-existing bug elsewhere (Java, OS, network).


As the next step, I'd suggest collecting a network trace from the Tomcat 
VM at the same you collect a trace from the client. The aim of this 
would be to determine if Tomcat / Java / OS/ VM is delaying those final 
packets or if the delay is occurring somewhere in the network. By 
comparing the two traces you should be able to narrow down where the 
delay is happening.


Mark






Other questions that come to mind:

1. Was the network trace created from a client connected directly to
Tomcat or was there a proxy / firewall / something else between the
client and Tomcat?


Tomcat is running on the company network to which I'm connected
through company mandated VPN and who knows what else
equipment there is between me and Tomcat's VM.

Your question made me think: what if I try from a browser installed
on another VM on the company network? So I did from two different
machines and bingo ... the problem does not manifest itself.
Maybe I need to have a word with our IT?



2. Where was the trace collected from? On the client? On the server?
Somewhere else?


Trace was collected on the client.

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Possible Http11NioProtocol regression since 9.0.48?

2021-07-05 Thread André van der Lugt

Mark, thank you for your efforts so far.

> 
> Other questions that come to mind:
> 
> 1. Was the network trace created from a client connected directly to
> Tomcat or was there a proxy / firewall / something else between the
> client and Tomcat?

Tomcat is running on the company network to which I'm connected
through company mandated VPN and who knows what else
equipment there is between me and Tomcat's VM.

Your question made me think: what if I try from a browser installed
on another VM on the company network? So I did from two different
machines and bingo ... the problem does not manifest itself.
Maybe I need to have a word with our IT?

> 
> 2. Where was the trace collected from? On the client? On the server?
> Somewhere else?

Trace was collected on the client.

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Possible Http11NioProtocol regression since 9.0.48?

2021-07-05 Thread Mark Thomas

On 02/07/2021 13:09, Mark Thomas wrote:

On 02/07/2021 12:43, André van der Lugt wrote:

I finally managed to create a decrypted Wireshark capture with 
injected TLS session keys, will send it in a direct message due to 
size. I hope it provides the information needed.


Thanks. I have the file. I'll hopefully have time to look at this later 
today.


Thanks for the trace.

I've spent most of the day reviewing it with the following conclusions:

1. Out of the 10 streams included in the trace, it is stream 2 that is 
of interest.


2. There is a lot of out of order packets in steam 2 - to the point that 
Wireshark appears to be unable to follow the stream and decrypt 
everything (although that doesn't appear to be an issue at this stage).


3. The response pauses about 10KB from the end. After 10s there is a TCP 
keep alive ping/pong and after a further 4s the remaining 10KB is received.


My original theory (an inconsistent content-length header or chunk size) 
looks to be wrong.


Based on the observations above, it looks like Tomcat might not be 
flushing the final part of the response after the application has 
written it. It is only being flushed when the connection is closed.


I'll see if I can find a way to recreate the issue locally but what 
would be really helpful would be a test WAR that demonstrates the issue. 
Given this seems to be triggered by static resources, are you able to 
create a cut-down version of your app that exhibits the issue that you 
could share (privately if necessary)?


Other questions that come to mind:

1. Was the network trace created from a client connected directly to 
Tomcat or was there a proxy / firewall / something else between the 
client and Tomcat?


2. Where was the trace collected from? On the client? On the server? 
Somewhere else?


Thanks,

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Possible Http11NioProtocol regression since 9.0.48?

2021-07-02 Thread Mark Thomas

On 02/07/2021 12:43, André van der Lugt wrote:


I finally managed to create a decrypted Wireshark capture with injected TLS 
session keys, will send it in a direct message due to size. I hope it provides 
the information needed.


Thanks. I have the file. I'll hopefully have time to look at this later 
today.


Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Possible Http11NioProtocol regression since 9.0.48?

2021-07-02 Thread André van der Lugt


> -Original Message-
> From: Mark Thomas 
> Sent: Wednesday, June 30, 2021 15:22
> To: users@tomcat.apache.org
> Subject: Re: Possible Http11NioProtocol regression since 9.0.48?
> 
> On 30/06/2021 13:57, André van der Lugt wrote:
> > Hi,
> >
> > Since upgrading our Tomcat 9.0.x installation from 9.0.46 to 9.0.48, I've
> noticed
> > several times that the download of a response with static content from a
> web archive
> > (i.e. JavaScript file) takes 'keepAliveTimeout' time to complete.
> > It only occurs with SSL enabled Http11NioProtocol and useSendfile=false.
> >
> > Could this be a bug?
> 
> Possibly.
> 
>  > What is needed to further clarify this?
> 
> A decrypted network trace (e.g. from wireshark) showing the problematic
> connection.
> 
> You'll need to set the SSLKEYLOGFILE environment variable.
> 

I finally managed to create a decrypted Wireshark capture with injected TLS 
session keys, will send it in a direct message due to size. I hope it provides 
the information needed.

> How often does this occur? Is it on every request?
The bootstrap client program is loading a number of Javascript files, with only 
one of them exceeding 700 KB compressed size.

The first time after Tomcat start, everything is fine for all requests. When 
rerunning the bootstrap client program, the problem (taking 'keepAliveTimeout' 
to complete) occurs only for the 'big' Javascript file. And it stays like that 
with each next run.

> 
> > The issue is clearly visible in the browser's developer tools on tab Timing:
> > the Content Download phase takes its normal time to download the
> content but
> > also indicates "Caution: request is not finished yet!".
> 
> The suggests some sort of issue with the client determining the content
> length.
> 
> Nothing in the changelog jumps out at me as a potential cause for this
> at the moment.
> 
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fto
> mcat.apache.org%2Fmigration-
> 9.html%23Upgrading_9.0.xdata=04%7C01%7C%7C1d1fc0d34df5420b5
> 79f08d93bca19e3%7C84df9e7fe9f640afb435%7C1%7C0%7C637
> 606561467284838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=cX
> lU0Zg6Hwly7d4ZN0wUyfnstQ3UoNVnkdPCvwpvSmA%3Dreserved=0
> states this:
> > "In 9.0.48 onwards, the NIO poller was simplified and the block poller 
> > and
> >  selector configuration has been removed"
> > Could it be that the problem symptom is a consequence or side effect of
> these changes?
> 
> I wouldn't rule anything out at this point.
> 
> Mark
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Possible Http11NioProtocol regression since 9.0.48?

2021-06-30 Thread Mark Thomas

On 30/06/2021 13:57, André van der Lugt wrote:

Hi,

Since upgrading our Tomcat 9.0.x installation from 9.0.46 to 9.0.48, I've 
noticed
several times that the download of a response with static content from a web 
archive
(i.e. JavaScript file) takes 'keepAliveTimeout' time to complete.
It only occurs with SSL enabled Http11NioProtocol and useSendfile=false.

Could this be a bug?


Possibly.

> What is needed to further clarify this?

A decrypted network trace (e.g. from wireshark) showing the problematic 
connection.


You'll need to set the SSLKEYLOGFILE environment variable.

How often does this occur? Is it on every request?


The issue is clearly visible in the browser's developer tools on tab Timing:
the Content Download phase takes its normal time to download the content but
also indicates "Caution: request is not finished yet!".


The suggests some sort of issue with the client determining the content 
length.


Nothing in the changelog jumps out at me as a potential cause for this 
at the moment.



https://tomcat.apache.org/migration-9.html#Upgrading_9.0.x states this:
"In 9.0.48 onwards, the NIO poller was simplified and the block poller and
 selector configuration has been removed"
Could it be that the problem symptom is a consequence or side effect of these 
changes?


I wouldn't rule anything out at this point.

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org