On 23/01/2025 14:34, Mark Thomas wrote:
<snip/>
All of that suggests that something detects an issue with this request
(or it just times out) which triggers the async error handling which
eventually leads to the async request being completed/dispatched.
In the case of the unit test, it was a client disconnect during async
processing that triggered the issue.
I suspect the NPE occurs more often on the CI system because it is
relatively heavily loaded. We often see that with rare errors that are
timing dependent.
As soon as I put my dev machine under load, I was able to recreate the
issue in my IDE.
The NPE is a Tomcat bug which will be fixed shortly and the fix included
in the next round of releases.
Now, the next thing that happens is even more surprising to me.
Sometimes the uncaught NPE triggers an HTTP 500 response on a request
to a servlet in application B! This I've been able to reproduce in my
development environment, if I simply throw an NPE in application A
from withing the AsyncContext as shown in the stacktrace above while I
simultaneously keep requests coming in to application B. Then it
sometimes happens. I haven't been able to see what's going on in
application B when this happens, but my trace logs don't show anything
suspicious. The client making requests to application B just get's
back this:
<!doctype html><html lang="en"><head><title>HTTP Status 500 – Internal
Server Error</title><style type="text/css">body
{font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b
{color:white;background-color:#525D76;} h1 {font-size:22px;} h2
{font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a
{color:black;} .line
{height:1px;background-color:#525D76;border:none;}</style></
head><body><h1>HTTP
Status 500 – Internal Server Error</h1><hr class="line"
/><p><b>Type</b> Exception Report</p><p><b>Description</b> The server
encountered an unexpected condition that prevented it from fulfilling
the request.</p><p><b>Exception</b></
p><pre>java.lang.NullPointerException
</pre><p><b>Note</b> The full stack trace of the root cause is
available in the server logs.</p><hr class="line" /><h3>Apache
Tomcat/10.1.28</h3></body></html>
Most likely because the application A thread has continued to use a
request/response that should have been recycled and reused for
application B. That said, I wouldn't rule out a Tomcat bug so it is
worth continuing to look at this.
It would be worth testing with a 10.1.x build if you can. You can either
build Tomcat from source or get a snapshot build from the CI system in a
couple of hours once it has build the latest 10.1.x
If this Tomcat bug was the cause of all the issues then great. That
said, it is usually the case that more than one thing is going wrong.
Fixing one thing just makes the next one easier to see. Do let us know
how you get on once the NPE issue is out of the picture.
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org