On 23/01/2025 14:34, Mark Thomas wrote:

<snip/>

All of that suggests that something detects an issue with this request (or it just times out) which triggers the async error handling which eventually leads to the async request being completed/dispatched.

In the case of the unit test, it was a client disconnect during async processing that triggered the issue.

I suspect the NPE occurs more often on the CI system because it is relatively heavily loaded. We often see that with rare errors that are timing dependent.

As soon as I put my dev machine under load, I was able to recreate the issue in my IDE.

The NPE is a Tomcat bug which will be fixed shortly and the fix included in the next round of releases.

Now, the next thing that happens is even more surprising to me.
Sometimes the uncaught NPE triggers an HTTP 500 response on a request
to a servlet in application B! This I've been able to reproduce in my
development environment, if I simply throw an NPE in application A
from withing the AsyncContext as shown in the stacktrace above while I
simultaneously keep requests coming in to application B. Then it
sometimes happens. I haven't been able to see what's going on in
application B when this happens, but my trace logs don't show anything
suspicious. The client making requests to application B just get's
back this:
<!doctype html><html lang="en"><head><title>HTTP Status 500 – Internal
Server Error</title><style type="text/css">body
{font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b
{color:white;background-color:#525D76;} h1 {font-size:22px;} h2
{font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a
{color:black;} .line
{height:1px;background-color:#525D76;border:none;}</style></ head><body><h1>HTTP
Status 500 – Internal Server Error</h1><hr class="line"
/><p><b>Type</b> Exception Report</p><p><b>Description</b> The server
encountered an unexpected condition that prevented it from fulfilling
the request.</p><p><b>Exception</b></ p><pre>java.lang.NullPointerException
</pre><p><b>Note</b> The full stack trace of the root cause is
available in the server logs.</p><hr class="line" /><h3>Apache
Tomcat/10.1.28</h3></body></html>

Most likely because the application A thread has continued to use a request/response that should have been recycled and reused for application B. That said, I wouldn't rule out a Tomcat bug so it is worth continuing to look at this.

It would be worth testing with a 10.1.x build if you can. You can either build Tomcat from source or get a snapshot build from the CI system in a couple of hours once it has build the latest 10.1.x

If this Tomcat bug was the cause of all the issues then great. That said, it is usually the case that more than one thing is going wrong. Fixing one thing just makes the next one easier to see. Do let us know how you get on once the NPE issue is out of the picture.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to