Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
Mark and Dan, On 6/21/23 04:57, Mark Thomas wrote: On 20/06/2023 17:12, Dan McLaughlin wrote: Mark, What are your thoughts on changing the Tomcat codebase to return a 503 instead of a 404 if a context is marked as distributable or if clustering is enabled and deployed but stopped? When I did searches years ago on this issue, most people at the time would recommend adding 404 to the fail_on_status, which is what we did...until I realized that we were causing our own internal DOS attack when we had a 404 mistakenly left in our apps; that got me thinking how easy it would be to make mod_jk thrash by just requesting pages that didn't exist. It's not a huge issue for us since most of our apps are authenticated using SAML, so all requests are intercepted before the request is ever sent to Tomcat, but for our apps that don't require authentication, it would be easy to exploit any app that had 404 in the fail_on_status. I think the problem is the "STOPPED" state is used by different users for different things. Some want it to be equivalent to "The application isn't deployed" while others want it to be equivalent to "The application is present but currently under maintenance". I don't think we can safely infer which of those behaviors the user wants from the clustering and/or distributable settings. I think the best solution is the "maintenance in progress" servlet deployed in the ROOT web application. Other options I considered: 1. New Lifecycle state "MAINTENANCE". This would be a significant change and I don't think the size of the problem justifies the scale of the changes required. 2. Extending/enhancing the "pause" feature. Not really the right place to start as pausing a context doesn't allow it to be updated (assuming updates are the main reason for the maintenance). 3. A per Host configuration option to set the status to be used for deployed but stopped web applications. Defaults to 404. Could be configured to be 503. Would require some changes to the mapper to add/remove contexts on deploy/undeploy rather than start/stop. Actually, this is a significant behavioural change since it changes the mapping. And the rewrite valve may complicate things further. The more I think about this, the more nervous I get about changes like this introducing regressions. I come back to the "maintenance in progress" servlet deployed in the ROOT web application. The one use case this doesn't cover is maintenance of the ROOT web application. Currently Tomcat is hard-coded to return a 404 if a request would be mapped to ROOT but that application isn't started. I think a request to make that status configurable would be implemented pretty quickly. If you want to remove the node from the load-balancer, why not ... just do it? You can't test an application without it being deployed, and taking a node down for maintenance can (and IMHO) should include notifying the load-balancer that that node is coming down for maintenance. Otherwise, you'll bounce users off the node unnecessarily. Since mod_jk is being used, why not simply change the state of the node-worker in mod_jk from ACTIVE to DISABLED (for testing, since requests with that node as a target will continue to go to it) or STOPPED (where mod_jk won't send any requests to it anymore? http://home.apache.org/~schultz/ApacheCon%20NA%202015/Load-balancing%20Tomcat%20with%20mod_jk.pdf Start on Slide 41 -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
On 20/06/2023 17:12, Dan McLaughlin wrote: Mark, What are your thoughts on changing the Tomcat codebase to return a 503 instead of a 404 if a context is marked as distributable or if clustering is enabled and deployed but stopped? When I did searches years ago on this issue, most people at the time would recommend adding 404 to the fail_on_status, which is what we did...until I realized that we were causing our own internal DOS attack when we had a 404 mistakenly left in our apps; that got me thinking how easy it would be to make mod_jk thrash by just requesting pages that didn't exist. It's not a huge issue for us since most of our apps are authenticated using SAML, so all requests are intercepted before the request is ever sent to Tomcat, but for our apps that don't require authentication, it would be easy to exploit any app that had 404 in the fail_on_status. I think the problem is the "STOPPED" state is used by different users for different things. Some want it to be equivalent to "The application isn't deployed" while others want it to be equivalent to "The application is present but currently under maintenance". I don't think we can safely infer which of those behaviors the user wants from the clustering and/or distributable settings. I think the best solution is the "maintenance in progress" servlet deployed in the ROOT web application. Other options I considered: 1. New Lifecycle state "MAINTENANCE". This would be a significant change and I don't think the size of the problem justifies the scale of the changes required. 2. Extending/enhancing the "pause" feature. Not really the right place to start as pausing a context doesn't allow it to be updated (assuming updates are the main reason for the maintenance). 3. A per Host configuration option to set the status to be used for deployed but stopped web applications. Defaults to 404. Could be configured to be 503. Would require some changes to the mapper to add/remove contexts on deploy/undeploy rather than start/stop. Actually, this is a significant behavioural change since it changes the mapping. And the rewrite valve may complicate things further. The more I think about this, the more nervous I get about changes like this introducing regressions. I come back to the "maintenance in progress" servlet deployed in the ROOT web application. The one use case this doesn't cover is maintenance of the ROOT web application. Currently Tomcat is hard-coded to return a 404 if a request would be mapped to ROOT but that application isn't started. I think a request to make that status configurable would be implemented pretty quickly. Mark -- Thanks, Dan On Tue, Jun 20, 2023 at 10:41 AM Dan McLaughlin wrote: We typically don't deploy a ROOT context in our production environments--for no other reason than making it more difficult to poke around. I'll look at that as an option. Thanks for the tips. -- Thanks, Dan On Tue, Jun 20, 2023 at 10:28 AM Mark Thomas wrote: On 20/06/2023 15:41, Dan McLaughlin wrote: So I tried to create a Valve to check to see if the application is stopped and convert the 404 response to a 503, but I haven't had any luck getting it to work. Is there another internal API that I should be using? context.getState().isAvailable ways seems to report the app is available even though it's stopped. The code is looking at the wrong Context. Since the web application has been stopped the request won't be mapped to it. I'm guessing the request has been mapped to the root context which is available. You'll need to do something like: Container[] containers = request.getHost().findChildren(); for (Container container : containers) { if (container.getState().isAvailable()) { continue; } Context context = (Context) container; if (request.getDecodedRequestURI().equals(context.getPath()) || request.getDecodedRequestURI().startsWith( context.getPath() + '/')) { response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); } } I haven't optimised this at all. It isn't particularly efficient. It is just to give you an idea. Actually. I have just had a much better idea. It works by taking advantage of the Servlet specification mapping rules which require the longest context path match. Lets assume you have /app1 /app2 and /app3 In your ROOT web application create a maintenance Servlet that just returns a 503 and map it to "/app1/*" "/app2/*" and /app3/*". If app1 is running, the longest context path match rule means it will be mapped to /app1 and the application will handle it. If the web application is stopped, the request will be mapped to ROOT where it will match the maintenance Servlet and return a 503. The only thing that this won't work for is if you want to take the RROT web application out of service. Mark import org.apache.catalina.*; import org.apache.catalina.connector.Request; import
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
Dan, On 6/20/23 11:32, Dan McLaughlin wrote: When I attach with a debugger, I can see what's causing it not to work. When the Web Application is started, then request.getContext(); returns the correct Web Application context, but when the application is stopped, request.getContext(); returns the ROOT context, which is up, so the 404 is passed on. Why would request.getContext(); return ROOT if that wasn't the requested context? Is this a bug? I know you posted a lot of messages in a short amount of time, and maybe you've moved-on from this question, but.. How does Tomcat know the difference between a request for /foo/bar and /bar/foo if there is a ROOT application and a /foo application? What happens when /foo is stopped, undeployed, etc.? Why *wouldn't* ROOT handle all requests that don't go to another application? URL paths dictate which application handles a particular request, and the ROOT application, by definition, handles requests that don't map to another application. -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
FYI... Here is the valve I finally came up with that seems to work. import org.apache.catalina.*; import org.apache.catalina.connector.Request; import org.apache.catalina.connector.Response; import org.apache.catalina.valves.ValveBase; import jakarta.servlet.ServletException; import java.io.IOException; import java.util.logging.Logger; import java.util.logging.Level; import jakarta.servlet.http.HttpServletResponse; public class DownForMaintenanceValve extends ValveBase { // Create a Logger instance to log activity private static final Logger log = Logger.getLogger(DownForMaintenanceValve.class.getName()); // Constructor logs that the valve has been instantiated public DownForMaintenanceValve() { log.info("DownForMaintenanceValve started"); } // Main method of the Valve, where the logic is implemented @Override public void invoke(Request request, Response response) throws IOException, ServletException { // Get the Context of the request Context context = request.getContext(); // If the context is null, log an info message and send a 503 error if (context == null) { log.info("Context is null, sending 503"); response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); return; // Stop further execution } // If the context is not available, log an info message and send a 503 error if (!context.getState().isAvailable()) { log.info("Application is not available, sending 503"); response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); } else { // If the context is available, get all contexts (children of the host) Container[] containers = request.getHost().findChildren(); // Iterate over all contexts for (Container container : containers) { // If the current context is available, skip the rest of the loop if (container.getState().isAvailable()) { continue; } // Cast the container to Context to be able to call Context methods context = (Context) container; // If the request URI matches the path of the context or is a subpath of the context, // log an info message and send a 503 error if (request.getDecodedRequestURI().equals(context.getPath()) || request.getDecodedRequestURI().startsWith(context.getPath() + '/')) { log.info("Application is not available, sending 503"); response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); return; // Stop further execution } } // If no unavailable context matching the request URI was found, log a fine message // and pass the request to the next Valve log.info("Application is available, passing to next valve"); getNext().invoke(request, response); } } } -- Thanks, Dan On Tue, Jun 20, 2023 at 12:15 PM Dan McLaughlin wrote: > > One thing I just tested was to undeploy the ROOT context, which is how > we run anyways, and this causes request.getContext() to return null, > which with the code, as is, results in a null pointer and a 500 being > thrown--which inadvertently would cause mod_jk to retry on another > node. I don't like letting code knowingly throw null pointers, so I > was thinking of just checking if the context is null and throwing a > 503. The only problem is that the valve would only work when the ROOT > context wasn't deployed, so your two other suggestions would be the > only options. > > Mark, > > I've been considering opening an official enhancement request to the > clustering implementation in Tomcat that would state the following... > > Currently, when an application within a clustered environment is > unavailable or stopped, Tomcat returns an HTTP 404 (Not Found) status > code. While this behavior is generally acceptable in a non-clustered > environment, it can lead to less than optimal routing decisions by > load balancers within a clustered setup. > > Most load balancers, including mod_jk, do not interpret a 404 status > code as an indication of application unavailability warranting a > failover. Moreover, reconfiguring load balancers to treat 404 codes as > triggers for failover could potentially expose systems to DOS attacks, > as malicious users could generate unnecessary failovers by requesting > non-existent resources. > > While there are workarounds to this issue, such as creating a custom > valve to check the application status and modifying the 404 to a 503, > or using root context and servlet mappings to return a 503, these > solutions require custom implementations by the end user. This adds > complexity and is not an ideal solution. > > In light of this, I propose that Tomcat should return an HTTP 503 > (Service Unavailable) status code when an application is not available > in a clustered environment. The 503 code, which signifies temporary > unavailability of the application, would align more accurately with > the circumstances and could enable load balancers to make more > informed and effective routing decisions. > > Thoughts? > > -- > > Thanks, > Dan > > > -- > > Thanks, > > Dan McLaughlin > > Robert Clay Vineyards > > > Proprietor/Vigneron > > d...@robertclayvineyards.com > > > mobile: 512.633.8086 > > main:
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
One thing I just tested was to undeploy the ROOT context, which is how we run anyways, and this causes request.getContext() to return null, which with the code, as is, results in a null pointer and a 500 being thrown--which inadvertently would cause mod_jk to retry on another node. I don't like letting code knowingly throw null pointers, so I was thinking of just checking if the context is null and throwing a 503. The only problem is that the valve would only work when the ROOT context wasn't deployed, so your two other suggestions would be the only options. Mark, I've been considering opening an official enhancement request to the clustering implementation in Tomcat that would state the following... Currently, when an application within a clustered environment is unavailable or stopped, Tomcat returns an HTTP 404 (Not Found) status code. While this behavior is generally acceptable in a non-clustered environment, it can lead to less than optimal routing decisions by load balancers within a clustered setup. Most load balancers, including mod_jk, do not interpret a 404 status code as an indication of application unavailability warranting a failover. Moreover, reconfiguring load balancers to treat 404 codes as triggers for failover could potentially expose systems to DOS attacks, as malicious users could generate unnecessary failovers by requesting non-existent resources. While there are workarounds to this issue, such as creating a custom valve to check the application status and modifying the 404 to a 503, or using root context and servlet mappings to return a 503, these solutions require custom implementations by the end user. This adds complexity and is not an ideal solution. In light of this, I propose that Tomcat should return an HTTP 503 (Service Unavailable) status code when an application is not available in a clustered environment. The 503 code, which signifies temporary unavailability of the application, would align more accurately with the circumstances and could enable load balancers to make more informed and effective routing decisions. Thoughts? -- Thanks, Dan -- Thanks, Dan McLaughlin Robert Clay Vineyards Proprietor/Vigneron d...@robertclayvineyards.com mobile: 512.633.8086 main: 325.261.0075 https://robertclayvineyards.com Facebook | Instagram On Tue, Jun 20, 2023 at 10:28 AM Mark Thomas wrote: > > On 20/06/2023 15:41, Dan McLaughlin wrote: > > So I tried to create a Valve to check to see if the application is stopped > > and convert the 404 response to a 503, but I haven't had any luck getting > > it to work. Is there another internal API that I should be using? > > context.getState().isAvailable > > ways seems to report the app is available even though it's stopped. > > The code is looking at the wrong Context. Since the web application has > been stopped the request won't be mapped to it. I'm guessing the request > has been mapped to the root context which is available. > > You'll need to do something like: > > Container[] containers = request.getHost().findChildren(); > for (Container container : containers) { > if (container.getState().isAvailable()) { > continue; > } > Context context = (Context) container; > if (request.getDecodedRequestURI().equals(context.getPath()) || > request.getDecodedRequestURI().startsWith( > context.getPath() + '/')) { > response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); > } > } > > I haven't optimised this at all. It isn't particularly efficient. It is > just to give you an idea. > > Actually. I have just had a much better idea. It works by taking > advantage of the Servlet specification mapping rules which require the > longest context path match. > > Lets assume you have /app1 /app2 and /app3 > > In your ROOT web application create a maintenance Servlet that just > returns a 503 and map it to "/app1/*" "/app2/*" and /app3/*". > > If app1 is running, the longest context path match rule means it will be > mapped to /app1 and the application will handle it. If the web > application is stopped, the request will be mapped to ROOT where it will > match the maintenance Servlet and return a 503. > > The only thing that this won't work for is if you want to take the RROT > web application out of service. > > Mark > > > > import org.apache.catalina.*; > > import org.apache.catalina.connector.Request; > > import org.apache.catalina.connector.Response; > > import org.apache.catalina.valves.ValveBase; > > > > import jakarta.servlet.ServletException; > > import java.io.IOException; > > import java.util.logging.Logger; > > import java.util.logging.Level; > > > > public class DownForMaintenanceValve extends ValveBase { > > > > // Create a Logger > > private static final Logger log = Logger.getLogger(DownForMaintenanceValve. > > class.getName()); > > > > public DownForMaintenanceValve() { > >
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
Mark, What are your thoughts on changing the Tomcat codebase to return a 503 instead of a 404 if a context is marked as distributable or if clustering is enabled and deployed but stopped? When I did searches years ago on this issue, most people at the time would recommend adding 404 to the fail_on_status, which is what we did...until I realized that we were causing our own internal DOS attack when we had a 404 mistakenly left in our apps; that got me thinking how easy it would be to make mod_jk thrash by just requesting pages that didn't exist. It's not a huge issue for us since most of our apps are authenticated using SAML, so all requests are intercepted before the request is ever sent to Tomcat, but for our apps that don't require authentication, it would be easy to exploit any app that had 404 in the fail_on_status. -- Thanks, Dan On Tue, Jun 20, 2023 at 10:41 AM Dan McLaughlin wrote: > > We typically don't deploy a ROOT context in our production environments--for > no other reason than making it more difficult to poke around. I'll look at > that as an option. Thanks for the tips. > > -- > > Thanks, > Dan > > > On Tue, Jun 20, 2023 at 10:28 AM Mark Thomas wrote: >> >> On 20/06/2023 15:41, Dan McLaughlin wrote: >> > So I tried to create a Valve to check to see if the application is stopped >> > and convert the 404 response to a 503, but I haven't had any luck getting >> > it to work. Is there another internal API that I should be using? >> > context.getState().isAvailable >> > ways seems to report the app is available even though it's stopped. >> >> The code is looking at the wrong Context. Since the web application has >> been stopped the request won't be mapped to it. I'm guessing the request >> has been mapped to the root context which is available. >> >> You'll need to do something like: >> >> Container[] containers = request.getHost().findChildren(); >> for (Container container : containers) { >> if (container.getState().isAvailable()) { >> continue; >> } >> Context context = (Context) container; >> if (request.getDecodedRequestURI().equals(context.getPath()) || >> request.getDecodedRequestURI().startsWith( >> context.getPath() + '/')) { >> response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); >> } >> } >> >> I haven't optimised this at all. It isn't particularly efficient. It is >> just to give you an idea. >> >> Actually. I have just had a much better idea. It works by taking >> advantage of the Servlet specification mapping rules which require the >> longest context path match. >> >> Lets assume you have /app1 /app2 and /app3 >> >> In your ROOT web application create a maintenance Servlet that just >> returns a 503 and map it to "/app1/*" "/app2/*" and /app3/*". >> >> If app1 is running, the longest context path match rule means it will be >> mapped to /app1 and the application will handle it. If the web >> application is stopped, the request will be mapped to ROOT where it will >> match the maintenance Servlet and return a 503. >> >> The only thing that this won't work for is if you want to take the RROT >> web application out of service. >> >> Mark >> >> >> > import org.apache.catalina.*; >> > import org.apache.catalina.connector.Request; >> > import org.apache.catalina.connector.Response; >> > import org.apache.catalina.valves.ValveBase; >> > >> > import jakarta.servlet.ServletException; >> > import java.io.IOException; >> > import java.util.logging.Logger; >> > import java.util.logging.Level; >> > >> > public class DownForMaintenanceValve extends ValveBase { >> > >> > // Create a Logger >> > private static final Logger log = Logger.getLogger(DownForMaintenanceValve. >> > class.getName()); >> > >> > public DownForMaintenanceValve() { >> > log.info("DownForMaintenanceValve started"); >> > } >> > >> > @Override >> > public void invoke(Request request, Response response) throws >> > IOException, ServletException >> > { >> > Context context = request.getContext(); >> > if (!context.getState().isAvailable()) { >> > log.info("Application is not available, sending 503"); >> > response.sendError(503); >> > } else { >> > log.fine("Application is available, passing to next valve"); >> > getNext().invoke(request, response); >> > } >> > } >> > } >> > >> > >> > -- >> > >> > Thanks, >> > Dan >> > >> > On Wed, Jun 14, 2023 at 2:32 PM Mark Thomas wrote: >> > >> >> On 14/06/2023 19:49, Dan McLaughlin wrote: >> >>> Hello, >> >>> >> >>> This is probably a question that would be better suited for the dev list, >> >>> but I thought I'd start here first. >> >> >> >> That depends. It is generally better to start on the users list. >> >> >> >>> Does anyone understand the reasoning behind why Tomcat, when clustered, >> >>> throws an HTTP status 404 and not a 503 when you have an application >> >>> deployed but stopped or paused? >> >> >> >> The issue you describe only affects stopped applications. If an >> >>
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
We typically don't deploy a ROOT context in our production environments--for no other reason than making it more difficult to poke around. I'll look at that as an option. Thanks for the tips. -- Thanks, Dan On Tue, Jun 20, 2023 at 10:28 AM Mark Thomas wrote: > On 20/06/2023 15:41, Dan McLaughlin wrote: > > So I tried to create a Valve to check to see if the application is > stopped > > and convert the 404 response to a 503, but I haven't had any luck getting > > it to work. Is there another internal API that I should be using? > > context.getState().isAvailable > > ways seems to report the app is available even though it's stopped. > > The code is looking at the wrong Context. Since the web application has > been stopped the request won't be mapped to it. I'm guessing the request > has been mapped to the root context which is available. > > You'll need to do something like: > > Container[] containers = request.getHost().findChildren(); > for (Container container : containers) { > if (container.getState().isAvailable()) { > continue; > } > Context context = (Context) container; > if (request.getDecodedRequestURI().equals(context.getPath()) || > request.getDecodedRequestURI().startsWith( > context.getPath() + '/')) { > response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); > } > } > > I haven't optimised this at all. It isn't particularly efficient. It is > just to give you an idea. > > Actually. I have just had a much better idea. It works by taking > advantage of the Servlet specification mapping rules which require the > longest context path match. > > Lets assume you have /app1 /app2 and /app3 > > In your ROOT web application create a maintenance Servlet that just > returns a 503 and map it to "/app1/*" "/app2/*" and /app3/*". > > If app1 is running, the longest context path match rule means it will be > mapped to /app1 and the application will handle it. If the web > application is stopped, the request will be mapped to ROOT where it will > match the maintenance Servlet and return a 503. > > The only thing that this won't work for is if you want to take the RROT > web application out of service. > > Mark > > > > import org.apache.catalina.*; > > import org.apache.catalina.connector.Request; > > import org.apache.catalina.connector.Response; > > import org.apache.catalina.valves.ValveBase; > > > > import jakarta.servlet.ServletException; > > import java.io.IOException; > > import java.util.logging.Logger; > > import java.util.logging.Level; > > > > public class DownForMaintenanceValve extends ValveBase { > > > > // Create a Logger > > private static final Logger log = > Logger.getLogger(DownForMaintenanceValve. > > class.getName()); > > > > public DownForMaintenanceValve() { > > log.info("DownForMaintenanceValve started"); > > } > > > > @Override > > public void invoke(Request request, Response response) throws > > IOException, ServletException > > { > > Context context = request.getContext(); > > if (!context.getState().isAvailable()) { > > log.info("Application is not available, sending 503"); > > response.sendError(503); > > } else { > > log.fine("Application is available, passing to next valve"); > > getNext().invoke(request, response); > > } > > } > > } > > > > > > -- > > > > Thanks, > > Dan > > > > On Wed, Jun 14, 2023 at 2:32 PM Mark Thomas wrote: > > > >> On 14/06/2023 19:49, Dan McLaughlin wrote: > >>> Hello, > >>> > >>> This is probably a question that would be better suited for the dev > list, > >>> but I thought I'd start here first. > >> > >> That depends. It is generally better to start on the users list. > >> > >>> Does anyone understand the reasoning behind why Tomcat, when clustered, > >>> throws an HTTP status 404 and not a 503 when you have an application > >>> deployed but stopped or paused? > >> > >> The issue you describe only affects stopped applications. If an > >> application is paused then any requests to that application should be > >> held until the application is unpaused (or the client timeouts out). > >> > >> The current Tomcat Mapper dates back to at least Tomcat 4. It might be > >> earlier but I don't know the Tomcat 3 code well enough to find the > >> Tomcat 3 mapping code in the web interface and I'm not curious enough to > >> check the code out so I can use grep. > >> > >> The clustering implementation dates back to Tomcat 5. > >> > >> You'll need to dig through the archives to see if this topic was ever > >> raised and, if it was, the result of that discussion. Probably around > >> the time clustering was added. > >> > >>> I think I understand that my only option is to > >>> failover for 404s considering the current implementation. > >> > >> That might cause problems. If the node returning 404 is marked as down > >> you'll have a DoS vulnerability that is trivial to exploit. > >> > >>> I've looked to > >>> see if there was a configuration setting related to clustering that >
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
When I attach with a debugger, I can see what's causing it not to work. When the Web Application is started, then request.getContext(); returns the correct Web Application context, but when the application is stopped, request.getContext(); returns the ROOT context, which is up, so the 404 is passed on. Why would request.getContext(); return ROOT if that wasn't the requested context? Is this a bug? -- Thanks, Dan -- Thanks, Dan McLaughlin DJAB Enterprises, LLC d...@djabenterprises.com mobile: 512.633.8086 NOTICE: This e-mail message and all attachments transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is strictly prohibited. The contents of this e-mail are confidential and may be subject to work product privileges. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. On Tue, Jun 20, 2023 at 9:41 AM Dan McLaughlin wrote: > > So I tried to create a Valve to check to see if the application is stopped > and convert the 404 response to a 503, but I haven't had any luck getting it > to work. Is there another internal API that I should be using? > context.getState().isAvailable ways seems to report the app is available even > though it's stopped. > import org.apache.catalina.*; > import org.apache.catalina.connector.Request; > import org.apache.catalina.connector.Response; > import org.apache.catalina.valves.ValveBase; > > import jakarta.servlet.ServletException; > import java.io.IOException; > import java.util.logging.Logger; > import java.util.logging.Level; > > public class DownForMaintenanceValve extends ValveBase { > > // Create a Logger > private static final Logger log = > Logger.getLogger(DownForMaintenanceValve.class.getName()); > > public DownForMaintenanceValve() { > log.info("DownForMaintenanceValve started"); > } > > @Override > public void invoke(Request request, Response response) throws IOException, > ServletException { > Context context = request.getContext(); > if (!context.getState().isAvailable()) { > log.info("Application is not available, sending 503"); > response.sendError(503); > } else { > log.fine("Application is available, passing to next valve"); > getNext().invoke(request, response); > } > } > } > > > -- > > Thanks, > Dan > > On Wed, Jun 14, 2023 at 2:32 PM Mark Thomas wrote: >> >> On 14/06/2023 19:49, Dan McLaughlin wrote: >> > Hello, >> > >> > This is probably a question that would be better suited for the dev list, >> > but I thought I'd start here first. >> >> That depends. It is generally better to start on the users list. >> >> > Does anyone understand the reasoning behind why Tomcat, when clustered, >> > throws an HTTP status 404 and not a 503 when you have an application >> > deployed but stopped or paused? >> >> The issue you describe only affects stopped applications. If an >> application is paused then any requests to that application should be >> held until the application is unpaused (or the client timeouts out). >> >> The current Tomcat Mapper dates back to at least Tomcat 4. It might be >> earlier but I don't know the Tomcat 3 code well enough to find the >> Tomcat 3 mapping code in the web interface and I'm not curious enough to >> check the code out so I can use grep. >> >> The clustering implementation dates back to Tomcat 5. >> >> You'll need to dig through the archives to see if this topic was ever >> raised and, if it was, the result of that discussion. Probably around >> the time clustering was added. >> >> > I think I understand that my only option is to >> > failover for 404s considering the current implementation. >> >> That might cause problems. If the node returning 404 is marked as down >> you'll have a DoS vulnerability that is trivial to exploit. >> >> > I've looked to >> > see if there was a configuration setting related to clustering that would >> > allow me to change the behavior, and I couldn't find one; the only solution >> > seems to be to write a custom listener that detects that an application is >> > deployed but stopped or paused, and then throw a 503 instead. >> >> That would be a better short-term solution and fairly simple to write. >> I'd probably do it as a Valve as you'll get access to Tomcat's internals >> that way. >> >> The clustering implementation generally assumes that all applications >> are available on all nodes. If that isn't the case I wouldn't be >> surprised to see log messages indicating issues with replication. >> >> What is the use case for stopping one (or more) web applications on a node? >> >> Mark >> >> - >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org >> For additional commands, e-mail: users-h...@tomcat.apache.org >> -- *NOTICE:* This e-mail message and all attachments transmitted with it are for the
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
On 20/06/2023 15:41, Dan McLaughlin wrote: So I tried to create a Valve to check to see if the application is stopped and convert the 404 response to a 503, but I haven't had any luck getting it to work. Is there another internal API that I should be using? context.getState().isAvailable ways seems to report the app is available even though it's stopped. The code is looking at the wrong Context. Since the web application has been stopped the request won't be mapped to it. I'm guessing the request has been mapped to the root context which is available. You'll need to do something like: Container[] containers = request.getHost().findChildren(); for (Container container : containers) { if (container.getState().isAvailable()) { continue; } Context context = (Context) container; if (request.getDecodedRequestURI().equals(context.getPath()) || request.getDecodedRequestURI().startsWith( context.getPath() + '/')) { response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); } } I haven't optimised this at all. It isn't particularly efficient. It is just to give you an idea. Actually. I have just had a much better idea. It works by taking advantage of the Servlet specification mapping rules which require the longest context path match. Lets assume you have /app1 /app2 and /app3 In your ROOT web application create a maintenance Servlet that just returns a 503 and map it to "/app1/*" "/app2/*" and /app3/*". If app1 is running, the longest context path match rule means it will be mapped to /app1 and the application will handle it. If the web application is stopped, the request will be mapped to ROOT where it will match the maintenance Servlet and return a 503. The only thing that this won't work for is if you want to take the RROT web application out of service. Mark import org.apache.catalina.*; import org.apache.catalina.connector.Request; import org.apache.catalina.connector.Response; import org.apache.catalina.valves.ValveBase; import jakarta.servlet.ServletException; import java.io.IOException; import java.util.logging.Logger; import java.util.logging.Level; public class DownForMaintenanceValve extends ValveBase { // Create a Logger private static final Logger log = Logger.getLogger(DownForMaintenanceValve. class.getName()); public DownForMaintenanceValve() { log.info("DownForMaintenanceValve started"); } @Override public void invoke(Request request, Response response) throws IOException, ServletException { Context context = request.getContext(); if (!context.getState().isAvailable()) { log.info("Application is not available, sending 503"); response.sendError(503); } else { log.fine("Application is available, passing to next valve"); getNext().invoke(request, response); } } } -- Thanks, Dan On Wed, Jun 14, 2023 at 2:32 PM Mark Thomas wrote: On 14/06/2023 19:49, Dan McLaughlin wrote: Hello, This is probably a question that would be better suited for the dev list, but I thought I'd start here first. That depends. It is generally better to start on the users list. Does anyone understand the reasoning behind why Tomcat, when clustered, throws an HTTP status 404 and not a 503 when you have an application deployed but stopped or paused? The issue you describe only affects stopped applications. If an application is paused then any requests to that application should be held until the application is unpaused (or the client timeouts out). The current Tomcat Mapper dates back to at least Tomcat 4. It might be earlier but I don't know the Tomcat 3 code well enough to find the Tomcat 3 mapping code in the web interface and I'm not curious enough to check the code out so I can use grep. The clustering implementation dates back to Tomcat 5. You'll need to dig through the archives to see if this topic was ever raised and, if it was, the result of that discussion. Probably around the time clustering was added. I think I understand that my only option is to failover for 404s considering the current implementation. That might cause problems. If the node returning 404 is marked as down you'll have a DoS vulnerability that is trivial to exploit. I've looked to see if there was a configuration setting related to clustering that would allow me to change the behavior, and I couldn't find one; the only solution seems to be to write a custom listener that detects that an application is deployed but stopped or paused, and then throw a 503 instead. That would be a better short-term solution and fairly simple to write. I'd probably do it as a Valve as you'll get access to Tomcat's internals that way. The clustering implementation generally assumes that all applications are available on all nodes. If that isn't the case I wouldn't be surprised to see log messages indicating issues with replication. What is the use case for stopping one (or more) web applications on a node? Mark
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
So I tried to create a Valve to check to see if the application is stopped and convert the 404 response to a 503, but I haven't had any luck getting it to work. Is there another internal API that I should be using? context.getState().isAvailable ways seems to report the app is available even though it's stopped. import org.apache.catalina.*; import org.apache.catalina.connector.Request; import org.apache.catalina.connector.Response; import org.apache.catalina.valves.ValveBase; import jakarta.servlet.ServletException; import java.io.IOException; import java.util.logging.Logger; import java.util.logging.Level; public class DownForMaintenanceValve extends ValveBase { // Create a Logger private static final Logger log = Logger.getLogger(DownForMaintenanceValve. class.getName()); public DownForMaintenanceValve() { log.info("DownForMaintenanceValve started"); } @Override public void invoke(Request request, Response response) throws IOException, ServletException { Context context = request.getContext(); if (!context.getState().isAvailable()) { log.info("Application is not available, sending 503"); response.sendError(503); } else { log.fine("Application is available, passing to next valve"); getNext().invoke(request, response); } } } -- Thanks, Dan On Wed, Jun 14, 2023 at 2:32 PM Mark Thomas wrote: > On 14/06/2023 19:49, Dan McLaughlin wrote: > > Hello, > > > > This is probably a question that would be better suited for the dev list, > > but I thought I'd start here first. > > That depends. It is generally better to start on the users list. > > > Does anyone understand the reasoning behind why Tomcat, when clustered, > > throws an HTTP status 404 and not a 503 when you have an application > > deployed but stopped or paused? > > The issue you describe only affects stopped applications. If an > application is paused then any requests to that application should be > held until the application is unpaused (or the client timeouts out). > > The current Tomcat Mapper dates back to at least Tomcat 4. It might be > earlier but I don't know the Tomcat 3 code well enough to find the > Tomcat 3 mapping code in the web interface and I'm not curious enough to > check the code out so I can use grep. > > The clustering implementation dates back to Tomcat 5. > > You'll need to dig through the archives to see if this topic was ever > raised and, if it was, the result of that discussion. Probably around > the time clustering was added. > > > I think I understand that my only option is to > > failover for 404s considering the current implementation. > > That might cause problems. If the node returning 404 is marked as down > you'll have a DoS vulnerability that is trivial to exploit. > > > I've looked to > > see if there was a configuration setting related to clustering that would > > allow me to change the behavior, and I couldn't find one; the only > solution > > seems to be to write a custom listener that detects that an application > is > > deployed but stopped or paused, and then throw a 503 instead. > > That would be a better short-term solution and fairly simple to write. > I'd probably do it as a Valve as you'll get access to Tomcat's internals > that way. > > The clustering implementation generally assumes that all applications > are available on all nodes. If that isn't the case I wouldn't be > surprised to see log messages indicating issues with replication. > > What is the use case for stopping one (or more) web applications on a node? > > Mark > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- *NOTICE:* This e-mail message and all attachments transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, or distribution is strictly prohibited. The contents of this e-mail are confidential and may be subject to work product privileges. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
Hey Mark, Thanks for the information and quick response! The typical use case is either during a hot redeployment of an application; we don't use the application context versions only because we had issues with it in the past, but the last time I tried it was years ago. If I remember correctly, the problems might have been classloader issues or related to JMX conflicts. For that reason, we redeploy using the same context and version. When the redeployment happens using the same context version, there is a small window where the app is stopped during the redeployment. The other case is on rare occasions, we will need to stop just one application deployed on a Tomcat node to troubleshoot something where clustering is making it more difficult to debug. We don't want to take down all the apps or the entire Tomcat node because we need it to handle the load. We don't hot deploy often, so it's not a huge issue, and even more rarely do we run into issues in production where we need to stop just one app, but it has happened. It would just be nice not to have to go tell mod_jk that a node was down for an application or have to stop Tomcat to get it to not send requests to a stopped app, if it was stopped and threw a 503 it would just happen. The only reason I even looked at this is that I've been tasked with implementing a comprehensive solution for handling all the different error conditions properly and displaying the proper error pages. We are also implementing a way to put all our applications in a "Down for Maintenance Mode" without having to stop them and that can be scheduled at the individual application level. I'll look at using a valve if we decide it's a big enough issue. Thanks again for the explanation! Dan On Wed, Jun 14, 2023 at 2:32 PM Mark Thomas wrote: > On 14/06/2023 19:49, Dan McLaughlin wrote: > > Hello, > > > > This is probably a question that would be better suited for the dev list, > > but I thought I'd start here first. > > That depends. It is generally better to start on the users list. > > > Does anyone understand the reasoning behind why Tomcat, when clustered, > > throws an HTTP status 404 and not a 503 when you have an application > > deployed but stopped or paused? > > The issue you describe only affects stopped applications. If an > application is paused then any requests to that application should be > held until the application is unpaused (or the client timeouts out). > > The current Tomcat Mapper dates back to at least Tomcat 4. It might be > earlier but I don't know the Tomcat 3 code well enough to find the > Tomcat 3 mapping code in the web interface and I'm not curious enough to > check the code out so I can use grep. > > The clustering implementation dates back to Tomcat 5. > > You'll need to dig through the archives to see if this topic was ever > raised and, if it was, the result of that discussion. Probably around > the time clustering was added. > > > I think I understand that my only option is to > > failover for 404s considering the current implementation. > > That might cause problems. If the node returning 404 is marked as down > you'll have a DoS vulnerability that is trivial to exploit. > > > I've looked to > > see if there was a configuration setting related to clustering that would > > allow me to change the behavior, and I couldn't find one; the only > solution > > seems to be to write a custom listener that detects that an application > is > > deployed but stopped or paused, and then throw a 503 instead. > > That would be a better short-term solution and fairly simple to write. > I'd probably do it as a Valve as you'll get access to Tomcat's internals > that way. > > The clustering implementation generally assumes that all applications > are available on all nodes. If that isn't the case I wouldn't be > surprised to see log messages indicating issues with replication. > > What is the use case for stopping one (or more) web applications on a node? > > Mark > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- *NOTICE:* This e-mail message and all attachments transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, or distribution is strictly prohibited. The contents of this e-mail are confidential and may be subject to work product privileges. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Re: Tomcat Clustering, Mod_JK, Fail_on_Status, Stopped Application
On 14/06/2023 19:49, Dan McLaughlin wrote: Hello, This is probably a question that would be better suited for the dev list, but I thought I'd start here first. That depends. It is generally better to start on the users list. Does anyone understand the reasoning behind why Tomcat, when clustered, throws an HTTP status 404 and not a 503 when you have an application deployed but stopped or paused? The issue you describe only affects stopped applications. If an application is paused then any requests to that application should be held until the application is unpaused (or the client timeouts out). The current Tomcat Mapper dates back to at least Tomcat 4. It might be earlier but I don't know the Tomcat 3 code well enough to find the Tomcat 3 mapping code in the web interface and I'm not curious enough to check the code out so I can use grep. The clustering implementation dates back to Tomcat 5. You'll need to dig through the archives to see if this topic was ever raised and, if it was, the result of that discussion. Probably around the time clustering was added. I think I understand that my only option is to failover for 404s considering the current implementation. That might cause problems. If the node returning 404 is marked as down you'll have a DoS vulnerability that is trivial to exploit. I've looked to see if there was a configuration setting related to clustering that would allow me to change the behavior, and I couldn't find one; the only solution seems to be to write a custom listener that detects that an application is deployed but stopped or paused, and then throw a 503 instead. That would be a better short-term solution and fairly simple to write. I'd probably do it as a Valve as you'll get access to Tomcat's internals that way. The clustering implementation generally assumes that all applications are available on all nodes. If that isn't the case I wouldn't be surprised to see log messages indicating issues with replication. What is the use case for stopping one (or more) web applications on a node? Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org