Chris, Thank you for the completeness. I always miss on getting the correct detail (too little, too much) in my postings.
> On Dec 11, 2020, at 11:31 AM, Christopher Schultz > <ch...@christopherschultz.net> wrote: > > Rob, > > On 12/9/20 23:58, Rob Sargent wrote: >> My apologies if this is too vague to warrant consideration. > > It is vague, but we can always ask questions :) > >> In the recent past I managed a naked port, a Selector, a ThreadPoolExecutor >> and friends (and it worked well enough...) but a dear and knowledgeable >> friend suggested embedding tomcat and using http.[3] > > Managing that yourself can be a pain. The downside to using Tomcat is that > you have to use HTTP. But maybe that can work in your favor in certain cases, > especially if you have other options (e.g. upgrade from HTTP to Websocket > after connection.) > > > I have that working, one request at a time. > > Great. > >> Is it advisable, practical to (re)establish a ThreadPoolExecutor, queue etc >> as a tomcat accessible "Resource" with JDNI lookup, and have my servlet pass >> the work off to the Executor's queue? > > I don't really understand this at all. Are you asking how to mitigate a > self-inflected DOS because you have so many incoming connections? > > If you have enough hardware to satisfy the requests, and your usage pattern > is as you suggest, then you will mostly have one or two huge requests > in-flight at any time, and some large number of smaller (faster?) requests > also in-flight at the same time. > > Again, no problem. You are constrained only by the resources you have > available: > > 1. Memory > 2. Maximum connections > 3. Maximum threads (in your executor / request-processor thread pool) > For the large request, the middle-ware (my impl, or tomcat layer) takes the payload from the client an writes “lots” of records in the database. Do I want that save() call in the servlet or should I queue it up for some other handler. All on the same hardware, but that frees up the servlet. In the small client (my self-made DOS), there’s only a handful of writes, but still faster to hand that memory to a queue and let the servlet go back to the storm. That’s the thinking behind the question of accessing a ThreadPoolExecutor via JDNI. I know my existing impl does queue jobs so (so the load is greater than the capacity to handle requests). I worry that without off-loading Tomcat would just spin up more servlet threads, exhaust resources. I can lose a client, but would rather not lose the server (that looses all clients...) > If you have data that doesn't fix into byte[MAXINT] then maybe you don't want > to try to handle it all at once. That's an application design decision and if > gzipping helps you in the short term, then great. My recommendation would be > to look at ways of handling that request in a streaming-fashion instead of > buffering everything up in memory. The overall performance of your > application will likely improve because of that change. Re-working the structure of the payload (break it up) is an option, but not a pleasant one :) > >> [2] Given infinite EC2 capacity there would be tens of thousands of jobs >> started at once. Realistic AWS capacity constraints limit this to hundreds >> of instances from a queue of thousands. The duration of any instance varies >> from hours to days. But the payload is simple, under 5K bytes. > If you are using AWS, then you can load-balance between any number of > back-end Tomcat instances. The lb just has to decide which back-end instance > to use. Sometimes lbs make bad decisions and you get all the load on a single > node. That's bad because (a) one node is overloaded and (b) the other nodes > are under-utilized. It's up to you to figure out how to get your load > distributed in an equitable way. > Not anxious to add more Tomcat instances. I can manually throttle both types of requests for now. > Back to the problem you are actually trying to solve. > > Are these "small requests" in any way a problem? Do they arrive frequently > and are they handled quickly? If so, then you can probably mostly just ignore > them. It's the huge requests that are (likely) the problem. > > If you want to hand-off control of a request to another thread for > processing, there are a few ways to do that I can think of: > > 1. new Thread(new Runnable() { /* your stuff */ }).start(); > > This is bad for a few of reasons: > > 1a. Unlimited threads created by remote clients? Bad. > > 1b. Call to Thread.start() returns immediately and the servlet's execution > ends. There is no way to reply to the client's, and if you haven't read all > their input, Bad Things will happen in Tomcat. (Like, your request and > response objects will be re-used and you'll observe mass-chaos). > > 2. sharedExecutor.submit(new Runnable() { /* your stuff */ }); > > This is bad for the same reason as 1b above, but it does not suffer from 1a. > 1a is now replaced by: > > 2a. Unlimited jobs submitted by remote clients? Bad. > > 3. Use servlet async processing. > > I think this is probably ideal for your use-case. When you go into > asynchronous mode, the request-processing thread is allowed to go back and > service other requests, so you get a more responsive server, at least from > your clients' perspectives. > > The bad news is that asynchronous mode requires that you completely change > the way you think about communicating with a client. Instead of reading the > input until you are satisfied, performing your business-logic, then writing > to the client until you are done, you have to subscribe to I/O events and > handle them all appropriately. If you get it wrong, you can make a mess of > things. > Here’s where my ignorance will really shine. Are you talking about HttpClient.sendAsync or are you talking about a Tomcat mode of operation? The clients die as soon as they send the request. I wait for an ok, but don’t really have to. (Would be nice if I could take statusCode() != 200 and write to a file (as I used to) but I can lose a client or two, occasionally. Release 2.1 :) ) There’s very little two-way communication: AWS queue starts clients. Clients ask for data to work on, do a bunch of simulations, send results to middleware; middle-ware make db calls. Client does not know about db. > It would help to understand the nature of what has to happen with the data > once it's received by your servlet. Does the processing take a long time, or > is it mostly the data-upload that takes forever? Is there a way for the > client to cut-up the work in any way? Are there other ways for the client to > get a response? Making an HTTP connection and then waiting 18 hours for a > response is ... not how HTTP is supposed to work. > > Long ago I worked on a product where we had to perform some long operation on > the server and client browsers would time-out. (This was a web browser, so > they have timeouts of like 60 seconds or so. They aren't custom clients where > you say just say "wait 18 hours for a response."). Your “Job” example seems along the lines of get-it-off-the-servlet, which again points back to my current queue handler I think. > This allows the server to make progress even if the client times out, or > loses a network connection, or power, or whatever. It also allows the server > to use that network connection used to submit the initial job for accepting > other connections. And you don't have to use servlet async and re-write your > whole process. > > I don't know if any of this helps, but I thik it will get you thinking in a > more HTTP-style way rather than a connection-oriented service like the one > you had originally implemented. Helps a ton. Very thankful for the indulgence. I hope I’ve given useful responses. Next up, is SSL. One of the reason’s I must switch from my naked socket impl. rjs --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org