Owen Rubel oru...@gmail.com On Tue, Aug 15, 2017 at 8:23 AM, Christopher Schultz < ch...@christopherschultz.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Owen, > > On 8/13/17 10:46 AM, Owen Rubel wrote: > > Owen Rubel oru...@gmail.com > > > > On Sun, Aug 13, 2017 at 5:57 AM, Christopher Schultz < > > ch...@christopherschultz.net> wrote: > > > > Owen, > > > > On 8/12/17 12:47 PM, Owen Rubel wrote: > >>>> What I am talking about is something that improves > >>>> communication as we notice that communication channel needing > >>>> more resources. Not caching what is communicated... improving > >>>> the CHANNEL for communicating the resource (whatever it may > >>>> be). > > > > If the channel is an HTTP connection (or TCP; the application > > protocol isn't terribly relevant), then you are limited by the > > following: > > > > 1. Network bandwidth 2. Available threads (to service a particular > > request) 3. Hardware resources on the server > > (CPU/memory/disk/etc.) > > > > Let's ignore 1 and 3 for now, since you are primarily concerned > > with concurrency, and concurrency is useless if the other resources > > are constrained or otherwise limiting the equation. > > > > Let's say we had "per endpoint" thread pools, so that e.g. /create > > had its own thread pool, and /show had another one, etc. What would > > that buy us? > > > > (Let's ignore for now the fact that one set of threads must always > > be used to decode the request to decide where it's going, like > > /create or /show.) > > > > If we have a limited total number of threads (e.g. 10), then we > > could "reserve" some of them so that we could always have 2 threads > > for /create even if all the other threads in the system (the other > > 8) were being used for something else. If we had 2 threads for > > /create and 2 threads for /show, then only 6 would remain for e.g. > > /edit or /delete. So if 6 threads were already being used for /edit > > or /delete, the 7th incoming request would be queued, but anyone > > making a request for /show or /create would (if a thread in those > > pools is available) be serviced immediately. > > > > I can see some utility in this ability, because it would allow the > > container to ensure that some resources were never starved... or, > > rather, that they have some priority over certain other services. > > In other words, the service could enjoy guaranteed provisioning > > for certain endpoints. > > > > As it stands, Tomcat (and, I would venture a guess, most if not > > all other containers) implements a fair request pipeline where > > requests are (at least roughly) serviced in the order in which they > > are received. Rather than guaranteeing provisioning for a > > particular endpoint, the closest thing that could be implemented > > (at the application level) would be a > > resource-availability-limiting mechanism, such as counting the > > number of in-flight requests and rejecting those which exceed some > > threshold with e.g. a 503 response. > > > > Unfortunately, that doesn't actually prioritize some requests, it > > merely rejects others in order to attempt to prioritize those > > others. It also starves endpoints even when there is no reason to > > do so (e.g. in the 10-thread scenario, if all 4 /show and /create > > threads are idle, but 6 requests are already in process for the > > other endpoints, a 7th request for those other endpoints will be > > rejected). > > > > I believe that per-endpoint provisioning is a possibility, but I > > don't think that the potential gains are worth the certain > > complexity of the system required to implement it. > > > > There are other ways to handle heterogeneous service requests in a > > way that doesn't starve one type of request in favor of another. > > One obvious solution is horizontal scaling with a load-balancer. An > > LB can be used to implement a sort of guaranteed-provisioning for > > certain endpoints by providing more back-end servers for certain > > endpoints. If you want to make sure that /show can be called by any > > client at any time, then make sure you spin-up 1000 /show servers > > and register them with the load-balancer. You can survive with only > > maybe 10 nodes servicing /delete requests; others will either wait > > in a queue or receive a 503 from the lb. > > > > For my money, I'd maximize the number of threads available for all > > requests (whether within a single server, or across a large > > cluster) and not require that they be available for any particular > > endpoint. Once you have to depart from a single server, you MUST > > have something like a load-balancer involved, and therefore the > > above solution becomes not only more practical but also more > > powerful. > > > > Since relying on a one-box-wonder to run a high-availability web > > service isn't practical, provisioning is necessarily above the > > cluster-node level, and so the problem has effectively moved from > > the app server to the load-balancer (or reverse proxy). I believe > > the application server is an inappropriate place to implement this > > type of provisioning because it's too small-scale. The app server > > should serve requests as quickly as possible, and arranging for > > this kind of provisioning would add a level of complexity that > > would jeopardize performance of all requests within the application > > server. > > > >>>> But like you said, this is not something that is doable so > >>>> I'll look elsewhere. > > > > I think it's doable, just not worth it given the orthogonal > > solutions available. Some things are better-implemented at other > > layers of the application (as a whole system) and perhaps not the > > application server itself. > > > > Someone with intimate experience with Obidos should be familiar > > with the benefits of separation of these kinds of concerns ;) > > > > If you are really more concerned with threads that are tied-up > > with I/O-bound work, then Websocket really is your friend. The > > complex threading model of Websocket allows applications to do Real > > Work on application threads and then delegate the work of pushing > > bytes across the wire to the container, resulting in very few > > I/O-bound threads. > > > > But the way you have phrased your questions seems like you were > > more interested in guaranteed provisioning than avoiding I/O-bound > > threads. > > > > -chris > >> > >> --------------------------------------------------------------------- > >> > >> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >> For additional commands, e-mail: users-h...@tomcat.apache.org > >> > >> > > > >> If we have a limited total number of threads (e.g. 10), then we > >> could "reserve" some of them so that we could always have 2 > >> threads for /create even if all the other threads in the system > >> (the other 8) were being used for something else. If we had 2 > >> threads for /create and 2 threads for /show, then only 6 would > >> remain for e.g. /edit or /delete. So if 6 threads were already > >> being used for /edit or /delete, the 7th incoming request would > >> be queued, but anyone making a request for /show or /create would > >> (if a thread in those pools is available) be serviced > >> immediately. > > > > Use percentages like most load balancers do to solve that problem > > and then adjust the percentages as traffic changes. > > > > > > So say we have the following assigned thread percentages: > > > > person/show - 5% person/create-2% person/edit-2% person/delete 1% > > What happened to the remaining 90% of threads? If they don't exist, > then everything above needs to be multiplied by 10x. If they do exist, > then they either needs to be "provisioned" to a specific endpoint, or > they needs to be explicitly defined to be "unprovisioned", meaning > that they can be used by/for any endpoint. > > > *(always guaranteeing that each would have 1 thread shared from the > > pool at all times) > > You have been talking about guaranteed provisioning and not really > talking about any kind of "shared" pool. I'm not entirely sure what a > hybrid approach would look like, here, but it really all goes back the > fact that all threads are really created equal, unless you are really > trying to create presistent connections (e.g. Websocket, HTTP keepalve > between lb/reverse-proxy and app server endpoints). > > > If suddenly traffic starts to spike on 'person/edit', we steal > > from 'person/show'. Why? 'person/show' had those threads created > > dynamically and may not be using them all currently. > > Sounds like a plain-old shared thread pool. > > > We steal from the highest percentages durinmg spikes because we > > currently have a new highest percentage. > > > > And if that changes, they will steal back. > > > > At least this is what I was envisioning for an implementation. > > There is no penalty for "stealing" a thread from another pool, so the > result is that all pools are equal, and a single pool will do the job > just as well. > > I'm obviously missing something fundamental about your reasoning, here. > > If it's communication channels you are concerned with, then I think > there is an argument to be made for guaranteed provisioning. For for > threads, there is no property of the thread that can make it any > better-suited for handling requests for endpoint A versus endpoint B. > > - -chris > -----BEGIN PGP SIGNATURE----- > Comment: GPGTools - http://gpgtools.org > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlmTEgAACgkQHPApP6U8 > pFjjuA/+O0Lvzi8BSiaGucXs7JA+f4dwvyf50tZfcLpD1ZGpkxSDEoyaU/e8sJZQ > 83b0KOQ/4ejHFqJ0rrgrMrTMgh0+9zhj4nBLI8W0NCih2Rrzaaf/+/XRItxWZmlw > y4HJfGK+VYsKZF6MGvudenWPLMfU4EdK+qzbyKFm8fkQVj6w7vt0+6SiF2IWyB3X > 8v2W6qr1aWVc19Km6xFB7csClwa93Fbv3hb05PJa3JEdiXPBb0Hh1lh7JT8RY4b6 > gAgjyfGvnlYp5OaY8Tb8CHrPSHwt0G1TuoFRkl/R2jwZicMCwYxEShQJOdE/nbVQ > /zYq4flZQUDNVtoLNsob4GLh9tHL21CsZammyWeZZYNDdaA2b5EJP/YCJLmqOSio > 2jkn+98BSxrAfIJdz/w+Pb3gDxJP30jtfCqBFhfisEzjKtpNUhh+Sr4PlgF3ejVi > 2j6rNgb8SK1RJQPQMKZBCUYJ7TAxP21wiGWPInxqiVYg39axrAyCJxN/lO0PTl9t > +pxDjcpSY5ZUXVltgpM9lTZr8t3LXmkFYQG72wuFYxwtloFhyXIXhhi0udY5LNcv > /YSlG+FpGXh7GS+nsBkBdZs2zo7C7Bzjzwm3Km1M1QifQ+5ncC1yNaFl8KgV3JjS > QAqTIlOJbYdw0uESDpkkEARBbTHc0DQ7u3oJo3DrYcGAGKzka4s= > =OLcB > -----END PGP SIGNATURE----- > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > Well you only steal when you need to steal resources so no... it would NEVER be the same; certain endpoints would always be balanced different. Think of it like 'load balancing per endpoint' but with threads.