Owen Rubel
oru...@gmail.com

On Tue, Aug 15, 2017 at 8:23 AM, Christopher Schultz <
ch...@christopherschultz.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Owen,
>
> On 8/13/17 10:46 AM, Owen Rubel wrote:
> > Owen Rubel oru...@gmail.com
> >
> > On Sun, Aug 13, 2017 at 5:57 AM, Christopher Schultz <
> > ch...@christopherschultz.net> wrote:
> >
> > Owen,
> >
> > On 8/12/17 12:47 PM, Owen Rubel wrote:
> >>>> What I am talking about is something that improves
> >>>> communication as we notice that communication channel needing
> >>>> more resources. Not caching what is communicated... improving
> >>>> the CHANNEL for communicating the resource (whatever it may
> >>>> be).
> >
> > If the channel is an HTTP connection (or TCP; the application
> > protocol isn't terribly relevant), then you are limited by the
> > following:
> >
> > 1. Network bandwidth 2. Available threads (to service a particular
> > request) 3. Hardware resources on the server
> > (CPU/memory/disk/etc.)
> >
> > Let's ignore 1 and 3 for now, since you are primarily concerned
> > with concurrency, and concurrency is useless if the other resources
> > are constrained or otherwise limiting the equation.
> >
> > Let's say we had "per endpoint" thread pools, so that e.g. /create
> > had its own thread pool, and /show had another one, etc. What would
> > that buy us?
> >
> > (Let's ignore for now the fact that one set of threads must always
> > be used to decode the request to decide where it's going, like
> > /create or /show.)
> >
> > If we have a limited total number of threads (e.g. 10), then we
> > could "reserve" some of them so that we could always have 2 threads
> > for /create even if all the other threads in the system (the other
> > 8) were being used for something else. If we had 2 threads for
> > /create and 2 threads for /show, then only 6 would remain for e.g.
> > /edit or /delete. So if 6 threads were already being used for /edit
> > or /delete, the 7th incoming request would be queued, but anyone
> > making a request for /show or /create would (if a thread in those
> > pools is available) be serviced immediately.
> >
> > I can see some utility in this ability, because it would allow the
> > container to ensure that some resources were never starved... or,
> > rather, that they have some priority over certain other services.
> > In other words, the service could enjoy guaranteed provisioning
> > for certain endpoints.
> >
> > As it stands, Tomcat (and, I would venture a guess, most if not
> > all other containers) implements a fair request pipeline where
> > requests are (at least roughly) serviced in the order in which they
> > are received. Rather than guaranteeing provisioning for a
> > particular endpoint, the closest thing that could be implemented
> > (at the application level) would be a
> > resource-availability-limiting mechanism, such as counting the
> > number of in-flight requests and rejecting those which exceed some
> > threshold with e.g. a 503 response.
> >
> > Unfortunately, that doesn't actually prioritize some requests, it
> > merely rejects others in order to attempt to prioritize those
> > others. It also starves endpoints even when there is no reason to
> > do so (e.g. in the 10-thread scenario, if all 4 /show and /create
> > threads are idle, but 6 requests are already in process for the
> > other endpoints, a 7th request for those other endpoints will be
> > rejected).
> >
> > I believe that per-endpoint provisioning is a possibility, but I
> > don't think that the potential gains are worth the certain
> > complexity of the system required to implement it.
> >
> > There are other ways to handle heterogeneous service requests in a
> > way that doesn't starve one type of request in favor of another.
> > One obvious solution is horizontal scaling with a load-balancer. An
> > LB can be used to implement a sort of guaranteed-provisioning for
> > certain endpoints by providing more back-end servers for certain
> > endpoints. If you want to make sure that /show can be called by any
> > client at any time, then make sure you spin-up 1000 /show servers
> > and register them with the load-balancer. You can survive with only
> > maybe 10 nodes servicing /delete requests; others will either wait
> > in a queue or receive a 503 from the lb.
> >
> > For my money, I'd maximize the number of threads available for all
> > requests (whether within a single server, or across a large
> > cluster) and not require that they be available for any particular
> > endpoint. Once you have to depart from a single server, you MUST
> > have something like a load-balancer involved, and therefore the
> > above solution becomes not only more practical but also more
> > powerful.
> >
> > Since relying on a one-box-wonder to run a high-availability web
> > service isn't practical, provisioning is necessarily above the
> > cluster-node level, and so the problem has effectively moved from
> > the app server to the load-balancer (or reverse proxy). I believe
> > the application server is an inappropriate place to implement this
> > type of provisioning because it's too small-scale. The app server
> > should serve requests as quickly as possible, and arranging for
> > this kind of provisioning would add a level of complexity that
> > would jeopardize performance of all requests within the application
> > server.
> >
> >>>> But like you said, this is not something that is doable so
> >>>> I'll look elsewhere.
> >
> > I think it's doable, just not worth it given the orthogonal
> > solutions available. Some things are better-implemented at other
> > layers of the application (as a whole system) and perhaps not the
> > application server itself.
> >
> > Someone with intimate experience with Obidos should be familiar
> > with the benefits of separation of these kinds of concerns ;)
> >
> > If you are really more concerned with threads that are tied-up
> > with I/O-bound work, then Websocket really is your friend. The
> > complex threading model of Websocket allows applications to do Real
> > Work on application threads and then delegate the work of pushing
> > bytes across the wire to the container, resulting in very few
> > I/O-bound threads.
> >
> > But the way you have phrased your questions seems like you were
> > more interested in guaranteed provisioning than avoiding I/O-bound
> > threads.
> >
> > -chris
> >>
> >> ---------------------------------------------------------------------
> >>
> >>
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>
> >>
> >
> >> If we have a limited total number of threads (e.g. 10), then we
> >> could "reserve" some of them so that we could always have 2
> >> threads for /create even if all the other threads in the system
> >> (the other 8) were being used for something else. If we had 2
> >> threads for /create and 2 threads for /show, then only 6 would
> >> remain for e.g. /edit or /delete. So if 6 threads were already
> >> being used for /edit or /delete, the 7th incoming request would
> >> be queued, but anyone making a request for /show or /create would
> >> (if a thread in those pools is available) be serviced
> >> immediately.
> >
> > Use percentages like most load balancers do to solve that problem
> > and then adjust the percentages as traffic changes.
> >
> >
> > So say we have the following assigned thread percentages:
> >
> > person/show - 5% person/create-2% person/edit-2% person/delete 1%
>
> What happened to the remaining 90% of threads? If they don't exist,
> then everything above needs to be multiplied by 10x. If they do exist,
> then they either needs to be "provisioned" to a specific endpoint, or
> they needs to be explicitly defined to be "unprovisioned", meaning
> that they can be used by/for any endpoint.
>
> > *(always guaranteeing that each would have 1 thread shared from the
> > pool at all times)
>
> You have been talking about guaranteed provisioning and not really
> talking about any kind of "shared" pool. I'm not entirely sure what a
> hybrid approach would look like, here, but it really all goes back the
> fact that all threads are really created equal, unless you are really
> trying to create presistent connections (e.g. Websocket, HTTP keepalve
> between lb/reverse-proxy and app server endpoints).
>
> > If suddenly traffic starts to spike on 'person/edit', we steal
> > from 'person/show'. Why? 'person/show' had those threads created
> > dynamically and may not be using them all currently.
>
> Sounds like a plain-old shared thread pool.
>
> > We steal from the highest percentages durinmg spikes because we
> > currently have a new highest percentage.
> >
> > And if that changes, they will steal back.
> >
> > At least this is what I was envisioning for an implementation.
>
> There is no penalty for "stealing" a thread from another pool, so the
> result is that all pools are equal, and a single pool will do the job
> just as well.
>
> I'm obviously missing something fundamental about your reasoning, here.
>
> If it's communication channels you are concerned with, then I think
> there is an argument to be made for guaranteed provisioning. For for
> threads, there is no property of the thread that can make it any
> better-suited for handling requests for endpoint A versus endpoint B.
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlmTEgAACgkQHPApP6U8
> pFjjuA/+O0Lvzi8BSiaGucXs7JA+f4dwvyf50tZfcLpD1ZGpkxSDEoyaU/e8sJZQ
> 83b0KOQ/4ejHFqJ0rrgrMrTMgh0+9zhj4nBLI8W0NCih2Rrzaaf/+/XRItxWZmlw
> y4HJfGK+VYsKZF6MGvudenWPLMfU4EdK+qzbyKFm8fkQVj6w7vt0+6SiF2IWyB3X
> 8v2W6qr1aWVc19Km6xFB7csClwa93Fbv3hb05PJa3JEdiXPBb0Hh1lh7JT8RY4b6
> gAgjyfGvnlYp5OaY8Tb8CHrPSHwt0G1TuoFRkl/R2jwZicMCwYxEShQJOdE/nbVQ
> /zYq4flZQUDNVtoLNsob4GLh9tHL21CsZammyWeZZYNDdaA2b5EJP/YCJLmqOSio
> 2jkn+98BSxrAfIJdz/w+Pb3gDxJP30jtfCqBFhfisEzjKtpNUhh+Sr4PlgF3ejVi
> 2j6rNgb8SK1RJQPQMKZBCUYJ7TAxP21wiGWPInxqiVYg39axrAyCJxN/lO0PTl9t
> +pxDjcpSY5ZUXVltgpM9lTZr8t3LXmkFYQG72wuFYxwtloFhyXIXhhi0udY5LNcv
> /YSlG+FpGXh7GS+nsBkBdZs2zo7C7Bzjzwm3Km1M1QifQ+5ncC1yNaFl8KgV3JjS
> QAqTIlOJbYdw0uESDpkkEARBbTHc0DQ7u3oJo3DrYcGAGKzka4s=
> =OLcB
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>
Well you only steal when you need to steal resources so no... it would
NEVER be the same; certain endpoints would always be balanced different.

Think of it like 'load balancing per endpoint' but with threads.

Reply via email to