Re: Ideas from ApacheCon

2017-06-05 Thread Jim Jagielski
Will do! THX!

> On Jun 2, 2017, at 9:46 AM, Jim Riggs  wrote:
> 
>> On 2 Jun 2017, at 07:20, Jim Jagielski  wrote:
>> 
>>> On Jun 1, 2017, at 1:29 PM, Jim Riggs  wrote:
>>> 
>>> Regardless, even worst case, we are looking at what, iterating 6 pointers 
>>> instead of 3 or 10 instead of 5? We probably have some lower hanging fruit 
>>> across the request lifecycle code to increase performance than saving some 
>>> arithmetic on a handful of structs, no? ;-)
>>> 
>> 
>> Oh I agree... it just seems that we add overhead during the
>> normal and nominal case (everybody is up and available) and
>> that the overhead is worse the more workers we have. I am
>> personally OK with abnormal situations being slower
>> 
>> And yeah, I also have never like the duplication that's inherent
>> in all the lbmethods, and so I really like it being simplified.
>> But I could never figure out a way to remove that without making
>> it more complex or slower. Last idea I had was to create some
>> sort of looping cpp MACRO to encapsulate it at some level :/
> 
> Yeah. Take a look at the callback version I submitted last night. It has 
> reduced the code in the lbmethods to almost nothing. The find_best_*() 
> functions just call ap_proxy_balancer_get_best_worker() with a small, quick 
> callback function and then do a little housekeeping. Nothing else.
> 
> BUT, this implementation makes it extremely flexible if someone want to 
> create a complex lbmethod. It can do whatever special/custom things it wants 
> to in its find_best_() function and/or in the is_best_() callback...or maybe 
> it wants to implement its own iteration for finding usable workers and 
> doesn't want to use ap_proxy_balancer_get_best_worker()'s logic at all. Just 
> because the 3 existing lbmethods use that flow doesn't mean everything has 
> to, but we removed a lot of duplicate code with a shared function for them 
> (and future lbmethods?) to use. I think it's a big win for extensibility and 
> simplification of the lbmethod interface.
> 



Re: Ideas from ApacheCon

2017-06-02 Thread Jim Riggs
> On 2 Jun 2017, at 07:20, Jim Jagielski  wrote:
> 
>> On Jun 1, 2017, at 1:29 PM, Jim Riggs  wrote:
>> 
>> Regardless, even worst case, we are looking at what, iterating 6 pointers 
>> instead of 3 or 10 instead of 5? We probably have some lower hanging fruit 
>> across the request lifecycle code to increase performance than saving some 
>> arithmetic on a handful of structs, no? ;-)
>> 
> 
> Oh I agree... it just seems that we add overhead during the
> normal and nominal case (everybody is up and available) and
> that the overhead is worse the more workers we have. I am
> personally OK with abnormal situations being slower
> 
> And yeah, I also have never like the duplication that's inherent
> in all the lbmethods, and so I really like it being simplified.
> But I could never figure out a way to remove that without making
> it more complex or slower. Last idea I had was to create some
> sort of looping cpp MACRO to encapsulate it at some level :/

Yeah. Take a look at the callback version I submitted last night. It has 
reduced the code in the lbmethods to almost nothing. The find_best_*() 
functions just call ap_proxy_balancer_get_best_worker() with a small, quick 
callback function and then do a little housekeeping. Nothing else.

BUT, this implementation makes it extremely flexible if someone want to create 
a complex lbmethod. It can do whatever special/custom things it wants to in its 
find_best_() function and/or in the is_best_() callback...or maybe it wants to 
implement its own iteration for finding usable workers and doesn't want to use 
ap_proxy_balancer_get_best_worker()'s logic at all. Just because the 3 existing 
lbmethods use that flow doesn't mean everything has to, but we removed a lot of 
duplicate code with a shared function for them (and future lbmethods?) to use. 
I think it's a big win for extensibility and simplification of the lbmethod 
interface.



Re: Ideas from ApacheCon

2017-06-02 Thread Jim Jagielski
On Jun 1, 2017, at 1:29 PM, Jim Riggs  wrote:Regardless, even worst case, we are looking at what, iterating 6 pointers instead of 3 or 10 instead of 5? We probably have some lower hanging fruit across the request lifecycle code to increase performance than saving some arithmetic on a handful of structs, no? ;-)Oh I agree... it just seems that we add overhead during thenormal and nominal case (everybody is up and available) andthat the overhead is worse the more workers we have. I ampersonally OK with abnormal situations being slowerAnd yeah, I also have never like the duplication that's inherentin all the lbmethods, and so I really like it being simplified.But I could never figure out a way to remove that without makingit more complex or slower. Last idea I had was to create somesort of looping cpp MACRO to encapsulate it at some level :/

Re: Ideas from ApacheCon

2017-06-02 Thread Jim Jagielski

> On Jun 1, 2017, at 1:29 PM, Jim Riggs  wrote:
> 
> 
> Regardless, even worst case, we are looking at what, iterating 6 pointers 
> instead of 3 or 10 instead of 5? We probably have some lower hanging fruit 
> across the request lifecycle code to increase performance than saving some 
> arithmetic on a handful of structs, no? ;-)
> 

Oh I agree... it just seems that we add overhead during the
normal and nominal case (everybody is up and available) and
that the overhead is worse the more workers we have. I am
personally OK with abnormal situations being slower

And yeah, I also have never like the duplication that's inherent
in all the lbmethods, and so I really like it being simplified.
But I could never figure out a way to remove that without making
it more complex or slower. Last idea I had was to create some
sort of looping cpp MACRO to encapsulate it at some level :/

Re: Ideas from ApacheCon

2017-06-02 Thread Jim Jagielski

> On Jun 1, 2017, at 1:29 PM, Jim Riggs  wrote:
> 
> 
> Regardless, even worst case, we are looking at what, iterating 6 pointers 
> instead of 3 or 10 instead of 5? We probably have some lower hanging fruit 
> across the request lifecycle code to increase performance than saving some 
> arithmetic on a handful of structs, no? ;-)
> 

Oh I agree... it just seems that we add overhead during the
normal and nominal case (everybody is up and available) and
that the overhead is worse the more workers we have. I am
personally OK with abnormal situations being slower

And yeah, I also have never like the duplication that's inherent
in all the lbmethods, and so I really like it being simplified.
But I could never figure out a way to remove that without making
it more complex or slower. Last idea I had was to create some
sort of looping cpp MACRO to encapsulate it at some level :/

Implement hot spares in mod_proxy_balancer [was Re: Ideas from ApacheCon]

2017-06-01 Thread Jim Riggs
> On 1 Jun 2017, at 17:15, Yann Ylavic  wrote:
> 
> On Thu, Jun 1, 2017 at 10:48 PM, Jim Riggs  wrote:
>>> On 1 Jun 2017, at 15:25, Yann Ylavic  wrote:
>>> 
>>> On Thu, Jun 1, 2017 at 10:22 PM, Yann Ylavic 
>>> wrote:
 On Thu, Jun 1, 2017 at 7:29 PM, Jim Riggs 
 wrote:
>> On 1 Jun 2017, at 07:55, Jim Jagielski 
>> wrote: 2. I understand the logic behind creating the arrays,
>> but doesn't this increase the overhead. We go thru the full
>> list of workers one time, and then go thru the list of avail
>> works and spares right after that. Assume that all workers
>> are available; doesn't it mean we go thru that last 2x?
 []
> 
> The only way I can think of to avoid this without going back
> to duplicating code across lbmethods, which I would argue is
> worse, is to maybe have the lbmethod provide a callback
> function and context pointer to
> ap_proxy_balancer_usable_workers() that gets called during the
> loop iteration to pick the best member in flight.
 
 Couldn't a simple 'best' for ap_proxy_balancer_usable_workers()
 make it return a single entry?
>>> 
>>> ... a simple 'best' *flag* (as argument) ...
>> 
>> I'm not sure I follow what this flag would be. The lbmethod would
>> somehow have to tell ap_proxy_balancer_usable_workers() how to pick
>> the best worker (e.g. by comparing the number of bytes sent or the
>> number of requests processed). I'm not sure how that information
>> could be passed as a flag unless we baked the behavior of byrequests,
>> bybusyness, and bytraffic into ap_proxy_balancer_usable_workers().
>> But then how would we allow for plugging in additional lbmethods?
> 
> Oh right, nevermind, I thought per lbmethod logic was already there.
> After a better look into it, a callback (and its baton) with that
> logic looks straightforward, though.

Indeed. I put a new patch that uses callbacks on BZ61140 (and visually 
available at https://github.com/jhriggs/httpd/pull/1/files). So, the lbmethods 
call ap_proxy_balancer_get_best_worker() with a callback and baton so that the 
best worker is chosen in flight during iteration of the workers, addressing 
Jim's overhead concern. Changes:

- use a callback and baton to the lbmethod in
  ap_proxy_balancer_get_best_worker() (previously
  ap_proxy_balancer_usable_workers()) so that we can find the best worker in
  flight while iterating rather than returning an array of usable workers that
  has to then in turn be iterated

- consider a draining worker unusable and suitable for replacement by a spare


> Regarding:
> worker = _ARRAY_IDX(balancer->workers, i, proxy_worker *);
> I don't see why worker needs to be a 'proxy_worker**' ('*worker' is
> never updated IIUC).
> 
> Looks like:
> proxy_worker *worker = APR_ARRAY_IDX(balancer->workers, i, proxy_worker *);
> would be fine and allow using 'worker->xxx' instead of
> '(*worker)->xxx' all over the place...

Much of what I did was based on what already existed. The existing code was 
iterating through proxy_worker**, so mine did too. You are correct, though, 
that there is no real reason for it. The latest patch switches to just using * 
rather than **.

Thanks for putting eyes on this! I appreciate it.



Re: Ideas from ApacheCon

2017-06-01 Thread Yann Ylavic
On Thu, Jun 1, 2017 at 10:48 PM, Jim Riggs  wrote:
>> On 1 Jun 2017, at 15:25, Yann Ylavic  wrote:
>>
>> On Thu, Jun 1, 2017 at 10:22 PM, Yann Ylavic 
>> wrote:
>>> On Thu, Jun 1, 2017 at 7:29 PM, Jim Riggs 
>>> wrote:
> On 1 Jun 2017, at 07:55, Jim Jagielski 
> wrote: 2. I understand the logic behind creating the arrays,
> but doesn't this increase the overhead. We go thru the full
> list of workers one time, and then go thru the list of avail
> works and spares right after that. Assume that all workers
> are available; doesn't it mean we go thru that last 2x?
>>> []

 The only way I can think of to avoid this without going back
 to duplicating code across lbmethods, which I would argue is
 worse, is to maybe have the lbmethod provide a callback
 function and context pointer to
 ap_proxy_balancer_usable_workers() that gets called during the
 loop iteration to pick the best member in flight.
>>>
>>> Couldn't a simple 'best' for ap_proxy_balancer_usable_workers()
>>> make it return a single entry?
>>
>> ... a simple 'best' *flag* (as argument) ...
>
> I'm not sure I follow what this flag would be. The lbmethod would
> somehow have to tell ap_proxy_balancer_usable_workers() how to pick
> the best worker (e.g. by comparing the number of bytes sent or the
> number of requests processed). I'm not sure how that information
> could be passed as a flag unless we baked the behavior of byrequests,
> bybusyness, and bytraffic into ap_proxy_balancer_usable_workers().
> But then how would we allow for plugging in additional lbmethods?

Oh right, nevermind, I thought per lbmethod logic was already there.
After a better look into it, a callback (and its baton) with that
logic looks straightforward, though.

Regarding:
 worker = _ARRAY_IDX(balancer->workers, i, proxy_worker *);
I don't see why worker needs to be a 'proxy_worker**' ('*worker' is
never updated IIUC).

Looks like:
 proxy_worker *worker = APR_ARRAY_IDX(balancer->workers, i, proxy_worker *);
would be fine and allow using 'worker->xxx' instead of
'(*worker)->xxx' all over the place...


Regards,
Yann.


Re: Ideas from ApacheCon

2017-06-01 Thread Jim Riggs
> On 1 Jun 2017, at 15:25, Yann Ylavic  wrote:
> 
> On Thu, Jun 1, 2017 at 10:22 PM, Yann Ylavic  wrote:
>> On Thu, Jun 1, 2017 at 7:29 PM, Jim Riggs  wrote:
 On 1 Jun 2017, at 07:55, Jim Jagielski  wrote:
 2. I understand the logic behind creating the arrays, but doesn't
this increase the overhead. We go thru the full list of workers
one time, and then go thru the list of avail works and spares
right after that. Assume that all workers are available; doesn't
it mean we go thru that last 2x?
>> []
>>> 
>>> The only way I can think of to avoid this without going back to
>>> duplicating code across lbmethods, which I would argue is worse, is
>>> to maybe have the lbmethod provide a callback function and context
>>> pointer to ap_proxy_balancer_usable_workers() that gets called during
>>> the loop iteration to pick the best member in flight.
>> 
>> Couldn't a simple 'best' for ap_proxy_balancer_usable_workers() make
>> it return a single entry?
> 
> ... a simple 'best' *flag* (as argument) ...

I'm not sure I follow what this flag would be. The lbmethod would somehow have 
to tell ap_proxy_balancer_usable_workers() how to pick the best worker (e.g. by 
comparing the number of bytes sent or the number of requests processed). I'm 
not sure how that information could be passed as a flag unless we baked the 
behavior of byrequests, bybusyness, and bytraffic into 
ap_proxy_balancer_usable_workers(). But then how would we allow for plugging in 
additional lbmethods?


>> 
>> By the way, retrieving ap_proxy_retry_worker() via an OPTIONAL_FN
>> looks unnecessary since it's implemented in the exact same
>> "proxy_util.c" file.

LOL! That's a copy-paste remnant from when I had that function in a different 
file originally. Patch updated. Thanks for catching it!



Re: Ideas from ApacheCon

2017-06-01 Thread Yann Ylavic
On Thu, Jun 1, 2017 at 10:22 PM, Yann Ylavic  wrote:
> On Thu, Jun 1, 2017 at 7:29 PM, Jim Riggs  wrote:
>>> On 1 Jun 2017, at 07:55, Jim Jagielski  wrote:
>>>
>>> I really like where this is going... I just have a few questions:
>>>
>>> 1. The style, esp with array usage is different; eg APR_ARRAY_PUSH
>>> and APR_ARRAY_IDX... any particular reason why?
>>
>> Well, we don't seem to be entirely consistent with using elts and
>> apr_array*() directly vs. the helper #defines across the codebase. I
>> was doing everything with direct pointer access and arithmetic until
>> I started looking at the heartbeat lbmethod that was using some of
>> the helpers. IMNSHO, I think they are a little cleaner and easier
>> use/read. I got lost in dereferencing hell with all of the `&((*...)
>> **)->(*)'! :-) The helpers are there, and I think they are convenient
>> and worth using, but I'm not digging through all of this code as
>> regularly as some of the others on here. If the standard/consensus is
>> the other way, it's easy to change.
>
> I also (much, much) prefer the macros, and preferably would like all
> the 'elts' usages and cast trickeries changed to them rather. But
> that's another (own) story...
>
>>
>>
>>>  2. I understand the logic behind creating the arrays, but doesn't
>>> this increase the overhead. We go thru the full list of workers
>>> one time, and then go thru the list of avail works and spares
>>> right after that. Assume that all workers are available; doesn't
>>> it mean we go thru that last 2x?
> []
>>
>> The only way I can think of to avoid this without going back to
>> duplicating code across lbmethods, which I would argue is worse, is
>> to maybe have the lbmethod provide a callback function and context
>> pointer to ap_proxy_balancer_usable_workers() that gets called during
>> the loop iteration to pick the best member in flight.
>
> Couldn't a simple 'best' for ap_proxy_balancer_usable_workers() make
> it return a single entry?

... a simple 'best' *flag* (as argument) ...

>
> By the way, retrieving ap_proxy_retry_worker() via an OPTIONAL_FN
> looks unnecessary since it's implemented in the exact same
> "proxy_util.c" file.
>
>>
>> Regardless, even worst case, we are looking at what, iterating 6
>> pointers instead of 3 or 10 instead of 5? We probably have some lower
>> hanging fruit across the request lifecycle code to increase
>> performance than saving some arithmetic on a handful of structs, no?
>> ;-)
>
> Agreed.


Re: Ideas from ApacheCon

2017-06-01 Thread Yann Ylavic
On Thu, Jun 1, 2017 at 7:29 PM, Jim Riggs  wrote:
>> On 1 Jun 2017, at 07:55, Jim Jagielski  wrote:
>>
>> I really like where this is going... I just have a few questions:
>>
>> 1. The style, esp with array usage is different; eg APR_ARRAY_PUSH
>> and APR_ARRAY_IDX... any particular reason why?
>
> Well, we don't seem to be entirely consistent with using elts and
> apr_array*() directly vs. the helper #defines across the codebase. I
> was doing everything with direct pointer access and arithmetic until
> I started looking at the heartbeat lbmethod that was using some of
> the helpers. IMNSHO, I think they are a little cleaner and easier
> use/read. I got lost in dereferencing hell with all of the `&((*...)
> **)->(*)'! :-) The helpers are there, and I think they are convenient
> and worth using, but I'm not digging through all of this code as
> regularly as some of the others on here. If the standard/consensus is
> the other way, it's easy to change.

I also (much, much) prefer the macros, and preferably would like all
the 'elts' usages and cast trickeries changed to them rather. But
that's another (own) story...

>
>
>>  2. I understand the logic behind creating the arrays, but doesn't
>> this increase the overhead. We go thru the full list of workers
>> one time, and then go thru the list of avail works and spares
>> right after that. Assume that all workers are available; doesn't
>> it mean we go thru that last 2x?
[]
>
> The only way I can think of to avoid this without going back to
> duplicating code across lbmethods, which I would argue is worse, is
> to maybe have the lbmethod provide a callback function and context
> pointer to ap_proxy_balancer_usable_workers() that gets called during
> the loop iteration to pick the best member in flight.

Couldn't a simple 'best' for ap_proxy_balancer_usable_workers() make
it return a single entry?

By the way, retrieving ap_proxy_retry_worker() via an OPTIONAL_FN
looks unnecessary since it's implemented in the exact same
"proxy_util.c" file.

>
> Regardless, even worst case, we are looking at what, iterating 6
> pointers instead of 3 or 10 instead of 5? We probably have some lower
> hanging fruit across the request lifecycle code to increase
> performance than saving some arithmetic on a handful of structs, no?
> ;-)

Agreed.


Re: Ideas from ApacheCon

2017-06-01 Thread Jim Riggs
> On 1 Jun 2017, at 07:55, Jim Jagielski  wrote:
> 
> I really like where this is going... I just have a few questions:
> 
>  1. The style, esp with array usage is different; eg APR_ARRAY_PUSH
> and APR_ARRAY_IDX... any particular reason why?

Well, we don't seem to be entirely consistent with using elts and apr_array*() 
directly vs. the helper #defines across the codebase. I was doing everything 
with direct pointer access and arithmetic until I started looking at the 
heartbeat lbmethod that was using some of the helpers. IMNSHO, I think they are 
a little cleaner and easier use/read. I got lost in dereferencing hell with all 
of the `&((*...) **)->(*)'! :-) The helpers are there, and I think they are 
convenient and worth using, but I'm not digging through all of this code as 
regularly as some of the others on here. If the standard/consensus is the other 
way, it's easy to change.


>  2. I understand the logic behind creating the arrays, but doesn't
> this increase the overhead. We go thru the full list of workers
> one time, and then go thru the list of avail works and spares
> right after that. Assume that all workers are available; doesn't
> it mean we go thru that last 2x?

If everything is available, the old implementation only goes through the list 
once and picks a worker. This new implementation goes through the list once and 
returns a *subset* that the lbmethod then has to loop through to pick a worker. 
Worst case (e.g. X workers all available in lbset 0), yes, we would now do 2X.

In a more complex scenario (e.g. X workers all available in each lbsets 0, 1, 
and 2), we do 3X (1st loop) + X (2nd loop) = 4X, not 2 * 3X = 6X. So, yes, 
there is slightly more overhead...but, the second loop (in the lbmethod) has at 
most one lbset's worth of workers and really isn't doing anything except an 
arithmetic comparison to pick the best from them.

If a number of workers are unavailable, however, it's different. The old 
implementation loops the whole list twice per lbset if it doesn't find a 
worker, once for workers and once for standbys, whereas this new implementation 
still loops the whole list only once per lbset. In the example above, 2 * 3X = 
6X for old, 3X + X = 4X for new.

The only way I can think of to avoid this without going back to duplicating 
code across lbmethods, which I would argue is worse, is to maybe have the 
lbmethod provide a callback function and context pointer to 
ap_proxy_balancer_usable_workers() that gets called during the loop iteration 
to pick the best member in flight. I am open to that...it just seemed to add 
unnecessary complexity.

Regardless, even worst case, we are looking at what, iterating 6 pointers 
instead of 3 or 10 instead of 5? We probably have some lower hanging fruit 
across the request lifecycle code to increase performance than saving some 
arithmetic on a handful of structs, no? ;-)


>> On May 31, 2017, at 1:44 PM, Jim Riggs  wrote:
>> 
>>> On 23 May 2017, at 09:16, Jim Riggs  wrote:
>>> 
 On 18 May 2017, at 13:22, Rainer Jung  wrote:
 
 Am 18.05.2017 um 19:46 schrieb Jim Jagielski:
> Based on feedback from various sessions:
> 
> o A new-kind of "hot standby" in mod_proxy which kicks
> in whenever a worker moves out of the pool (ie, doesn't
> wait until all workers are out)... ala a redundant
> hard drive.
 
 Maybe "spare worker" (and we could have more than one such spares).
>>> 
>>> Exactly. I am already working on some code for this, though it to seems to 
>>> necessarily be a bit convoluted in the current codebase.
>>> 
>>> The way we treat a "hot standby" in mod_proxy_balancer is as a last-ditch 
>>> effort to return something. I.e. *all* workers are unavailable, so then we 
>>> use the hot standby. (This problem can actually be solved a different way 
>>> by setting a high value for lbset.)
>>> 
>>> In my mind, though, what is proposed here is actually how I actually expect 
>>> a "hot standby" to work. Think of it more like a "hot spare" in disk RAID 
>>> terms. That is, if *any* worker is unavailable, the hot spare will be used 
>>> (or at least added to the list of potential workers...still to be 
>>> determined by the lbmethod implementation).
>>> 
>>> Example:
>>> 
>>> 
>>> BalancerMember "http://192.168.1.50:80;
>>> BalancerMember "http://192.168.1.51:80;
>>> BalancerMember "http://192.168.1.52:80;
>>> BalancerMember "http://192.168.1.53:80; status=+H
>>> 
>>> 
>>> In this case, .53 will only get used if .50, .51, and .52 are *all* 
>>> unavailable.
>>> 
>>> 
>>> BalancerMember "http://192.168.1.50:80;
>>> BalancerMember "http://192.168.1.51:80;
>>> BalancerMember "http://192.168.1.52:80;
>>> BalancerMember "http://192.168.1.53:80; status=+R # new "hot spare" status
>>> BalancerMember "http://192.168.1.54:80; status=+R # new "hot spare" status
>>> 
>>> 
>>> In this case, if .50 

Re: Ideas from ApacheCon

2017-06-01 Thread Jim Jagielski
I really like where this is going... I just have a few questions:

  1. The style, esp with array usage is different; eg APR_ARRAY_PUSH
 and APR_ARRAY_IDX... any particular reason why?

  2. I understand the logic behind creating the arrays, but doesn't
 this increase the overhead. We go thru the full list of workers
 one time, and then go thru the list of avail works and spares
 right after that. Assume that all workers are available; doesn't
 it mean we go thru that last 2x?

> On May 31, 2017, at 1:44 PM, Jim Riggs  wrote:
> 
>> On 23 May 2017, at 09:16, Jim Riggs  wrote:
>> 
>>> On 18 May 2017, at 13:22, Rainer Jung  wrote:
>>> 
>>> Am 18.05.2017 um 19:46 schrieb Jim Jagielski:
 Based on feedback from various sessions:
 
 o A new-kind of "hot standby" in mod_proxy which kicks
 in whenever a worker moves out of the pool (ie, doesn't
 wait until all workers are out)... ala a redundant
 hard drive.
>>> 
>>> Maybe "spare worker" (and we could have more than one such spares).
>> 
>> Exactly. I am already working on some code for this, though it to seems to 
>> necessarily be a bit convoluted in the current codebase.
>> 
>> The way we treat a "hot standby" in mod_proxy_balancer is as a last-ditch 
>> effort to return something. I.e. *all* workers are unavailable, so then we 
>> use the hot standby. (This problem can actually be solved a different way by 
>> setting a high value for lbset.)
>> 
>> In my mind, though, what is proposed here is actually how I actually expect 
>> a "hot standby" to work. Think of it more like a "hot spare" in disk RAID 
>> terms. That is, if *any* worker is unavailable, the hot spare will be used 
>> (or at least added to the list of potential workers...still to be determined 
>> by the lbmethod implementation).
>> 
>> Example:
>> 
>> 
>> BalancerMember "http://192.168.1.50:80;
>> BalancerMember "http://192.168.1.51:80;
>> BalancerMember "http://192.168.1.52:80;
>> BalancerMember "http://192.168.1.53:80; status=+H
>> 
>> 
>> In this case, .53 will only get used if .50, .51, and .52 are *all* 
>> unavailable.
>> 
>> 
>> BalancerMember "http://192.168.1.50:80;
>> BalancerMember "http://192.168.1.51:80;
>> BalancerMember "http://192.168.1.52:80;
>> BalancerMember "http://192.168.1.53:80; status=+R # new "hot spare" status
>> BalancerMember "http://192.168.1.54:80; status=+R # new "hot spare" status
>> 
>> 
>> In this case, if .50 becomes unavailable, .53 (or .54 depending on 
>> implementation) will be treated as an available worker for the lbmethod to 
>> potentially choose. If 2 or more of .50, .51, and .52 become unavailable, 
>> both .53 and .54 would be available to be chosen.
>> 
>> So, instead of having a single fallback option when *all* workers are dead, 
>> we will have a way of trying to ensure that a specific number of workers (3 
>> in the example above) are always available...just like a hot spare drive 
>> plugs into the RAID array when one of the members dies. In our case, though, 
>> once the main worker recovers, the hot spare will go back to being a hot 
>> spare (except for matching routes).
>> 
>> Comments welcome.
> 
> 
> As promised, balancer hot spare support: 
> https://bz.apache.org/bugzilla/show_bug.cgi?id=61140
> 



Re: Ideas from ApacheCon

2017-05-31 Thread Jim Riggs
> On 23 May 2017, at 09:16, Jim Riggs  wrote:
> 
>> On 18 May 2017, at 13:22, Rainer Jung  wrote:
>> 
>> Am 18.05.2017 um 19:46 schrieb Jim Jagielski:
>>> Based on feedback from various sessions:
>>> 
>>> o A new-kind of "hot standby" in mod_proxy which kicks
>>>  in whenever a worker moves out of the pool (ie, doesn't
>>>  wait until all workers are out)... ala a redundant
>>>  hard drive.
>> 
>> Maybe "spare worker" (and we could have more than one such spares).
> 
> Exactly. I am already working on some code for this, though it to seems to 
> necessarily be a bit convoluted in the current codebase.
> 
> The way we treat a "hot standby" in mod_proxy_balancer is as a last-ditch 
> effort to return something. I.e. *all* workers are unavailable, so then we 
> use the hot standby. (This problem can actually be solved a different way by 
> setting a high value for lbset.)
> 
> In my mind, though, what is proposed here is actually how I actually expect a 
> "hot standby" to work. Think of it more like a "hot spare" in disk RAID 
> terms. That is, if *any* worker is unavailable, the hot spare will be used 
> (or at least added to the list of potential workers...still to be determined 
> by the lbmethod implementation).
> 
> Example:
> 
> 
> BalancerMember "http://192.168.1.50:80;
> BalancerMember "http://192.168.1.51:80;
> BalancerMember "http://192.168.1.52:80;
> BalancerMember "http://192.168.1.53:80; status=+H
> 
> 
> In this case, .53 will only get used if .50, .51, and .52 are *all* 
> unavailable.
> 
> 
> BalancerMember "http://192.168.1.50:80;
> BalancerMember "http://192.168.1.51:80;
> BalancerMember "http://192.168.1.52:80;
> BalancerMember "http://192.168.1.53:80; status=+R # new "hot spare" status
> BalancerMember "http://192.168.1.54:80; status=+R # new "hot spare" status
> 
> 
> In this case, if .50 becomes unavailable, .53 (or .54 depending on 
> implementation) will be treated as an available worker for the lbmethod to 
> potentially choose. If 2 or more of .50, .51, and .52 become unavailable, 
> both .53 and .54 would be available to be chosen.
> 
> So, instead of having a single fallback option when *all* workers are dead, 
> we will have a way of trying to ensure that a specific number of workers (3 
> in the example above) are always available...just like a hot spare drive 
> plugs into the RAID array when one of the members dies. In our case, though, 
> once the main worker recovers, the hot spare will go back to being a hot 
> spare (except for matching routes).
> 
> Comments welcome.


As promised, balancer hot spare support: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=61140



Re: Ideas from ApacheCon

2017-05-23 Thread Graham Leggett
On 23 May 2017, at 3:29 PM, Jim Riggs  wrote:

>>> Does that differ from "CacheQuickHandler OFF"?
> 
> 
> This one is a bit quirky and requires some investigation to see the edge 
> cases. There appear to be some situations where Require directives with 
> `CacheQuickHandler off' do not always take effect. Thus, items that have been 
> cached will be served from the cache even if access restrictions would 
> otherwise disallow them.
> 
> We will just need to do some more testing to see when -- and if -- this is 
> actually occurring. I stumbled across it while doing testing/prep for my 
> talk, but did not investigate it thoroughly. (It is entirely possible that I 
> just messed something up in my config and didn't catch it...or forgot a 
> graceful...or something...)
> 
> In my talk, I just told people to "be careful" with access control and CQF 
> off.

When CacheQuickHandler is off, mod_cache acts as a normal content generator, 
and all normal request handling - including the require directives - applies.

If there are cases where this doesn’t apply we need to investigate them and 
either fix them or properly explain the behaviour.

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: Ideas from ApacheCon

2017-05-23 Thread Stefan Eissing

> Am 23.05.2017 um 17:44 schrieb Jacob Champion :
> 
> On 05/18/2017 10:46 AM, Jim Jagielski wrote:
>> Based on feedback from various sessions:
> 
> Thanks for the list, Jim!
> 
>>   o Warn if the trailing '/'s don't match in ProxyPass/Reverse
>> directives (eg: ProxyPass /foo http://www.example.com/foo/ )
> 
> This one is easy enough to put into the directives themselves, but I'd like 
> to expand on the idea in general.
> 
> What would you all think about a linter for httpd config files? Something 
> that can be updated independently of httpd releases with a ruleset, and that 
> can target multiple versions of the server at once so that everyone gets the 
> benefits without having to upgrade. Ideally the output would be standardized 
> to the point that IDEs could dynamically run the linter as you typed.
> 
> I started playing with this idea last year but got pulled into security and 
> testing, so I haven't taken a look at my (Python-and-Atom-based) project in a 
> while. This trailing-slash warning was in my notes, as were things like
> 
> - Unused/unnecessary /
> - VirtualHosts declared with hostnames instead of IP
> - Location blocks in the wrong order
> - Duplicate Listen directives
> 
> etc.
> 
> Short term, this helps automate spreading the wisdom that we have to impart 
> over and over again on the support channels. In the long term, linter 
> rulesets document what's difficult about the configuration language so we can 
> potentially design a better one in the future.
> 
> --Jacob

While speaking of ProxyPass/Reverse: the reverse mapping, applied to headers 
such as "Location" is currently not supporting the "Link" header - last I 
looked. I added support for that in mod_proxy_http2, to get PUSHes from the 
backend working, but this is valid for other proxy modules as well. 

If someone on this list feels like it, I would imagine it a nice little 
exercise to lift the code into the general mod_proxy handling.

Cheers,

Stefan

Re: Ideas from ApacheCon

2017-05-23 Thread Jacob Champion

On 05/18/2017 10:46 AM, Jim Jagielski wrote:

Based on feedback from various sessions:


Thanks for the list, Jim!


   o Warn if the trailing '/'s don't match in ProxyPass/Reverse
 directives (eg: ProxyPass /foo http://www.example.com/foo/ )


This one is easy enough to put into the directives themselves, but I'd 
like to expand on the idea in general.


What would you all think about a linter for httpd config files? 
Something that can be updated independently of httpd releases with a 
ruleset, and that can target multiple versions of the server at once so 
that everyone gets the benefits without having to upgrade. Ideally the 
output would be standardized to the point that IDEs could dynamically 
run the linter as you typed.


I started playing with this idea last year but got pulled into security 
and testing, so I haven't taken a look at my (Python-and-Atom-based) 
project in a while. This trailing-slash warning was in my notes, as were 
things like


- Unused/unnecessary /
- VirtualHosts declared with hostnames instead of IP
- Location blocks in the wrong order
- Duplicate Listen directives

etc.

Short term, this helps automate spreading the wisdom that we have to 
impart over and over again on the support channels. In the long term, 
linter rulesets document what's difficult about the configuration 
language so we can potentially design a better one in the future.


--Jacob


Re: Ideas from ApacheCon

2017-05-23 Thread Jim Riggs

> On 18 May 2017, at 12:46, Jim Jagielski  wrote:
> 
> Based on feedback from various sessions:
> 
>  o A new-kind of "hot standby" in mod_proxy which kicks
>in whenever a worker moves out of the pool (ie, doesn't
>wait until all workers are out)... ala a redundant
>hard drive.
> 
>  o Look into AAA and mod_cache; eg: "bolt in at the end"
> 
>  o HTTP/2 no longer experimental!
> 
>  o balancer-manager more scriptable (move to REST for realz?)
> 
>  o When restarting w/ persistent balancer data, warn if
>config files mtime is newer.
> 
>  o Warn if the trailing '/'s don't match in ProxyPass/Reverse
>directives (eg: ProxyPass /foo http://www.example.com/foo/ )
> 
> All I can recall at present...


  o Investigate mod_cache performance when bypassing cache (e.g.
`Cache-Control: no-cache'). In benchmark testing for my
talk,

  (1) no mod_cache
  vs.
  (2) cached responses
  vs.
  (3) cache enabled but requests specify `Cache-Control: no-cache'

showed that (3) was significantly slower than (1). I would expect
that no-cache would short-circuit/bypass mod_cache and have
results similar to (1), but there is *something* going on in
there. I have not investigated the code...just noted the results.



Re: Ideas from ApacheCon

2017-05-23 Thread Jim Riggs
> On 18 May 2017, at 13:22, Rainer Jung  wrote:
> 
> Am 18.05.2017 um 19:46 schrieb Jim Jagielski:
>> Based on feedback from various sessions:
>> 
>> o A new-kind of "hot standby" in mod_proxy which kicks
>>   in whenever a worker moves out of the pool (ie, doesn't
>>   wait until all workers are out)... ala a redundant
>>   hard drive.
> 
> Maybe "spare worker" (and we could have more than one such spares).

Exactly. I am already working on some code for this, though it to seems to 
necessarily be a bit convoluted in the current codebase.

The way we treat a "hot standby" in mod_proxy_balancer is as a last-ditch 
effort to return something. I.e. *all* workers are unavailable, so then we use 
the hot standby. (This problem can actually be solved a different way by 
setting a high value for lbset.)

In my mind, though, what is proposed here is actually how I actually expect a 
"hot standby" to work. Think of it more like a "hot spare" in disk RAID terms. 
That is, if *any* worker is unavailable, the hot spare will be used (or at 
least added to the list of potential workers...still to be determined by the 
lbmethod implementation).

Example:


 BalancerMember "http://192.168.1.50:80;
 BalancerMember "http://192.168.1.51:80;
 BalancerMember "http://192.168.1.52:80;
 BalancerMember "http://192.168.1.53:80; status=+H


In this case, .53 will only get used if .50, .51, and .52 are *all* unavailable.


 BalancerMember "http://192.168.1.50:80;
 BalancerMember "http://192.168.1.51:80;
 BalancerMember "http://192.168.1.52:80;
 BalancerMember "http://192.168.1.53:80; status=+R # new "hot spare" status
 BalancerMember "http://192.168.1.54:80; status=+R # new "hot spare" status


In this case, if .50 becomes unavailable, .53 (or .54 depending on 
implementation) will be treated as an available worker for the lbmethod to 
potentially choose. If 2 or more of .50, .51, and .52 become unavailable, both 
.53 and .54 would be available to be chosen.

So, instead of having a single fallback option when *all* workers are dead, we 
will have a way of trying to ensure that a specific number of workers (3 in the 
example above) are always available...just like a hot spare drive plugs into 
the RAID array when one of the members dies. In our case, though, once the main 
worker recovers, the hot spare will go back to being a hot spare (except for 
matching routes).

Comments welcome.



Re: Ideas from ApacheCon

2017-05-23 Thread Jim Riggs
> On 22 May 2017, at 06:45, Jim Jagielski  wrote:
> 
> I'll let Jim Riggs answer that...it came up during his mod_cache
> talk.
>> On May 18, 2017, at 2:25 PM, Eric Covener  wrote:
>> 
>> On Thu, May 18, 2017 at 2:22 PM, Rainer Jung  wrote:
 o Look into AAA and mod_cache; eg: "bolt in at the end"
>> 
>> Does that differ from "CacheQuickHandler OFF"?


This one is a bit quirky and requires some investigation to see the edge cases. 
There appear to be some situations where Require directives with 
`CacheQuickHandler off' do not always take effect. Thus, items that have been 
cached will be served from the cache even if access restrictions would 
otherwise disallow them.

We will just need to do some more testing to see when -- and if -- this is 
actually occurring. I stumbled across it while doing testing/prep for my talk, 
but did not investigate it thoroughly. (It is entirely possible that I just 
messed something up in my config and didn't catch it...or forgot a 
graceful...or something...)

In my talk, I just told people to "be careful" with access control and CQF off.



Re: Ideas from ApacheCon

2017-05-22 Thread Jim Jagielski
I'll let Jim Riggs answer that...it came up during his mod_cache
talk.
> On May 18, 2017, at 2:25 PM, Eric Covener  wrote:
> 
> On Thu, May 18, 2017 at 2:22 PM, Rainer Jung  wrote:
>>>  o Look into AAA and mod_cache; eg: "bolt in at the end"
> 
> Does that differ from "CacheQuickHandler OFF"?
> 
> 
> 
> -- 
> Eric Covener
> cove...@gmail.com



Re: Ideas from ApacheCon

2017-05-18 Thread Eric Covener
On Thu, May 18, 2017 at 2:22 PM, Rainer Jung  wrote:
>>   o Look into AAA and mod_cache; eg: "bolt in at the end"

Does that differ from "CacheQuickHandler OFF"?



-- 
Eric Covener
cove...@gmail.com


Re: Ideas from ApacheCon

2017-05-18 Thread Rainer Jung

Thanks for the list.

One remark inline ...

Am 18.05.2017 um 19:46 schrieb Jim Jagielski:

Based on feedback from various sessions:

  o A new-kind of "hot standby" in mod_proxy which kicks
in whenever a worker moves out of the pool (ie, doesn't
wait until all workers are out)... ala a redundant
hard drive.


Maybe "spare worker" (and we could have more than one such spares).


  o Look into AAA and mod_cache; eg: "bolt in at the end"

  o HTTP/2 no longer experimental!

  o balancer-manager more scriptable (move to REST for realz?)

  o When restarting w/ persistent balancer data, warn if
config files mtime is newer.

  o Warn if the trailing '/'s don't match in ProxyPass/Reverse
directives (eg: ProxyPass /foo http://www.example.com/foo/ )

All I can recall at present...


Regards,

Rainer