Re: [squid-dev] [PATCH] url_rewrite_timeout directive

2014-12-03 Thread Tsantilas Christos

If there is not any objection I will apply the last patch to trunk...


On 11/24/2014 02:36 PM, Tsantilas Christos wrote:

This is a new patch for url_rewrite_timeout feature.

Changes over the last patch:
- The tools/helper-mux/helper-mux fixed to work with the new helpers
request-id.
- Now there is a limit on request retries, it is hardcoded to 2
retries.
- The retrying request with BH replies on storeID and redrector
helpers is nor handled inside helpers.cc code.
- other minor polishing changes

I did not remove the on_timeout option from url_rewrite_timeout
directive. Although it can be emulated using the
use_configured_response option I believe it is a clearer configuration
method. The use_configured_response looks more than a trick and I am
sure in 1-2 years, even me I develop the patch I will forget that exist
a such configuration option.

I must note again that the default behaviour of current helpers
configuration, should not change with this patch.

I hope it is OK.

Regards,
Christos




___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev



___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] url_rewrite_timeout directive

2014-11-18 Thread Tsantilas Christos

On 11/18/2014 01:27 AM, Amos Jeffries wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 18/11/2014 6:10 a.m., Tsantilas Christos wrote:

On 11/16/2014 01:05 PM, Amos Jeffries wrote:

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1

On 16/11/2014 7:38 a.m., Tsantilas Christos wrote:


 



For the record I am still extremely skeptical about the use-case
behind this feature. Timeout i a clear sign that the helper or
system behind it is broken (capacity overload / network
congestion). Continuing to use such systems in a way which
further increases network load is very often a bad choice. Hiding
the issue an even worse choice.



Maybe the administrator does not care about unanswered queries.


I posit that admin who truely do not care about unanswered questions
will not care enough to bother configuring this feature.


So he will not do it.
The default behaviour is to not use timeout.
The default behaviour is not changed.



The administrator should always care about these unanswered questions.
Each one means worse proxy performance. Offering to hide them silently
and automatically without anybody having to do any work about fixing
the underlying cause implies they are not actually problems. Which is
false implication.



This option can be used to increase proxy performance. This is the 
purpose of this option.






Imagine a huge squid proxy, which may have to process thousands of
URLs per minute and the administrator decided that a 10% of helpers
request can fail. In this case we are giving him a way to configure
the squid behaviour in this case: may ignore the probelm, or use a
predefined answer etc


Now that same Squid proxy suddenly starts randomly wrongly rejecting
just one request per minute out of all those thousands.
  Which one, why, and WTF can anybody do about it?


Just only requests which timed out and only if configured to do it...



Previous Squid would log a client request with _TIMEOUT, leave a
helper in Busy or Pending state with full trace of the pending lookup
in mgr reports, possibly even cache.log warnings about helpers queue
length.


The patch does not change squid behaviour, if the timeout is not 
configured (default).




Now all that is reduced to an overly simple aggregated Requests timed
out: hiding in a rarely viewed manager report and a hidden level-3
debug message that lists an anonymous requestId from N identical
requestIds spread over N helpers.



IF configured you will see in mgr report a Requests timed out:  for 
each running server.
If you see that a server has many timedout requests, more than the other 
servers then you can kiil it if you consider it as a problem.





The reason the helper is not answered enough fast, maybe is a
database or an external lookup failure (for example categorized
urls list as a DNS service). In these cases the system admin or
service provider, may prefer a none answer from the helper, or a
preconfigured answer, instead of waiting too long for the answer.


What to do if the internal dependencies are going badly is something
that should be handled *inside the helper*.
  After all Squid has absolutely no way to know if its generic lookup
helper has the DNS lookup on the DB server name or the DB query itself
broken. Each of which might have a different best way to respond.

The intention of the BrokenHelper code was to have the helpers
explicitly inform Squid that they were in trouble and needed help
shifting the load elsewhere.

Squid silently Forgetting that requests have been sent and then
sending *more* is a truely terrible way to fix all the cases of helper
overload.


Again, the timeout is optional. It is an extra option.
If the customer/squid-user, wants to use BH code, still can do it.




The helper can be implemented to not answer at all after a timeout
period. A policy of if you do not answer in a day, please do not
answer at all, I am not interested any mode is common in human
world, and in business.



Yes, it is also being fingered as a major reason for businesses dying
off during the recent recession


:-)
Well, we can do a huge discussion about the recent recession, but I am 
sure I have more examples than you on failing businesses under an 
recessionary environment!



. Late respondants lost out on work and
thus cashflow.


Yes but you can not avoid such cases. A stop-loss, after a reasonable 
configured timeout, is not a bad tactic in such cases.





in src/cache_cf.cc

* please move the *_url_rewrite_timeout() functions to
redirect.cc - that will help reduce dependency issues when we get
around to making the redirector config a standalone FooConfig
object.


Not so easy, because of dependencies on time-related parsing
functions I let it for now. If required we must move time parse
functions to a library file, for example Config.cc



Pity. Oh well.



* it would be simpler and easier on users to omit the
on_timeout=use_configured_response response= and just have the
existence of a response= 

Re: [squid-dev] [PATCH] url_rewrite_timeout directive

2014-11-18 Thread Alex Rousskov
On 11/16/2014 04:05 AM, Amos Jeffries wrote:

 For the record I am still extremely skeptical about the use-case 
 behind this feature.

This is a real use case (or we would not be proposing this feature):
admins want to control what happens when their helper transactions
timeout.

Some of us may prefer to ignore timeouts completely (because they are
usually benign), some may prefer to kill Squid on the first timeout
(so that somebody notices and investigates sooner rather than later),
and some may want to do something in-between. Since both extremes and
especially the range of actions between them are valid in many
environments, our personal preferences are not really that important
here: We should give the admins the knobs they need to do their job
the way they see fit.


 Timeout i a clear sign that the helper or system behind it is
 broken (capacity overload / network congestion). Continuing to use
 such systems in a way which further increases network load is very
 often a bad choice. Hiding the issue an even worse choice.

The above logic is very true in some deployment environments and
completely bogus in others. Fortunately, there is no need to force
everybody use this logic or spend hours discussing the best approach
to dealing with proxy errors. We can just let admins the power to
decide what works best for them!


 My vote on this is -0. Too many problems and still no clear
 use-case has been described[1] for this to be a reasonable
 addition.

 [1] helper is slow and helper times out are not use-case 
 descriptions. They are symptoms of an underlying problem.

They are both a use case and a symptom. I know it is difficult to
imagine, but sometimes admins have to work around symptoms of the
problems they cannot control or cannot fix.

The ability to deal with imperfect helpers has been requested many
times. Squid inability to function when optional helpers misbehave is
a serious flaw that drives users away. As far as the overall
functionality of the proposed feature is concerned, I am sure it would
be welcomed by many, so I am happy to give it +1 if that is needed to
offset your they should just make perfect helpers instead -0.


Cheers,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] url_rewrite_timeout directive

2014-11-17 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 18/11/2014 6:10 a.m., Tsantilas Christos wrote:
 On 11/16/2014 01:05 PM, Amos Jeffries wrote:
 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
 
 On 16/11/2014 7:38 a.m., Tsantilas Christos wrote:
 Hi all,
 
 This patch adds the url_rewrite_timeout directive.
 
 When configured, Squid keeps track of active requests and
 treats timed out requests to redirector as failed requests.
 
 url_rewrite_timeout format: url_rewrite_timeout timeout
 time-units 
 on_timeout=fail|bypass|retry|use_configured_response 
 [response=quoted-string]
 
 The url_rewrite_timeout directive can accept the on_timeout 
 argument to allow user configure the action when the helper
 request times out. The available actions are: - fail: squid
 return a ERR_GATEWAY_FAILURE error page
 
 Why? It seems to me this should be doing one of the below:
 
 on_timeout=use_configured_response response=ERR
 
 on_timeout=use_configured_response response=BH
 
 although being able to redirect to a Squid default error page
 template has its attractions regardless of this. Perhapse BH with
 an error=ERR_GATEWAY_FAILURE kv-pair ?
 
 The on_timeout= option is easier to understand and configure.
 
 Also this patch add the retries operation inside helpers.cc code
 and make it easier to implement similar features for other helpers
 too...
 
 Also the BH retry can be implemented now using helpers retry
 operation.
 
 
 
 - bypass: the url is not rewritten.
 
 Identical to: on_timeout=use_configured_response response=OK
 
 - retry: retry the request to helper
 
 Equivalent to what is supposed to happen for: 
 on_timeout=use_configured_response response=BH
 
 NP: if retry with different helper process is not already being
 done on BH then it needs to be added.
 
 It is not implemented for BH redirects. Also although it is make
 sense to not use the same server for a BH response, it is not clear
 why it is needed for a timedout server?

We dont know why the timeout happened. There are a few cases where it
may have happened due to internal helper state. Moving to a different
helper guarantees that those cases will have changed in some ways -
reducing the overall probability that it will repeat.

 
 I believe this is should have a different form. If there are many 
 timedout responses from a server, then do not sent more requests
 for a while, or maybe shut it down. We can add a todo for this
 one.
 

Both are worth doing.

 
 - use_configured_response: use a response which can be
 configured using the the response= option
 
 
 So as you can see, with a change to make the URL-rewriter capable
 of redirecting to one of the built-in error page templates we
 could completely drop the on_timeout setting.
 
 I still believe that the on-timeout is better options and easier
 to understand and configure.
 

Whatever it gets called the main point was there is no need for 2
different options.

 
 Technical details =
 
 This patch: - adds mechanism inside helpers.cc code to handle 
 timeouts: - return a pre-configured response on timeout - or 
 retries on timeouts. - or timedout (Helper::TimedOut code)
 response to the  caller. The caller can select to ignore the
 timedout request, or produce an error.
 
 - modify the client_side_request.cc to return
 ERR_GATEWAY_FAILURE error page. Also the error detail
 ERR_DETAIL_REDIRECTOR_TIMEDOUT is set to identify the type of
 error.
 
 
 For the record I am still extremely skeptical about the use-case 
 behind this feature. Timeout i a clear sign that the helper or
 system behind it is broken (capacity overload / network
 congestion). Continuing to use such systems in a way which
 further increases network load is very often a bad choice. Hiding
 the issue an even worse choice.
 
 
 Maybe the administrator does not care about unanswered queries.

I posit that admin who truely do not care about unanswered questions
will not care enough to bother configuring this feature.

The administrator should always care about these unanswered questions.
Each one means worse proxy performance. Offering to hide them silently
and automatically without anybody having to do any work about fixing
the underlying cause implies they are not actually problems. Which is
false implication.


 Imagine a huge squid proxy, which may have to process thousands of
 URLs per minute and the administrator decided that a 10% of helpers
 request can fail. In this case we are giving him a way to configure
 the squid behaviour in this case: may ignore the probelm, or use a
 predefined answer etc

Now that same Squid proxy suddenly starts randomly wrongly rejecting
just one request per minute out of all those thousands.
 Which one, why, and WTF can anybody do about it?

Previous Squid would log a client request with _TIMEOUT, leave a
helper in Busy or Pending state with full trace of the pending lookup
in mgr reports, possibly even cache.log warnings about helpers queue
length.

Now all that is reduced to