On 12/02/2012 5:05 p.m., Pieter De Wit wrote:
<snip>
* the parsing bottleneck gets crunched several times: on first
arrival, in the ICAP server, and on return to Squid,
* the ICAP server bypass optimization can't be used since quote needs
to measure every byte,
* tunneled data does not get sent to ICAP services,
Not exactly perfect service, but it offers the most complete quota
control without adding complexity to Squid.
eCAP might be a slightly better. It sill runs inside Squid and has
some processing overhead, but should reduce the parse problems and
network delays involved with ICAP.
Points to reading URL's are more than welcome, also, so is examples
of libicapapi :)
Hopefully someone else knows some then, because I dont :(
Amos
Hi Amos,
You said that you proposed some work a while ago, would you mind
sharing that? I gave the network thing some thoughts and I can see how
the delay would hurt squid. I kept on comparing it to milters, but
these don't mind a few ms delay, email is a lot less interactive.
The thought process I am going with is something along the lines of a
process that is "spoken" to, like ecap perhaps, via pipes or a lib or
some such. This process will be notified based on the following:
(* - Request, **-Reply)
* I would like to go to protocol://site
** Is there quota left to allow this, if the user has 0 quota left,
block the request, no use
* The server said the object is X bytes long, can I continue to
download it
** Yes, there is quota. The problem comes in if the server didn't give
a length, if that is the case, perhaps only allow 1024 bytes until his
quota runs out. There is also the problem if the server said the
object is bigger than it really is...
* Can I sent the following 1024 bytes
** Yes, there is quota.
At any given step, if the quota runs out, the connect is aborted. This
will involve some tie in with the FD struct that you guys have
already. I do recall myself and Alex having a chat about this. I
referred to it as "hooks" into the FD struct. I *think* the talk about
"hooks" in the FD struct was aborted because it didn't add enough
value at the time, or real life caught up to me or or or :)
The download even if known-length can be aborted at any time, also the
backend system may change the quota at any time as well.
So IMO the best idea is to collpase the requests all down to a request
asking for N bytes and passing along any parameters which the quota
backend needs.
The basic idea was started here:
http://bugs.squid-cache.org/show_bug.cgi?id=1849
Looking back at the discussion thread it was started by you in Feb 2009
the model description is here
http://marc.info/?l=squid-dev&m=123570800116923&w=1. Although it seems I
sent you something in private before that with more details. Sorry that
mail is gone now.
The Measurement Factory have since created the client_delay_pool part of
it but without any helper hooks. So the current is only /sec capping.
Adding a helper API hook that sets the client DB quota field values and
updates it when exhausted
That is fully controllable already with per-request limitations and speeds.
The big cases that are left is fixed-size quotas that run down. No need
for lookups with details from particular headers or such at this point.
Based on this, I would like to re-float the idea of "hooks" in the FD
struct. From the top of my head, one would have modules that expose
certain function/procedures:
The FD struct is on the hitlist for erasure or at least removing
anything that is not particularly directly related to the FD value. The
Comm layer has been restructured in Squid-3.2+ into a set of dynamically
created listener Jobs (TcpAcceptor) which spawn traffic handler
AsyncCalls based on the http(s)_port settings. The hooks would be best
being added into the call sequence and run out of those traffic handler
functions. Incidentally that would be...
OnClientConnect (source_ip,source_port,target_ip,target_port);
This would be httpAccept(), httpsAccept() in client_side.cc where the
client DB entry is created/updated. It would need the config settings to
handle being limited to the TCP level details available here, with no
request details.
The main idea behind using a helper, was that we can completely avoid
the work of figuring out generically useful config directives. Just pass
the TCP details to the helper and let the admin decide which are used
and how.
OnClientRequest (URL);
OnClientRequestContent (content,size,offset);
The code structure allows for a hook after the request headers are fully
received and parse completed in the doCallouts(). The earlier processing
is locked inside some annoying loops. I'm hoping to kill those, but that
will take a while. For now we are stuck with doCallouts() being the
start- and end-all of request processing.
OnClientResponse (URL,size);
OnClientResponseContent (content,size,offset);
Squid offers http.cc processRequest() for hooks after the response
headers have been parsed.
OnClientDisconnect (<not sure>);
I will outright say, I have no clue how modules work (thinking about
apache etc) and these are shamelessly based on my Delphi XP with Objects.
The hardest part is making the hook on quota runout work cleanly. Have a
good look at client_db.cc for how the "quota" stuff in there works already.
Cheers,
Pieter
P.S. Might be worth starting a new thread perhaps ?
Same topic though. Rename?
Amos