On 12/02/2012 5:05 p.m., Pieter De Wit wrote:
<snip>
* the parsing bottleneck gets crunched several times: on first arrival, in the ICAP server, and on return to Squid, * the ICAP server bypass optimization can't be used since quote needs to measure every byte,
* tunneled data does not get sent to ICAP services,

Not exactly perfect service, but it offers the most complete quota control without adding complexity to Squid.

eCAP might be a slightly better. It sill runs inside Squid and has some processing overhead, but should reduce the parse problems and network delays involved with ICAP.


Points to reading URL's are more than welcome, also, so is examples of libicapapi :)

Hopefully someone else knows some then, because I dont :(

Amos
Hi Amos,

You said that you proposed some work a while ago, would you mind sharing that? I gave the network thing some thoughts and I can see how the delay would hurt squid. I kept on comparing it to milters, but these don't mind a few ms delay, email is a lot less interactive.

The thought process I am going with is something along the lines of a process that is "spoken" to, like ecap perhaps, via pipes or a lib or some such. This process will be notified based on the following:

(* - Request, **-Reply)

* I would like to go to protocol://site
** Is there quota left to allow this, if the user has 0 quota left, block the request, no use * The server said the object is X bytes long, can I continue to download it ** Yes, there is quota. The problem comes in if the server didn't give a length, if that is the case, perhaps only allow 1024 bytes until his quota runs out. There is also the problem if the server said the object is bigger than it really is...
* Can I sent the following 1024 bytes
** Yes, there is quota.

At any given step, if the quota runs out, the connect is aborted. This will involve some tie in with the FD struct that you guys have already. I do recall myself and Alex having a chat about this. I referred to it as "hooks" into the FD struct. I *think* the talk about "hooks" in the FD struct was aborted because it didn't add enough value at the time, or real life caught up to me or or or :)

The download even if known-length can be aborted at any time, also the backend system may change the quota at any time as well. So IMO the best idea is to collpase the requests all down to a request asking for N bytes and passing along any parameters which the quota backend needs.

The basic idea was started here:
  http://bugs.squid-cache.org/show_bug.cgi?id=1849

Looking back at the discussion thread it was started by you in Feb 2009 the model description is here http://marc.info/?l=squid-dev&m=123570800116923&w=1. Although it seems I sent you something in private before that with more details. Sorry that mail is gone now.


The Measurement Factory have since created the client_delay_pool part of it but without any helper hooks. So the current is only /sec capping. Adding a helper API hook that sets the client DB quota field values and updates it when exhausted

That is fully controllable already with per-request limitations and speeds.

The big cases that are left is fixed-size quotas that run down. No need for lookups with details from particular headers or such at this point.




Based on this, I would like to re-float the idea of "hooks" in the FD struct. From the top of my head, one would have modules that expose certain function/procedures:

The FD struct is on the hitlist for erasure or at least removing anything that is not particularly directly related to the FD value. The Comm layer has been restructured in Squid-3.2+ into a set of dynamically created listener Jobs (TcpAcceptor) which spawn traffic handler AsyncCalls based on the http(s)_port settings. The hooks would be best being added into the call sequence and run out of those traffic handler functions. Incidentally that would be...


OnClientConnect (source_ip,source_port,target_ip,target_port);

This would be httpAccept(), httpsAccept() in client_side.cc where the client DB entry is created/updated. It would need the config settings to handle being limited to the TCP level details available here, with no request details. The main idea behind using a helper, was that we can completely avoid the work of figuring out generically useful config directives. Just pass the TCP details to the helper and let the admin decide which are used and how.

OnClientRequest (URL);
OnClientRequestContent (content,size,offset);

The code structure allows for a hook after the request headers are fully received and parse completed in the doCallouts(). The earlier processing is locked inside some annoying loops. I'm hoping to kill those, but that will take a while. For now we are stuck with doCallouts() being the start- and end-all of request processing.

OnClientResponse (URL,size);
OnClientResponseContent (content,size,offset);

Squid offers http.cc processRequest() for hooks after the response headers have been parsed.

OnClientDisconnect (<not sure>);

I will outright say, I have no clue how modules work (thinking about apache etc) and these are shamelessly based on my Delphi XP with Objects.

The hardest part is making the hook on quota runout work cleanly. Have a good look at client_db.cc for how the "quota" stuff in there works already.


Cheers,

Pieter

P.S. Might be worth starting a new thread perhaps ?

Same topic though. Rename?


Amos

Reply via email to