Sorry for the late reply. Was traveling in the last two days.

On Wed, Sep 4, 2013 at 10:05 AM, Amos Jeffries <[email protected]> wrote:
>
> On 4/09/2013 7:14 a.m., Niki Gorchilov wrote:
>>
>> 2. We know that 50% of the objects in our cache never get requested
>> second time, thus only creating load on the system to store and later
>> to evict them.
>
> How did you get to that conclusion please?

These are not the squid stats, but our own custom cache peer stats.

>  What Squid version are you using at present?

3.3.8. Waiting for 3.4 :-)

>>   So we prefer to be able to cache on second, third,
>> etc... request without passing the first requests via the peer at all.
>
> You understand that will possibly halve your caching efficiency right? 
> turning the 2-request URLs into MISS+MISS+... and making only the 3-times 
> fetched URLs worth caching...

Yeah. Did the math very carefully :) 1+, 2+, 3+, 4+ up to 10+ requests scenario.

> Caching is at its core a tradeoff between storage delays and bandwidth 
> delays. If you explicitly weight it in the direction of bandwidth delays by 
> not caching things on first request the benefits drop off significantly fast.

You are right in general, but not in our specific case. Our cache peer
produces about 60% cache hit. Going from 1st to 2nd request caching
will reduce the hit rate to about 58%. Still good result without much
trashing the HDDs. Or keep the trashing, but reduce the size by half
:)

>> Why? Same reasons as above.... ICP is cheap enough for statistics and
>> decision making...
>
> This is not possible with ICP as far as I know.

OK. Clear.

>> Any ideas how to resolve my issue and offload the cache peer by at
>> least 50% of the requests it servers currently?
>
> Answer: Do not use cache existence test(s) to solve access control and 
> routing problems.

Great point :-) Thanks.

> I would use an external_acl_type helper to do the calculation about whether a 
> request was to be cached and set a tag=value on the transaction. The tag type 
> ACL can then test for this tag and do a "cache deny". Since you have all 
> traffic
>
> Something like this:
>
>   external_acl_type tagger ttl=0 %URL ...  (helper returns "OK 
> tag=first-seen" or just "OK").
>   acl firstSeen external tagger
>   acl taggedFirst tag first-seen
>   http_accesss deny firstSeen !all
>   cache deny !taggedFirst

Yeah. Did something like this, works like a charm.

Even tried to remove all ICP as it is used only for marking via
qos_flows parent. The helper mostly replicates the logic behind our
custom ICP listener and returning tag=parent-hit was a no brainer.
Unfortunately I have discovered that clientside_tos doesn't support
slow acls like tag. I believe this fact has to be mentioned somewhere
at http://www.squid-cache.org/Doc/config/clientside_tos/. Will stick
to ICP HIT/MISS  & quos_flows for DSP marking for now, while observing
ZPH kernel patch as an alternative.

Thanks Amos. For all the efforts keeping the squid development going
and it's community alive :)

Best,
Niki

Reply via email to