So basically from I understand the limit of the AUFS\UFS cache_dir is at:
16,777,215 Objects.
So for a very loaded system it might be pretty "small".

I have asked since:
I have seen the mongodb ecap adapter that stores chunks and I didn't liked it.
In the other way I wrote a cache_dir in GoLang which I am using for the windows 
updates caching proxy and for now it's surpassing the AUFS\UFS limits.

Based on the success of the Windows Updates Cache proxy which strives to cache 
only public objects, I was thinking about writing something similar for a more 
global usage.
The basic constrain on what would be cached is only If the object has 
Cache-Control "public".
The first step would be an ICAP service (respmod) which will log requests and 
response and will decide what GET results are worthy of later fetch.
Squid currently does things on-the-fly while the client transaction is fetched 
by the client.
For an effective cache I believe we can compromise on another approach which 
relays or statistics.
The first rule is: Not everything worth caching!!!
Then after understanding and configuring this we can move on to fetch *Public* 
only objects when they get a high repeated downloads.
This is actually how google cache and other similar cache systems work.
They first let traffic reach the "DB" or "DATASTORE" if it's the first time 
seen.
Then after more the a specific threshold they object is being fetched by the 
cache system without any connection to the transaction which the clients 
consume.
It might not be the most effective caching "method" for specific very loaded 
systems or specific big files and *very* high cost up-stream connections but 
for many it will be fine.
And the actual logic and implementation can be each of couple algorithms like 
LRU as the default and couple others as an option.

I believe that this logic will be good for specific systems and will remove all 
sort of weird store\cache_dir limitations.
I already have a ready to use system which I named "YouTube-Store" that allows 
the admin to download and serve specific YouTube videos to a local web-service.
It can be utilized together with an external_acl helper that will redirect 
clients to a special page that hosts cached\stored video with an option to 
bypass the cached version.

I hope to publish this system soon under BSD license.

Thanks,
Eliezer

----
Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il



-----Original Message-----
From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On Behalf 
Of Alex Rousskov
Sent: Friday, July 14, 2017 20:49
To: Amos Jeffries <squ...@treenet.co.nz>; squid-users@lists.squid-cache.org
Subject: Re: [squid-users] What would be the maximum ufs\aufs cache_dir objects?

On 07/14/2017 10:47 AM, Amos Jeffries wrote:

> One UFS cache_dir can hold a maximum of (2^27)-1 safely. 

You probably meant to say (2^25)-1 but the actual number is (2^24)-1
because the sfileno is signed. This is why you get 16'777'215 (a.k.a.
0xFFFFFF) as the actual limit.


> The index hash entries are stored as a 32-bit bitmask (sfileno) - with 5
> bits for cache_dir ID and 27 bits for hash of the file details.

The cache index entries are hashed on their keys, not file numbers (of
any kind). The index entry is using 25 bits for the file number, but
IIRC, those 25 bits are never merged/combined with the 7 bits of the
cache_dir ID in any meaningful way.


Alex.

> typedef signed_int32_t sfileno;>     sfileno swap_filen:25; // keep in sync 
> with SwapFilenMax
>     sdirno swap_dirn:7;
> enum { SwapFilenMax = 0xFFFFFF }; // keep in sync with StoreEntry::swap_filen

_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to