Re: [squid-users] Large Files Not Caching
On 11/12/15 12:35 PM, Antony Stone wrote: >>> I'm trying to set up a CDN-like frontend to our (bandwidth-constrained) >>> >>> master package repository. Everything seems to be working (including >>> memory cache hits) except for some reason it does not seem to be >>> caching/keeping large files. > Define "large"? Sorry. To back up a little: squid version: 3.4.8-6+deb8u1 (debian jessie) With that config, I see memory hits to the cache, working fine. However, if I try to download something that's a couple of MB, it never writes to either cache directory. I get this in the store.log: > 1447350253.330 RELEASE -1 41BD9B4385C540AB29F252B7B7DDF41C > 200 > 1447350184 1447185078 1447954984 application/x-rpm 2368070/2368070 > GET > http://uk-1.mirrors.opennms.org:3128/yum/stable/common/opennms/opennms-jmx-config-generator-16.0.4-1.noarch.rpm ...and this in the access.log: > 1447350253.330 7 2606:a000:45e2:1200:f0cb:6c0a:1e57:68bd > TCP_MISS/200 > 2368590 GET > http://uk-1.mirrors.opennms.org:3128/yum/stable/common/opennms/opennms-jmx-config-generator-16.0.4-1.noarch.rpm > - TIMEOUT_FIRSTUP_PARENT/108.169.150.249 application/x-rpm On a second hit, I get the same thing, RELEASE and TCP_MISS. >>> Attached is my configuration. Is there something obvious that I'm missing? >>> >>> maximum_object_size 600 MB > I assume you don't mean "it's not caching stuff bigger than 600 Mb" Hah, no. The goal is to cache the most popular RPM and Debian packages and to spread the load out geographically. Most of them are somewhere between 20-300MB. Unfortunately, right now it seems to only cache what fits in memory. Also, sorry, just noticed this after I'd already reply-all'd: > Please reply to the list; > please *don't* CC me. I won't do it again... :/ ___ squid-users mailing list squid-users@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-users
[squid-users] Large Files Not Caching
I'm trying to set up a CDN-like frontend to our (bandwidth-constrained) master package repository. Everything seems to be working (including memory cache hits) except for some reason it does not seem to be caching/keeping large files. Attached is my configuration. Is there something obvious that I'm missing? acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT acl our_sites dstdomain yum.opennms.org debian.opennms.org maven.opennms.org repo.opennms.org .mirrors.opennms.org .mirrors.opennms.com acl mirrors src 45.55.163.22/32 acl mirrors src 2604:a880:800:10::60:4001/128 acl mirrors src 104.236.160.233/32 acl mirrors src 2604:a880:1:20::d6:7001/128 acl mirrors src 46.101.6.157/32 acl mirrors src 2a03:b0c0:1:d0::7a:7001/128 acl mirrors src 46.101.211.239/32 acl mirrors src 2a03:b0c0:3:d0::8a:6001/128 http_access deny !Safe_ports #http_access deny CONNECT !SSL_ports http_access deny CONNECT # manager access http_access allow localhost manager http_access deny manager # proxy access http_access allow our_sites http_access allow localhost http_access deny all # peer access icp_access allow mirrors icp_access deny all icp_port 3130 #http_port 80 accel defaultsite=www.mirrors.opennms.org vhost #http_port 8080 accel defaultsite=www.mirrors.opennms.org vhost http_port 3128 accel defaultsite=www.mirrors.opennms.org vhost coredump_dir /var/spool/squid3 client_ip_max_connections 8 # how much to cache/keep minimum_object_size 0 maximum_object_size 600 MB minimum_expiry_time 60 seconds refresh_pattern . 900 80% 604800 cache allow all memory_cache_mode disk cache_peer mirror.internal.opennms.com parent 80 0no-query originserver name=myAccel cache_peer_access myAccel allow our_sites cache_peer_access myAccel deny all cache_peer ny-1.mirrors.opennms.orgsibling 80 3130 name=ny1 cache_peer sf-1.mirrors.opennms.orgsibling 80 3130 name=sf1 cache_peer uk-1.mirrors.opennms.orgsibling 80 3130 name=uk1 cache_peer de-1.mirrors.opennms.orgsibling 80 3130 name=de1 cache_peer_access ny1 allow all cache_peer_access sf1 allow all cache_peer_access uk1 allow all cache_peer_access de1 allow all cache_dir aufs /var/spool/squid3/cache-small 2000 16 256 min-size=0 max-size=100KB cache_dir aufs /var/spool/squid3/cache-large 14000 16 256 min-size=100KB max-size=600MB # cache 404s for 1 minute negative_ttl 60 seconds ___ squid-users mailing list squid-users@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-users
Re: [squid-users] Large Files Not Caching
On Thursday 12 November 2015 at 18:31:10, Benjamin Reed wrote: > I'm trying to set up a CDN-like frontend to our (bandwidth-constrained) > master package repository. Everything seems to be working (including > memory cache hits) except for some reason it does not seem to be > caching/keeping large files. Define "large"? > Attached is my configuration. Is there something obvious that I'm missing? > maximum_object_size 600 MB I assume you don't mean "it's not caching stuff bigger than 600 Mb" :) Antony. -- This sentence contains exactly threee erors. Please reply to the list; please *don't* CC me. ___ squid-users mailing list squid-users@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-users
Re: [squid-users] Large Files and Reverse proxy
On Friday 29 August 2008 21:44:48 Henrik Nordstrom wrote: On fre, 2008-08-29 at 14:08 +0100, Simon Waters wrote: I don't care if Squid does a refresh query for an 8MB object, indeed I'm happy for it to check freshness every time such an object is fetched if needed to comply with HTTP RFCs, I was just concerned that Squid is fetching the whole 8MB file many times a day. Do the object have a useable cache validator (Last-Modifie / ETag) and does the web server respond properly on cache validation requests (If-Modified-Since / If-None-Match)? Yes (including ETags, although I believe it is the default Apache behaviour for these), and yes. It does sometimes do refresh queries instead of fetches, just not as often as I'd hope.
Re: [squid-users] Large Files and Reverse proxy
On Friday 29 August 2008 03:40:21 Amos Jeffries wrote: For various reasons we have a number of multimedia files on this end of the connection, all large, and all with no explicit expiry information (which I can adjust if it helps). That will help. Enormously. The longer it can be explicitly known cacheable the better (RRC states only up to a year though). Can I ask why? Is the default LRU or heap LFUDA policy concerned with expiry dates. However are there other likely gotchas with handling larger files? Some people find it more efficient to store them on disk rather than in memory. If your squid is already 64-bit or handling it nicely then no problem. I don't think there is a performance issue here with memory. I think it is just down to how the proxy decides which files to keep. As I said the goal is to offload bandwidth usage. I'm pondering dropping the caching of small objects, since mostly they cause a refresh_hit in this reverse proxy configuration, and the saving on bandwidth isn't huge (although presumably it saves a trip over a congested link). There is also a large number of small image files which I believe can have a long expiry date set in Apache, I just need to check that with the guy who did the file naming algorithmn. This would probably be a bigger win. Perhaps I just need a cache which is as larger than all the data to be served - which might be possible to organise given the current price of disk space.
Re: [squid-users] Large Files and Reverse proxy
Simon Waters wrote: On Friday 29 August 2008 03:40:21 Amos Jeffries wrote: For various reasons we have a number of multimedia files on this end of the connection, all large, and all with no explicit expiry information (which I can adjust if it helps). That will help. Enormously. The longer it can be explicitly known cacheable the better (RRC states only up to a year though). Can I ask why? Is the default LRU or heap LFUDA policy concerned with expiry dates. With known expiry info, squid can calculate fresh/stale properly. Without it Squid has to estimate and periodically refreshes the object. The LRU/LFUDA algorithms are only related to garbage collection on objects in the cache. refresh_pattern can tune the freshness estimation algorithm in your cache. But nothing beats proper authoritative info about an objects freshness. Since that will affect every cache on the web. However are there other likely gotchas with handling larger files? Some people find it more efficient to store them on disk rather than in memory. If your squid is already 64-bit or handling it nicely then no problem. I don't think there is a performance issue here with memory. I think it is just down to how the proxy decides which files to keep. As I said the goal is to offload bandwidth usage. Theres an issue ion Squid-2.x related to the size of memory chunks that can play badly with very large files. If you are not seeing that, then fine. I'm pondering dropping the caching of small objects, since mostly they cause a refresh_hit in this reverse proxy configuration, and the saving on bandwidth isn't huge (although presumably it saves a trip over a congested link). There is also a large number of small image files which I believe can have a long expiry date set in Apache, I just need to check that with the guy who did the file naming algorithmn. This would probably be a bigger win. Perhaps I just need a cache which is as larger than all the data to be served - which might be possible to organise given the current price of disk space. Amos -- Please use Squid 2.7.STABLE4 or 3.0.STABLE8
Re: [squid-users] Large Files and Reverse proxy
On Friday 29 August 2008 13:41:14 Amos Jeffries wrote: Simon Waters wrote: On Friday 29 August 2008 03:40:21 Amos Jeffries wrote: For various reasons we have a number of multimedia files on this end of the connection, all large, and all with no explicit expiry information (which I can adjust if it helps). That will help. Enormously. The longer it can be explicitly known cacheable the better (RRC states only up to a year though). Can I ask why? Is the default LRU or heap LFUDA policy concerned with expiry dates. With known expiry info, squid can calculate fresh/stale properly. Without it Squid has to estimate and periodically refreshes the object. The LRU/LFUDA algorithms are only related to garbage collection on objects in the cache. Perhaps I wasn't clear. I don't care if Squid does a refresh query for an 8MB object, indeed I'm happy for it to check freshness every time such an object is fetched if needed to comply with HTTP RFCs, I was just concerned that Squid is fetching the whole 8MB file many times a day. It may be Squid is doing a sensible thing with the available resources! But when I see the whole 8MB file shipped, at one point with a 15 seconds interval between them to the proxy, I do wonder how it became the LRU of 17GB of data in 15 seconds (our proxy isn't THAT busy), and whether I'm missing something basic about the performance of the cache.
Re: [squid-users] Large Files and Reverse proxy
On fre, 2008-08-29 at 14:08 +0100, Simon Waters wrote: I don't care if Squid does a refresh query for an 8MB object, indeed I'm happy for it to check freshness every time such an object is fetched if needed to comply with HTTP RFCs, I was just concerned that Squid is fetching the whole 8MB file many times a day. Do the object have a useable cache validator (Last-Modifie / ETag) and does the web server respond properly on cache validation requests (If-Modified-Since / If-None-Match)? Regards Henrik signature.asc Description: This is a digitally signed message part
[squid-users] Large Files and Reverse proxy
Goal maximise byte hit rate to alleviate bandwidth issue across a limited connection (till it is upgraded, or the systems relocated). I have squid configured as a reverse proxy, and that I now have some free bandwidth shows it is doing something useful - much thanks! However revisiting my bandwidth statistics surprised me, that I'm still shipping a lot more duplicated content over our connection than I expected, to the Squid proxy. The biggest offender being a WMV file (it was the biggest offender first thing this week when I realised I hadn't set the maximum size of objects high enough to get it cached! It is still the biggest offender). For various reasons we have a number of multimedia files on this end of the connection, all large, and all with no explicit expiry information (which I can adjust if it helps). What I am hoping is that I can persuade Squid to do a TCP_REFRESH_HIT and burn 350 odd bytes instead of 8 MB when serving our most popular WMV file across this connection, or other media files it has cached. Squid has 420MB of RAM and 17GB of cache (now all populated). Tuesday, before the cache was full, it was behaving as I expected, since I explicitly tested the top WMV file after spotting the object size mistake. And now it intermittently does what I want it to. I assume this is simply that the cache is full, and that it choosing to drop this object from cache. I see advice to try heap LFUDA as the cache policy for maximizing byte hit rate - which I will try. However are there other likely gotchas with handling larger files? Are there other levers to twiddle to persuade Squid to hang onto larger files? I'm seeing HITS or REFRESHES about 50% of the time, the other 50% are straight TCP_MISS for my worst WMV. Bandwidth figures suggest similar results for other files.
Re: [squid-users] Large Files and Reverse proxy
Goal maximise byte hit rate to alleviate bandwidth issue across a limited connection (till it is upgraded, or the systems relocated). I have squid configured as a reverse proxy, and that I now have some free bandwidth shows it is doing something useful - much thanks! However revisiting my bandwidth statistics surprised me, that I'm still shipping a lot more duplicated content over our connection than I expected, to the Squid proxy. The biggest offender being a WMV file (it was the biggest offender first thing this week when I realised I hadn't set the maximum size of objects high enough to get it cached! It is still the biggest offender). For various reasons we have a number of multimedia files on this end of the connection, all large, and all with no explicit expiry information (which I can adjust if it helps). That will help. Enormously. The longer it can be explicitly known cacheable the better (RRC states only up to a year though). What I am hoping is that I can persuade Squid to do a TCP_REFRESH_HIT and burn 350 odd bytes instead of 8 MB when serving our most popular WMV file across this connection, or other media files it has cached. Squid has 420MB of RAM and 17GB of cache (now all populated). Tuesday, before the cache was full, it was behaving as I expected, since I explicitly tested the top WMV file after spotting the object size mistake. And now it intermittently does what I want it to. I assume this is simply that the cache is full, and that it choosing to drop this object from cache. I see advice to try heap LFUDA as the cache policy for maximizing byte hit rate - which I will try. However are there other likely gotchas with handling larger files? Some people find it more efficient to store them on disk rather than in memory. If your squid is already 64-bit or handling it nicely then no problem. Are there other levers to twiddle to persuade Squid to hang onto larger files? I'm seeing HITS or REFRESHES about 50% of the time, the other 50% are straight TCP_MISS for my worst WMV. Bandwidth figures suggest similar results for other files. Amos
Re: [squid-users] Large Files
Lighttpd uses it... Is it just that it would require a substantial redesign? (he says, completely ignorant of the internals...) On 2006/09/06, at 6:51 PM, Henrik Nordstrom wrote: - How does sendfile support in 2.6 affect this? It doesn't. Not really usable for Squid. Using sendfile outside one thread/process per request designs is not trivial. So it works quite nicely for most traditional servers, but not that well for event loop based ones.. -- Mark Nottingham [EMAIL PROTECTED]
Re: [squid-users] Large Files
tor 2006-09-07 klockan 10:24 -0700 skrev Mark Nottingham: Lighttpd uses it... Is it just that it would require a substantial redesign? (he says, completely ignorant of the internals...) It works fine if the amount of files is small and all fit in the filesystem cache. To use sendfile efficiently when the amount of files is large and a high likelyhood the data is not found in the filesystem cache then it's quite problematic, unles you accept blocking while the filesystem buffers gets filled (which we don't..). Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: [squid-users] Large Files
On Thu, Sep 07, 2006, Mark Nottingham wrote: Lighttpd uses it... Is it just that it would require a substantial redesign? (he says, completely ignorant of the internals...) It wouldn't give Squid much of a performance benefit considering the codebase and how we do IO. Lighttpd currently implements its disk IO using a blocking method. If they wanted to change to async they'd probably have to drop the sendfile() support. (Or, really, send someone on a trip to implement some slightly better kernel interfaces for async disk/network IO.) Adrian
Re: [squid-users] Large Files
fre 2006-09-08 klockan 07:14 +0800 skrev Adrian Chadd: Lighttpd currently implements its disk IO using a blocking method. If they wanted to change to async they'd probably have to drop the sendfile() support. really? async sendfile is effectively the same as async read, just an fd as output instead of memory.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: [squid-users] Large Files
On Fri, Sep 08, 2006, Henrik Nordstrom wrote: fre 2006-09-08 klockan 07:14 +0800 skrev Adrian Chadd: Lighttpd currently implements its disk IO using a blocking method. If they wanted to change to async they'd probably have to drop the sendfile() support. really? async sendfile is effectively the same as async read, just an fd as output instead of memory.. Hey, cool! If the async sendfile() works then even better. Adrian
答复: ????: [squid-users] Large Files
Hi, Adrian: -Hm! - -What, it aborted even though others were downloading the file? Yes, large files are hard to be fetched and cached by squid if these files are concurrently accessed by many clients with multi-thread download tools. For example, If the first client request one cache miss file ( such as 2 Megabytes large) by two threads, then the first thread request range: 0~1Megabytes, and the second thread request range: 1M ~ 2Megabytes. If the first request aborted before have received more than 1Megabytes, We often find squid's back side connection stopped at the point of 1Megabytes , so this file can not be whole fetched and normally cached. Our purpose: 1. squid can cache large files normally. 2. squid can support multi-thread requests once large files are cached. (squid response TCP_HIT/206 once file was cached. ) Our configuration: range_offset_limit -1 KB # support byte_range at the front side, and request without range header at back side. quick_abort_min -1 KB #keep fetching whole file ,no matter when clients abort. Adrian, please help to take a look , thank you in advanced. Adam --邮件原件- -发件人: Adrian Chadd [mailto:[EMAIL PROTECTED] -发送时间: 2006年9月5日 10:00 -收件人: adam.cheng -抄送: squid-users@squid-cache.org -主题: Re: : [squid-users] Large Files - -On Tue, Sep 05, 2006, adam.cheng wrote: - HI, Adriam - - We have a similar implementation as youtobe . But when we deploy squid for - large files, we found quick_abort does not work well. Problem looks like - this: - - When cache miss , it is very hard for squid to get whole large file and - cache it if there are lots of multi-thread download tools, like flashget - or netransport are accessing this file at the same time. Observing through - sniffer, we found the back side connection often stopped at the point of the - first thread stopped. - -Hm! - -What, it aborted even though others were downloading the file? - - - - - -Adrian -
[squid-users] Large Files
Hi, Adrian: -Hm! - -What, it aborted even though others were downloading the file? Yes, large files are hard to be fetched and cached by squid if these files are accessed simultaneously by many clients with multi-thread download tools. For example, If the first client request one cache miss file ( such as 2 Megabytes large) by two threads, then the first thread request range: 0~1Megabytes, and the second thread request range: 1M ~ 2Megabytes. If the first request aborted before have received more than 1Megabytes, We often find squid's back side connection stopped at the point of 1Megabytes , so this file can not be whole fetched and normally cached. Our purpose: 1. squid can cache large files normally. 2. squid can support multi-thread requests once large files are cached. (squid response TCP_HIT/206 once file was cached. ) Our configuration: range_offset_limit -1 KB # support byte_range at the front side, and request without range header at back side. quick_abort_min -1 KB #keep fetching whole file ,no matter when clients abort. Adrian, please help to take a look , thank you in advanced. Adam - -- --On Tue, Sep 05, 2006, adam.cheng wrote: -- HI, Adriam -- -- We have a similar implementation as youtobe . But when we deploy squid for -- large files, we found quick_abort does not work well. Problem looks like -- this: -- -- When cache miss , it is very hard for squid to get whole large file and -- cache it if there are lots of multi-thread download tools, like flashget -- or netransport are accessing this file at the same time. Observing through -- sniffer, we found the back side connection often stopped at the point of the -- first thread stopped. -- --Hm! -- --What, it aborted even though others were downloading the file? -- -- -- -- -- --Adrian --
答复: [squid-users] Large Files
HI, Adriam We have a similar implementation as youtobe . But when we deploy squid for large files, we found quick_abort does not work well. Problem looks like this: When cache miss , it is very hard for squid to get whole large file and cache it if there are lots of multi-thread download tools, like flashget or netransport are accessing this file at the same time. Observing through sniffer, we found the back side connection often stopped at the point of the first thread stopped. --邮件原件- -发件人: Adrian Chadd [mailto:[EMAIL PROTECTED] -发送时间: 2006年9月2日 8:58 -收件人: Mark Nottingham -抄送: squid-users@squid-cache.org -主题: Re: [squid-users] Large Files - -On Fri, Sep 01, 2006, Mark Nottingham wrote: - I'd appreciate some enlightenment as to how Squid handles large files - WRT memory and disk. - - In particular; - - squid.conf says that memory is used for in-transit objects. - What exactly is kept in memory for in-transit objects; just metadata, - or the whole thing? - -Squid used to keep the whole thing in memory. It now: - -* can keep the whole object in memory -* If memory is needed, and the object is being swapped out Squid will - 'free' the start of the object in memory - the in-memory copy is - then not used to serve replies but is just there to be written to - disk. All subsequent hits come from disk. - - - if something is in memory cache, does it get copied when it is - requested (because it is in-transit)? - -The whole object isn't copied during a memory hit; only the current -'4k' range being read. - - - How does sendfile support in 2.6 affect this? - -Sendfile support? :) - - - Does anyone have any experiences they'd care to relate regarding - memory-caching very large objects? - -Not yet; but there's a project on my plate to evaluate this to -begin caching p2p and youtube stuff a bit better.. - - - -Adrian -
Re: ????: [squid-users] Large Files
On Tue, Sep 05, 2006, adam.cheng wrote: HI, Adriam We have a similar implementation as youtobe . But when we deploy squid for large files, we found quick_abort does not work well. Problem looks like this: When cache miss , it is very hard for squid to get whole large file and cache it if there are lots of multi-thread download tools, like flashget or netransport are accessing this file at the same time. Observing through sniffer, we found the back side connection often stopped at the point of the first thread stopped. Hm! What, it aborted even though others were downloading the file? Adrian
[squid-users] Large Files
I'd appreciate some enlightenment as to how Squid handles large files WRT memory and disk. In particular; - squid.conf says that memory is used for in-transit objects. What exactly is kept in memory for in-transit objects; just metadata, or the whole thing? - if something is in memory cache, does it get copied when it is requested (because it is in-transit)? - How does sendfile support in 2.6 affect this? - Does anyone have any experiences they'd care to relate regarding memory-caching very large objects? Thanks! -- Mark Nottingham [EMAIL PROTECTED]
Re: [squid-users] Large Files
On Fri, Sep 01, 2006, Mark Nottingham wrote: I'd appreciate some enlightenment as to how Squid handles large files WRT memory and disk. In particular; - squid.conf says that memory is used for in-transit objects. What exactly is kept in memory for in-transit objects; just metadata, or the whole thing? Squid used to keep the whole thing in memory. It now: * can keep the whole object in memory * If memory is needed, and the object is being swapped out Squid will 'free' the start of the object in memory - the in-memory copy is then not used to serve replies but is just there to be written to disk. All subsequent hits come from disk. - if something is in memory cache, does it get copied when it is requested (because it is in-transit)? The whole object isn't copied during a memory hit; only the current '4k' range being read. - How does sendfile support in 2.6 affect this? Sendfile support? :) - Does anyone have any experiences they'd care to relate regarding memory-caching very large objects? Not yet; but there's a project on my plate to evaluate this to begin caching p2p and youtube stuff a bit better.. Adrian