Time for a Bug Hunt
Hey everybody, We still have a few open RC bugs which should be solvable given a few days each with a dedicated researcher. They are all blocking 3.1 release and some are causing major headaches to users of 3.0. 2305 - auth assertions (under refcounting?) 2424 - FD 'leaks' due to auth locks (over refcounting?) 2524 - range request connection handling 2517 - LDAP helper memory leaks These are in HEAD and need testing by someone other than me. Just waiting on next snapshot (200812**) to build a testable bundle of code. 2526 - ACL checklist default ALLOW. 2395 - FTP error messages (407 particularly) These are serious but are known to require a whole re-design of something: 2459 - dns_error_message system 2404 - WCCP in mask mode The current RC blocker bugs list can be seen at http://www.squid-cache.org/bugs/buglist.cgi?query_format=advancedproduct=Squidproduct=Websitetarget_milestone=3.0target_milestone=3.1bug_status=UNCONFIRMEDbug_status=NEWbug_status=ASSIGNEDbug_status=REOPENEDbug_severity=blockerbug_severity=criticalbug_severity=majoremailtype1=substringemail1=emailtype2=substringemail2=bugidtype=includeorder=bugs.bug_severity%2Cbugs.bug_idchfieldto=Nowcmdtype=doit Amos Jeffries Squid-3 Release Maintainer
The cache deny QUERY change... partial rollback?
After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: The cache deny QUERY change... partial rollback?
2008/12/1 Henrik Nordstrom [EMAIL PROTECTED]: After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? Are you able to put up some examples and statistics? I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Adrian
Re: The cache deny QUERY change... partial rollback?
mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd: Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? The studied cache is using LRU, and these flash videos effectively reduce the cache size by filling the cache with large and never to be referenced again objects. Are you able to put up some examples and statistics? I'll try. I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Probably true. Based on the URLs styles there seem to only be two or three of these authentication/session schemes. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
ICAP, bypassing respmod depending reqmod result
Hi, I am trying to add extension to ICAP layer in squid 3.0 STABLE 10. Our service works both in respmod and reqmod, in some cases we know right after reqmod that the data should passed directly to the client. in those cases we like to be more efficient by not utilizing respmod at all, sending the server's response directly to the client, bypassing ICAP server's respmod. Does any one have an idea how to decided after 'reqmod reply' whether http server response will be sent to ICAP server's respmod or to send directly to client? Any suggestions how to trick the mechanism? Thanks Allot, Moshe Beeri.
Re: The cache deny QUERY change... partial rollback?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Henrik Nordstrom wrote: After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Having a single recommended config seems dubious: I for one never run squid as a forward proxy, for instance. We should probably split apart the default / recommended forward and reverse configurations (which are just starting points, right?) and document how to tell which one to start with. Tres. - -- === Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJNAo0+gerLs4ltQ4RAnlrAJ45FgRi1WjkyikSunADePZSOwwBTgCghz+E 9fOaumxljVn99Tm257N1rUw= =Q9De -END PGP SIGNATURE-
Re: The cache deny QUERY change... partial rollback?
mån 2008-12-01 klockan 11:00 -0500 skrev Tres Seaver: Having a single recommended config seems dubious: I for one never run squid as a forward proxy, for instance. We should probably split apart the default / recommended forward and reverse configurations (which are just starting points, right?) and document how to tell which one to start with. The example/default configuration shipped with Squid is that of a normal proxy. Reverse proxies do need some changes to that config, it's unavoidable. Also, if your site is sane then you use query parameters in a sane manner and this while discussion is irrelevant. Regards Henrik
Re: Time for a Bug Hunt
Amos Jeffries wrote: Hey everybody, We still have a few open RC bugs which should be solvable given a few days each with a dedicated researcher. They are all blocking 3.1 release and some are causing major headaches to users of 3.0. 2305 - auth assertions (under refcounting?) 2424 - FD 'leaks' due to auth locks (over refcounting?) 2524 - range request connection handling 2517 - LDAP helper memory leaks These are in HEAD and need testing by someone other than me. Just waiting on next snapshot (200812**) to build a testable bundle of code. 2526 - ACL checklist default ALLOW. The ICAP acls in squid3.HEAD are broken too For example the following: acl com dstdomain .com adaptation_access class_clamresp deny com adaptation_access class_clamresp allow all has as result the responces from all sites (.com and others) to send to the icap server. If I omit the allow all line works well (do not send .com sites but send other sites)... Grr. 2395 - FTP error messages (407 particularly) These are serious but are known to require a whole re-design of something: 2459 - dns_error_message system 2404 - WCCP in mask mode The current RC blocker bugs list can be seen at http://www.squid-cache.org/bugs/buglist.cgi?query_format=advancedproduct=Squidproduct=Websitetarget_milestone=3.0target_milestone=3.1bug_status=UNCONFIRMEDbug_status=NEWbug_status=ASSIGNEDbug_status=REOPENEDbug_severity=blockerbug_severity=criticalbug_severity=majoremailtype1=substringemail1=emailtype2=substringemail2=bugidtype=includeorder=bugs.bug_severity%2Cbugs.bug_idchfieldto=Nowcmdtype=doit Amos Jeffries Squid-3 Release Maintainer
Re: /bzr/squid3/trunk/ r9386: Bug 2395: FTP auth errors not displayed
Isn't the real question who called FwdState::complete()? It's only meant to be called when all data has been placed into the entry, not before.. tis 2008-12-02 klockan 00:56 +1300 skrev Amos Jeffries: revno: 9386 committer: Amos Jeffries [EMAIL PROTECTED] branch nick: trunk timestamp: Tue 2008-12-02 00:56:34 +1300 message: Bug 2395: FTP auth errors not displayed I appears to be the StoreEntry reporting an error on zero-length objects. This somehow overrides the FTP reported error and aborts the reply page. Add an extra check to prevent StoreEntry::complete() being called too early on error responses. modified: src/forward.cc src/ftp.cc vanligt textdokument-bilaga (r9386.diff) === modified file 'src/forward.cc' --- a/src/forward.cc 2008-10-16 04:51:12 + +++ b/src/forward.cc 2008-12-01 11:56:34 + @@ -335,9 +335,12 @@ startComplete(servers); } else { -debugs(17, 3, fwdComplete: not re-forwarding status entry-getReply()-sline.status); -EBIT_CLR(entry-flags, ENTRY_FWD_HDR_WAIT); -entry-complete(); +debugs(17, 3, fwdComplete: server FD server_fd not re-forwarding status entry-getReply()-sline.status); +if (entry-isEmpty() !err) +{ +EBIT_CLR(entry-flags, ENTRY_FWD_HDR_WAIT); +entry-complete(); +} if (server_fd 0) completed(); === modified file 'src/ftp.cc' --- a/src/ftp.cc 2008-09-24 13:21:04 + +++ b/src/ftp.cc 2008-12-01 11:56:34 + @@ -1991,7 +1991,7 @@ ftpReadPass(FtpStateData * ftpState) { int code = ftpState-ctrl.replycode; -debugs(9, 3, HERE); +debugs(9, 3, HERE code= code); if (code == 230) { ftpSendType(ftpState); @@ -3462,7 +3462,11 @@ static void ftpFail(FtpStateData *ftpState) { -debugs(9, 3, HERE); +debugs(9, 6, HERE flags( +(ftpState-flags.isdir?IS_DIR,:) +(ftpState-flags.try_slash_hack?TRY_SLASH_HACK:) ), +mdtm= ftpState-mdtm , size= ftpState-theSize +slashhack= (ftpState-request-urlpath.caseCmp(/%2f, 4)==0? T:F) ); /* Try the / hack to support Netscape FTP URL's for retreiving files */ if (!ftpState-flags.isdir /* Not a directory */ @@ -3491,6 +3495,7 @@ void FtpStateData::failed(err_type error, int xerrno) { +debugs(9,3,HERE entry-null= (entry?entry-isEmpty():0) , entry= entry); if (entry-isEmpty()) failedErrorMessage(error, xerrno); signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: The cache deny QUERY change... partial rollback?
mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd: Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? The studied cache is using LRU, and these flash videos effectively reduce the cache size by filling the cache with large and never to be referenced again objects. Are you able to put up some examples and statistics? I'll try. I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Probably true. Based on the URLs styles there seem to only be two or three of these authentication/session schemes. Regards Henrik A global blockade is a little harsh when it's only a few offenders. If we can locate a pattern to match just these sites while any dialog is going on I'd be happy to support a reversal for just them. That would keep most of the main bandwidth gains from doing it in the first place. Amos
Re: The cache deny QUERY change... partial rollback?
tis 2008-12-02 klockan 12:35 +1300 skrev Amos Jeffries: A global blockade is a little harsh when it's only a few offenders. If we can locate a pattern to match just these sites while any dialog is going on I'd be happy to support a reversal for just them. That would keep most of the main bandwidth gains from doing it in the first place. In the analyzed cache there were no identified query objects 10 MB without session identifiers in the query parameters. These objects came from a wide range of sites. With some being more prominent than others. The majority were flash videos. But not all. There was also software downloads, and some other data. Among the flash video sites, there were about 3 different styles in how the query parameters were encoded, suggesting that there is about as many providers of the software used, or may be related to CDN networks (not sure as it's impossible to tell from URL alone). Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Time for a Bug Hunt
Amos Jeffries wrote: Hey everybody, We still have a few open RC bugs which should be solvable given a few days each with a dedicated researcher. They are all blocking 3.1 release and some are causing major headaches to users of 3.0. 2305 - auth assertions (under refcounting?) 2424 - FD 'leaks' due to auth locks (over refcounting?) 2524 - range request connection handling 2517 - LDAP helper memory leaks These are in HEAD and need testing by someone other than me. Just waiting on next snapshot (200812**) to build a testable bundle of code. 2526 - ACL checklist default ALLOW. The ICAP acls in squid3.HEAD are broken too For example the following: acl com dstdomain .com adaptation_access class_clamresp deny com adaptation_access class_clamresp allow all has as result the responces from all sites (.com and others) to send to the icap server. If I omit the allow all line works well (do not send .com sites but send other sites)... Grr. That sounds like another issue altogether. Like the list is being constructed in reverse order :( (list push to head instead of to tail?) Amos 2395 - FTP error messages (407 particularly) These are serious but are known to require a whole re-design of something: 2459 - dns_error_message system 2404 - WCCP in mask mode The current RC blocker bugs list can be seen at http://www.squid-cache.org/bugs/buglist.cgi?query_format=advancedproduct=Squidproduct=Websitetarget_milestone=3.0target_milestone=3.1bug_status=UNCONFIRMEDbug_status=NEWbug_status=ASSIGNEDbug_status=REOPENEDbug_severity=blockerbug_severity=criticalbug_severity=majoremailtype1=substringemail1=emailtype2=substringemail2=bugidtype=includeorder=bugs.bug_severity%2Cbugs.bug_idchfieldto=Nowcmdtype=doit Amos Jeffries Squid-3 Release Maintainer
Re: /bzr/squid3/trunk/ r9386: Bug 2395: FTP auth errors not displayed
Isn't the real question who called FwdState::complete()? It's only meant to be called when all data has been placed into the entry, not before.. I started with that idea, but the code proved to be very convoluted. On an error the FTP engine calls ftpFail(), which syncs down to FtpSate::failed() and generates a 407 error object in the state member for holding errors, and closes the server links properly and then calls serverComplete(). Once serverComplete is called the FwdState jumble kicks off. There are two functions there complete() and completed() The code in complete() appears to handle two distinct cases, (1) when the connection is done with and everything is okay. And (2) when an error has occured as seen by *_INCOMPLETE. On the failure case (2) is active, it tries to store any partial object received (the 4xx or 5xx message from server???) then calls completed(), then repeats the storage for squid outgoing error. By trying to store the partial object before completed() even if it's empty we actually cause the zero-length-reply state to be triggered by store and unset the real error message member variable. I tried doing the calls to copy correct error page into the store object before calling serverComplete, but that caused worse side effects (such as status 200 with error page content, or object-already-stored assertions depending on where it was saved). This fix, still leaves the pre-save behavior if any entry data was read, but skips the possibility of generating a zero-length-reply when another more meaningful error may be present. The errorpage object is left to wherever the error-handling logics below completed() actually take place. Amos tis 2008-12-02 klockan 00:56 +1300 skrev Amos Jeffries: revno: 9386 committer: Amos Jeffries [EMAIL PROTECTED] branch nick: trunk timestamp: Tue 2008-12-02 00:56:34 +1300 message: Bug 2395: FTP auth errors not displayed I appears to be the StoreEntry reporting an error on zero-length objects. This somehow overrides the FTP reported error and aborts the reply page. Add an extra check to prevent StoreEntry::complete() being called too early on error responses. modified: src/forward.cc src/ftp.cc vanligt textdokument-bilaga (r9386.diff) === modified file 'src/forward.cc' --- a/src/forward.cc 2008-10-16 04:51:12 + +++ b/src/forward.cc 2008-12-01 11:56:34 + @@ -335,9 +335,12 @@ startComplete(servers); } else { -debugs(17, 3, fwdComplete: not re-forwarding status entry-getReply()-sline.status); -EBIT_CLR(entry-flags, ENTRY_FWD_HDR_WAIT); -entry-complete(); +debugs(17, 3, fwdComplete: server FD server_fd not re-forwarding status entry-getReply()-sline.status); +if (entry-isEmpty() !err) +{ +EBIT_CLR(entry-flags, ENTRY_FWD_HDR_WAIT); +entry-complete(); +} if (server_fd 0) completed(); === modified file 'src/ftp.cc' --- a/src/ftp.cc 2008-09-24 13:21:04 + +++ b/src/ftp.cc 2008-12-01 11:56:34 + @@ -1991,7 +1991,7 @@ ftpReadPass(FtpStateData * ftpState) { int code = ftpState-ctrl.replycode; -debugs(9, 3, HERE); +debugs(9, 3, HERE code= code); if (code == 230) { ftpSendType(ftpState); @@ -3462,7 +3462,11 @@ static void ftpFail(FtpStateData *ftpState) { -debugs(9, 3, HERE); +debugs(9, 6, HERE flags( +(ftpState-flags.isdir?IS_DIR,:) +(ftpState-flags.try_slash_hack?TRY_SLASH_HACK:) ), +mdtm= ftpState-mdtm , size= ftpState-theSize +slashhack= (ftpState-request-urlpath.caseCmp(/%2f, 4)==0? T:F) ); /* Try the / hack to support Netscape FTP URL's for retreiving files */ if (!ftpState-flags.isdir/* Not a directory */ @@ -3491,6 +3495,7 @@ void FtpStateData::failed(err_type error, int xerrno) { +debugs(9,3,HERE entry-null= (entry?entry-isEmpty():0) , entry= entry); if (entry-isEmpty()) failedErrorMessage(error, xerrno);
Re: The cache deny QUERY change... partial rollback?
Hmm. Given that heap GDSF out-performs LRU in the common case, and there's a crashing bug in LRU at the moment anyway, maybe the best thing to do is to change the default replacement policy -- and always compile in the heap algorithms? On 02/12/2008, at 2:05 AM, Henrik Nordstrom wrote: mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd: Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? The studied cache is using LRU, and these flash videos effectively reduce the cache size by filling the cache with large and never to be referenced again objects. Are you able to put up some examples and statistics? I'll try. I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Probably true. Based on the URLs styles there seem to only be two or three of these authentication/session schemes. Regards Henrik -- Mark Nottingham [EMAIL PROTECTED]