source code reorg makefiles
I have one small request vis-a-vis the source code reorg. Please don't undo the non-recursive make support. I put a lot of effort into getting it where it was up to (even though that wasn't propogated across the entire code base). It makes a substantial difference to correctness, performance and ease of use of incremental builds. Where the goal is to split out a Makefile, I suggest the use of an include to incorporate smaller files. I can see a number of regressions in this to-date, which is why I'm asking that we not let it slide anymore. -Rob signature.asc Description: This is a digitally signed message part
Re: source code reorg makefiles
Robert Collins wrote: I have one small request vis-a-vis the source code reorg. Please don't undo the non-recursive make support. I put a lot of effort into getting it where it was up to (even though that wasn't propogated across the entire code base). It makes a substantial difference to correctness, performance and ease of use of incremental builds. Do you mean the DiskIO/* and fs/*/* stuff? Where the goal is to split out a Makefile, I suggest the use of an include to incorporate smaller files. We are doing that.. (Common.am and TestHeaders.am etc) I can see a number of regressions in this to-date, which is why I'm asking that we not let it slide anymore. -Rob Can you explain a bit more detail please? what are you seeing us do wrong? we are not trying to remove anything at this point AFAIK. Amos -- Please be using Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15 Current Beta Squid 3.1.0.7
3.0 assertion in comm.cc:572
We have one user with a fairly serious production machine hitting this assertion. It's an attempted comm_read of closed FD after reconfigure. Nasty, but I think the asserts can be converted to a nop return. Does anyone know of a subsystem that would fail badly after a failed read with all its sockets and networking closed anyway? Amos -- Please be using Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15 Current Beta Squid 3.1.0.7
Re: [squid-users] Error making squid-3.1.0.7-20090412 on Mac OS X 10.4
Here's what I did and the outcome, with squid-3.HEAD-20090511: patch -p0 b9683.patch vi src/asn.cc /* template cbdata_type CbDataListint::CBDATA_CbDataList; */ Then 2) 3). /bin/sh ../../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments -g -O2 -c -o Asn.lo Asn.cc g++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments -g -O2 -c Asn.cc -fno-common -DPIC -o .libs/Asn.o Asn.cc:44: error: expected unqualified-id before 'template' make[3]: *** [Asn.lo] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all] Error 2 make: *** [all-recursive] Error 1 Original-Nachricht I wonder. Lets experiment then. If you would please: 1. comment out the annoying line: template cbdata_type CbDataListint::CBDATA_CbDataList; 2. add this to src/asn.cc: class FubarA { public: char a; } template cbdata_type CbDataListFubarA::CBDATA_CbDataList; 3. add this to acl/Asn.cc: class FubarB { public: char b; } template cbdata_type CbDataListFubarB::CBDATA_CbDataList; 4. rebuild and see what happens... Amos -- Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss für nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a
Re: source code reorg makefiles
On Tue, 2009-05-12 at 01:48 +1200, Amos Jeffries wrote: Can you explain a bit more detail please? what are you seeing us do wrong? we are not trying to remove anything at this point AFAIK. acl is the one I noticed; but I had the others building non-recursively too. Non-recursive make is where make builds objects in directories without reinvoking make. Instead it holds a single build graph and can handle more complex dependencies safely. -Rob signature.asc Description: This is a digitally signed message part
Re: [squid-users] CARP Failover behavior - multiple parents chosen for URL
Moving this to squid-dev due to increasingly propellerhead-like content... :) Looking over the code and some debugging output, it's pretty clear what's happening here. The carpSelectParent() function does the appropriate hashing of each URL+parent hash and the requisite ranking of the results. To determine whether or not the highest-hash-value parent is the parent that should, in fact, be returned, it uses peerHTTPOkay() as its test. The problem here is that peerHTTPOkay only returns 0 if the peer in question has been marked DEAD; carpSelectParent has no way of knowing if the peer is down unless squid has officially marked it DEAD. So, if the highest-ranked peer is a peer that is refusing connections but isn't marked DEAD yet, then peer_select tries to use it, and when it fails, falls back to ANY_PARENT - this actually shows up in the access.log, which I didn't realize when I initially sent this in. Once we've tried to hit the parent 10 times, we officially mark it DEAD, and then carpSelectParent() does the Right Thing. So, we have a couple option here as far as how to resolve this: 1. Adjust PEER_TCP_MAGIC_COUNT from 10 to 1, so that a parent is marked DEAD after only one failure. This may be overly sensitive however. Alternatively, carpSelectParent() can check peer-tcp_up and disqualify the peer if it's not equal to PEER_TCP_MAGIC_COUNT; this will have a similar effect without going through the overhead of actually marking the peer DEAD and then reviving it. 2. Somehow have carpSelectParent() return the entire sorted list of peers, so that if the to choice is found to be down, then peer_select() already knows where to go next... 3. Add some special-case code (I'm guessing this would be either in forward.c or peer_select.c) so that if a connection to a peer selected by carpSelectParent() fails, then increment a counter (which would be unique to that request) and call carpSelectParent() again. This counter can be used in carpPeerSelect to ignore the X highest-ranked entries. Once this peer gets officially declared DEAD, this becomes moot. Personally, I'm partial to #3, but other approaches are welcome :) Thanks, -C On May 6, 2009, at 10:13 PM, Amos Jeffries wrote: On May 6, 2009, at 8:14 PM, Amos Jeffries wrote: Hi, I've noticed a behavior in CARP failover (on 2.7) that I was wondering if someone could explain. In my test environment, I have a non-caching squid configured with multiple CARP parent caches - two servers, three per box (listening on ports 1080/1081/1082, respectively, for a total of six servers. When I fail a squid instance and immediately afterwards run GETs to URLs that were previously directed to that instance, I notice that the request goes to a different squid, as expected, and I see the following in the log for each request: May 6 11:43:28 cdce-den002-001 squid[1557]: TCP connection to http- cache-1c.den002 (http-cache-1c.den002:1082) failed And I notice that the request is being forwarded to a different, but consistent, parent. After ten of the above requests, I see this: May 6 11:43:41 cdce-den002-001.den002 squid[1557]: Detected DEAD Parent: http-cache-1c.den002 So, I'm presuming that after ten failed requests, the peer is considered DEAD. So far, so good. The problem is this: During my test GETs, I noticed that immediately after the Detected DEAD Parent message was generated, the parent server that the request was being forwarded to changed - as if there's an interim decision made until the peer is officially declared DEAD, and then another hash decision made afterwards. So while consistent afterwards, it's apparent that during the failover, the parent server for the test URL changed twice, not once. Can someone explain this behavior? Do you have 'default' set on any of the parents? It is entirely possible that multiple paths are selected as usable and only the first taken. No, my cache_peer config options are cache_peer http-cache-1a.den002 parent 1080 0 carp http11 idle=10 repeat for each hostname During the period between death and detection the dead peer will still be attempted but failover happens to send the request to another location. When death is detected the hashes are actual re-calculated. OK, correct me if I misread, but my understanding of the spec is that each parent cache gets its own hash value, each of which is then combined with the URL's hash to come up with a set of values. The parent cache corresponding with the highest result is the cache chosen. If that peer is unavailable, the next-best peer is selected, then the next, etc etc. If that is correct, what hashes are re-calculated when a dead peer is detected? Any why would those hashes result in different results than the pre-dead peer run of the algorithm And more importantly, will that recalculation result in URLs being re- mapped that weren't originally pointed to the failed parent? I
Re: 3.0 assertion in comm.cc:572
2009/5/11 Amos Jeffries squ...@treenet.co.nz: We have one user with a fairly serious production machine hitting this assertion. It's an attempted comm_read of closed FD after reconfigure. Nasty, but I think the asserts can be converted to a nop return. Does anyone know of a subsystem that would fail badly after a failed read with all its sockets and networking closed anyway? That will bite you later on if/when you wanted to move to support Windows overlapped IO / POSIX AIO style kernel async IO on network sockets. You don't want read's scheduled on FDs that are closed; nor do you want the FD closed during the execution of the read. Figure out what is scheduling a read / what is scheduling the completion incorrectly and fix the bug. Adrian
Re: One final (?) set of patches for 2.HEAD
All applied to 2-HEAD. On 09/05/2009, at 12:43 PM, Amos Jeffries wrote: Mark Nottingham wrote: Just a few more; HTCP logging http://www.squid-cache.org/bugs/show_bug.cgi?id=2627 +1. ignore-must-revalidate http://www.squid-cache.org/bugs/show_bug.cgi?id=2645 +1. Create request methods consistently http://www.squid-cache.org/bugs/show_bug.cgi?id=2646 +1. Do override-* before stale-while-revalidate, stale-if-error http://www.squid-cache.org/bugs/show_bug.cgi?id=2647 Yeesh. +1 if tested true. Amos -- Please be using Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15 Current Beta Squid 3.1.0.7 -- Mark Nottingham m...@yahoo-inc.com
Re: [squid-users] CARP Failover behavior - multiple parents chosen for URL
A patch to make PEER_TCP_MAGIC_COUNT configurable is on 2-HEAD; http://www.squid-cache.org/Versions/v2/HEAD/changesets/12208.patch Cheers, On 12/05/2009, at 1:15 PM, Chris Woodfield wrote: 1. Adjust PEER_TCP_MAGIC_COUNT from 10 to 1, so that a parent is marked DEAD after only one failure. This may be overly sensitive however. -- Mark Nottingham m...@yahoo-inc.com
Re: [squid-users] CARP Failover behavior - multiple parents chosen for URL
Moving this to squid-dev due to increasingly propellerhead-like content... :) Looking over the code and some debugging output, it's pretty clear what's happening here. The carpSelectParent() function does the appropriate hashing of each URL+parent hash and the requisite ranking of the results. To determine whether or not the highest-hash-value parent is the parent that should, in fact, be returned, it uses peerHTTPOkay() as its test. The problem here is that peerHTTPOkay only returns 0 if the peer in question has been marked DEAD; carpSelectParent has no way of knowing if the peer is down unless squid has officially marked it DEAD. So, if the highest-ranked peer is a peer that is refusing connections but isn't marked DEAD yet, then peer_select tries to use it, and when it fails, falls back to ANY_PARENT - this actually shows up in the access.log, which I didn't realize when I initially sent this in. Once we've tried to hit the parent 10 times, we officially mark it DEAD, and then carpSelectParent() does the Right Thing. So, we have a couple option here as far as how to resolve this: 1. Adjust PEER_TCP_MAGIC_COUNT from 10 to 1, so that a parent is marked DEAD after only one failure. This may be overly sensitive however. Alternatively, carpSelectParent() can check peer-tcp_up and disqualify the peer if it's not equal to PEER_TCP_MAGIC_COUNT; this will have a similar effect without going through the overhead of actually marking the peer DEAD and then reviving it. Patches went in recently to make that setting a squid.conf option. Squid-3: http://www.squid-cache.org/Versions/v3/HEAD/changesets/b9678.patch Squid-2: http://www.squid-cache.org/Versions/v2/HEAD/changesets/12208.patch http://www.squid-cache.org/Versions/v2/HEAD/changesets/12209.patch 2. Somehow have carpSelectParent() return the entire sorted list of peers, so that if the to choice is found to be down, then peer_select() already knows where to go next... 3. Add some special-case code (I'm guessing this would be either in forward.c or peer_select.c) so that if a connection to a peer selected by carpSelectParent() fails, then increment a counter (which would be unique to that request) and call carpSelectParent() again. This counter can be used in carpPeerSelect to ignore the X highest-ranked entries. Once this peer gets officially declared DEAD, this becomes moot. Personally, I'm partial to #3, but other approaches are welcome :) I'm partial to #2. But not for any particular reason. Patches for either #2 or #3 are welcome. Amos