Re: SMP: logging
On 24 February 2010 06:55, Amos Jeffries squ...@treenet.co.nz wrote: Ah, I did not realize cache.log daemon logging is not supported yet. One more reason to start with simple O_APPEND. As a side effect, we would be able to debug daemon log starting problems as well :-). Yay. Definitely +3 then. :) Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may be under Linux and it may be under certain FreeBSD versions, but it's likely a side-effect of VFS locking than the actual specification. adrian
Re: SMP: logging
On 24 February 2010 18:06, Adrian Chadd adr...@squid-cache.org wrote: Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may be under Linux and it may be under certain FreeBSD versions, but it's likely a side-effect of VFS locking than the actual specification. .. and it certainly won't be supported for logging-to-NFS. I'd honestly just investigate a logging layer that implements some kind of IPC mechanism (sockets, sysvshm, etc) that can handle logs from multiple processes. Or you go down the apache path - lock, append, unlock. Eww. adrian
Re: [squid-users] 'gprof squid squid.gmon' only shows the initial configuration functions
Talk to the freebsd guys (eg me) about pmcstat and support for your hardware. You may just need to find / organise a backport of the particular hardware support for your platform. I've been working on profiling Lusca with pmcstat and some new-ish tools which use and extend it in useful ways. gprof data is almost certainly uselessly unreliable on modern CPUs. Too much can and will happen between profiling ticks. I can hazard a few guesses about where your CPU is going. Likely candidate is poll() if your Squid is too old. First thing to do is organise porting the kqueue() stuff if it isn't already included. I can make more educated guesses about where the likely CPU hog culprits are given workload and configuration file information. Adrian 2009/12/10 Guy Bashkansky guy...@gmail.com: Is there an oprofile version for FreeBSD? I thought it is limited to Linux. On FreeBSD I tried pmcstat, but it gives an initialization error. My version of Squid is old and customized (so I can't upgrade) and may not have builtin timers. Since what version did they appear? As for gprof - even with the event loop on top, still the rest of the table might give some idea why the CPU is overloaded. The problem is - I see only initial configuration functions: called/total parents index %time self descendents called+self name index called/total children spontaneous [1] 63.4 0.17 0.00 _mcount [1] --- 0.00 0.10 1/1 _start [3] [2] 36.0 0.00 0.10 1 main [2] 0.00 0.10 1/1 parseConfigFile [4] ... --- spontaneous [3] 36.0 0.00 0.10 _start [3] 0.00 0.10 1/1 main [2] --- 0.00 0.10 1/1 main [2] [4] 36.0 0.00 0.10 1 parseConfigFile [4] 0.00 0.09 1/1 readConfigLines [5] 0.00 0.00 169/6413 parse_line [6] .. System info: # uname -m -r -s FreeBSD 6.2-RELEASE-p9 amd64 # gcc -v Using built-in specs. Configured with: FreeBSD/amd64 system compiler Thread model: posix gcc version 3.4.6 [FreeBSD] 20060305 There are 7 fork()s for unlinkd/diskd helpers. Can these fork()s affect profiling info? On Wed, Dec 9, 2009 at 2:04 AM, Robert Collins robe...@robertcollins.net wrote: On Tue, 2009-12-08 at 15:32 -0800, Guy Bashkansky wrote: I've built squid with the -pg flag and run it in the no-daemon mode (-N flag), without the initial fork(). I send it the SIGTERM signal which is caught by the signal handler, to flag graceful exit from main(). I expect to see meaningful squid.gmon, but 'gprof squid squid.gmon' only shows the initial configuration functions: gprof isn't terribly useful anyway - due to squids callback based model, it will see nearly all the time belonging to the event loop. oprofile and/or squids built in analytic timers will get much better info. -Rob
Re: your suggestion for range_offset_limit
the trick at least in squid-2 is to make sure that quick abort isn't occuring. Or it will begin downloading the whole object, return the requested range bit, and then abort the remainder of the fetch. Adrian 2009/11/25 Amos Jeffries squ...@treenet.co.nz: Matthew Morgan wrote: On Wed, Nov 25, 2009 at 7:09 PM, Amos Jeffries squ...@treenet.co.nz wrote: Matthew Morgan wrote: Sorry it's taking me so long to get this done, but I do have a question. You suggested making getRangeOffsetLimit a member of HttpReply. There are two places where this method currently needs to be called: one is CheckQuickAbort2() in store_client.cc. This one will be easy, as I can just do entry-getReply()-getRangeOffsetLimit(). The other is HttpStateData::decideIfWeDoRanges in http.cc. Here, all we have access to is an HttpRequest object. I looked through the source to see if I could find where a request owned or had access to a reply, but I don't see anything like that. If getRangeOffsetLimit were a member of HttpReply, what do you suggest doing here? I could make a static version of the method, but that wouldn't allow caching the result. Ah. I see. Quite right. After a bit more though I find my original request a bit weird. Yes it should be a _Request_ member and do its caching there. You can go ahead with that now while we discuss whether to do a slight tweak on top of the basic feature. [cc'ing squid-dev so others can provide input] I'm not certain of the behavior we want here if we do open the ACLs to reply details. Some discussion is in order. Simple way would be to not cache the lookup the first time when reply details are not provided. It would mean making it return potentially two different values across the transaction. 1) based on only request detail to and other on request+reply details. decide if a range request to possible. and then 2) based on additional reply details to see if the abort could be done. No problem if the reply details cause an increase in the limit. But if they restrict it we enter grounds of potentially making a request then canceling it and being unable to store the results. Or, taking the maximum of the two across two calls? so it can only increase. would be slightly trickier involving a flag a well to short-circuit the reply lookups instead of just a magic cache value. Am I seriously over-thinking things today? Amos Here's a question, too: is this feature going to benefit anyone? I realized later that it will not solve my problem, because all the traffic that was getting force downloaded ended up being from windows updates. The urls showing up in netstat and such were just weird because the windows update traffic was actually coming from limelight. My ultimate solution was to write a script that reads access.log, checks for windows update urls that are not cached, and manually download them one at a time after hours. If there is anyone at all who would benefit from this I would still be *more* than glad to code it (as I said, it would be my first real open source contribution...very exciting), but I just wondered if anyone will actually use it. I believe people will find more control here useful. Windows update service packs are a big reason, but there are also similar range issues with Adobe Reader online PDFs, google maps/earth, and flash videos when paused/resumed. Potentially other stuff, but I have not heard of problems. This will allow anyone to fine tune the places where ranges are permitted or forced to fully cache. Avoiding the problems a blanket limit adds. As to which approach would be better, I don't know enough about that data path to really suggest. When I initially made my changes, I just replaced each reference to Config.range_offset_limit or whatever. Today I went back and read some more of the code, but I'm still figuring it out. How often would the limit change based on the request vs. the reply? Just the once. On first time being checked for the reply. And most likely on the case of testing for a reply mime type. The other useful info I can think of are all request data. You can ignore if you like. I'm just worrying over a borderline case. Someone else can code a fix if they find it a problem or need to do mime checks. Amos -- Please be using Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20 Current Beta Squid 3.1.0.15
Re: squid-smp: synchronization issue solutions
Right. Thats the easy bit. I could even do that in Squid-2 with a little bit of luck. The hard bit is rewriting the relevant code which relies on cbdata style reference counting behaviour. That is the tricky bit. Adrian 2009/11/20 Robert Collins robe...@robertcollins.net: On Wed, 2009-11-18 at 10:46 +0800, Adrian Chadd wrote: Plenty of kernels nowdays do a bit of TCP and socket process in process/thread context; so you need to do your socket TX/RX in different processes/threads to get parallelism in the networking side of things. Very good point. You could fake it somewhat by pushing socket IO into different threads but then you have all the overhead of shuffling IO and completed IO between threads. This may be .. complicated. The event loop I put together for -3 should be able to do that without changing the loop - just extending the modules that hook into it. -Rob
Re: Recent Facebook Issues
I've emailed the facebook NOC directly about the issue. Thanks, Adrian 2009/10/9 Kinkie gkin...@gmail.com: You can try to access facebook with konqueror. It complains rather loudly, drops the excess data and the site generally doesn't work (has been dong so for a few days, but only NOW I'm connecting the wires...) Kinkie On Fri, Oct 9, 2009 at 2:52 AM, Adrian Chadd adr...@squid-cache.org wrote: Ok, this happens for all versions? I can bring this up with facebook engineering if someone provides me with further information. Adrian 2009/10/9 Amos Jeffries squ...@treenet.co.nz: Thanks to several people I've managed to track down why the facebook issues are suddenly appearing and why its intermittent. On the sometimes works sometimes doesn't problem. facebook.com does User-Agent header checks and sends back one of four pages. 1) a generic page saying 'please use another browser'. 2) a redirect to login for each of IE, Firefox and Safari 3) a home page (if cookies sent initially) going through the login redirects to the page also presented at (3) above. The home page is the real problem. When cookies are presented it ships without Content-Length (fine). When they _are_ present, ie after the user has logged in it ships with Content-Length: 18487 and data size of 18576. Amos -- /kinkie
Re: Recent Facebook Issues
Ok, this happens for all versions? I can bring this up with facebook engineering if someone provides me with further information. Adrian 2009/10/9 Amos Jeffries squ...@treenet.co.nz: Thanks to several people I've managed to track down why the facebook issues are suddenly appearing and why its intermittent. On the sometimes works sometimes doesn't problem. facebook.com does User-Agent header checks and sends back one of four pages. 1) a generic page saying 'please use another browser'. 2) a redirect to login for each of IE, Firefox and Safari 3) a home page (if cookies sent initially) going through the login redirects to the page also presented at (3) above. The home page is the real problem. When cookies are presented it ships without Content-Length (fine). When they _are_ present, ie after the user has logged in it ships with Content-Length: 18487 and data size of 18576. Amos
Re: Segfault in HTCP CLR request on 64-bit
The whole struct is on the local stack. Hence bzero() or memset() to 0. 2009/10/2 Matt W. Benjamin m...@linuxbox.com: Bzero? Is it an already-allocated array/byte sequence? (Apologies, I haven't seen the code.) Assignment to NULL/0 is in fact correct for initializing a sole pointer, and using bzero for that certainly isn't typical. Also, for initializing a byte range, memset is preferred [see Linux BZERO(3), which refers to POSIX.1-2008 on that point]. STYLE(9) says use NULL rather than 0, and it is clearer. But C/C++ programmers should know that NULL is 0. And note that at least through 1998, initialization to 0 was the preferred style in C++, IIRC. Matt - Adrian Chadd adr...@squid-cache.org wrote: I've just replied to the ticket in question. It should probably just be a bzero() rather than setting the pointer to 0. Which should really be setting it to NULL. Anyway, please test whether the bzero() works. If it does then I'll commit that fix to HEAD and 2.7. 2009/9/28 Jason Noble ja...@linuxbox.com: I have opened a bug for this issue here: http://bugs.squid-cache.org/show_bug.cgi?id=2788 Also, the previous patch was not generated against head so I re-rolled the patch against current head and attached to the bug report -- Matt Benjamin The Linux Box 206 South Fifth Ave. Suite 150 Ann Arbor, MI 48104 http://linuxbox.com tel. 734-761-4689 fax. 734-769-8938 cel. 734-216-5309
Re: Segfault in HTCP CLR request on 64-bit
Could you please create a bugzilla report for this, complete with a patch against Squid-2.HEAD and 2.7? I'll then commit it. 2009/9/26 Jason Noble ja...@linuxbox.com: I recently ran into an issue where Squid 2.7 would segfault trying to issue HTCP CLR requests. I found the segfault only occurred on 64-bit machines. While debugging, I found that the value of stuff.S.req_hdrs was not initialized but later, strlen was being called on it. This seems to -- by chance -- not fail on 32 bit builds, but always segfaults on 64-bit. The attached patch fixed the problem for me and it seems good programming practice to properly initialize pointers to prevent issues such as this. As the htcpStuff struct is used in other places, I have concerns that other issues may be lurking as well, although I have yet to run into them. Regards, Jason
Re: Squid-smp : Please discuss
2009/9/15 Sachin Malave sachinmal...@gmail.com: On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd adr...@squid-cache.org wrote: Guys, Please look at what other multi-CPU network applications do, how they work and don't work well, before continuing this kind of discussion. Everything that has been discussed has already been done to death elsewhere. Please don't re-invent the wheel, badly. Yes synchronization is always expensive . So we must target only those areas where shared data is updated infrequently. Also if we are making thread then the amount of work done must be more as compared to overheads required in thread creation, synchronization scheduling. Current generation CPUs are a lot, lot better at the thread-style sync primitives than older CPUs. There's other things to think about, such as lockless queues, transactional memory hackery, atomic instructions in general, etc, etc, which depend entirely upon the type of hardware being targetted. If we try to provide locks to existing data structures then synchronization factor will definitely affect to our design. Redesigning of such structures and there behavior is time consuming and may change whole design of the Squid. Adrian
Re: Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?
But in that case, ACCESS_REQ_PROXY_AUTH would be returned rather than ACCESS_DENIED.. Adrian 2009/9/15 Robert Collins robe...@robertcollins.net: On Tue, 2009-09-15 at 15:22 +1000, Adrian Chadd wrote: G'day. This question is aimed mostly at Henrik, who I recall replying to a similar question years ago but without explaining why. Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL? The particular bit in src/client_side.c: int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; Is there any particular reason why auth is tried again? it forces a pop-up on browsers that already have done authentication via NTLM. Because it should? Perhaps you can expand on where you are seeing this - I suspect a misconfiguration or some such. Its entirely appropriate to signal HTTP_PROXY_AUTHENTICATION_REQUIRED when a user is denied access to a resource *and if they log in differently they could get access*. -Rob
Re: Squid-smp : Please discuss
If you want to start looking at -threading- inside Squid, I'd suggest thinking first how you'd create a generic thread helper framework that allows Squid to run multiple internal threads that can do stuff, and then implement some message/data queues and handle notification between threads. You can then push some stuff into these worker threads as an experiment and see exactly what the issues are. Building worker threads into Squid is easy. Making them do anything? Not so easy :) Adrian 2009/9/15 Sachin Malave sachinmal...@gmail.com: On Tue, Sep 15, 2009 at 1:38 AM, Adrian Chadd adr...@squid-cache.org wrote: 2009/9/15 Sachin Malave sachinmal...@gmail.com: On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd adr...@squid-cache.org wrote: Guys, Please look at what other multi-CPU network applications do, how they work and don't work well, before continuing this kind of discussion. Everything that has been discussed has already been done to death elsewhere. Please don't re-invent the wheel, badly. Yes synchronization is always expensive . So we must target only those areas where shared data is updated infrequently. Also if we are making thread then the amount of work done must be more as compared to overheads required in thread creation, synchronization scheduling. Current generation CPUs are a lot, lot better at the thread-style sync primitives than older CPUs. There's other things to think about, such as lockless queues, transactional memory hackery, atomic instructions in general, etc, etc, which depend entirely upon the type of hardware being targetted. If we try to provide locks to existing data structures then synchronization factor will definitely affect to our design. Redesigning of such structures and there behavior is time consuming and may change whole design of the Squid. Adrian And current generation libraries are also far better than older, like OpenMP, creating threads and handling synchronization issues in OpenMP is very easy... Automatic locks are provided, u need not to design your own locking mechanisms Just a statement and u can lock the shared variable... Then the major work remains is to identify the shared access. I WANT TO USE OPENMP library. ANY suggestions.
Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?
G'day. This question is aimed mostly at Henrik, who I recall replying to a similar question years ago but without explaining why. Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL? The particular bit in src/client_side.c: int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; Is there any particular reason why auth is tried again? it forces a pop-up on browsers that already have done authentication via NTLM. I've written a patch to fix this in Squid-2.7: http://www.creative.net.au/diffs/2009-09-15-squid-2.7-auth_required_on_auth_acl_deny.diff I'll create a bugtraq entry when I have some more background information about this. Thanks, adrian
Re: Squid-smp : Please discuss
Guys, Please look at what other multi-CPU network applications do, how they work and don't work well, before continuing this kind of discussion. Everything that has been discussed has already been done to death elsewhere. Please don't re-invent the wheel, badly. Adrian 2009/9/15 Robert Collins robe...@robertcollins.net: On Tue, 2009-09-15 at 14:27 +1200, Amos Jeffries wrote: RefCounting done properly forms a lock on certain read-only types like Config. Though we are currently handling that for Config by leaking the memory out every gap. SquidString is not thread-safe. But StringNG with its separate refcounted buffers is almost there. Each thread having a copy of StringNG sharing a SBuf equates to a lock with copy-on-write possibly causing issues we need to look at if/when we get to that scope. General rule: you do /not/ want thread safe objectse for high usage objects like RefCount and StringNG. synchronisation is expensive; design to avoid synchronisation and hand offs as much as possible. -Rob
squid-2 - vary and x-accelerator-vary differences?
G'day, I just noticed in src/HttpReply.c that the vary expire option (Config.onoff.vary_ignore_expire) is checked if the reply has HDR_VARY set but it does not check if HDR_X_ACCELERATOR_VARY is set. Everywhere else in the code checks them both consistently and assembles Vary header contents consistently from both. Is this an oversight/bug? Is it intentional behaviour? Thanks, Adrian
multiple store-dir issues
G'day, I've fixed a potentially risky situation in Lusca relating to the initialisation of the storeIOState cbdata type. Each storedir has a different idea of how the allocation should be free()'ed. The relevant commit in Lusca is r14208 - http://code.google.com/p/lusca-cache/source/detail?r=14208 . I'd like this approach to be included in Squid-2.HEAD and backported to Squid-2.7 / Squid-2.6. Thanks, adrian
Re: multiple store-dir issues
2009/7/20 Henrik Nordstrom hen...@henriknordstrom.net: I've fixed a potentially risky situation in Lusca relating to the initialisation of the storeIOState cbdata type. Each storedir has a different idea of how the allocation should be free()'ed. Risky in what sense? Ah. I just re-re-re-read the code again and I now understand what is going on. There are multiple definitions of storeIOState cbdata being allocated instead of one. The definitions are local to each module. Ok. Sorry for the noise. I'll commit a fix to COSS for the initialisation issue someone reported during reconfigure. Adrian
Re: Hello from Mozilla
2009/7/17 Ian Hickson i...@hixie.ch: That way you are still speaking HTTP right until the protocol change occurs, so any and all HTTP compatible changes in the path(s) will occur. As mentioned earlier, we need the handshake to be very precisely defined because otherwise people could trick unsuspecting servers into opting in, or rather appearing to opt in, and could then send all kinds of commands down to those servers. Would you please provide an example of where an unsuspecting server is tricked into doing something? Ian, don't you see and understand the semantic difference between speaking HTTP and speaking a magic bytecode that is intended to look HTTP-enough to fool a bunch of things until the upgrade process occurs ? Don't you understand that the possible set of things that can go wrong here is quite unbounded ? Don't you understand the whole reason for known ports and protocol descriptions in the first place? Apparently not. Ok. Look at this. The byte sequence GET / HTTP/1.0\r\nHost: foo\r\nConnection: close\r\n\r\n is not byte equivalent to the sequence GET / HTTP/1.0\r\nConnection: close\r\nHost: foo\r\n\r\n The same byte sequence interpreted as a HTTP protocol exchange is equivalent. There's a mostly-expected understanding that what happens over port 80 is HTTP. The few cases where that has broken (specifically Shoutcast, but I do see other crap on port 80 from time to time..) has been by people who have implemented a mostly HTTP looking protocol, tested that it mostly works via a few gateways/firewalls/proxies, and then deployed it. My suggestion is to completely toss the whole pretend to be HTTP thing out of the window and look at extending or adding a new HTTP mechanism for negotiating proper tunneling on port 80. If this involves making CONNECT work on port 80 then so be it. Redesigning HTTP is really much more work than I intend to take on here. HTTP already has an Upgrade mechanism, reusing it seems the right thing to do. What you intend to take on here and what should be taken on here is very relevant. You're intending to do stuff over tcp/80 which looks like HTTP but isn't HTTP. Everyone who implements anything HTTP gateway related (be it a transparent proxy, a firewall, a HTTP router, etc) suddenly may have to implement your websockets stuff as well. So all of a sudden your attempt to not extend HTTP ends up extending HTTP. The point is, there may be a whole lot of stuff going on with HTTP implementations that you're not aware of. Sure, but with the except of man-in-the-middle proxies, this isn't a big deal -- the people implementing the server side are in control of what the HTTP implementation is doing. That may be your understanding of how the world works, but out here in the rest of the world, the people who deploy the edge and the people who deploy the core may not be the same people. There may be a dozen layers of red tape, equipment lifecycle, security features, etc, that need to be handled before websockets happy stuff can be deployed everywhere it needs to. Please don't discount man-in-the-middle -anything- as being easy to deal with. In all cases except a man-in-the-middle proxy, this seems to be what we do. I'm not sure how we can do anything in the case of such a proxy, since by definition the client doesn't know it is present. .. so you're still not speaking HTTP? Ian, are you absolutely certain that everywhere you use the internet, there is no man in the middle between you and the server you're speaking to? Haven't you ever worked at any form of corporate or enterprise environment? What about existing captive portal deployments like wifi hotspots, some of which still use squid-2.5 (eww!) as their http firewall/proxy to control access to the internet? That stuff is going to need upgrading sure, but I'd rather see the upgrade happen once to a well thought out and reasonably well designed protocol, versus having lots of little upgrades need to occur because your HTTP but not quite HTTP exchange on port 80 isn't thought out enough. Adrian
Re: Hello from Mozilla
2009/7/15 Ian Hickson i...@hixie.ch: On Tue, 14 Jul 2009, Alex Rousskov wrote: WebSocket made the handshake bytes look like something Squid thinks it understands. That is the whole point of the argument. You are sending an HTTP-looking message that is not really an HTTP message. I think this is a recipe for trouble, even though it might solve some problem in some environments. Could you elaborate on what bytes Squid thinks it should change in the WebSocket handshake? Anything which it can under the HTTP/1.x RFCs. Maybe I missed it - why exactly again aren't you just talking HTTP on the HTTP port(s), and doing a standard HTTP upgrade? Adrian
Re: Hello from Mozilla
2009/7/15 Amos Jeffries squ...@treenet.co.nz: a) Getting a dedicated WebSocket port assigned. * You and the client needing it have an argument to get that port opened through the firewall. * Squid and other proxies can be altered to allow CONNECT through to safe defined ports (80 is not one). Or to do the WebSocket upgrade itself. b) accepting that the network being traversed is screwed beyond redemption by its own policy or admin. I think the fundamental mistake being made here by Ian (and potentially others) is breaking the assumption that specific protocols exist on the well-known ports. Suddenly treating stuff on port 80 as almost but not quite HTTP is bound to cause issues, both devices speaking valid HTTP (eg Squid) and firewalls etc which may treat the exchange as not HTTP and decide to start dropping things. Or worse - passing it through, sort of. Ian - I understand your motivations here but I think it shows a fundamental mis-understanding of the glue which keeps the internet mostly functioning together. Here's a question for you - would you run a mythical protocol, call it foonet, over IP, if it looked almost-but-not-quite like IP so people could run it on their existing IP networks? Can you see any particular issues with that? Other slots in the mythical OSI stack shouldn't be treated any differently. Adrian
Re: [PATCH] Bug 2680: ** helper errors after -k rotate
NOte that winbind has a hard coded limit that is by default very low. Opening 2n ntlm_auth helpers may make things blow up in horrible ways. Adrian 2009/7/16 Robert Collins robe...@robertcollins.net: On Thu, 2009-07-16 at 14:08 +1200, Amos Jeffries wrote: Both reconfigure and helper recovery use startHelpers() where the limit needs to take place. The DOS bug fix broke *rotate* (reconfigure has an async step added by Alex that prevents it being a problem). s/rotate/reconfigure then :) In my mind one is a subset of the other. If someone is running hundreds of helpers on openwrt/olpc then things are broken already :). I'd really suggest that such environments pipeline through a single helper rather than many concurrent helpers. Such platorms are single core and you'll get better usage of memory doing many requests in a single helper than one request each to many helpers. lol, NTLM concurrent? try it! I did. IIRC the winbindd is fully capable of handling multiple overlapping requests, and each NTLM helper is *solely* a thunk layer between squid's format and the winbindd *state*. ASCII art time, 3 requests: Multiple helpers: /--1-helper--\ squid-*---2-helper---* winbindd [state1, state2, state3] \--3-helper--/ One helper: squid-*--1-helper---* winbindd [state1, state2, state3] -Rob
squid-2.HEAD hanging with 304 not modified responses
G'day guys, I've fixed a bug in Lusca which was introduced with Benno's method_t stuff. The specific bug is revalidation replies 'hanging' until the upstream socket closes, forcing an end of message to occur. The history and patch are here: http://code.google.com/p/lusca-cache/source/detail?r=14103 Those of you toying with Squid-2.HEAD (eg Mark) - would you mind verifying that you can reproduce it on Squid-2.HEAD and comment on the fix? Thanks, adrian
NTLM authentication popups, etc
I'm working on a couple of paid squid + active directory deployments and they're both seeing the occasional NTLM auth popup happening. The workaround is pretty simple - just enable the IP auth cache. This however doesn't solve the fundamental problem(s), whatever they are. The symptom is logs like this: [2009/06/15 16:20:17, 1] libsmb/ntlmssp.c:ntlmssp_update(334) got NTLMSSP command 1, expected 3 And vice versa (expected 3, got 1.) These correspond to states in samba/source/include/ntlmssp.h - 1 is NTLMSSP_NEGOTIATE; 3 is NTLMSSP_AUTH. The conclusion here is that there's a disconnect between the authentication state of the client -and- the authentication state of ntlm_auth. I'm trying to eliminate the possibilities here. The stateful helper stuff seems correct enough, so requests aren't being queued to already busy stateful helpers. The other two possibilities I can immediately think of: * 1 - authentication is aborted somewhere for whatever reason; an authentication helper is stuck at the wrong point in the state engine; the next request coming along starts at NTLMSSP_NEGOTIATE but the ntlm_auth helper it is handed to is at NTLMSSP_AUTH (from the partial authentication attempt earlier); error * 2 - the web browser is stuffing different phases of the negotiation down different connections to the proxy. Now, debugging (1) shouldn't be difficult at all. I'm going to try and determine the code paths that lead to and from an aborted auth request, add in some debugging and see if the helper is closed. Debugging (2) without full logs (impractical in this environment) and full traffic dump (again, impractical in production) is going to be a bit more difficult. I'm thinking about adding some hacky code to the Squid ntlm auth class which keeps a log of the auth blobs sent/received from/to the client and ntlm_auth. I can then dump the entire conversation out to cache.log whenever authentication fails/errors. This should at least give me a hint as to what is going on. (1) can explain the client state == NTLMSSP_NEGOTATE but ntlm_auth state is NTLMSSP_AUTH problem but not vice versa. (2) explains both. It is quite possible it is the combination of both however. Now, the reason this is getting somewhat annoying and why I'd like to try and understand/fix it is that -another- problem seen by one of these clients is negotiate/ntlm authentication from IE (at least IE8) through Squid. I've got packet dumps showing the browser sending different phases of the negotiation down separate proxy connections and then reusing the original one incorrectly. My medium term plan is to take whatever evidence I have of this behaviour and throw it at the IE group(s) at Microsoft but in the short term I'd like to make certain the proxy authentication side of things is completely blameless before I hand off stuff to third parties. Ideas? Comments? adrian
Re: Very odd problem running squid 2.7 on Windows
strtoul(). But if you want to verify the -whole- thing is numeric, just write a bit of C which does this: int isNumeric(const char *str) { } 2009/5/25 Amos Jeffries squ...@treenet.co.nz: Guido Serassio wrote: Hi, At 16.17 24/05/2009, Adrian Chadd wrote: Well as Amos said, this isn't the way to call getservbyname(). getservbyname() doesn't translate ports to ports; it translates tcp/udp service names to ports. It should be returning NULL if it can't find the service string in the file. Methinks numeric values shouldn't be handed to getservbyname() under Windows. :) So, we have just found a Squid bug :-) Regards Yes. Question becomes though, fastest way to detect numeric-only strings. Amos -- Please be using Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15 Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1
Re: Very odd problem running squid 2.7 on Windows
Actually, it should probably be 1 vs 0; -1 still evaluates to true if you go (if func()). I think the last few hours of fixing bad C and putting in error checking messed me around a bit. I just wrote a quick bit of C to double-check. (Of course in C++ there's native bool types, no? :) Sorry! Adrian 2009/5/25 Kinkie gkin...@gmail.com: On Mon, May 25, 2009 at 2:21 PM, Adrian Chadd adr...@squid-cache.org wrote: int isUnsignedNumeric(const char *str) { for (; *str; str++) { if (! isdigit(*str)) return -1; } return 1; } Wouldn't returning 0 on false instead of -1 be easier? Just a random thought.. -- /kinkie
Re: Very odd problem running squid 2.7 on Windows
Well as Amos said, this isn't the way to call getservbyname(). getservbyname() doesn't translate ports to ports; it translates tcp/udp service names to ports. It should be returning NULL if it can't find the service string in the file. Methinks numeric values shouldn't be handed to getservbyname() under Windows. :) adrian 2009/5/24 Guido Serassio guido.seras...@acmeconsulting.it: Hi, At 04.38 24/05/2009, Adrian Chadd wrote: Can you craft a small C program to replicate the behaviour? Sure, I wrote the following test program: #include stdio.h #include Winsock2.h void main(void) { u_short i, converted; WSADATA wsaData; struct servent *port = NULL; char token[32]; const char proto[] = tcp; WSAStartup(2, wsaData); for (i=1; i65535; i++) { sprintf(token, %d, i); port = getservbyname(token, proto); if (port != NULL) { converted=ntohs((u_short) port-s_port); if (i != converted) printf(%d %d\n, i, converted); } } WSACleanup(); } And this is the result on my Windows XP x64 machine (similar results on Windows 2000 and Vista): 2 512 258 513 524 3074 770 515 782 3587 1288 2053 1792 7 1807 3847 2050 520 2234 47624 2304 9 2311 1801 2562 522 2564 1034 2816 11 3328 13 3586 526 3853 3343 4352 17 4354 529 4610 530 4864 19 4866 531 5120 20 5122 532 5376 21 5632 22 5888 23 6400 25 7170 540 7938 543 8194 544 8706 546 8962 547 9472 37 10752 42 10767 3882 11008 43 11266 556 12054 5679 13058 563 13568 53 13570 565 13579 2869 14380 11320 14856 2106 15372 3132 15629 3389 16165 9535 16897 322 17920 70 18182 1607 18183 1863 19977 2382 20224 79 20233 2383 20480 80 20736 81 20738 593 21764 1109 22528 88 22550 5720 22793 2393 23049 2394 23809 349 24335 3935 25602 612 25856 101 25858 613 26112 102 27392 107 27655 1900 27904 109 28160 110 28416 111 28928 113 29952 117 30208 118 30222 3702 30464 119 31746 636 34049 389 34560 135 35072 137 35584 139 36106 2701 36362 2702 36608 143 36618 2703 36874 2704 37905 4500 38400 150 38919 1944 39173 1433 39426 666 39429 1434 39936 156 39945 2460 40448 158 42250 2725 43520 170 44806 1711 45824 179 45826 691 47383 6073 47624 2234 47873 443 47878 1723 48385 445 49166 3776 49664 194 49926 1731 50188 3268 50437 1477 50444 3269 50693 1478 51209 2504 52235 3020 53005 3535 53249 464 53510 1745 54285 3540 55309 3544 56070 1755 56579 989 56585 2525 56835 990 57347 992 57603 993 57859 994 58115 995 59397 1512 60674 749 62469 1524 62980 1270 64257 507 65040 4350 It seems that sometime (!!!) getservbyname() will incorrectly return something ... Regards Guido adrian 2009/5/24 Guido Serassio guido.seras...@acmeconsulting.it: Hi, One user has reported a very strange problem using cache_peer directive on 2.7 STABLE6 running on Windows: When using the following config: cache_peer 192.168.0.63 parent 3329 0 no-query cache_peer rea.acmeconsulting.loc parent 3328 3130 the result is always: 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent rea.acmeconsulting.loc/13/3130 Very odd Debugging the code, I have found where is situated the problem. The following if GetService() from cache_cf.c: static u_short GetService(const char *proto) { struct servent *port = NULL; char *token = strtok(NULL, w_space); if (token == NULL) { self_destruct(); return -1; /* NEVER REACHED */ } port = getservbyname(token, proto); if (port != NULL) { return ntohs((u_short) port-s_port); } return xatos(token); } When the value of port-s_port is 3328, ntohs() always returns 13. Other values seems to work fine. Any idea ? Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: guido.seras...@acmeconsulting.it WWW: http://www.acmeconsulting.it/ - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: guido.seras...@acmeconsulting.it WWW: http://www.acmeconsulting.it/
Re: Very odd problem running squid 2.7 on Windows
Can you craft a small C program to replicate the behaviour? adrian 2009/5/24 Guido Serassio guido.seras...@acmeconsulting.it: Hi, One user has reported a very strange problem using cache_peer directive on 2.7 STABLE6 running on Windows: When using the following config: cache_peer 192.168.0.63 parent 3329 0 no-query cache_peer rea.acmeconsulting.loc parent 3328 3130 the result is always: 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent rea.acmeconsulting.loc/13/3130 Very odd Debugging the code, I have found where is situated the problem. The following if GetService() from cache_cf.c: static u_short GetService(const char *proto) { struct servent *port = NULL; char *token = strtok(NULL, w_space); if (token == NULL) { self_destruct(); return -1; /* NEVER REACHED */ } port = getservbyname(token, proto); if (port != NULL) { return ntohs((u_short) port-s_port); } return xatos(token); } When the value of port-s_port is 3328, ntohs() always returns 13. Other values seems to work fine. Any idea ? Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: guido.seras...@acmeconsulting.it WWW: http://www.acmeconsulting.it/
Re: Is it really necessary for fatal() to dump core?
2009/5/19 Mark Nottingham m...@yahoo-inc.com: I'm going to push back on that; the administrator doesn't really have any need to get a core when, for example, append_domain doesn't start with .'. Squid.conf is bloated as it is; if there are cases where a core could be conceivably useful, they should be converted to fatal_dump. From what I've seen they'll be a small minority at best... Well, I'd be interested in seeing some better defined characteristics of stuff with some sort of defined expectations and behaviour. Like an API. :) Right now, fatal, assert, etc are all used interchangably for quite a wide variety of reasons and the codebase may be much better off if someone starts off by fixing these a bit. Adrian
Re: Is it really necessary for fatal() to dump core?
just make that behaviour configurable? core_on_fatal {on|off} Adrian 2009/5/19 Mark Nottingham m...@yahoo-inc.com: tools.c:fatal() dumps core because it calls abort. Considering that the core can be quite large (esp. on a 64bit system), and that there's fatal_dump() as well if you really want one, can we just make fatal() exit(1) instead of abort()ing? Cheers, -- Mark Nottingham m...@yahoo-inc.com
Re: 3.0 assertion in comm.cc:572
2009/5/11 Amos Jeffries squ...@treenet.co.nz: We have one user with a fairly serious production machine hitting this assertion. It's an attempted comm_read of closed FD after reconfigure. Nasty, but I think the asserts can be converted to a nop return. Does anyone know of a subsystem that would fail badly after a failed read with all its sockets and networking closed anyway? That will bite you later on if/when you wanted to move to support Windows overlapped IO / POSIX AIO style kernel async IO on network sockets. You don't want read's scheduled on FDs that are closed; nor do you want the FD closed during the execution of the read. Figure out what is scheduling a read / what is scheduling the completion incorrectly and fix the bug. Adrian
/dev/poll solaris 10 fixes
I'm giving my /dev/poll (Solaris 10) code a good thrashing on some updated Sun hardware. I've fixed one silly bug of mine in 2.7 and 2.HEAD. If you're running Solaris 10 and not using the /dev/poll code then please try out the current CVS version(s) or wait for tomorrow's snapshots. I'll commit whatever other fixes are needed in this environment here :) Thanks, Adrian
Squid-2/Lusca async io shortcomings..
Hi all, I've been braindumping my thoughts into the Lusca blog during some experimental development to eliminate the data copy in the disk store read path. This shows up as the number 1 CPU abuser in my test CDN deployment - where I see a 99% hit rate on a set of large objects ( 16meg.) My first idea was to avoid having to paper over the storage code shortcomings with refcounted buffers, and modify various bits of code to keep the store supplied read buffer around until the completion of said read IO. This mirrors the requirements for various other underlying async io implementations such as posix AIO and windows completion IO. Unfortunately the store layer and the async IO code doesn't handle event cancellation right (ie, you can't do it) but the temporary read buffer in async_io.c + the callback data pointer check papers over that. Store reads and writes may be scheduled and in flight when some other part of code calls storeClose() and nothing really tries to wait around for the read IO to complete. So either the store layer needs to be made slightly more sane (which I may attempt later), or the whole mess can stay a mess and be papered over by abusing refcounted buffers all the way down to the IO layer. Anyway, I know there are other developers out there working on filesystem code for Squid-3 and I'm reasonably certain (read: at last check a few months ago) the store layer and IO layers are just as grimey - so hopefully my braindumping will save some more of you a whole lot of headache. :) Adrian
Re: Feature: quota control
I'm looking at implementing this as part of a contract for squid-2. I was going to take a different approach - that is, i'm not going to implement quota control or management in squid; I'm going to provide the hooks to squid to allow external controls to handle the quota. adrian 2009/2/21 Pieter De Wit pie...@insync.za.net: Hi Guys, I would like to offer my time in working on this feature - I have not done any squid dev, but since I would like to see this feature in Squid, I thought I would take it on. I have briefly contacted Amos off list and we agreed that there is no set in stone way of doing this. I would like to propose that we then start throwing around some ideas and let's see if we can get this into squid :) Some ideas that Amos quickly said : - Based on delay pools - Use of external helpers to track traffic The way I see this happening is that a Quota is like a pool that empties based on 2 classes - bytes and requests. Requests will be for things like the number of requests, i.e. a person is only allowed to download 5 exe's per day or 5 requests of 1meg or something like that (it just popped into my head :) ) Bytes is a pretty straight forward one, the user is only allowed x amount of bytes per y amount of time. Anyways - let the ideas fly :) Cheers, Pieter
Resigning from squid-core
Hi all, It's been a tough decision, but I'm resigning from any further active role in the Squid core group and cutting back on contributing towards Squid development. I'd like to wish the rest of the active developers all the best in the future, and thank everyone here for helping me develop and test my performance and feature related Squid work. Adrian
Re: Buffer/String split, take2
2009/1/21 Kinkie gkin...@gmail.com: What I fear from the DC approach is that we'll end up with lots of duplicate code between the 'buffer' classes, to gain a tiny little bit of efficiency and semantic clarity. If that approach has to be taken, then I'd rather take the variant of the note - in fact that's quite in line with what the current (agreeably ugly) code does. The trouble is that the current, agreeably ugly code, actually works (for values of works) right now, and the last thing the project needs is for that works bit to be disturbed too much. In my opinion the 'universal buffer' model can be adapted quite easily to address different uses by extending its allocation strategy - it's a self-contained function of code exactly for this purpose, and it could be extended again by using Strategy patterns to do whatever the caller wishes. It would be trivial for instance for users to request that the underlying memory be allocated by the pageful, or to request preallocation of a certain amount of memory if they know they'll be using, etc. Having a wide interface is a drawback of the Universal approach, But you don't know how that memory should be arranged. If its just for strings, then you know the memory should be arranged in whatever makes sense to minimise memory allocator overheads. In the parsing codepath, that involves parsing and creating references to an already-allocated large chunk of RAM, instead of copying into separately allocated areas. For things like disk IO (and later on, network IO too!) this may not be as obvious a case. In fact, based on the -provider- (anonymous? disk? network? some peer module?) you may want to request pages from -them- to put data into for various reasons, as simply grabbing an anonymous page from the system allocator and filling it with data may need -another- copy step later on. This is why I'm saying that right now, focusing on -just- the String stuff and the minimum required to do copy-free parsing and copying in and out of the store is probably the best bet. A universal buffer method is probably over-reaching things. There's a lot of code in Squid which needs tidying up and whatever we come up and -all- of it -has- to happen -regardless- of what buffer abstraction(s) we choose. Regarding vector i/o, it's almost a no-brainer at a first glance: given UniversalBuffer, implement UniversalBufferList and make MemBuf use the latter to implement producer-consumer semantics. Then use this for writev(). produce and consume become then extremely lightweight calls. Let me remind you that currently MemBuf happily memmoves contents at each consume, and other producer-consumer classes I could find (BodyPipe and StoreEntry) are entirely different beasts, which would benefit from having their interfaces changed to use UniversalBuffers, but probably not their innards. And again, what I'm saying here is that a conservative, cautious approach now is likely to save a lot of risk in the development path forward. Regarding Adrian's proposal, he and I discussed the issue extensively. I don't agree with him that the current String will give us the best long-term benefits. My expectation is (but we can only know after we have at least some extensive use of it) that the cheap substringing features of the current UniversalBuffer implementation will give us substantial benefits in the long term. I agree with him that fixing the most broken parts of the String interface is a sensible strategy for merging whatever String implementation we end up choosing. I fear that if we focus too much on the long-term, we may end up losing sight of the medium-term, and thus we risk reaching neither because short-term noone does anything. EVERYONE keeps on asserting that squid (2 and 3) has low-level issues to be fixed, yet at the same time only Adrian does something in squid-2, and I feel I'm the only one trying to do something in squid-3 - PLEASE correct me and prove me wrong. *shrug* I think people keep choosing the wrong bits to bite off. I'm not specifically talking about you Kinkie, this certainly isn't the only instance where the problem isn't really fully understood. The problem in my eyes is that noone understands the entire Squid-3 codebase enough to start to understand what needs to happen and begin engineering an actual path forward. Everyone knows their little corner of the codebase. Squid-3 seems to be plagued by little mini-projects which focus on specific areas without much knowledge of how it all holds together, and all kinds of busted behaviour ensues. There's another issue which worries me: the current implementation has been in the works for 5 months; there have been two extensive reviews, two half-rewrites and endless discussions. Now the issue crops up that the basic design - whose blueprint has also been available for 5 months in the wiki - is not good, and that we may end up having to basically start from scratch. How can we as
Re: Ref-counted strings in Squid-2/Cacheboy
I'd like to avoid having to write to those pages if possible. Leaving the incoming data as read-only will save another write-back pass for those pages through the cache/bus, and in the case of tiny objects (ie, where parsing becomes a -big- part of the overhead), that may end up hurting. NUL terminated strings make iteration easier (you only need an address register and a check for 0) but current CPUs with their plenty-of-registers and superscalar execution mostly make that point moot. You can check, increment the pointer and decrement a length value pretty damned quickly. :) There aren't all that many places that assume C buffer semantics for String. Most of it isn't all that hairy (access_log, etc); some of it is only hairy because of the use of _C_ string library functions with String.buf() (ftp); the biggest annoyance is the vary code and the client-side code. Oh, and one has to copy the buffer anyway for regexp lookups (POSIX regex API requires a NUL terminated string), at least until we convert to PCRE which can and does take a length parameter to a regex run function. :) The point is, once you've been forced to tidy up the String users by removing the assumption that NUL will occur, you'll (hopefully) have been forced to write nicer replacement code, and everyone benefits from that. Adrian 2009/1/21 Henrik Nordstrom hen...@henriknordstrom.net: fre 2009-01-16 klockan 12:53 -0500 skrev Adrian Chadd: So far, so good. It turns out doing this as an intermediary step worked out better than trying to replace the String code in its entirety with replacement code which doesn't assume NUL terminated strings. Just a thought, but is there really any parsing step where we can not just overwrite the next octet with a \0 to get null-terminated strings? This is what the parser does today, right? The HTTP parser certainly can in-place null-terminate everything. Header names always ends with a : which we always throw away, and the data ends with a newline which is also thrown away. Regards Henrik
Re: IRC Meetup logs up in the wiki
Uhm, guess I go on holiday and miss out on EVERYTHING I got back on the 17th and would have loved to attend had I the precence of mind to have checked. :) Hey, someone got a holiday! Quick, he's relaxed enough now to work! :) Sorry guys. In other news I've got some new exposed counters for squid-2 performance - will port to 3.1 and then submit for review. Also planning to extend cachemgr to output in xml as alternative, will allow far simpler processing and xsl transforms. Do you have the patches against Squid-2 available? adrian Extended cacti monitoring of all relevant bits is in process and will be available soon. Regardt
Re: Buffer/String split, take2
2009/1/20 Alex Rousskov rouss...@measurement-factory.com: Please voice your opinion: which design would be best for Squid 3.2 and the foreseeable future. [snip] I'm about 2/3rds of the way along the actual implementation path of this in Cacheboy so I can provide an opinion based on increasing amounts of experience. :) [Warning: long, somewhat rambly post follows, from said experience.] The thing I'm looking at right now is what buffer design is required to adequately handle the problem set. There's a few things which we currently do very stupidly in any Squid related codebase: * storeClientCopy - which Squid-2.HEAD and Cacheboy avoid the copy on, but it exposes issues (see below); * storeAppend - the majority of data coming -into- the cache (ie, anything from an upstream server, very applicable today for forward proxies, not as applicable for high-hit-rate reverse proxies) is still memcpy()'ed, and this can use up a whole lot of bus time; * creating strings - most strings are created during parsing; few are generated themselves, and those which are, are at least half static data which shouldn't be re-generated over and over and over again; * duplicating strings - httpHeaderClone() and friends - dup'ing happens quite often, and making it cheap for the read only copies which are made would be fantastic * later on, being able to use it for disk buffers, see below * later on, being able to properly use it for the memory cache, again see below The biggest problems I've hit thus far stem from the data pipeline from server - memstore - store client - client side. At the moment, the storeClientCopy() call aggregates data across the 4k stmem page size (at least in squid-2/cacheboy, I think its still 4k in squid-3) and thus if your last access gave you half a page, your next access can get data from both the other half of the page and whatever is in the next buffer. Just referencing the stmem pages in 2.HEAD/Cacheboy means that you can (and do) end up with a large number of small reads from the memory store. You save on the referencing, but fail on the work chunk size. You end up having to have a sensible reference counted buffer design -and- a vector list to operate on it with. The string type right now makes sense if it references a contiguous, linear block of memory (ie, a sub-region of a contig buffer). This is how its treated today. For almost all of the lifting inside Squid proper, that may be enough. There may however be a need later on for string-like and buffer-like operations on buffer -vectors- - for example, if you're doing some kind of content scanning over incoming data, you may wish to buffer your incoming data until you have enough data to match that string which is straddling two buffers - and the current APIs don't support it. Well, nothing in Squid supports it currently, but I think its worth thinking about for the longer term. Certainly though, I think that picking a sensible string API with absolutely no direct buffer access out of a few controlled areas (eg, translating a list of strings or list of buffers into an iovec for writev(), for example) is the way to go. That will equip Squid with a decent enough set of tools to start converting everything else which currently uses C strings over to using Squid Strings and eventually reap the benefits of the zero-cost string duplication. Ok, to summarise, and this may not exactly be liked by the majority of fellow developers: I think the benefits that augmenting/fixing the current SquidString API and tidying up all the bad places where its used right now is going to give you the maximum long-term benefit. There's a lot of legacy code right now which absolutely needs to be trashed and rewritten. I think the smartest path forward is to ignore 95% of the decision about deciding which buffering method to use for now, fix the current String API and all the code which uses it so its sensible (and fixing it so its sensible won't take long; fixing the code which uses it will take longer) and at that point the codebase will be in much better shape to decide which will be the better path forward. Now, just so people don't think I'm stirring trouble, I've gone through this myself in both a squid-2 branch and Cacheboy, and here's what I found: * there's a lot of code which uses C strings created from Strings; * there's a lot of code which init'ed strings from C strings, where the length was already known and thrown out; * there's a lot of code which init'ed strings from C strings which were once Strings; * there's even code which init's strings -from- a string, but only by using strBuf(s) (I'm pointing at the http header related code here, ugh) * all the stuff which directly accesses the string buffer code can and should be tossed, immediately - unfortunately there's a lot of it, the majority being in what I gather is very long-lived code in src/client_side.c (and what it became in squid-3) So what I'm sort of doing now in Cacheboy-head, combined
Ref-counted strings in Squid-2/Cacheboy
I've just created a branch off of my Cacheboy tree and dumped in the first set of changes relating to ref-counted strings into it. They're not as useful and flexible as the end-goal we all want - specifically, this pass just creates ref counted NUL-terminated C strings, so creating references of regions of other strings / buffers isn't possible. But it does mean that duplicating header sets (ie, httpHeaderClone() I think?) becomes bloody cheap. The next move - removing the requirement for the NUL-termination - is slightly hairer, but still completely doable (and I've done it in a previous branch in sourceforge, so I know whats required.) Thats when the real benefits start to appear. So far, so good. It turns out doing this as an intermediary step worked out better than trying to replace the String code in its entirety with replacement code which doesn't assume NUL terminated strings. http://code.google.com/p/cacheboy/source/list?path=/branches/CACHEBOY_HEAD_strref This, and all the other gunk thats gone into cacheboy over the last few months during the reorganisation and tidyup, still mostly represents where I think Squid core codebase should have gone / should be going at the present time. Enjoy. :) Adrian
Re: [PATCH] WCCPv2 documentation and cleanup for bug 2404
Have you tested these changes against various WCCPv2 implementations? I do recall some structure definitions in the draft mis-matching the wide number of IOS versions out there, this is why I'm curious. Adrian 2009/1/10 Amos Jeffries squ...@treenet.co.nz: This patch: - adds a reference to each struct mentioning the exact draft RFC section where that struct is defined. - fixes sent mask structure fields to match draft. (bug 2404) - removes two duplicate useless structs Submitting as a patch to give anyone interested time to double-check the code changes. As a result we are a step closer toward splitting the code into a separate library. It's highlighted some of the WCCPv2 issues and a pathway forward now clear: - move types definitions to a protocol types header (wccp2_types.h ?) - correct mangled definitions for generic use. including code in that. - add capability handling - add hash/mask service negotiation - add sibling peer discovery through WCCP group details ?? Amos -- Please be using Current Stable Squid 2.7.STABLE5 or 3.0.STABLE11 Current Beta Squid 3.1.0.3
Re: When can we make Squid using multi-CPU?
2009/1/8 Alex Rousskov rouss...@measurement-factory.com: SMP support has been earmarked for Squid v3.2 but there is currently not enough resources to make it happen (AFAICT) so it may have to wait until v3.3 or later. FWIW, I think that multi-core scalability in many environments would not require another Squid rewrite, especially if initial support does not have to do better than running multiple Squids. Well, people are already doing that where its suitable. Whats really missing for those sorts of setups is a simple(!) storage-only backend and some smarts in Squid to be able to push and pull stuff out of a shared storage backend, rather than relaying through it. The trouble, as I've found here, is if you're trying to aggregate a bunch of forward proxy squid instances on one box through one backend squid instance - all of a sudden you end up with lots of RAM wastage and things die at high loads with all the duplicate data floating around in socket buffers. :/ Adrian
Re: When can we make Squid using multi-CPU?
I've been looking into what would be needed to thread squid as part of my cacheboy squid-2 fork. Basically, I've been working on breaking out a bunch of the core code into libraries, which I can then check and verify are thread-safe. I can then use these bits in threaded code. My first goal was probably to break out the ACL and internal URL rewriter code into threads, but the current use of the callback data setup in Squid makes passing cbdata pointers into other threads quite uhm, tricky. The basic problem is that although a given chunk of memory backing a cbdata pointer will remain valid for as long as the reference exists, the -data itself- may not be valid at any point. So if thread A creates a cbdata pointer and passes it into thread B to do something (say an ACL lookup), there's no way (at the moment) for thread B to guarantee at any/all points during its execution that the data in B will stay valid without a whole lot of pissing around with locking, which I'd absolutely like to avoid doing in a high performance network application even the apparent wonderful performance current hardware has w/ lots of locking. :) So for the time being, I'm looking at what would be needed for a basic inter-thread batch event/callback message queue, sort of like AsyncCalls in squid-3 but minus 100% of the legacy cruft; and then I'll see what kind of tasks can be pushed out to the threads. Hopefully a bunch of stuff can be easily pushed out to threads with a minimum amount of effort, such as some/all of the ACL lookups, some URL rewriting, some GZIP and other kind of basic content mainpulation, and the freakishly simple (comparitively) server-side HTTP code (src/http.c). But doing that requires making sure a bunch of the low level code is suitably re-enterant/thread-safe/etc, and this includes a -lot- of stuff (lib/, debug, logging, memory allocation, some statistics gathering, chunks of the HTTP parsing and packing routines, the packer routines, membufs, etc.) Thankfully (in Cacheboy) I've broken out almost all of the needed stuff into top-level libraries which can be independently audited for thread-happiness. There's just some loose ends which need tidying up. For example, almost all of the code in libhttp/ in cacheboy (ie, basic http header and header entry stuff, parsing, range request headers, cc, headers, etc) are thread-safe, but the functions -they- call (such as the base64 functions) use static buffers which may or may not be thread-safe. Stuff which calls the legacy non-safe inet_* routines, or perhaps the non thread-safe strtok() and other string.h functions, all need to be fixed. Threading the rest of it would take a lot, -lot- more time. A thread-aware storage backend (disk, memory, store index) is definitely an integral part of making a threaded Squid, and a whole lot more code modularity and reorganisation would have to take place for that to occur. Want to help? :) Adrian 2009/1/4 ShuXin Zheng zhengshu...@gmail.com: I've ever do this to run multi-squid on one machine which can use multi-CPU, but can't share the same store-fs, and must configure multi-IP on the same machine. Can we rewrite squid as follow: thread0(client side, no block, can accept many connections) thread1 ..threadn(n=CPU number) | | v v access check access check | | v v http header parse http header parse | | v v acl filter acl filter | | v v check local cache check local cache | | v v --- | neighbor| ||-ufs webserver--|--- forward - |store fs |-aufs | | |-coss --- |(thread0) |(thread1) .. v v ... 2009/1/4 anest...@cisdi.com: I've found the best way is to run multiple copies of squid on a single machine, and use LVS to load balance between the squid processes. -- Joe Quoting Adrian Chadd adr...@squid-cache.org: when someone decides to either help code it up, or donate towards the effort. adrian 2009/1/3 ShuXin Zheng zhengshu...@gmail.com: Hi, Squid current can only use one CPU, but multi-CPU hardware machines are so popular. These are so greatly wastely. How can we use the multi-CPU? Can we separate some parallel sections which are CPU wasting to run on different CPU? OMP(http://openmp.org/wp/) gives us some thinking about using multi-CPU, so can we use these technology in Squid? Thanks
Re: When can we make Squid using multi-CPU?
when someone decides to either help code it up, or donate towards the effort. adrian 2009/1/3 ShuXin Zheng zhengshu...@gmail.com: Hi, Squid current can only use one CPU, but multi-CPU hardware machines are so popular. These are so greatly wastely. How can we use the multi-CPU? Can we separate some parallel sections which are CPU wasting to run on different CPU? OMP(http://openmp.org/wp/) gives us some thinking about using multi-CPU, so can we use these technology in Squid? Thanks -- zsxxsz
Re: Introductions
Welcome! 2008/12/30 Regardt van de Vyver sq...@vdvyver.net: Hi Dev Team. My name is Regardt van de Vyver, a technology enthusiast who tinkers with squid on a regular basis. I've been involved in development for around 12 years and am an active participant on numerous open source projects. Right now I'm focussed on improving and extending performance metrics for squid, specifically related to SNMP and the cachemanager. I'd like to take a more active role in the coming year from a dev perspective and feel the 1st step here is to at least get my butt onto the dev mailing list ;-) I look forward to getting involved. Regards, Regardt van de Vyver
src/debug.cc : amos?
Amos, whats this for in src/debug.cc ? //*AYJ:*/if (!Config.onoff.buffered_logs) fflush(debug_log); Adrian
Re: Migrating debug code from src/ to src/debug/
Ok, besides the lacking build dependency on src/core and src/debug, I think the first round of changes are finished. That is, the ctx/debug routines and all that they depend on have been shuffled out of src/ and into src/core / src/debug as appropriate. I've pushed the changes to the launchpad URL mentioned previously. I'd like some feedback and some assistance figuring out how/where to convince src/Makefile.am that the two above directories are build prereqs for almost everything. There are a -lot- of build targets in that Makefile under Squid-3 and I'm not sure that I want to add to the mess in a naive way. Thanks, Adrian
Re: X-Vary-Options support
2008/12/20 Mark Nottingham m...@yahoo-inc.com: I agree. My impression was that it's pretty specific to their requirements, not a good general solution. Well, I'm all ears about a slightly more flexible solution. I mean, this is an X-* header; we could simply document it as a Squid specific feature once a few basic concerns have been addressed, and leave nutting out the right solution to the IETF group. :) Adrian
Re: Migrating debug code from src/ to src/debug/
2008/12/18 Adrian Chadd adr...@freebsd.org: I've begun fiddling with migrating the bulk of the debug code out of src/ and into src/debug/; as per the source reorganisation wiki page. The next step is migrating some other stuff out and doing some API hiding hijinx of the debugging logfile code - a bunch of code directly frobs the debug log fd/filehandle for various nefarious purposes. Grr. The other next thing is to sort out where to put the SquidTime stuff, which is used by the debug code. I'll create src/core for now in my branch to put this random stuff; I'll worry about the final destination for it all later. I couldn't tease apart ctx and debug all that much in cacheboy (and I couldn't figure out how it should or may be done as an exercise either) so I'll just lump them together. Adrian
Re: Migrating debug code from src/ to src/debug/
Would someone perhaps enlighten me why Squid-3 is trying to install src/SquidTime.h as part of some build rule, and why moving it out of the way (into src/core/) has resulted in make install completely failing? I'm having some real trouble understanding all of the gunk thats in the Squid-3 src/Makefile.am and its starting to give me a headache. Thanks, Adrian
Migrating debug code from src/ to src/debug/
I've begun fiddling with migrating the bulk of the debug code out of src/ and into src/debug/; as per the source reorganisation wiki page. The first step is to just relocate the syslog facility code out, which I've done. The next step is to break out the debug code which handles the actual debugging into src/debug/. The changes can be viewed at http://bazaar.launchpad.net/~adrian-squid-cache/squid/adrian_src_reorganise/ . I'll post again when I've finished the debug code shuffle so I can figure out the right way to submit the change request. Adrian
X-Vary-Options support
Hi, I've got a small contract to get Squid going in front of a small group of Mediawiki servers and one of the things which needs adding is the X-Vary-Options support. So is there any reason whatsoever that it can't be committed to Squid-2.HEAD as-is, and at least backported (but not committed to start with) to squid-2.7? I remember the Wiki guys' issues wrt Variant purging, which I'm hoping Y! and Benno have sorted out, and I'm not looking to commit anything relating to that now - just the X-Vary-Options support. Thanks, Adrian
Re: Request for new round of SBuf review
Howdy, As most of you aren't aware, Kinkie, alex and I had a bit of a discussion about this on IRC rather than on the mailing list, so there's probably some other stuff which should be posted here. Kinkie, are you able to post some updated code + docs after our discussion? My main suggestion to Kinkie was to take his code and see how well it worked with some test use cases - the easiest and most relevant one being parsing HTTP requests and building HTTP replies. I think that a few test case implementations outside of the Squid codebase will be helpful in both understanding the issues which this sort of class is trying to solve. I would really be against integrating it into Squid mainline until we've all had a chance to play with it without being burdened by the rest of Squid. :) Adrian 2008/12/4 Kinkie [EMAIL PROTECTED]: Hi all, I feel that SBuf may just be complete enough to be considered a viable replacement for SquidString, as a first step towards integration. I'd appreciate anyone's help in giving it a check to gather feedback and suggestions. Doxygen documentation for the relevant classes is available at http://eu.squid-cache.org/~kinkie/sbuf-docs/ , the code is at lp:~kinkie/squid/stringng (https://code.launchpad.net/~kinkie/squid/stringng). Thanks! -- /kinkie
Re: The cache deny QUERY change... partial rollback?
2008/12/1 Henrik Nordstrom [EMAIL PROTECTED]: After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? Are you able to put up some examples and statistics? I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Adrian
Re: Rv: Why not BerkeleyDB based object store?
I thought about it a while ago but i'm just out of time to be honest. Writing objects to disk only if they're popular or you need the RAM to handle concurrent accesses for large objects for some reason would probably way way improve disk performance as the amount of writing would drop drastically. Sponsorship for investigating and developing this is gladly accepted :) Adrian 2008/11/26 Mark Nottingham [EMAIL PROTECTED]: Just a tangental thought; has there been any investigation into reducing the amount of write traffic with the existing stores? E.g., establishing a floor for reference count; if it doesn't have n refs, don't write to disk? This will impact hit rate, of course, but may mitigate in situations where disk caching is desirable, but writing is the bottleneck... On 26/11/2008, at 9:14 AM, Kinkie wrote: On Tue, Nov 25, 2008 at 10:23 PM, Pablo Rosatti [EMAIL PROTECTED] wrote: Amazon uses BerkeleyDB for several critical parts of its website. The Chicago Mercatile Exchange uses BerkeleyDB for backup and recovery of its trading database. And Google uses BerkeleyDB to process Gmail and Google user accounts. Are you sure BerkeleyDB is not a good idea to replace the Squid filesystems even COSS? Squid3 uses a modular storage backend system, so you're more than welcome to try to code it up and see how it compares. Generally speaking, the needs of a data cache such as squid are very different from those of a general-purpose backend storage. Among the other key differences: - the data in the cache has little or no value. it's important to know whether a file was corrupted, but it can always be thrown away and fetched from the origin server at a relatively low cost - workload is mostly writes a well-tuned forward proxy will have a hit-rate of roughly 30%, which means 3 writes for every read on average - data is stored in incremental chunks Given these characteristics, a long list of mechanisms database-like systems have such as journaling, transactions etc. are a waste of resources. COSS is explicitly designed to handle a workload of this kind. I would not trust any valuable data to it, but it's about as fast as it gets for a cache. IMHO BDB might be much more useful as a metadata storage engine, as those have a very different access pattern than a general-purpose cache store. But if I had any time to devote to this, my priority would be in bringing 3.HEAD COSS up to speed with the work Adrian has done in 2. -- /kinkie -- Mark Nottingham [EMAIL PROTECTED]
Re: omit to loop-forever processing some regex acls
G'day! If these are patches against Squid-2 then please put them into the Squid bugzilla so we don't lose them. There's a different process for Squid-3 submissions. Thanks! Adrian 2008/11/26 Matt Benjamin [EMAIL PROTECTED]: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 - -- Matt Benjamin The Linux Box 206 South Fifth Ave. Suite 150 Ann Arbor, MI 48104 http://linuxbox.com tel. 734-761-4689 fax. 734-769-8938 cel. 734-216-5309 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJLYaAJiSUUSaRdSURCNBMAJ90xJm8VjlLJuubuxqi2drt8plR7QCdHXDs zBhdg5Gf8JScY8BdXqMZf8I= =Kd5i -END PGP SIGNATURE-
Re: access_log acl not observing my_port
g'day! Just create a ticket in the Squid bugzilla and put the patch into there. Thanks for your contribution! Adrian 2008/11/13 Stephen Thorne [EMAIL PROTECTED]: G'day, I've been looking into a problem we've observed where this situation does not work as expected, this is in squid-2.7.STABLE4: acl direct myport 8080 access_log /var/log/squid/direct_proxy.log common direct I did some tracing through the code and established that this chain of events occurs: httpRequestFree calls clientAclChecklistCreate calls aclChecklistCreate But aclChecklistCacheInit is the function that populates the checklist-my_port, which is required for a myport acl to work, and it isn't called. I have attached a patch that fixes this particular problem for me, which simply calls aclChecklistCacheInit in clientAclChecklistCreate. -- Regards, Stephen Thorne Development Engineer NetBox Blue - 1300 737 060 Scanned by the NetBox from NetBox Blue (http://netboxblue.com/) Scanned by the NetBox from NetBox Blue (http://netboxblue.com/)
delayed forwarding is in Squid-2.HEAD
G'day, I've just committed the delayed forwarding stuff into Squid-2.HEAD. Thanks, Adrian
Re: [PATCH] Check half-closed descriptors at most once per second.
2008/9/25 Alex Rousskov [EMAIL PROTECTED]: This revision resurrects 1 check/sec limit, but hopefully with fewer bugs. In my limited tests, CPU usage seems to be back to normal. Woo, thanks! The DescriptorSet class has O(1) complexity for search, insertion, and deletion. It uses about 2*sizeof(int)*MaxFD bytes. Splay tree that used to store half-closed descriptors previously uses less RAM for small number of descriptors but has O(log n) complexity. The DescriptorSet code should probably get its own .h and .cc files, especially if it is going to be used by deferred reads. Could you do that sooner rather than later? I'd like to try using this code for deferred reads and delay pools. Thanks! Adrian
Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...
2008/9/23 Martin Langhoff [EMAIL PROTECTED]: Any way we can kludge our way around it for the time being? Does squid take any signal that gets it to shed its index? It'd be pretty trivial to write a few cachemgr hooks to implement that kind of behaviour. 'flush memory cache', 'flush disk cache entirely', etc. The trouble is that the index is -required- at the moment for the disk cache. if you flush the index you flush the disk cache entirely. There's no hard limit for squid and squid (any version) handles memory allocation failures very very poorly (read: crashes.) Is it relatively sane to run it with a tight rlimit and restart it often? Or just monitor it and restart it? It probably won't like that very much if you decide to also use disk caching. You can limit the amount of cache_mem which limits the memory cache size; you could probably modify the squid codebase to start purging objects at a certain object count rather than based on the disk+memory storage size. That wouldn't be difficult. Any chance of having patches that do this? I could probably do that in a week or so once I've finished my upcoming travel. Someone could try beating me to it.. The big problem: you won't get Squid down to 24meg of RAM with the current tuning parameters. Well, I couldn't; and I'm playing around Hmmm... with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb RAM.) Its something which will require quite a bit of development to slim some of the internals down to scale better with restricted memory footprints. Its on my personal TODO list (as it mostly is in line with a bunch of performance work I'm slowly working towards) but as the bulk of that is happening in my spare time, I do not have a fixed timeframe at the moment. Thanks for that -- at whatever pace, progress is progress. I'll stay tuned. I'm not on squid-devel, but generally interested in any news on this track; I'll be thankful if you CC me or rope me into relevant threads. Ok. Is there interest within the squid dev team in moving towards a memory allocation model that is more tunable and/or relies more on the abilities of modern kernels to do memory mgmt? Or an alternative approach to handle scalability (both down to small devices and up to huge kit) more dynamically and predictably? You'll generally find the squid dev team happy to move in whatever directions make sense. The problem isn't direction as so much as the coding to make it happen. Making Squid operate well in small memory footprints turns out to be quite relevant to higher performance and scalability; the problem is in the doing. I'm hoping to start work on some stuff to reduce the memory footprint in my squid-2 branch (cacheboy) once the current round of IPv6 preparation is completed and stable. The developers working on Squid-3 are talking about similar stuff. Adrian
Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...
2008/9/24 Martin Langhoff [EMAIL PROTECTED]: Good hint, thanks! If we did have such a control, what is the wired memory that squid will use for each entry? In an email earlier I wrote... sizeof(StoreEntry) per index entry, basically. - Each index entry takes between 56 bytes and 88 bytes, plus additional, unspecificed overhead. Is 1KB per entry a reasonable conservative estimate? 1kb per entry is pretty conservative. The per-object overhead includes the StoreEntry, the couple of structures for the memory/disk replacement policies, plus the MD5 URL for the index hash, whatever other stuff hangs off MemObject for in-memory objects. You'll find that the RAM requirements grow a bit more for things like in-memory cache objects as the full reply headers stay in memory, and are copied whenever anyone wants to request it. - Discussions about compressing or hashing the URL in the index are recurrent - is the uncompressed URL there? That means up to 4KB per index entry? The uncompressed URL and headers are in memory during: * request/reply handling * in-memory object; (objects with MemObject's allocated); on-disk entries just have the MD5 URL hash per StoreEntry. HTH, Oh, and I'll be in the US from October for a few months; I can always do a side-trip out to see you guys if there's enough interest. Adrian
Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...
G'day, I've looked into this a bit (and have a couple of OLPC laptops to do testing with) and .. well, its going to take a bit of effort to make squid fit. There's no hard limit for squid and squid (any version) handles memory allocation failures very very poorly (read: crashes.) You can limit the amount of cache_mem which limits the memory cache size; you could probably modify the squid codebase to start purging objects at a certain object count rather than based on the disk+memory storage size. That wouldn't be difficult. The big problem: you won't get Squid down to 24meg of RAM with the current tuning parameters. Well, I couldn't; and I'm playing around with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb RAM.) Its something which will require quite a bit of development to slim some of the internals down to scale better with restricted memory footprints. Its on my personal TODO list (as it mostly is in line with a bunch of performance work I'm slowly working towards) but as the bulk of that is happening in my spare time, I do not have a fixed timeframe at the moment. Adrian 2008/9/23 Martin Langhoff [EMAIL PROTECTED]: Hi! I am working on the School Server (aka XS: a Fedora 9 spin, tailored to run on fairly limited hw), I'm preparing the configuration settings for it. It's a somewhat new area for me -- I've setup Squid before on mid-range hardware... but this is... different. So I'm interested in understanding more aobut the variables affecting memory footprint and how I can set a _hard limit_ on the wired memory that squid allocates. In brief: - The workload is relatively light - 3K clients is the upper bound. - The XS will (in some locations) be hooked to *very* unreliable power... uncontrolled shutdowns are the norm. Is this ever a problem with Squid? - After a bad shutdown, graceful recovery is the most important aspect. If a few cached items are lost, we can cope... - The XS hardware runs many services (mostly webbased), so Squid gets only a limited slice of memory. To make matters worse, I *really* don't want the core working set (Squid, Pg, Apache/PHP) to get paged out. So I am interested in pegging the max memory Squid will take to itself. - The XS hw is varied. In small schools it may have 256MB RAM (likely to be running on XO hardware + usb-connected ext hard-drive). Medium-to-large schools will have the recommended 1GB RAM and a cheap SATA disk. A few very large schools will be graced with more RAM (2 or 4GB). .. so RAM allocation for Squid will prob range between 24MB at the lower-end and 96MB at the 1GB recommended RAM. My main question is: how would you tune Squid 3 so that - it does not allocate directly more than 24MB / 96MB? (Assume that the linux kernel will be smart about mmapped stuff, and aggressive about caching -- I am talking about the memory Squid will claim to itself). - still gives us good thoughput? :-) So far Google has turned up very little info, and it seems to be rather old. What I've found can be summarised as follows: - The index is malloc'd, so the number of entries in the index will be the dominant concern WRT memory footprint. - Each index entry takes between 56 bytes and 88 bytes, plus additional, unspecificed overhead. Is 1KB per entry a reasonable conservative estimate? - Discussions about compressing or hashing the URL in the index are recurrent - is the uncompressed URL there? That means up to 4KB per index entry? - The index does nto seem to be mmappable or otherwise We can rely on the (modern) linux kernel doing a fantastic job at caching disk IO and shedding those cached entries when under memory pressure, so I am likely to set Squid's own cache to something really small. Everything I read points to the index being my main concern - is there a way to limit (a) the total memory the index is allowed to take or (b) the number of index entries allowed? Does the above make sense in general? Or am I barking up the wrong tree? cheers, martin -- [EMAIL PROTECTED] [EMAIL PROTECTED] -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Server-devel mailing list [EMAIL PROTECTED] http://lists.laptop.org/listinfo/server-devel
Re: [MERGE] Connection pinning patch
Its a 900-odd line patch; granted, a lot of it is boiler plate for config parsing and management, but I recall the issues connection pinning had when it was introduced and I'd hate to try and be the one debugging whatever crazy stuff pops up in 3.1 combined with the changes to the workflow connection pinning introduces. I don't pretend to completely understand the implications for ICAP either. Is there any documentation for how connection pinning should behave with ICAP and friends? Is there any particular rush to get this in for this release at such a late point in the release cycle? Could we hold off of it until the next release, and just focus on getting whats currently in 3.HEAD released and stable? Adrian 2008/9/21 Tsantilas Christos [EMAIL PROTECTED]: Hi all, This patch fixes the bug 1632 (http://www.squid-cache.org/bugs/show_bug.cgi?id=1632) It is based on the original squid2.5 connection pinning patch developed by Henrik (http://devel.squid-cache.org/projects.html#pinning) and the related squid 2.6 connection pinning code. Although I spend many hours looking on pined connections I am still not absolutely sure that does not have bugs. However the code is very similar with this in squid2.6 (where the pinning code runs for years) and I hope will be easy to fix problems and bugs. Regards, Christos
Re: [MERGE] Connection pinning patch
2008/9/22 Alex Rousskov [EMAIL PROTECTED]: It would help if there was a document describing what connection pinning is and what are the known pitfalls. Do we have such a document? Is RFC 4559 enough? I'll take another read. I think we should look at documenting these sorts of features somewhere else though. If not, Christos, can you write one and have Adrian and others contribute pitfalls? It does not have to be long -- just a few paragraphs describing the basics of the feature. We can add that description to code documentation too. I'd be happy to help troll over the 2.X code and see what its doing. Henrik and Steven know the code better than I do; I've just spent some time figuring out how it interplays with load balancing to peers and such. ICAP and eCAP do not care about HTTP connections or custom headers. Is connection pinning more than connection management via some custom headers? Nope; it just changes the semantics a little and some code may assume things work a certain way. Sine NTLM authentication forwarding appears to be a required feature for many and since connection pinning patch is not trivial (but is not huge either), I would rather see it added now (after the proper review process, of course). It could be the right icing on 3.1 cake for many users. I do realize that, like any 900-line patch, it may cause problems even if it is reviewed and tested. *nodnod* I'm just making sure the reasons for pushing it through are recorded somewhere during the process. Adrian
Re: Strategy
Put this stuff on hold, get Squid-3.1 out of the way, sort out the issues surrounding that before you start throwing more code into Squid-3 trunk, and -then- have this discussion. We can sort this stuff out in a short period of time if its our only focus. Adrian 2008/9/22 Amos Jeffries [EMAIL PROTECTED]: On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote: Alex Rousskov wrote: * Look for simpler warts with localized impact. We have plenty of them and your energy would be well spent there. If you have a choice, do not try to improve something as fundamental and as critical as String. Localized single-use code should receive a lot less scrutiny than fundamental classes. Agreed, but that said. If you kinkie, picking oe of the hard ones causes a thorough discussion, as String has, and comes up with a good API. That not just a step in the rght direction but a giant leap. And worth doing if you can spare the time (months in some cases). The follow on effects will be better and easier code in other areas depending on it. Amos, I think the above work-long-enough-and-you-will-make-it analysis and a few other related comments do not account for one important factor: cost (and the limited resources this project has). Please compare the following estimates (all numbers are very approximate, of course): Kinkie's time to draft a String class: 2 weeks Kinkie's time to fix the String class: 6 weeks Reviewers' time to find bugs and convince Kinkie that they are bugs: 2 weeks Total: 10 weeks Reviewer's time to write a String class: 3 weeks Total: 3 weeks Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the reviewers gain 1 week themselves. So I stand by, if he feels strongly enough to do it. If you add to the above that one reviewer cannot review and work on something else at the same time, the waste goes well above 200%. Which is wrong. We can review one thing and work on another project. Compare the above with a regular project that does not require writing complex or fundamental classes (again, numbers are approximate): Kinkie's time to complete a regular project: 1 week Reviewer's time to complete a regular project: 1 week After which both face the hard project again. Which remains hard and could have cut off 5 days of the regular project. If we want Squid code to continue to be a playground for half-finished code and ideas, then we should abandon the review process. Let's just commit everything that compiles and that the committer is happy with. I assume you are being sarcastic. Otherwise, let's do our best to find a project for everyone, without sacrificing the quality of the output or wasting resources. For example, if a person wants String to implement his pet project, but cannot make a good String, it may be possible to trade String implementation for a few other pet projects that the person can do. Then that trade needs to be discussed with the person before they start. I get the idea you are trying to manage this FOSS like you would a company project. That approach has been tried and failed miserably in FOSS. This will not be smooth and easy, but it is often doable because most of us share the goal of making the best open source proxy. * When assessing the impact of your changes, do not just compare the old code with the one submitted for review. Consider how your classes stand on their own and how they _will_ be used. Providing a poor but easier-to-abuse interface is often a bad idea even if that interface is, in some aspects, better than the old hard-to-use one. Noone else is tackling the issues that I'm working on. Should they be left alone? Or should I aim for the perfect solution each time? Perfect varies, and will change. As the baseline 'worst' code in Squid improves. The perfect API this year may need changing later. Aim for the best you can find to do, and see if its good enough for inclusion. Right. The problems come when it is not good enough, and you cannot fix it on your own. I do not know how to avoid these ugly situations. Teamwork. Which I thought we were starting to get in the String API after earlier attempts at solo by whoever wrote SquidString and myself on the BetterString mk1, mk2, mk3. I doubt any of us could do a good job of something so deep without help. Even you needed Henrik to review and find issues with AsyncCalls, maybe others I don't know about before that. The fact remains these things NEED someone to kick us into a team and work on it. for example, Alex had no issues with wordlist when it first came out. This was my first review of the proposed class, but I doubt it would have changed if I reviewed it earlier. Thank you, Alex. Amos
Re: Strategy
And in the meantime, if someone (eg kinkie) wants to work on this stuff some more, I suggest sitting down and writing some of the support code which would use it. Write a HTTP parser, HTTP response builder, do some benchmaking, perhaps glue it to something like libevent or some other comm framework and do some benchmarking there. See how it performs, how it behaves, see if it does everything y'all want cleanly. _Then_ have this discussion. Adrian 2008/9/22 Adrian Chadd [EMAIL PROTECTED]: Put this stuff on hold, get Squid-3.1 out of the way, sort out the issues surrounding that before you start throwing more code into Squid-3 trunk, and -then- have this discussion. We can sort this stuff out in a short period of time if its our only focus. Adrian 2008/9/22 Amos Jeffries [EMAIL PROTECTED]: On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote: Alex Rousskov wrote: * Look for simpler warts with localized impact. We have plenty of them and your energy would be well spent there. If you have a choice, do not try to improve something as fundamental and as critical as String. Localized single-use code should receive a lot less scrutiny than fundamental classes. Agreed, but that said. If you kinkie, picking oe of the hard ones causes a thorough discussion, as String has, and comes up with a good API. That not just a step in the rght direction but a giant leap. And worth doing if you can spare the time (months in some cases). The follow on effects will be better and easier code in other areas depending on it. Amos, I think the above work-long-enough-and-you-will-make-it analysis and a few other related comments do not account for one important factor: cost (and the limited resources this project has). Please compare the following estimates (all numbers are very approximate, of course): Kinkie's time to draft a String class: 2 weeks Kinkie's time to fix the String class: 6 weeks Reviewers' time to find bugs and convince Kinkie that they are bugs: 2 weeks Total: 10 weeks Reviewer's time to write a String class: 3 weeks Total: 3 weeks Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the reviewers gain 1 week themselves. So I stand by, if he feels strongly enough to do it. If you add to the above that one reviewer cannot review and work on something else at the same time, the waste goes well above 200%. Which is wrong. We can review one thing and work on another project. Compare the above with a regular project that does not require writing complex or fundamental classes (again, numbers are approximate): Kinkie's time to complete a regular project: 1 week Reviewer's time to complete a regular project: 1 week After which both face the hard project again. Which remains hard and could have cut off 5 days of the regular project. If we want Squid code to continue to be a playground for half-finished code and ideas, then we should abandon the review process. Let's just commit everything that compiles and that the committer is happy with. I assume you are being sarcastic. Otherwise, let's do our best to find a project for everyone, without sacrificing the quality of the output or wasting resources. For example, if a person wants String to implement his pet project, but cannot make a good String, it may be possible to trade String implementation for a few other pet projects that the person can do. Then that trade needs to be discussed with the person before they start. I get the idea you are trying to manage this FOSS like you would a company project. That approach has been tried and failed miserably in FOSS. This will not be smooth and easy, but it is often doable because most of us share the goal of making the best open source proxy. * When assessing the impact of your changes, do not just compare the old code with the one submitted for review. Consider how your classes stand on their own and how they _will_ be used. Providing a poor but easier-to-abuse interface is often a bad idea even if that interface is, in some aspects, better than the old hard-to-use one. Noone else is tackling the issues that I'm working on. Should they be left alone? Or should I aim for the perfect solution each time? Perfect varies, and will change. As the baseline 'worst' code in Squid improves. The perfect API this year may need changing later. Aim for the best you can find to do, and see if its good enough for inclusion. Right. The problems come when it is not good enough, and you cannot fix it on your own. I do not know how to avoid these ugly situations. Teamwork. Which I thought we were starting to get in the String API after earlier attempts at solo by whoever wrote SquidString and myself on the BetterString mk1, mk2, mk3. I doubt any of us could do a good job of something so deep without help. Even you
Re: Strategy
only focus should really have been our main focus at that short period of time, not the only thing we care about. Sheesh. :P Adrian 2008/9/22 Alex Rousskov [EMAIL PROTECTED]: On Mon, 2008-09-22 at 10:36 +0800, Adrian Chadd wrote: Put this stuff on hold, get Squid-3.1 out of the way, sort out the issues surrounding that before you start throwing more code into Squid-3 trunk, and -then- have this discussion. If this stuff is WordList, then put this stuff on hold is my suggestion as well. If this stuff is String, then I think the basic design choices can be discussed now, but waiting is even better for me, so I am happy to follow your suggestion :-). If this stuff is how we improve teamwork, then I am happy to continue any _constructive_ discussions since releasing 3.1 can benefit from teamwork as well. We can sort this stuff out in a short period of time if its our only focus. The only focus? You must be dreaming :-). Alex. 2008/9/22 Amos Jeffries [EMAIL PROTECTED]: On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote: Alex Rousskov wrote: * Look for simpler warts with localized impact. We have plenty of them and your energy would be well spent there. If you have a choice, do not try to improve something as fundamental and as critical as String. Localized single-use code should receive a lot less scrutiny than fundamental classes. Agreed, but that said. If you kinkie, picking oe of the hard ones causes a thorough discussion, as String has, and comes up with a good API. That not just a step in the rght direction but a giant leap. And worth doing if you can spare the time (months in some cases). The follow on effects will be better and easier code in other areas depending on it. Amos, I think the above work-long-enough-and-you-will-make-it analysis and a few other related comments do not account for one important factor: cost (and the limited resources this project has). Please compare the following estimates (all numbers are very approximate, of course): Kinkie's time to draft a String class: 2 weeks Kinkie's time to fix the String class: 6 weeks Reviewers' time to find bugs and convince Kinkie that they are bugs: 2 weeks Total: 10 weeks Reviewer's time to write a String class: 3 weeks Total: 3 weeks Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the reviewers gain 1 week themselves. So I stand by, if he feels strongly enough to do it. If you add to the above that one reviewer cannot review and work on something else at the same time, the waste goes well above 200%. Which is wrong. We can review one thing and work on another project. Compare the above with a regular project that does not require writing complex or fundamental classes (again, numbers are approximate): Kinkie's time to complete a regular project: 1 week Reviewer's time to complete a regular project: 1 week After which both face the hard project again. Which remains hard and could have cut off 5 days of the regular project. If we want Squid code to continue to be a playground for half-finished code and ideas, then we should abandon the review process. Let's just commit everything that compiles and that the committer is happy with. I assume you are being sarcastic. Otherwise, let's do our best to find a project for everyone, without sacrificing the quality of the output or wasting resources. For example, if a person wants String to implement his pet project, but cannot make a good String, it may be possible to trade String implementation for a few other pet projects that the person can do. Then that trade needs to be discussed with the person before they start. I get the idea you are trying to manage this FOSS like you would a company project. That approach has been tried and failed miserably in FOSS. This will not be smooth and easy, but it is often doable because most of us share the goal of making the best open source proxy. * When assessing the impact of your changes, do not just compare the old code with the one submitted for review. Consider how your classes stand on their own and how they _will_ be used. Providing a poor but easier-to-abuse interface is often a bad idea even if that interface is, in some aspects, better than the old hard-to-use one. Noone else is tackling the issues that I'm working on. Should they be left alone? Or should I aim for the perfect solution each time? Perfect varies, and will change. As the baseline 'worst' code in Squid improves. The perfect API this year may need changing later. Aim for the best you can find to do, and see if its good enough for inclusion. Right. The problems come when it is not good enough, and you cannot fix it on your own. I do not know how to avoid
Re: SBuf review
2008/9/19 Amos Jeffries [EMAIL PROTECTED]: I kind of fuzzily disagree, the point of this is to replace MemBuf + String with SBuf. Not implement both again independently duplicating stuff. I'll say it again - ignore MemBuf. Ignore MemBuf for now. Leave it as a NUL-terminated dynamic buffer with some printf append like semantics. When you've implemented a non-NUL-terminated ref-counted memory region implementation and you layer some basic strings semantics on top of it, you can slowly convert or eliminate the bulk of the MemBuf users over. You're going to find plenty of places where the string handling is plain old horrible. Don't try to cater for those situations with things like NULL strings. I tried that, its ugly. Aim to implement something which'll cater to something narrow to begin with - like parsing HTTP headers - and look to -rewrite- larger parts of the code later on. Don't try to invent things which will somehow seamlessly fit into the existing code and provide the same semantics. Some of said semantics is plain shit. I still don't get why this is again becoming so freakishly complicated. Adrian
Re: [MERGE] WCCPv2 Config Cleanup
2008/9/13 Amos Jeffries [EMAIL PROTECTED]: This one was easy and isolated, so I went and did it early. It's back-compatible, so people don't have to use the new names if they like. But its clearer for the newbies until the big cleanup you mention below is stable. Well, the newbies still need to know about the different kinds of redirection/assignment methods; what would be nice is if it were mostly autonegotiated per-host per-service group, and if wccp2d could setup/teardown the GRE tunnels as required. The WCCPv2 stuff works fine (for what it does); it could do with some better documentation but what it really needs is to be broken out from Squid itself and run as a seperate daemon. I've been waiting most of a year for your work on that direction in Squid-2 to be ported over. There does not appear to be any sign of it happening in time for 3.1. The rest of us are largely concentrating on cleaning other components. I still haven't done all that much with the WCCPv2 stuff yet. I'll be breaking out the source code in Cacheboy after I finish the next set of IPv6 changes; the wccp2d code will then use the config registry type stuff we've discussed and reuse the core code for comms, debugging, logging, etc. Adrian
Re: [MERGE] WCCPv2 Config Cleanup
The specification defines them as separate entities and using them in this fashion makes it clearer for people working on the code. Adrian 2008/9/13 Henrik Nordstrom [EMAIL PROTECTED]: On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote: +#define WCCP2_FORWARDING_METHOD_GRE WCCP2_METHOD_GRE +#define WCCP2_FORWARDING_METHOD_L2 WCCP2_METHOD_L2 +#define WCCP2_PACKET_RETURN_METHOD_GRE WCCP2_METHOD_GRE +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2 Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are the same value? Regards Henrik
Re: [MERGE] WCCPv2 Config Cleanup
Amos, why are you pushing through changes to the WCCP configuration stuff at this point in the game? The WCCPv2 stuff works fine (for what it does); it could do with some better documentation but what it really needs is to be broken out from Squid itself and run as a seperate daemon. Adrian 2008/9/13 Henrik Nordstrom [EMAIL PROTECTED]: With the patch the code uses WCCP2_METHOD_.. in some places (config parsing/dumping) and the context specific ones in other places. This is even more confusing. Very minor detail in any case. On lör, 2008-09-13 at 09:49 +0800, Adrian Chadd wrote: The specification defines them as separate entities and using them in this fashion makes it clearer for people working on the code. Adrian 2008/9/13 Henrik Nordstrom [EMAIL PROTECTED]: On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote: +#define WCCP2_FORWARDING_METHOD_GRE WCCP2_METHOD_GRE +#define WCCP2_FORWARDING_METHOD_L2 WCCP2_METHOD_L2 +#define WCCP2_PACKET_RETURN_METHOD_GRE WCCP2_METHOD_GRE +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2 Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are the same value? Regards Henrik
Re: squid-2.HEAD: storeCleanup and -F option (foreground rebuild)
I've committed a slightly modified version of this - store_rebuild.c r1.80 . Take a look and see if it works for you. Thanks! Adrian 2008/8/5 Alexander V. Lukyanov [EMAIL PROTECTED]: Hello! I use squid in transparent mode, so I don't want degraded performance while rebuilding and cleanup. Here is a patch I use to make storeCleanup do all the work at once before squid starts processing requests, when -F option is specified on command line. Index: store_rebuild.c === RCS file: /squid/squid/src/store_rebuild.c,v retrieving revision 1.80 diff -u -p -r1.80 store_rebuild.c --- store_rebuild.c 1 Sep 2007 23:09:32 - 1.80 +++ store_rebuild.c 5 Aug 2008 05:51:43 - @@ -68,7 +68,8 @@ storeCleanup(void *datanotused) hash_link *link_ptr = NULL; hash_link *link_next = NULL; validnum_start = validnum; -while (validnum - validnum_start 500) { +int limit = opt_foreground_rebuild ? 1 30 : 500; +while (validnum - validnum_start limit) { if (++bucketnum = store_hash_buckets) { debug(20, 1) ( Completed Validation Procedure\n); debug(20, 1) ( Validated %d Entries\n, validnum); @@ -147,8 +148,8 @@ storeRebuildComplete(struct _store_rebui debug(20, 1) ( Took %3.1f seconds (%6.1f objects/sec).\n, dt, (double) counts.objcount / (dt 0.0 ? dt : 1.0)); debug(20, 1) (Beginning Validation Procedure\n); -eventAdd(storeCleanup, storeCleanup, NULL, 0.0, 1); safe_free(RebuildProgress); +storeCleanup(0); } /*
Re: squid-2.HEAD: fwdComplete/Fail before comm_close
Hiya, Could you please verify this is still a problem in the latest 2.HEAD and if so lodge a bugzilla bug report with the patch? Thanks! Adrian 2008/8/5 Alexander V. Lukyanov [EMAIL PROTECTED]: Hello! Some time ago I had core dumps just after these messages: Short response from ... httpReadReply: Excess data from ... I beleave this patch fixes these problems. Index: http.c === RCS file: /squid/squid/src/http.c,v retrieving revision 1.446 diff -u -p -r1.446 http.c --- http.c 25 Jun 2008 22:11:20 - 1.446 +++ http.c 5 Aug 2008 06:05:29 - @@ -755,6 +757,7 @@ httpAppendBody(HttpStateData * httpState /* Is it a incomplete reply? */ if (httpState-chunk_size 0) { debug(11, 2) (Short response from '%s' on port %d. Expecting % PRINTF_OFF_T octets more\n, storeUrl(entry), comm_local_port(fd), httpState-chunk_size); + fwdFail(httpState-fwd, errorCon(ERR_INVALID_RESP, HTTP_BAD_GATEWAY, httpState-fwd-request)); comm_close(fd); return; } @@ -774,6 +777,7 @@ httpAppendBody(HttpStateData * httpState (httpReadReply: Excess data from \%s %s\\n, RequestMethods[orig_request-method].str, storeUrl(entry)); + fwdComplete(httpState-fwd); comm_close(fd); return; }
Re: squid-2.HEAD:
have you dumped this into bugzilla? Thanks! 2008/9/3 Alexander V. Lukyanov [EMAIL PROTECTED]: Hello! I have noticed lots of 'impossible keep-alive' messages in the log. It appears that httpReplyBodySize incorrectly returns -1 for 304 Not Modified replies. Patch to fix it is attached. -- Alexander.
Re: Where to document APIs?
2008/9/11 Alex Rousskov [EMAIL PROTECTED]: To clarify: Longer API documents, .dox file in docs/, or maybe src/ next to the .cc Basic rules the code need to fulfill, or until the API documentation grows large, in the .h or .cc file. You all have seen the current API notes for Comm and AsyncCalls. Do you think they should go into a .dox or .h file? I think they are big enough (and growing) to justify a .dox file. I will probably add those files to trunk (next to the corresponding .h files) unless there are better ideas. Whats wrong with inline documentation again? Adrian
Australian Development Meetup 2008 - Notes
G'day, I've started publishing the notes from the presentations and developer discussions that we held at the Yahoo!7 offices last month. You can find them at http://www.squid-cache.org/Conferences/AustraliaMeeting2008/ . I'm going to try and make sure any further mini-conferences/discussions/etc which happen go up there so people get more of an idea of whats going on. Who knows, eventually there may be enough interest to hold a reasonably formal Squid conference somewhere.. :) Adrian
Re: [MERGE] Config cleanups
You have the WCCPv2 stuff around the wrong way. the redirection has nothing to do with the assignment method. You can and do have L2 redirection with hash assignment. You probably won't have GRE redirection with mask assignment though, but I think its entirely possible. Keep the options separate, and named whatever they are in the wccp2 draft. I'd also suggest committing each chunk thats different seperately - ie, the wccp stuff seperate, the ACL tidyup seperate, the default storage stuff seperate, etc. That makes backing out patches easier if needed. 2c, Adrian 2008/9/10 Amos Jeffries [EMAIL PROTECTED]: This update removes several magic number options in the WCCPv2 configuration. Replacing them with user-freindly text options. This should help with a lot of config confusion where these are needed until they are obsoleted properly. # Bazaar merge directive format 2 (Bazaar 0.90) # revision_id: [EMAIL PROTECTED] # target_branch: file:///src/squid/bzr/trunk/ # testament_sha1: 7b319238106ae2926697f85b2ec58c3476abc121 # timestamp: 2008-09-11 03:50:49 +1200 # base_revision_id: [EMAIL PROTECTED] # q5rnfdpug13p94fl # # Begin patch === modified file 'src/cf.data.depend' --- src/cf.data.depend 2008-04-03 05:31:29 + +++ src/cf.data.depend 2008-09-10 15:22:08 + @@ -47,6 +47,7 @@ tristate uri_whitespace ushort +wccp2_method wccp2_service wccp2_service_info wordlist === modified file 'src/cf.data.pre' --- src/cf.data.pre 2008-08-09 06:24:33 + +++ src/cf.data.pre 2008-09-10 15:47:36 + @@ -831,8 +831,8 @@ NOCOMMENT_START #Allow ICP queries from local networks only -icp_access allow localnet -icp_access deny all +#icp_access allow localnet +#icp_access deny all NOCOMMENT_END DOC_END @@ -856,8 +856,8 @@ NOCOMMENT_START #Allow HTCP queries from local networks only -htcp_access allow localnet -htcp_access deny all +#htcp_access allow localnet +#htcp_access deny all NOCOMMENT_END DOC_END @@ -883,7 +883,7 @@ NAME: miss_access TYPE: acl_access LOC: Config.accessList.miss -DEFAULT: none +DEFAULT: allow all DOC_START Use to force your neighbors to use you as a sibling instead of a parent. For example: @@ -897,11 +897,6 @@ By default, allow all clients who passed the http_access rules to fetch MISSES from us. - -NOCOMMENT_START -#Default setting: -# miss_access allow all -NOCOMMENT_END DOC_END NAME: ident_lookup_access @@ -1555,9 +1550,7 @@ icp-port: Used for querying neighbor caches about objects. To have a non-ICP neighbor -specify '7' for the ICP port and make sure the -neighbor machine has the UDP echo port -enabled in its /etc/inetd.conf file. +specify '0' for the ICP port. NOTE: Also requires icp_port option enabled to send/receive requests via this method. @@ -1955,7 +1948,7 @@ NAME: maximum_object_size_in_memory COMMENT: (bytes) TYPE: b_size_t -DEFAULT: 8 KB +DEFAULT: 512 KB LOC: Config.Store.maxInMemObjSize DOC_START Objects greater than this size will not be attempted to kept in @@ -2124,7 +2117,7 @@ which can be changed with the --with-coss-membuf-size=N configure option. NOCOMMENT_START -cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256 +# cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256 NOCOMMENT_END DOC_END @@ -2291,7 +2284,7 @@ NAME: access_log cache_access_log TYPE: access_log LOC: Config.Log.accesslogs -DEFAULT: none +DEFAULT: @DEFAULT_ACCESS_LOG@ squid DOC_START These files log client request activities. Has a line every HTTP or ICP request. The format is: @@ -2314,9 +2307,9 @@ And priority could be any of: err, warning, notice, info, debug. -NOCOMMENT_START -access_log @DEFAULT_ACCESS_LOG@ squid -NOCOMMENT_END + + Default: + access_log @DEFAULT_ACCESS_LOG@ squid DOC_END NAME: log_access @@ -2342,14 +2335,17 @@ NAME: cache_store_log TYPE: string -DEFAULT: @DEFAULT_STORE_LOG@ +DEFAULT: none LOC: Config.Log.store DOC_START Logs the activities of the storage manager. Shows which objects are ejected from the cache, and which objects are - saved and for how long. To disable, enter none. There are - not really utilities to analyze this data, so you can safely + saved and for how long. To disable, enter none or remove the line. + There are not really utilities to analyze this data, so you can safely disable it. +NOCOMMENT_START +# cache_store_log @DEFAULT_STORE_LOG@ +NOCOMMENT_END DOC_END NAME: cache_swap_state cache_swap_log @@ -3085,7 +3081,7 @@ NAME: request_header_max_size COMMENT: (KB) TYPE: b_size_t -DEFAULT: 20 KB +DEFAULT: 64 KB LOC: Config.maxRequestHeaderSize DOC_START This specifies the
Re: Comm API notes
2008/9/11 Alex Rousskov [EMAIL PROTECTED]: * I/O cancellation. To cancel an interest in a read operation, call comm_read_cancel() with an AsyncCall object. This call guarantees that the passed Call will be canceled (see the AsyncCall API for call cancellation definitions and details). Naturally, the code has to store the original read callback Call pointer to use this interface. This call does not guarantee that the read operation has not already happen. This call guarantees that the read operation will not happen. As I said earlier, you can't guarantee that with asynchronous IO. The call may be in progress and not completed. I'm assuming you'd count in progress as has already happened but unlike the latter, you can't cancel it at the OS level. As long as the API keeps all the relevant OS related structures in place to allow the IO to complete, and callers to the cancellation function are prepared to handle the case where the IO is happening versus has already happened, then i'm happy. You cannot reliably cancel an interest in read operation using the old comm_read_cancel call that uses a function pointer. The handler may get even called after old comm_read_cancel was called. This old API will be removed. I really did think I had fixed removing the pending callbacks from the callback queue when I implemented this. (Ie, I thought I implemented enough for the POSIX read/write API but not enough for overlapped/POSIX IO.) What were people seeing pre-AsyncCalls? It is OK to call comm_read_cancel (both old and new) at any time as long as the descriptor has not been closed and there is either no read interest registered or the passed parameters match the registered ones. If the descriptor has been closed, the behavior is undefined. Otherwise, if parameters do not match, you get an assertion. To cancel other operations, close the descriptor with comm_close. I'm still not happy with comm_close() being used in that way; it seems you aren't either and are stipulating new user code aborts jobs via alternative paths. I'm also not happy with the idea of close handlers to unwind state associated with it; how deep do close handlers actually get? Would we be better off in the long run by stipulating a more rigid shutdown process (eg - shutting down a client-side fd would not involve comm_close(fd); but ConnStateData::close() which would handle clearing the clientHttpRequests and such, then itself + fd?) Raw socket descriptors may be replaced with unique IDs or small objects that help detect stale descriptor/socket usage bugs and encapsulate access to socket-specific information. New user code should treat descriptor integers as opaque objects. I do agree with this. As Henrik said, this makes Windows porting a bit easier. There are still other problems to tackle to properly abuse overlapped IO in any sensible fashion, mostly surrounding IO scheduling and callback scheduling.. adrian
Re: Comm API notes
2008/9/11 Alex Rousskov [EMAIL PROTECTED]: Here is a replacement text: The comm_close API will be used exclusively for stop future I/O, schedule a close callback call, and cancel all other callbacks purposes. New user code should not use comm_close for the purpose of immediately ending a job via a close handler call. Yup. (As part of another email) I'd also make it completely clear that the underlying socket and IO may not be immediately closed via a comm_close() until pending scheduled IO events occur; and that the callers should be prepared for the situation where the underlying buffer(s) and other resources must stay immutable until the completion of the kernel-side stuff. This is partially why I wanted explicit notification, cancellation or not, so the owners of things like buffers would know when they were able to modify/reuse them again - or the immutable semantics must be enforced some other way. Adrian
Re: How to buffer a POST request
Well, I've got a proof of concept which works well but its -very- ugly. This is one of those things may have been slightly easier to do in Squid-3 with Alex's BodyPipe changes. I haven't stared at the BodyPipe code to know whether its doing all the right kinds of buffering for this application. The problem is that Squid-2's request body data pipeline doesn't do any of its own buffering - it doesn't do anything at all until a consumer says give me some more request body data please at which point its copied out of conn-in.buf (the client-side incoming socket buffer), consumed, and passed onto the caller. I thought about a clean implementation which would involve the request body pipeline code consuming socket buffer data until a certain threshold is reached, then feeding that back up to the request body consumer but I decided that was too difficult for this particular contract. Instead, the hack here is to just keep reading data into the client-side socket buffer - its already doing double duty as a request body buffer anyway - until an ACL match fires to begin forwarding. Its certainly not clean but it seems to work in local testing. I haven't yet tested connection aborts and such to make sure that connections are properly cleaned up. I'll look at posting a patch to squid-dev in a day or two once my client has had a look at it. Thanks, Adrian 2008/8/8 Adrian Chadd [EMAIL PROTECTED]: Well I'm still going through the process of planning out what changes need to happen. I know what changes need to happen long-term but this project doesn't have that sort of scope.. Adrian 2008/8/8 Mark Nottingham [EMAIL PROTECTED]: You said you were doing it :) On 08/08/2008, at 4:40 PM, Adrian Chadd wrote: Way to dob me in! Adrian 2008/8/8 Mark Nottingham [EMAIL PROTECTED]: I took at stab at: http://wiki.squid-cache.org/Features/RequestBuffering On 22/07/2008, at 4:40 PM, Henrik Nordstrom wrote: It's not a bug. A feature request in the wiki is more appropriate. wiki.squid-cache.org/Features/ Regards Henrik On mån, 2008-07-21 at 17:50 -0700, Mark Nottingham wrote: I couldn't find an open bug for this, so I opened http://www.squid-cache.org/bugs/show_bug.cgi?id=2420 On 11/06/2008, at 3:29 AM, Henrik Nordstrom wrote: On ons, 2008-06-11 at 12:51 +0300, Mikko Kettunen wrote: Yes, I read something about this on squid-users list, there seems to be 8kB buffer for this if I understood right. The buffer is bigger than that. But not unlimited. The big change needed is that there currently isn't anything delaying forwarding of the request headers until sufficient amount of the request body has been buffered. Regards Henrik -- Mark Nottingham [EMAIL PROTECTED] -- Mark Nottingham [EMAIL PROTECTED] -- Mark Nottingham [EMAIL PROTECTED]
Squid-2.HEAD URL regression with CONNECT
G'day, Squid-2.HEAD doesn't seem to handle CONNECT URLs anymore; I get something like: [start] The requested URL could not be retrieved While trying to retrieve the URL: www.gmail.com:443 The following error was encountered: * Invalid URL [end] Benno, could you please double/triple check that your method and url related changes to Squid-2.HEAD didn't break CONNECT? Thanks! Adrian
Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.
I've been thinking about doing exactly this after I've been knee-deep in the DNS code. It may not be a bad idea to have generic udp/tcp incoming/outgoing addresses which can then be over-ridden per-protocol. Adrian 2008/9/9 Amos Jeffries [EMAIL PROTECTED]: revno: 9176 committer: Alex Rousskov [EMAIL PROTECTED] branch nick: trunk timestamp: Mon 2008-09-08 17:52:06 -0600 message: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address. modified: src/htcp.cc I think this is one of those cleanup situations where we wanted to split the protocol away from generic udp_*_address and make it an htcp_outgoing_address. Yes? Amos
Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.
Hah, Amos just exposed my on-set short term memory loss! (Time to get a bigger whiteboard..) Adrian 2008/9/9 Amos Jeffries [EMAIL PROTECTED]: I've been thinking about doing exactly this after I've been knee-deep in the DNS code. It may not be a bad idea to have generic udp/tcp incoming/outgoing addresses which can then be over-ridden per-protocol. WTF? We discussed this months ago and came to the conclusion it would be good to have a two layered outgoing address/port assignment. a) base default of random system-assigned outbound address port. b) override per-component/protocol in/out bound address/port with individual config options. Amos Adrian 2008/9/9 Amos Jeffries [EMAIL PROTECTED]: revno: 9176 committer: Alex Rousskov [EMAIL PROTECTED] branch nick: trunk timestamp: Mon 2008-09-08 17:52:06 -0600 message: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address. modified: src/htcp.cc I think this is one of those cleanup situations where we wanted to split the protocol away from generic udp_*_address and make it an htcp_outgoing_address. Yes? Amos
Re: [PATCH] Send 407 on url_rewrite_access/storeurl_access
Thanks! Don't forget to bug me if its not sorted out in the next week or so. Adrian 2008/9/8 Diego Woitasen [EMAIL PROTECTED]: http://www.squid-cache.org/bugs/show_bug.cgi?id=2455 On Sun, Sep 07, 2008 at 09:28:30AM +0800, Adrian Chadd wrote: It looks fine; could you dump it into bugzilla for the time being? (We're working on the Squid-2 - bzr merge stuff at the moment!) Adrian 2008/9/7 Diego Woitasen [EMAIL PROTECTED]: This patch apply to Squid 2.7.STABLE4. If we use a proxy_auth acl on {storeurl,url_rewrite}_access and the user isn't authenticated previously, send 407. regards, Diego diff --git a/src/client_side.c b/src/client_side.c index 23c4274..4f75ea0 100644 --- a/src/client_side.c +++ b/src/client_side.c @@ -448,19 +448,71 @@ clientFinishRewriteStuff(clientHttpRequest * http) } -static void -clientAccessCheckDone(int answer, void *data) +void +clientSendErrorReply(clientHttpRequest * http, int answer) { -clientHttpRequest *http = data; err_type page_id; http_status status; ErrorState *err = NULL; char *proxy_auth_msg = NULL; + +proxy_auth_msg = authenticateAuthUserRequestMessage(http-conn-auth_user_request ? http-conn-auth_user_request : http-request-auth_user_request); + +int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; + +debug(33, 5) (Access Denied: %s\n, http-uri); +debug(33, 5) (AclMatchedName = %s\n, + AclMatchedName ? AclMatchedName : null); +debug(33, 5) (Proxy Auth Message = %s\n, + proxy_auth_msg ? proxy_auth_msg : null); + +/* + * NOTE: get page_id here, based on AclMatchedName because + * if USE_DELAY_POOLS is enabled, then AclMatchedName gets + * clobbered in the clientCreateStoreEntry() call + * just below. Pedro Ribeiro [EMAIL PROTECTED] + */ +page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, answer != ACCESS_REQ_PROXY_AUTH); +http-log_type = LOG_TCP_DENIED; +http-entry = clientCreateStoreEntry(http, http-request-method, + null_request_flags); +if (require_auth) { + if (!http-flags.accel) { + /* Proxy authorisation needed */ + status = HTTP_PROXY_AUTHENTICATION_REQUIRED; + } else { + /* WWW authorisation needed */ + status = HTTP_UNAUTHORIZED; + } + if (page_id == ERR_NONE) + page_id = ERR_CACHE_ACCESS_DENIED; +} else { + status = HTTP_FORBIDDEN; + if (page_id == ERR_NONE) + page_id = ERR_ACCESS_DENIED; +} +err = errorCon(page_id, status, http-orig_request); +if (http-conn-auth_user_request) + err-auth_user_request = http-conn-auth_user_request; +else if (http-request-auth_user_request) + err-auth_user_request = http-request-auth_user_request; +/* lock for the error state */ +if (err-auth_user_request) + authenticateAuthUserRequestLock(err-auth_user_request); +err-callback_data = NULL; +errorAppendEntry(http-entry, err); + +} + +static void +clientAccessCheckDone(int answer, void *data) +{ +clientHttpRequest *http = data; + debug(33, 2) (The request %s %s is %s, because it matched '%s'\n, RequestMethods[http-request-method].str, http-uri, answer == ACCESS_ALLOWED ? ALLOWED : DENIED, AclMatchedName ? AclMatchedName : NO ACL's); -proxy_auth_msg = authenticateAuthUserRequestMessage(http-conn-auth_user_request ? http-conn-auth_user_request : http-request-auth_user_request); http-acl_checklist = NULL; if (answer == ACCESS_ALLOWED) { safe_free(http-uri); @@ -469,47 +521,7 @@ clientAccessCheckDone(int answer, void *data) http-redirect_state = REDIRECT_PENDING; clientRedirectStart(http); } else { - int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; - debug(33, 5) (Access Denied: %s\n, http-uri); - debug(33, 5) (AclMatchedName = %s\n, - AclMatchedName ? AclMatchedName : null); - debug(33, 5) (Proxy Auth Message = %s\n, - proxy_auth_msg ? proxy_auth_msg : null); - /* -* NOTE: get page_id here, based on AclMatchedName because -* if USE_DELAY_POOLS is enabled, then AclMatchedName gets -* clobbered in the clientCreateStoreEntry() call -* just below. Pedro Ribeiro [EMAIL PROTECTED] -*/ - page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, answer != ACCESS_REQ_PROXY_AUTH); - http-log_type = LOG_TCP_DENIED; - http-entry = clientCreateStoreEntry(http, http-request-method, - null_request_flags); - if (require_auth
Re: [PATCH] Send 407 on url_rewrite_access/storeurl_access
It looks fine; could you dump it into bugzilla for the time being? (We're working on the Squid-2 - bzr merge stuff at the moment!) Adrian 2008/9/7 Diego Woitasen [EMAIL PROTECTED]: This patch apply to Squid 2.7.STABLE4. If we use a proxy_auth acl on {storeurl,url_rewrite}_access and the user isn't authenticated previously, send 407. regards, Diego diff --git a/src/client_side.c b/src/client_side.c index 23c4274..4f75ea0 100644 --- a/src/client_side.c +++ b/src/client_side.c @@ -448,19 +448,71 @@ clientFinishRewriteStuff(clientHttpRequest * http) } -static void -clientAccessCheckDone(int answer, void *data) +void +clientSendErrorReply(clientHttpRequest * http, int answer) { -clientHttpRequest *http = data; err_type page_id; http_status status; ErrorState *err = NULL; char *proxy_auth_msg = NULL; + +proxy_auth_msg = authenticateAuthUserRequestMessage(http-conn-auth_user_request ? http-conn-auth_user_request : http-request-auth_user_request); + +int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; + +debug(33, 5) (Access Denied: %s\n, http-uri); +debug(33, 5) (AclMatchedName = %s\n, + AclMatchedName ? AclMatchedName : null); +debug(33, 5) (Proxy Auth Message = %s\n, + proxy_auth_msg ? proxy_auth_msg : null); + +/* + * NOTE: get page_id here, based on AclMatchedName because + * if USE_DELAY_POOLS is enabled, then AclMatchedName gets + * clobbered in the clientCreateStoreEntry() call + * just below. Pedro Ribeiro [EMAIL PROTECTED] + */ +page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, answer != ACCESS_REQ_PROXY_AUTH); +http-log_type = LOG_TCP_DENIED; +http-entry = clientCreateStoreEntry(http, http-request-method, + null_request_flags); +if (require_auth) { + if (!http-flags.accel) { + /* Proxy authorisation needed */ + status = HTTP_PROXY_AUTHENTICATION_REQUIRED; + } else { + /* WWW authorisation needed */ + status = HTTP_UNAUTHORIZED; + } + if (page_id == ERR_NONE) + page_id = ERR_CACHE_ACCESS_DENIED; +} else { + status = HTTP_FORBIDDEN; + if (page_id == ERR_NONE) + page_id = ERR_ACCESS_DENIED; +} +err = errorCon(page_id, status, http-orig_request); +if (http-conn-auth_user_request) + err-auth_user_request = http-conn-auth_user_request; +else if (http-request-auth_user_request) + err-auth_user_request = http-request-auth_user_request; +/* lock for the error state */ +if (err-auth_user_request) + authenticateAuthUserRequestLock(err-auth_user_request); +err-callback_data = NULL; +errorAppendEntry(http-entry, err); + +} + +static void +clientAccessCheckDone(int answer, void *data) +{ +clientHttpRequest *http = data; + debug(33, 2) (The request %s %s is %s, because it matched '%s'\n, RequestMethods[http-request-method].str, http-uri, answer == ACCESS_ALLOWED ? ALLOWED : DENIED, AclMatchedName ? AclMatchedName : NO ACL's); -proxy_auth_msg = authenticateAuthUserRequestMessage(http-conn-auth_user_request ? http-conn-auth_user_request : http-request-auth_user_request); http-acl_checklist = NULL; if (answer == ACCESS_ALLOWED) { safe_free(http-uri); @@ -469,47 +521,7 @@ clientAccessCheckDone(int answer, void *data) http-redirect_state = REDIRECT_PENDING; clientRedirectStart(http); } else { - int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; - debug(33, 5) (Access Denied: %s\n, http-uri); - debug(33, 5) (AclMatchedName = %s\n, - AclMatchedName ? AclMatchedName : null); - debug(33, 5) (Proxy Auth Message = %s\n, - proxy_auth_msg ? proxy_auth_msg : null); - /* -* NOTE: get page_id here, based on AclMatchedName because -* if USE_DELAY_POOLS is enabled, then AclMatchedName gets -* clobbered in the clientCreateStoreEntry() call -* just below. Pedro Ribeiro [EMAIL PROTECTED] -*/ - page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, answer != ACCESS_REQ_PROXY_AUTH); - http-log_type = LOG_TCP_DENIED; - http-entry = clientCreateStoreEntry(http, http-request-method, - null_request_flags); - if (require_auth) { - if (!http-flags.accel) { - /* Proxy authorisation needed */ - status = HTTP_PROXY_AUTHENTICATION_REQUIRED; - } else { - /* WWW authorisation needed */ - status = HTTP_UNAUTHORIZED; - } - if (page_id == ERR_NONE) - page_id = ERR_CACHE_ACCESS_DENIED; - } else { -
Re: [RFC] COSS removal from 3.0
2008/9/4 Amos Jeffries [EMAIL PROTECTED]: I'm expecting to roll 3.0.STABLE9 sometime over the next 5 days. One update still to be done is the removal of COSS. I had planned on just dead-coding (disabling) it. But with the configure recursion being dynamic thats not easily possible. I'm currently considering dropping an #error abortion into the top of all COSS code files to kill any builds trying to use it. Anyone have a better way? drop the code entirely from 3.0? I think thats perfectly fine for now. Adrian
Re: squid-2.HEAD:
2008/9/3 Alexander V. Lukyanov [EMAIL PROTECTED]: Hello! I have noticed lots of 'impossible keep-alive' messages in the log. It appears that httpReplyBodySize incorrectly returns -1 for 304 Not Modified replies. Patch to fix it is attached. Hm, I'd have to eyeball the rest of the code to make sure thats the right fix. Can you throw it into a bugzilla ticket for me? I've got a couple other Squid-2.HEAD patches to stare at and commit. Adrian
Re: [MERGE] Address Alex and Amos' comments.
I still need to eyeball the relative URL stuff, but.. bb:approve 2008/9/3 Bundle Buggy [EMAIL PROTECTED]: Bundle Buggy has detected this merge request. For details, see: http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E Project: Squid
Re: Using cached headers in ACLs
nope! adrian 2008/9/4 Diego Woitasen [EMAIL PROTECTED]: Hi, As I've explained in my introduction, I'm working on changes over cache statement and refresh_pattern to allow easy flash video caching and may be other things. The first thing that I'm trying to change is that ACLs used in cache would match against cached headers. For example, if the cached headers for some URL contains Content-Type: video/flv I serve that object from cache. Is there any contraindication if I use cached headers in that way? Regards, Diego -- --- Diego Woitasen - XTECH www.xtech.com.ar
Re: [MERGE] Address Alex and Amos' comments.
bb:approve (Sorry, I wasn't setup to vote until now!) 2008/9/4 Adrian Chadd [EMAIL PROTECTED]: I still need to eyeball the relative URL stuff, but.. bb:approve 2008/9/3 Bundle Buggy [EMAIL PROTECTED]: Bundle Buggy has detected this merge request. For details, see: http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E Project: Squid
Re: [MERGE] Address Alex and Amos' comments.
bb:approve (third time lucky?) 2008/9/4 Adrian Chadd [EMAIL PROTECTED]: bb:approve (Sorry, I wasn't setup to vote until now!) 2008/9/4 Adrian Chadd [EMAIL PROTECTED]: I still need to eyeball the relative URL stuff, but.. bb:approve 2008/9/3 Bundle Buggy [EMAIL PROTECTED]: Bundle Buggy has detected this merge request. For details, see: http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E Project: Squid
Re: [MERGE] Address Alex and Amos' comments.
bb:approve 2008/9/3 Bundle Buggy [EMAIL PROTECTED]: Bundle Buggy has detected this merge request. For details, see: http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E Project: Squid
Re: [MERGE] Address Alex and Amos' comments.
bb:approve *sigh!* 2008/9/3 Bundle Buggy [EMAIL PROTECTED]: Bundle Buggy has detected this merge request. For details, see: http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E Project: Squid
Re: [squid-users] state of gzip/transfer-encoding?
So how much is needed exactly to support this when we currently don't support HTTP/1.1? Adrian 2008/9/2 Amos Jeffries [EMAIL PROTECTED]: Chris Woodfield wrote: Squid does not do transfer encoding of objects on its own; Just be curious, does Squid have the plan to develop the feature of objects compression like Apache's mod_deflate? Thanks. Yes. As an eCAP module. It should be out shortly after eCAP support is stabilized. Amos
Re: pseudo-specs for a String class: char
2008/9/2 Kinkie [EMAIL PROTECTED]: Read: might be useful for the HTTP parser. Stuff like: KBuf reqhdr = (whatever gets in from comm) Kbuf hdrline=reqhdr-nextToken(\n); if (hdrline[hdrline.len()-1]=='\r') hrline.truncate(hdrline.len()-1); //chop \r off Ok, so what would the underlying manipulations be? Adrian
Re: [squid-users] state of gzip/transfer-encoding?
2008/9/3 Amos Jeffries [EMAIL PROTECTED]: The existing experimental patch for 3.0.pre2 adds a ClientStreams handler for de/encoding as needed. Right. It's really only a matter of caching things properly (ETag from server of squid-generated), and fiddling the headers slightly as they transit Squid in both directions. Should not matter which HTTP/1.x version is used at the header level, since both can handle compressed data objects. 3.1 already does TE decoding. But we can't do the TE encoding until 1.1, only Content-Type re-coding. So its not specifically TE, its just content-encoded fiddling as appropriate? OK, that makes much more sense. Adrian
Re: [MERGE] Rework urlAbsolute to be a little more streamlined.
one zeroes, one doesn't. calloc is meant to be x objects of size y, but its effectively also a bzero(). Adrian 2008/9/2 Amos Jeffries [EMAIL PROTECTED]: On 01/09/2008, at 1:01 PM, Amos Jeffries wrote: Resubmit this patch, including changes based on comments by various people. - Mention RFC text in relation to changing the default behaviour in relation to unknown HTTP methods. - Use safe_free instead of xfree. - Rework urlAbsolute to use snprintf in a slightly better way. Snprintf is now used to construct the initial portion of the url and the rest is added on using POSIX string routines. I'm sure you can still crop the xstrdup() usage by adding a JIT allocation of urlbuf before if (req-protocol == PROTO_URN) and returning urlbuf plain at the end. As in malloc it? Yes with xmalloc or xcalloc. (I'm still not sure why we have two). Amos
Re: binary data
Erk! I just read that patch! man isprint. Oh, and man ctype. I'm sure there's a C++ equivalent somewhere which makes more sense to use. Adrian 2008/9/2 Amos Jeffries [EMAIL PROTECTED]: Henrik, Theres been some user interest in porting the binary data hack http://www.squid-cache.org/Versions/v3/3.0/changesets/b8877.patch to 2.7. How say you? Amos
Re: pseudo-specs for a String class
Do you really want to provide a 'consume' interface for a low-level representation of memory? I think trying to replace MemBuf with this new buffer is a bit silly. Sure, use it -in- MemBuf, along with all the other places that buffers are used. What about strtok()? Why would you want to tokenise data? Adrian 2008/8/31 Kinkie [EMAIL PROTECTED]: +1. With a view of re-using MemBuf in the final product. Starting from KinkieBuf. (Joke, me taking a dig at the content filterers again). I've gotten a bit forward, now I'm a bit at a loss about where to go next. Current interface: class KBuf { KBuf(); KBuf(const KBuf S); KBuf(const char *S, u_int32_t Ssize); KBuf(const char *S); //null-terminated ~KBuf(); bool isNull(); KBuf operator = (KBuf S); KBuf operator = (char const *S, u_int32_t Ssize); KBuf operator = (char const *S); //null-terminated KBuf append(KBuf S); KBuf append(const char * S, u_int32_t Slen); KBuf append(const char * S); //null-terminated KBuf append(const char c); //To be removed? KBuf appendf(const char *fmt, ...); //to be copied over from membuf std::ostream print (std::ostream os); // for operator void dump(ostream os); //dump debugging info const int operator [] (int pos); int cmp(KBuf S); //strcmp() bool operator == (const KBuf S); bool operator (KBuf S); bool operator (KBuf S); void truncate(u_int32_t to_size); KBuf consume(u_int32_t howmuch); //from MemBuf void terminate(void); //null-terminate static ostream stats(ostream os); char *exportCopy(void); char *exportRefIKnowWhatImDoing(void); KBuf nextToken(const char *delim); //strtok() KBuf substr(u_int32_t from, u_int32_t to); u_int32_t index(char c); u_int32_t rindex(char c); } on x86, sizeof(KBuf)=16 Now I'm a bit at a loss as to how to best integrate with iostream. There's basically three possibilities: 1. KBuf kb; kb.append(stringstream ) cheapest implementation, but each of those requires two to three copies of the contents 2. Kbuf kb; stringstream ss=kb-stream(); ss (blah). this seems to be the official way of extending iostreams; performed by making KBuf a subclass of stringbuf. extends sizeof(KBuf) by 32 bytes, and many of its calls need to housekeep two states. 3. Kbuf kb; stringstream ss=kb-stream(); ss (blah) performed by using an adapter class. The coding effort starts to be quite noticeable, as keeping the stringbuf and the KBuf in sync is not trivial. 4 Kbuf kb(blah). requires kbuf to be a subclass of an ostream. there's a significant coding effort, AND baloons the size of KBuf to 156 bytes. What's your take on how to better address this? -- /kinkie
Re: Refresh patterns and ACLs
2008/8/30 Henrik Nordstrom [EMAIL PROTECTED]: Make sure you can collapse those ACLs down to something sensible for software processing before you go down that path! It's relatively easy to make a unified lookup tree of such structure, and even if you don't it's still as fast or faster than the current acl scheme. Oh, I'm sure we could beat the current way we're using ACLs, the question is whether we can dramatically improve complicated ACL processing in the future. A big problem with the way we're inlining fast-path ACL lookups is that they suddenly become impossible to farm out to other threads. Adrian
Re: Refresh patterns and ACLs
2008/8/29 Kinkie [EMAIL PROTECTED]: YES please.. I'm quite familiar with the JunOS ACL format and it resembes this pretty closely, it's very flexible.. Make sure you can collapse those ACLs down to something sensible for software processing before you go down that path! Adrian