Re: SMP: logging
On 24 February 2010 18:06, Adrian Chadd wrote: > Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may > be under Linux and it may be under certain FreeBSD versions, but it's > likely a side-effect of VFS locking than the actual specification. .. and it certainly won't be supported for logging-to-NFS. I'd honestly just investigate a logging layer that implements some kind of IPC mechanism (sockets, sysvshm, etc) that can handle logs from multiple processes. Or you go down the apache path - lock, append, unlock. Eww. adrian
Re: SMP: logging
On 24 February 2010 06:55, Amos Jeffries wrote: >> Ah, I did not realize cache.log daemon logging is not supported yet. One >> more reason to start with simple O_APPEND. As a side effect, we would be >> able to debug daemon log starting problems as well :-). >> > > Yay. Definitely +3 then. :) Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may be under Linux and it may be under certain FreeBSD versions, but it's likely a side-effect of VFS locking than the actual specification. adrian
Re: [squid-users] 'gprof squid squid.gmon' only shows the initial configuration functions
Talk to the freebsd guys (eg me) about pmcstat and support for your hardware. You may just need to find / organise a backport of the particular hardware support for your platform. I've been working on profiling Lusca with pmcstat and some new-ish tools which use and extend it in useful ways. gprof data is almost certainly uselessly unreliable on modern CPUs. Too much can and will happen between profiling ticks. I can hazard a few guesses about where your CPU is going. Likely candidate is poll() if your Squid is too old. First thing to do is organise porting the kqueue() stuff if it isn't already included. I can make more educated guesses about where the likely CPU hog culprits are given workload and configuration file information. Adrian 2009/12/10 Guy Bashkansky : > Is there an oprofile version for FreeBSD? I thought it is limited to > Linux. On FreeBSD I tried pmcstat, but it gives an initialization > error. > > My version of Squid is old and customized (so I can't upgrade) and may > not have builtin timers. Since what version did they appear? > > As for gprof - even with the event loop on top, still the rest of the > table might give some idea why the CPU is overloaded. The problem is > - I see only initial configuration functions: > > called/total parents > index %time self descendents called+self name index > called/total children > > [1] 63.4 0.17 0.00 _mcount [1] > --- > 0.00 0.10 1/1 _start [3] > [2] 36.0 0.00 0.10 1 main [2] > 0.00 0.10 1/1 parseConfigFile [4] > <...> > --- > > [3] 36.0 0.00 0.10 _start [3] > 0.00 0.10 1/1 main [2] > --- > 0.00 0.10 1/1 main [2] > [4] 36.0 0.00 0.10 1 parseConfigFile [4] > 0.00 0.09 1/1 readConfigLines [5] > 0.00 0.00 169/6413 parse_line [6] > .. > > > System info: > > # uname -m -r -s > FreeBSD 6.2-RELEASE-p9 amd64 > > # gcc -v > Using built-in specs. > Configured with: FreeBSD/amd64 system compiler > Thread model: posix > gcc version 3.4.6 [FreeBSD] 20060305 > > > There are 7 fork()s for unlinkd/diskd helpers. Can these fork()s > affect profiling info? > > On Wed, Dec 9, 2009 at 2:04 AM, Robert Collins > wrote: >> On Tue, 2009-12-08 at 15:32 -0800, Guy Bashkansky wrote: >>> I've built squid with the -pg flag and run it in the no-daemon mode >>> (-N flag), without the initial fork(). >>> >>> I send it the SIGTERM signal which is caught by the signal handler, to >>> flag graceful exit from main(). >>> >>> I expect to see meaningful squid.gmon, but 'gprof squid squid.gmon' >>> only shows the initial configuration functions: >> >> gprof isn't terribly useful anyway - due to squids callback based model, >> it will see nearly all the time belonging to the event loop. >> >> oprofile and/or squids built in analytic timers will get much better >> info. >> >> -Rob >> > >
Re: your suggestion for range_offset_limit
the trick at least in squid-2 is to make sure that quick abort isn't occuring. Or it will begin downloading the whole object, return the requested range bit, and then abort the remainder of the fetch. Adrian 2009/11/25 Amos Jeffries : > Matthew Morgan wrote: >> >> On Wed, Nov 25, 2009 at 7:09 PM, Amos Jeffries >> wrote: >>> >>> Matthew Morgan wrote: Sorry it's taking me so long to get this done, but I do have a question. You suggested making getRangeOffsetLimit a member of HttpReply. There are two places where this method currently needs to be called: one is CheckQuickAbort2() in store_client.cc. This one will be easy, as I can just do entry->getReply()->getRangeOffsetLimit(). The other is HttpStateData::decideIfWeDoRanges in http.cc. Here, all we have access to is an HttpRequest object. I looked through the source to see if I could find where a request owned or had access to a reply, but I don't see anything like that. If getRangeOffsetLimit were a member of HttpReply, what do you suggest doing here? I could make a static version of the method, but that wouldn't allow caching the result. >>> >>> Ah. I see. Quite right. >>> >>> After a bit more though I find my original request a bit weird. >>> >>> Yes it should be a _Request_ member and do its caching there. You can go >>> ahead with that now while we discuss whether to do a slight tweak on top >>> of >>> the basic feature. >>> >>> >>> [cc'ing squid-dev so others can provide input] >>> >>> I'm not certain of the behavior we want here if we do open the ACLs to >>> reply >>> details. Some discussion is in order. >>> >>> Simple way would be to not cache the lookup the first time when reply >>> details are not provided. >>> >>> It would mean making it return potentially two different values across >>> the >>> transaction. >>> >>> 1) based on only request detail to >>> and other on request+reply details. decide if a range request to >>> possible. >>> and then >>> 2) based on additional reply details to see if the abort could be done. >>> >>> No problem if the reply details cause an increase in the limit. But if >>> they >>> restrict it we enter grounds of potentially making a request then >>> canceling >>> it and being unable to store the results. >>> >>> >>> Or, taking the maximum of the two across two calls? so it can only >>> increase. >>> would be slightly trickier involving a flag a well to short-circuit the >>> reply lookups instead of just a magic cache value. >>> >>> Am I seriously over-thinking things today? >>> >>> >>> Amos >> >> Here's a question, too: is this feature going to benefit anyone? I >> realized later that it will not solve my problem, because all the >> traffic that was getting force downloaded ended up being from windows >> updates. The urls showing up in netstat and such were just weird >> because the windows update traffic was actually coming from limelight. >> My ultimate solution was to write a script that reads access.log, >> checks for windows update urls that are not cached, and manually >> download them one at a time after hours. >> >> If there is anyone at all who would benefit from this I would still be >> *more* than glad to code it (as I said, it would be my first real open >> source contribution...very exciting), but I just wondered if anyone >> will actually use it. > > I believe people will find more control here useful. > > Windows update service packs are a big reason, but there are also similar > range issues with Adobe Reader online PDFs, google maps/earth, and flash > videos when paused/resumed. Potentially other stuff, but I have not heard of > problems. > > This will allow anyone to fine tune the places where ranges are permitted or > forced to fully cache. Avoiding the problems a blanket limit adds. > >> >> As to which approach would be better, I don't know enough about that >> data path to really suggest. When I initially made my changes, I just >> replaced each reference to Config.range_offset_limit or whatever. >> Today I went back and read some more of the code, but I'm still >> figuring it out. How often would the limit change based on the >> request vs. the reply? > > Just the once. On first time being checked for the reply. > And most likely on the case of testing for a reply mime type. The other > useful info I can think of are all request data. > > You can ignore if you like. I'm just worrying over a borderline case. > Someone else can code a fix if they find it a problem or need to do mime > checks. > > Amos > -- > Please be using > Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20 > Current Beta Squid 3.1.0.15 > >
Re: squid-smp: synchronization issue & solutions
Right. Thats the easy bit. I could even do that in Squid-2 with a little bit of luck. The hard bit is rewriting the relevant code which relies on cbdata style reference counting behaviour. That is the tricky bit. Adrian 2009/11/20 Robert Collins : > On Wed, 2009-11-18 at 10:46 +0800, Adrian Chadd wrote: >> Plenty of kernels nowdays do a bit of TCP and socket process in >> process/thread context; so you need to do your socket TX/RX in >> different processes/threads to get parallelism in the networking side >> of things. > > Very good point. > >> You could fake it somewhat by pushing socket IO into different threads >> but then you have all the overhead of shuffling IO and completed IO >> between threads. This may be .. complicated. > > The event loop I put together for -3 should be able to do that without > changing the loop - just extending the modules that hook into it. > > -Rob >
Re: squid-smp: synchronization issue & solutions
Plenty of kernels nowdays do a bit of TCP and socket process in process/thread context; so you need to do your socket TX/RX in different processes/threads to get parallelism in the networking side of things. You could fake it somewhat by pushing socket IO into different threads but then you have all the overhead of shuffling IO and completed IO between threads. This may be .. complicated. Adrian 2009/11/18 Gonzalo Arana : > On Tue, Nov 17, 2009 at 12:45 PM, Alex Rousskov > wrote: >> On 11/17/2009 04:09 AM, Sachin Malave wrote: >> >> >> >>> I AM THINKING ABOUT HYBRID OF BOTH... >>> >>> Somebody might implement process model, Then we would merge both >>> process and thread models .. together we could have a better squid.. >>> :) >>> What do u think? > > In my limited squid expierence, cpu usage is hardly a bottleneck. So, > why not just use smp for the cpu/disk-intensive parts? > > The candidates I can think of are: > * evaluating regular expressions (url_regex acls). > * aufs/diskd (squid already has support for this). > > Best regards, > > -- > Gonzalo A. Arana > >
Re: squid-smp
Oh, I can absolutely give you guys food for thought. I was just hoping someone else would already try to do a bit of legwork. Things to think about: * Do you really, -really- want to reinvent the malloc wheel? This is separate from caching results and pre-made class instances. There's been a lot of work in well-performing, thread-aware malloc libraries * Do you want to run things in multiple processes or multiple threads? Or support both? * How much of the application do you want to push out into separate threads? run lots of "copies" of Squid concurrently, with some locking going on? Break up individual parts of the processing pipeline into threads? (eg, what I'm going to be experimenting with soon - handling ICP/HTCP in a separate thread for some basic testing) * Survey the current codebase and figure out what depends upon what - in a way that you can use for figuring out what needs to be made re-entrant and what may need locking. Think about how to achieve all of this. Best example of this - you're going to need to figure out how to do concurrent debug logging and memory allocation - so see what that code uses, what that codes' code uses, etc * 10GE cards are dumping individual PCIe channels to CPUs; which means that the "most efficient" way of pumping data around will be to somehow throw individual connections onto specific CPUs, and keep them there. There's no OS support for this yet, but OSes may be "magical" (ie, handing you sockets in specific threads via accept() and hoping that the NIC doesn't reorganise its connection->PCIe channel hash) * Do you think its worth being able to migrate specific connections between threads? Or once they're in a thread they're there for good * If you split up squid into "lots of threads running the whole app", what and where would you envisage locking and blocking? What about data sharing? How would that scale given a handful of example workloads? What about in abnormal situations? How well will things degrade? * What about using message passing and message queues? Where would it be appropriate? Where wouldn't it be appropriate? Why? Here's an example: * Imagine you're doing store lookups using message passing with your "store" being a separate thread with a message queue. Think about how you'd handle say, ICP peering between two caches doing > 10,000 requests a second. What repercussions does that have for the locking of the message queues between other threads. What are the other threads doing? With that in mind, survey the kinds of ways that current network apps "do" threading: * look at the various ways apache does it - eg, the per-connection thread+process hybrid model, the event-worker thread model, etc * look at memcached - one thread doing accept'ing, farming requests off to other threads that just run a squid-like event loop. Minimal inter-thread communication for the most part * investigate what the concurrency hooks for various frameworks do - eg, the boost asio library stuff has "colours" which you mark thread events with. These colours dictate which events need to be run sequentially and which can run in parallel * look at all of the random blogs written by windows networking coders - they're further ahead of the massively-concurrent network application stack because Windows has had it for a number of years. Now. You've mentioned you've looked at the others and you think major replumbing is going to be needed. Here's a hint - its going to be needed. Thinking you can avoid it is silly. Figuring out what you can do right now that doesn't lock you into a specific trajectory is -not- silly. For example, figuring out what APIs need to be changed to make them re-enterant is not silly. Most of the stuff in lib/ with static char buffers that they return need to be changed. That can be done -now- without having to lock yourself into a particular concurrency model. 2c, Adrian 2009/10/15 Amos Jeffries : > Adrian Chadd wrote: >> >> 2009/10/15 Sachin Malave : >> >>> Its not like we want to make project bad. Squid was not deployed on >>> smp before because we did not have shared memory architectures >>> (multi-cores), also the library support for multi-threading was like >>> nightmare for people. Now things are changed, it is very easy to >>> manage threads, people have multi-core machines at their desktops, and >>> as hardware is available now or later somebody has to try and build >>> SMP support. think about future... >>> >>> To cop with internet speed & increase in number of users, Squid must >>> use multi-core architecture and distribute its work >> >> I 100% agree with your comments. I agree 100% that Squid needs to be >> made scalable on
Re: squid-smp
2009/10/15 Sachin Malave : > Its not like we want to make project bad. Squid was not deployed on > smp before because we did not have shared memory architectures > (multi-cores), also the library support for multi-threading was like > nightmare for people. Now things are changed, it is very easy to > manage threads, people have multi-core machines at their desktops, and > as hardware is available now or later somebody has to try and build > SMP support. think about future... > > To cop with internet speed & increase in number of users, Squid must > use multi-core architecture and distribute its work I 100% agree with your comments. I agree 100% that Squid needs to be made scalable on multi-core boxes. Writing threaded code may be easier now than in the past, but the ways of screwing stability, debuggability, performance and such -haven't- changed.. This is what I'm trying to get across. :) Adrian
Re: Operating System that give best performance on Squid
I should also say I've pushed the same under Linux in my little cacheboy CDN. The peak loads were around 200mbit worth of ~ 300kbyte cached replies. This was on both 32 bit, 64 bit intel and even an IA-64 box. Adrian 2009/10/15 Adrian Chadd : > I'm pushing upwards of 100mbit of small objects on FreeBSD using > Squid-2.7/Lusca and COSS. > > You can push quite a bit more if your workload fits in memory and/or > is large objects. > > 2009/10/14 Paul Khadra : >> >> Dear all, >> >> We are planning to install squid on a server where we expect to push 100 >> mbps - 200 mbps (if possible). >> >> Which OS is best tested with squid and which one can give the highest >> performance ? >> >> Thank you, Paul >> -- >> View this message in context: >> http://www.nabble.com/Operating-System-that-give-best-performance-on-Squid-tp25888502p25888502.html >> Sent from the Squid - Development mailing list archive at Nabble.com. >> >> >
Re: squid-smp
2009/10/14 Amos Jeffries : [snip] I still find it very amusing that noone else has sat down and talked about the last 20 + years of writing threaded, concurrent code and what the pro/cons of them would be here; nor what other projects are doing. Please don't sit down and talk about how to shoehorn SMP into some existing Squid-3 "thing" (be it AsyncCalls, or anything really) before doing this. You'll just be re-inventing the same mistakes made in the past and it will make the project look bad. Adrian
Re: Operating System that give best performance on Squid
I'm pushing upwards of 100mbit of small objects on FreeBSD using Squid-2.7/Lusca and COSS. You can push quite a bit more if your workload fits in memory and/or is large objects. 2009/10/14 Paul Khadra : > > Dear all, > > We are planning to install squid on a server where we expect to push 100 > mbps - 200 mbps (if possible). > > Which OS is best tested with squid and which one can give the highest > performance ? > > Thank you, Paul > -- > View this message in context: > http://www.nabble.com/Operating-System-that-give-best-performance-on-Squid-tp25888502p25888502.html > Sent from the Squid - Development mailing list archive at Nabble.com. > >
Re: Recent Facebook Issues
I've emailed the facebook NOC directly about the issue. Thanks, Adrian 2009/10/9 Kinkie : > You can try to access facebook with konqueror. It complains > rather loudly, drops the excess data and the site generally doesn't > work (has been dong so for a few days, but only NOW I'm connecting the > wires...) > > Kinkie > > On Fri, Oct 9, 2009 at 2:52 AM, Adrian Chadd wrote: >> Ok, this happens for all versions? >> >> I can bring this up with facebook engineering if someone provides me >> with further information. >> >> >> Adrian >> >> 2009/10/9 Amos Jeffries : >>> Thanks to several people I've managed to track down why the facebook issues >>> are suddenly appearing and why its intermittent. >>> >>> On the sometimes works sometimes doesn't problem. facebook.com does >>> User-Agent header checks and sends back one of four pages. >>> 1) a generic page saying 'please use another browser'. >>> 2) a redirect to login for each of IE, Firefox and Safari >>> 3) a home page (if cookies sent initially) >>> >>> going through the login redirects to the page also presented at (3) above. >>> >>> The home page is the real problem. When cookies are presented it ships >>> without Content-Length (fine). >>> When they _are_ present, ie after the user has logged in it ships with >>> Content-Length: 18487 and data size of 18576. >>> >>> Amos >>> >>> >> > > > > -- > /kinkie > >
Re: Recent Facebook Issues
Ok, this happens for all versions? I can bring this up with facebook engineering if someone provides me with further information. Adrian 2009/10/9 Amos Jeffries : > Thanks to several people I've managed to track down why the facebook issues > are suddenly appearing and why its intermittent. > > On the sometimes works sometimes doesn't problem. facebook.com does > User-Agent header checks and sends back one of four pages. > 1) a generic page saying 'please use another browser'. > 2) a redirect to login for each of IE, Firefox and Safari > 3) a home page (if cookies sent initially) > > going through the login redirects to the page also presented at (3) above. > > The home page is the real problem. When cookies are presented it ships > without Content-Length (fine). > When they _are_ present, ie after the user has logged in it ships with > Content-Length: 18487 and data size of 18576. > > Amos > >
Re: Segfault in HTCP CLR request on 64-bit
The whole struct is on the local stack. Hence bzero() or memset() to 0. 2009/10/2 Matt W. Benjamin : > Bzero? Is it an already-allocated array/byte sequence? (Apologies, I > haven't seen the code.) Assignment to NULL/0 is in fact correct for > initializing a sole pointer, and using bzero for that certainly isn't > typical. Also, for initializing a byte range, memset is preferred [see Linux > BZERO(3), which refers to POSIX.1-2008 on that point]. > > STYLE(9) says use NULL rather than 0, and it is clearer. But C/C++ > programmers should know that NULL is 0. And note that at least through 1998, > initialization to 0 was the preferred style in C++, IIRC. > > Matt > > - "Adrian Chadd" wrote: > >> I've just replied to the ticket in question. It should probably just >> be a bzero() rather than setting the pointer to 0. Which should >> really >> be setting it to NULL. >> >> Anyway, please test whether the bzero() works. If it does then I'll >> commit that fix to HEAD and 2.7. >> >> 2009/9/28 Jason Noble : >> > I have opened a bug for this issue here: >> http://bugs.squid-cache.org/show_bug.cgi?id=2788 Also, the previous >> patch was not generated against head so I re-rolled the patch against >> current head and attached to the bug report > > -- > > Matt Benjamin > > The Linux Box > 206 South Fifth Ave. Suite 150 > Ann Arbor, MI 48104 > > http://linuxbox.com > > tel. 734-761-4689 > fax. 734-769-8938 > cel. 734-216-5309 > >
Re: Segfault in HTCP CLR request on 64-bit
I've just replied to the ticket in question. It should probably just be a bzero() rather than setting the pointer to 0. Which should really be setting it to NULL. Anyway, please test whether the bzero() works. If it does then I'll commit that fix to HEAD and 2.7. 2009/9/28 Jason Noble : > I have opened a bug for this issue here: > http://bugs.squid-cache.org/show_bug.cgi?id=2788 Also, the previous patch > was not generated against head so I re-rolled the patch against current head > and attached to the bug report
Re: Segfault in HTCP CLR request on 64-bit
Could you please create a bugzilla report for this, complete with a patch against Squid-2.HEAD and 2.7? I'll then commit it. 2009/9/26 Jason Noble : > I recently ran into an issue where Squid 2.7 would segfault trying to issue > HTCP CLR requests. I found the segfault only occurred on 64-bit machines. > While debugging, I found that the value of stuff.S.req_hdrs was not > initialized but later, strlen was being called on it. This seems to -- by > chance -- not fail on 32 bit builds, but always segfaults on 64-bit. The > attached patch fixed the problem for me and it seems good programming > practice to properly initialize pointers to prevent issues such as this. As > the htcpStuff struct is used in other places, I have concerns that other > issues may be lurking as well, although I have yet to run into them. > > Regards, > Jason >
Re: Squid-smp : Please discuss
If you want to start looking at -threading- inside Squid, I'd suggest thinking first how you'd create a generic thread "helper" framework that allows Squid to run multiple internal threads that can do "stuff", and then implement some message/data queues and handle notification between threads. You can then push some "stuff" into these worker threads as an experiment and see exactly what the issues are. Building worker threads into Squid is easy. Making them do anything? Not so easy :) Adrian 2009/9/15 Sachin Malave : > On Tue, Sep 15, 2009 at 1:38 AM, Adrian Chadd wrote: >> 2009/9/15 Sachin Malave : >>> On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd >>> wrote: >>>> Guys, >>>> >>>> Please look at what other multi-CPU network applications do, how they >>>> work and don't work well, before continuing this kind of discussion. >>>> >>>> Everything that has been discussed has already been done to death >>>> elsewhere. Please don't re-invent the wheel, badly. >> >>> Yes synchronization is always expensive . So we must target only those >>> areas where shared data is updated infrequently. Also if we are making >>> thread then the amount of work done must be more as compared to >>> overheads required in thread creation, synchronization & scheduling. >> >> Current generation CPUs are a lot, lot better at the thread-style sync >> primitives than older CPUs. >> >> There's other things to think about, such as lockless queues, >> transactional memory hackery, atomic instructions in general, etc, >> etc, which depend entirely upon the type of hardware being targetted. >> >>> If we try to provide locks to existing data structures then >>> synchronization factor will definitely affect to our design. >> >>> Redesigning of such structures and there behavior is time consuming >>> and may change whole design of the Squid. >> >> >> Adrian >> > > > > And current generation libraries are also far better than older, like > OpenMP, creating threads and handling synchronization issues in OpenMP > is very easy... > > Automatic locks are provided, u need not to design your own locking > mechanisms Just a statement and u can lock the shared > variable... > Then the major work remains is to identify the shared access. > > I WANT TO USE OPENMP library. > > ANY suggestions. > >
Re: Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?
But in that case, ACCESS_REQ_PROXY_AUTH would be returned rather than ACCESS_DENIED.. Adrian 2009/9/15 Robert Collins : > On Tue, 2009-09-15 at 15:22 +1000, Adrian Chadd wrote: >> G'day. This question is aimed mostly at Henrik, who I recall replying >> to a similar question years ago but without explaining why. >> >> Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL? >> >> The particular bit in src/client_side.c: >> >> int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || >> aclIsProxyAuth(AclMatchedName)) && !http->request->flags.transparent; >> >> Is there any particular reason why auth is tried again? it forces a >> pop-up on browsers that already have done authentication via NTLM. > > Because it should? Perhaps you can expand on where you are seeing this - > I suspect a misconfiguration or some such. > > Its entirely appropriate to signal HTTP_PROXY_AUTHENTICATION_REQUIRED > when a user is denied access to a resource *and if they log in > differently they could get access*. > > -Rob >
Re: Squid-smp : Please discuss
2009/9/15 Sachin Malave : > On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd wrote: >> Guys, >> >> Please look at what other multi-CPU network applications do, how they >> work and don't work well, before continuing this kind of discussion. >> >> Everything that has been discussed has already been done to death >> elsewhere. Please don't re-invent the wheel, badly. > Yes synchronization is always expensive . So we must target only those > areas where shared data is updated infrequently. Also if we are making > thread then the amount of work done must be more as compared to > overheads required in thread creation, synchronization & scheduling. Current generation CPUs are a lot, lot better at the thread-style sync primitives than older CPUs. There's other things to think about, such as lockless queues, transactional memory hackery, atomic instructions in general, etc, etc, which depend entirely upon the type of hardware being targetted. > If we try to provide locks to existing data structures then > synchronization factor will definitely affect to our design. > Redesigning of such structures and there behavior is time consuming > and may change whole design of the Squid. Adrian
Re: Squid-smp : Please discuss
Guys, Please look at what other multi-CPU network applications do, how they work and don't work well, before continuing this kind of discussion. Everything that has been discussed has already been done to death elsewhere. Please don't re-invent the wheel, badly. Adrian 2009/9/15 Robert Collins : > On Tue, 2009-09-15 at 14:27 +1200, Amos Jeffries wrote: >> >> >> RefCounting done properly forms a lock on certain read-only types like >> Config. Though we are currently handling that for Config by leaking >> the >> memory out every gap. >> >> SquidString is not thread-safe. But StringNG with its separate >> refcounted >> buffers is almost there. Each thread having a copy of StringNG sharing >> a >> SBuf equates to a lock with copy-on-write possibly causing issues we >> need >> to look at if/when we get to that scope. > > General rule: you do /not/ want thread safe objectse for high usage > objects like RefCount and StringNG. > > synchronisation is expensive; design to avoid synchronisation and hand > offs as much as possible. > > -Rob > >
Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?
G'day. This question is aimed mostly at Henrik, who I recall replying to a similar question years ago but without explaining why. Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL? The particular bit in src/client_side.c: int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) && !http->request->flags.transparent; Is there any particular reason why auth is tried again? it forces a pop-up on browsers that already have done authentication via NTLM. I've written a patch to fix this in Squid-2.7: http://www.creative.net.au/diffs/2009-09-15-squid-2.7-auth_required_on_auth_acl_deny.diff I'll create a bugtraq entry when I have some more background information about this. Thanks, adrian
Re: separate these to new list?: "Build failed..."
2009/8/16 Henrik Nordstrom : > sön 2009-08-16 klockan 10:23 +1000 skrev Robert Collins: > >> If the noise is too disturbing to folk we can investigate these... I >> wouldn't want anyone to leave the list because of these reports. > > I would expect the number of reports to decline significantly as we > learn to check commits better to avoid getting flamed in failed build > reports an hour later.. combined with the filtering just applied which > already reduced it to 1/6. > > But seriously, it would be a sad day if these reports becomes so > frequent compared to other discussions that developers no longer would > like to stay subscribed. We then have far more serious problems.. There's more build failure messages on squid-dev then actual development discussion. Perhaps the build failure email should start spamming the person who did the commit, rather than squid-dev. Adrian
squid-2 - vary and x-accelerator-vary differences?
G'day, I just noticed in src/HttpReply.c that the vary expire option (Config.onoff.vary_ignore_expire) is checked if the reply has HDR_VARY set but it does not check if HDR_X_ACCELERATOR_VARY is set. Everywhere else in the code checks them both consistently and assembles "Vary" header contents consistently from both. Is this an oversight/bug? Is it intentional behaviour? Thanks, Adrian
Re: multiple store-dir issues
2009/7/20 Henrik Nordstrom : >> I've fixed a potentially risky situation in Lusca relating to the >> initialisation of the storeIOState cbdata type. Each storedir has a >> different idea of how the allocation should be free()'ed. > > Risky in what sense? Ah. I just re-re-re-read the code again and I now understand what is going on. There are multiple definitions of "storeIOState cbdata" being allocated instead of one. The definitions are local to each module. Ok. Sorry for the noise. I'll commit a fix to COSS for the initialisation issue someone reported during reconfigure. Adrian
multiple store-dir issues
G'day, I've fixed a potentially risky situation in Lusca relating to the initialisation of the storeIOState cbdata type. Each storedir has a different idea of how the allocation should be free()'ed. The relevant commit in Lusca is r14208 - http://code.google.com/p/lusca-cache/source/detail?r=14208 . I'd like this approach to be included in Squid-2.HEAD and backported to Squid-2.7 / Squid-2.6. Thanks, adrian
Re: Hello from Mozilla
2009/7/17 Ian Hickson : >> That way you are still speaking HTTP right until the "protocol change" >> occurs, so any and all HTTP compatible changes in the path(s) will >> occur. > > As mentioned earlier, we need the handshake to be very precisely defined > because otherwise people could trick unsuspecting servers into opting in, > or rather appearing to opt in, and could then send all kinds of commands > down to those servers. Would you please provide an example of where an unsuspecting server is tricked into doing something? >> Ian, don't you see and understand the semantic difference between >> "speaking HTTP" and "speaking a magic bytecode that is intended to look >> HTTP-enough to fool a bunch of things until the upgrade process occurs" >> ? Don't you understand that the possible set of things that can go wrong >> here is quite unbounded ? Don't you understand the whole reason for >> "known ports" and protocol descriptions in the first place? > > Apparently not. Ok. Look at this. The byte sequence "GET / HTTP/1.0\r\nHost: foo\r\nConnection: close\r\n\r\n" is not byte equivalent to the sequence "GET / HTTP/1.0\r\nConnection: close\r\nHost: foo\r\n\r\n" The same byte sequence interpreted as a HTTP protocol exchange is equivalent. There's a mostly-expected understanding that what happens over port 80 is HTTP. The few cases where that has broken (specifically Shoutcast, but I do see other crap on port 80 from time to time..) has been by people who have implemented a mostly HTTP looking protocol, tested that it mostly works via a few gateways/firewalls/proxies, and then deployed it. >> My suggestion is to completely toss the whole "pretend to be HTTP" thing >> out of the window and look at extending or adding a new HTTP mechanism >> for negotiating proper tunneling on port 80. If this involves making >> CONNECT work on port 80 then so be it. > > Redesigning HTTP is really much more work than I intend to take on here. > HTTP already has an Upgrade mechanism, reusing it seems the right thing to > do. What you intend to take on here and what should be taken on here is very relevant. You're intending to do stuff over tcp/80 which looks like HTTP but isn't HTTP. Everyone who implements anything HTTP gateway related (be it a transparent proxy, a firewall, a HTTP "router", etc) suddenly may have to implement your websockets stuff as well. So all of a sudden your attempt to not extend HTTP ends up extending HTTP. >> The point is, there may be a whole lot of stuff going on with HTTP >> implementations that you're not aware of. > > Sure, but with the except of man-in-the-middle proxies, this isn't a big > deal -- the people implementing the server side are in control of what the > HTTP implementation is doing. That may be your understanding of how the world works, but out here in the rest of the world, the people who deploy the edge and the people who deploy the core may not be the same people. There may be a dozen layers of red tape, equipment lifecycle, security features, etc, that need to be handled before "websockets happy" stuff can be deployed everywhere it needs to. Please don't discount man-in-the-middle -anything- as being "easy" to deal with. > In all cases except a man-in-the-middle proxy, this seems to be what we > do. I'm not sure how we can do anything in the case of such a proxy, since > by definition the client doesn't know it is present. .. so you're still not speaking HTTP? Ian, are you absolutely certain that everywhere you use "the internet", there is no "man in the middle" between you and the server you're speaking to? Haven't you ever worked at any form of corporate or enterprise environment? What about existing "captive portal" deployments like wifi hotspots, some of which still use squid-2.5 (eww!) as their http firewall/proxy to control access to the internet? That stuff is going to need upgrading sure, but I'd rather see the upgrade happen once to a well thought out and reasonably well designed protocol, versus having lots of little upgrades need to occur because your "HTTP but not quite HTTP" exchange on port 80 isn't thought out enough. Adrian
Re: [PATCH] Bug 2680: ** helper errors after -k rotate
NOte that winbind has a hard coded limit that is by default very low. Opening 2n ntlm_auth helpers may make things blow up in horrible ways. Adrian 2009/7/16 Robert Collins : > On Thu, 2009-07-16 at 14:08 +1200, Amos Jeffries wrote: >> >> Both reconfigure and helper recovery use startHelpers() where the >> limit >> needs to take place. >> The DOS bug fix broke *rotate* (reconfigure has an async step added by >> Alex >> that prevents it being a problem). > > s/rotate/reconfigure then :) In my mind one is a subset of the other. > >> > If someone is running hundreds of helpers on openwrt/olpc then >> things >> > are broken already :). I'd really suggest that such environments >> > pipeline through a single helper rather than many concurrent >> helpers. >> > Such platorms are single core and you'll get better usage of memory >> > doing many requests in a single helper than one request each to many >> > helpers. >> >> lol, NTLM concurrent? try it! > > I did. IIRC the winbindd is fully capable of handling multiple > overlapping requests, and each NTLM helper is *solely* a thunk layer > between squid's format and the winbindd *state*. > > ASCII art time, 3 requests: > Multiple helpers: > /--1-helper--\ > squid-*---2-helper---* winbindd [state1, state2, state3] > \--3-helper--/ > One helper: > squid-*--1-helper---* winbindd [state1, state2, state3] > > -Rob > >
Re: Hello from Mozilla
2009/7/16 Ian Hickson : >> Right down to the HTTP/1.1 reserved protocol label (do get that changed >> please). > > If we're faking HTTP, then it has to look like HTTP. The message here is "don't fake HTTP". "Speak HTTP over port 80". > I'm getting very mixed messages here. > > Is there a reliable way to open a bidirectional non-HTTP TCP/IP connection > through a Squid man-in-the-middle proxy over port 80 to a remote server > that normally acts like an HTTP server? If so, what is the sequence of > bytes that will act in this way? That is the wrong question. The whole point of speaking HTTP on port 80 is to be able to speak a variety of sequence of bytes, all which match the HTTP protocol specification, in order to get the job done. At the point you're speaking on port TCP/80, you're not just speaking a sequence of bytes any more. You're speaking HTTP. There are plenty of sequences of bytes that can occur that are -semantically identical-. Adrian
Re: Hello from Mozilla
2009/7/16 Ian Hickson : > We actually used to do that, but we got requests to make it more > compatible with the HTTP Upgrade mechanism so that people could add the > support to their HTTP servers instead of having to put code in front of > their servers. Right. So why not extend the spec a little more to make a tunneling based upgrade process or something over HTTP? That way you are still speaking HTTP right until the "protocol change" occurs, so any and all HTTP compatible changes in the path(s) will occur. This includes things like authentication, which I believe Henrik mentioned. > Well, since Upgrade is a by-hop packet, apparently that's a moot point > anyway, because man-in-the-middle proxies will always break it if they're > present. So I'm not convinced that allowing HTTP modifications matters. Ian, don't you see and understand the semantic difference between "speaking HTTP" and "speaking a magic bytecode that is intended to look HTTP-enough to fool a bunch of things until the upgrade process occurs" ? Don't you understand that the possible set of things that can go wrong here is quite unbounded ? Don't you understand the whole reason for "known ports" and protocol descriptions in the first place? > But the point is that it is a recognisable handshake and so could be > implemented as a switch before hitting the HTTP server, or it could be > implemented in the HTTP server itself (as some people apparently want). It > fails with man-in-the-middle proxies, but then that's what the TLS-over- > port-443 solution is intended for. My suggestion is to completely toss the whole "pretend to be HTTP" thing out of the window and look at extending or adding a new HTTP mechanism for negotiating proper tunneling on port 80. If this involves making CONNECT work on port 80 then so be it. The point is, there may be a whole lot of stuff going on with HTTP implementations that you're not aware of. I'd rather invest my time in making certain that what you speak on port 80 is -still HTTP- (and what you speak to proxies which are relaying your websocket data around is also HTTP) right until a well understood protocol upgrade occurs. Frankly, I'm curious how the process got this far inside the websockets community -without-having -anyone- with HTTP experience step up and state all the reasons this is a bad, bad idea. Surely mozilla has some smart HTTP clued up people on board? :) Adrian
Re: Hello from Mozilla
2009/7/15 Amos Jeffries : > a) Getting a dedicated WebSocket port assigned. > * You and the client needing it have an argument to get that port opened > through the firewall. > * Squid and other proxies can be altered to allow CONNECT through to safe > defined ports (80 is not one). Or to do the WebSocket upgrade itself. > > b) accepting that the network being traversed is screwed beyond redemption > by its own policy or admin. I think the fundamental mistake being made here by Ian (and potentially others) is breaking the assumption that specific protocols exist on the well-known ports. Suddenly treating stuff on port 80 as "almost but not quite HTTP" is bound to cause issues, both devices speaking valid HTTP (eg Squid) and firewalls etc which may treat the exchange as "not HTTP" and decide to start dropping things. Or worse - passing it through, "sort of". Ian - I understand your motivations here but I think it shows a fundamental mis-understanding of the glue which keeps the internet mostly functioning together. Here's a question for you - would you run a mythical protocol, call it "foonet", over IP, if it looked almost-but-not-quite like IP so people could run it on their existing IP networks? Can you see any particular issues with that? Other slots in the mythical OSI stack shouldn't be treated any differently. Adrian
Re: Hello from Mozilla
2009/7/15 Ian Hickson : > On Tue, 14 Jul 2009, Alex Rousskov wrote: >> >> WebSocket made the handshake bytes look like something Squid thinks it >> understands. That is the whole point of the argument. You are sending an >> HTTP-looking message that is not really an HTTP message. I think this is >> a recipe for trouble, even though it might solve some problem in some >> environments. > > Could you elaborate on what bytes Squid thinks it should change in the > WebSocket handshake? Anything which it can under the HTTP/1.x RFCs. Maybe I missed it - why exactly again aren't you just talking HTTP on the HTTP port(s), and doing a standard HTTP upgrade? Adrian
squid-2.HEAD hanging with 304 not modified responses
G'day guys, I've fixed a bug in Lusca which was introduced with Benno's method_t stuff. The specific bug is revalidation replies 'hanging' until the upstream socket closes, forcing an end of message to occur. The history and patch are here: http://code.google.com/p/lusca-cache/source/detail?r=14103 Those of you toying with Squid-2.HEAD (eg Mark) - would you mind verifying that you can reproduce it on Squid-2.HEAD and comment on the fix? Thanks, adrian
NTLM authentication popups, etc
I'm working on a couple of paid squid + active directory deployments and they're both seeing the occasional NTLM auth popup happening. The workaround is pretty simple - just enable the IP auth cache. This however doesn't solve the fundamental problem(s), whatever they are. The symptom is logs like this: [2009/06/15 16:20:17, 1] libsmb/ntlmssp.c:ntlmssp_update(334) got NTLMSSP command 1, expected 3 And vice versa (expected 3, got 1.) These correspond to states in samba/source/include/ntlmssp.h - 1 is NTLMSSP_NEGOTIATE; 3 is NTLMSSP_AUTH. The conclusion here is that there's a disconnect between the authentication state of the client -and- the authentication state of ntlm_auth. I'm trying to eliminate the possibilities here. The stateful helper stuff seems correct enough, so requests aren't being queued to already busy stateful helpers. The other two possibilities I can immediately think of: * 1 - authentication is aborted somewhere for whatever reason; an authentication helper is stuck at the wrong point in the state engine; the next request coming along starts at NTLMSSP_NEGOTIATE but the ntlm_auth helper it is handed to is at NTLMSSP_AUTH (from the partial authentication attempt earlier); error * 2 - the web browser is stuffing different phases of the negotiation down different connections to the proxy. Now, debugging (1) shouldn't be difficult at all. I'm going to try and determine the code paths that lead to and from an aborted auth request, add in some debugging and see if the helper is closed. Debugging (2) without full logs (impractical in this environment) and full traffic dump (again, impractical in production) is going to be a bit more difficult. I'm thinking about adding some hacky code to the Squid ntlm auth class which keeps a log of the auth blobs sent/received from/to the client and ntlm_auth. I can then dump the entire conversation out to cache.log whenever authentication fails/errors. This should at least give me a hint as to what is going on. (1) can explain the client state == NTLMSSP_NEGOTATE but ntlm_auth state is NTLMSSP_AUTH problem but not vice versa. (2) explains both. It is quite possible it is the combination of both however. Now, the reason this is getting somewhat annoying and why I'd like to try and understand/fix it is that -another- problem seen by one of these clients is negotiate/ntlm authentication from IE (at least IE8) through Squid. I've got packet dumps showing the browser sending different phases of the negotiation down separate proxy connections and then reusing the original one incorrectly. My medium term plan is to take whatever evidence I have of this behaviour and throw it at the IE group(s) at Microsoft but in the short term I'd like to make certain the proxy authentication side of things is completely blameless before I hand off stuff to third parties. Ideas? Comments? adrian
Re: Very odd problem running squid 2.7 on Windows
Actually, it should probably be 1 vs 0; -1 still evaluates to true if you go (if func()). I think the last few hours of fixing bad C and putting in error checking messed me around a bit. I just wrote a quick bit of C to double-check. (Of course in C++ there's native bool types, no? :) Sorry! Adrian 2009/5/25 Kinkie : > On Mon, May 25, 2009 at 2:21 PM, Adrian Chadd wrote: >> int >> isUnsignedNumeric(const char *str) >> { >> for (; *str; str++) { >> if (! isdigit(*str)) >> return -1; >> } >> return 1; >> } > > Wouldn't returning 0 on false instead of -1 be easier? > Just a random thought.. > > > -- > /kinkie > >
Re: Very odd problem running squid 2.7 on Windows
int isUnsignedNumeric(const char *str) { for (; *str; str++) { if (! isdigit(*str)) return -1; } return 1; } 2009/5/25 Adrian Chadd : > strtoul(). But if you want to verify the -whole- thing is numeric, > just write a bit of C which does this: > > int isNumeric(const char *str) > { > > } > > 2009/5/25 Amos Jeffries : >> Guido Serassio wrote: >>> >>> Hi, >>> >>> At 16.17 24/05/2009, Adrian Chadd wrote: >>>> >>>> Well as Amos said, this isn't the way to call getservbyname(). >>>> >>>> getservbyname() doesn't translate ports to ports; it translates >>>> tcp/udp service names to ports. It should be returning NULL if it >>>> can't find the service string in the file. >>>> >>>> Methinks numeric values shouldn't be handed to getservbyname() under >>>> Windows. :) >>> >>> So, we have just found a Squid bug :-) >>> >>> Regards >> >> Yes. Question becomes though, fastest way to detect numeric-only strings. >> >> Amos >> -- >> Please be using >> Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15 >> Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1 >> >> >
Re: Very odd problem running squid 2.7 on Windows
strtoul(). But if you want to verify the -whole- thing is numeric, just write a bit of C which does this: int isNumeric(const char *str) { } 2009/5/25 Amos Jeffries : > Guido Serassio wrote: >> >> Hi, >> >> At 16.17 24/05/2009, Adrian Chadd wrote: >>> >>> Well as Amos said, this isn't the way to call getservbyname(). >>> >>> getservbyname() doesn't translate ports to ports; it translates >>> tcp/udp service names to ports. It should be returning NULL if it >>> can't find the service string in the file. >>> >>> Methinks numeric values shouldn't be handed to getservbyname() under >>> Windows. :) >> >> So, we have just found a Squid bug :-) >> >> Regards > > Yes. Question becomes though, fastest way to detect numeric-only strings. > > Amos > -- > Please be using > Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15 > Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1 > >
Re: Very odd problem running squid 2.7 on Windows
Well as Amos said, this isn't the way to call getservbyname(). getservbyname() doesn't translate ports to ports; it translates tcp/udp service names to ports. It should be returning NULL if it can't find the service string in the file. Methinks numeric values shouldn't be handed to getservbyname() under Windows. :) adrian 2009/5/24 Guido Serassio : > Hi, > > At 04.38 24/05/2009, Adrian Chadd wrote: >> >> Can you craft a small C program to replicate the behaviour? > > Sure, I wrote the following test program: > > #include > #include > > void main(void) > { > u_short i, converted; > WSADATA wsaData; > struct servent *port = NULL; > char token[32]; > const char proto[] = "tcp"; > > WSAStartup(2, &wsaData); > > for (i=1; i<65535; i++) > { > sprintf(token, "%d", i); > port = getservbyname(token, proto); > if (port != NULL) { > converted=ntohs((u_short) port->s_port); > if (i != converted) > printf("%d %d\n", i, converted); > } > } > WSACleanup(); > } > > And this is the result on my Windows XP x64 machine (similar results on > Windows 2000 and Vista): > > 2 512 > 258 513 > 524 3074 > 770 515 > 782 3587 > 1288 2053 > 1792 7 > 1807 3847 > 2050 520 > 2234 47624 > 2304 9 > 2311 1801 > 2562 522 > 2564 1034 > 2816 11 > 3328 13 > 3586 526 > 3853 3343 > 4352 17 > 4354 529 > 4610 530 > 4864 19 > 4866 531 > 5120 20 > 5122 532 > 5376 21 > 5632 22 > 5888 23 > 6400 25 > 7170 540 > 7938 543 > 8194 544 > 8706 546 > 8962 547 > 9472 37 > 10752 42 > 10767 3882 > 11008 43 > 11266 556 > 12054 5679 > 13058 563 > 13568 53 > 13570 565 > 13579 2869 > 14380 11320 > 14856 2106 > 15372 3132 > 15629 3389 > 16165 9535 > 16897 322 > 17920 70 > 18182 1607 > 18183 1863 > 19977 2382 > 20224 79 > 20233 2383 > 20480 80 > 20736 81 > 20738 593 > 21764 1109 > 22528 88 > 22550 5720 > 22793 2393 > 23049 2394 > 23809 349 > 24335 3935 > 25602 612 > 25856 101 > 25858 613 > 26112 102 > 27392 107 > 27655 1900 > 27904 109 > 28160 110 > 28416 111 > 28928 113 > 29952 117 > 30208 118 > 30222 3702 > 30464 119 > 31746 636 > 34049 389 > 34560 135 > 35072 137 > 35584 139 > 36106 2701 > 36362 2702 > 36608 143 > 36618 2703 > 36874 2704 > 37905 4500 > 38400 150 > 38919 1944 > 39173 1433 > 39426 666 > 39429 1434 > 39936 156 > 39945 2460 > 40448 158 > 42250 2725 > 43520 170 > 44806 1711 > 45824 179 > 45826 691 > 47383 6073 > 47624 2234 > 47873 443 > 47878 1723 > 48385 445 > 49166 3776 > 49664 194 > 49926 1731 > 50188 3268 > 50437 1477 > 50444 3269 > 50693 1478 > 51209 2504 > 52235 3020 > 53005 3535 > 53249 464 > 53510 1745 > 54285 3540 > 55309 3544 > 56070 1755 > 56579 989 > 56585 2525 > 56835 990 > 57347 992 > 57603 993 > 57859 994 > 58115 995 > 59397 1512 > 60674 749 > 62469 1524 > 62980 1270 > 64257 507 > 65040 4350 > > It seems that sometime (!!!) getservbyname() will incorrectly return > something ... > > Regards > > Guido > > >> adrian >> >> 2009/5/24 Guido Serassio : >> > Hi, >> > >> > One user has reported a very strange problem using cache_peer directive >> > on >> > 2.7 STABLE6 running on Windows: >> > >> > When using the following config: >> > >> > cache_peer 192.168.0.63 parent 3329 0 no-query >> > cache_peer rea.acmeconsulting.loc parent 3328 3130 >> > >> > the result is always: >> > >> > 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0 >> > 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent >> > rea.acmeconsulting.loc/13/3130 >> > >> > Very odd >> > >> > Debugging the code, I have found where is situated the problem. >> > >> > The following if GetService() from cache_cf.c: >> > >> > static u_short >> > GetService(const char *proto) >> > { >> > struct servent *port = NULL; >> > char *token = strtok(NULL, w_space); >> > if (token == NULL) { >> > self_destruct(); >> > return -1; /* NEVER REACHED */ >> > } >> > port = getservbyname(token, proto); >> > if (port != NULL) { >> > return ntohs((u_short) port->s_port); >> > } >> > return xatos(token); >> > } >> > >> > When the value of port->s_port is 3328, ntohs() always returns 13. >> > Other values seems to work fine. >> > >> > Any idea ? >> > >> > Regards >> > >> > Guido >> > >> > >> > >> > - >> > >> > Guido Serassio >> > Acme Consulting S.r.l. - Microsoft Certified Partner >> > Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY >> > Tel. : +39.011.9530135 Fax. : +39.011.9781115 >> > Email: guido.seras...@acmeconsulting.it >> > WWW: http://www.acmeconsulting.it/ >> > >> > > > > - > > Guido Serassio > Acme Consulting S.r.l. - Microsoft Certified Partner > Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY > Tel. : +39.011.9530135 Fax. : +39.011.9781115 > Email: guido.seras...@acmeconsulting.it > WWW: http://www.acmeconsulting.it/ > >
Re: Very odd problem running squid 2.7 on Windows
Can you craft a small C program to replicate the behaviour? adrian 2009/5/24 Guido Serassio : > Hi, > > One user has reported a very strange problem using cache_peer directive on > 2.7 STABLE6 running on Windows: > > When using the following config: > > cache_peer 192.168.0.63 parent 3329 0 no-query > cache_peer rea.acmeconsulting.loc parent 3328 3130 > > the result is always: > > 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0 > 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent > rea.acmeconsulting.loc/13/3130 > > Very odd > > Debugging the code, I have found where is situated the problem. > > The following if GetService() from cache_cf.c: > > static u_short > GetService(const char *proto) > { >struct servent *port = NULL; >char *token = strtok(NULL, w_space); >if (token == NULL) { >self_destruct(); >return -1; /* NEVER REACHED */ >} >port = getservbyname(token, proto); >if (port != NULL) { >return ntohs((u_short) port->s_port); >} >return xatos(token); > } > > When the value of port->s_port is 3328, ntohs() always returns 13. > Other values seems to work fine. > > Any idea ? > > Regards > > Guido > > > > - > > Guido Serassio > Acme Consulting S.r.l. - Microsoft Certified Partner > Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY > Tel. : +39.011.9530135 Fax. : +39.011.9781115 > Email: guido.seras...@acmeconsulting.it > WWW: http://www.acmeconsulting.it/ > >
Re: Is it really necessary for fatal() to dump core?
2009/5/19 Mark Nottingham : > I'm going to push back on that; the administrator doesn't really have any > need to get a core when, for example, append_domain doesn't start with .'. > > Squid.conf is bloated as it is; if there are cases where a core could be > conceivably useful, they should be converted to fatal_dump. From what I've > seen they'll be a small minority at best... Well, I'd be interested in seeing some better defined characteristics of "stuff" with some sort of defined expectations and behaviour. Like an API. :) Right now, fatal, assert, etc are all used interchangably for quite a wide variety of reasons and the codebase may be much better off if someone starts off by fixing these a bit. Adrian
Re: Is it really necessary for fatal() to dump core?
just make that behaviour configurable? core_on_fatal {on|off} Adrian 2009/5/19 Mark Nottingham : > tools.c:fatal() dumps core because it calls abort. > > Considering that the core can be quite large (esp. on a 64bit system), and > that there's fatal_dump() as well if you really want one, can we just make > fatal() exit(1) instead of abort()ing? > > Cheers, > > -- > Mark Nottingham m...@yahoo-inc.com > > >
Re: 3.0 assertion in comm.cc:572
2009/5/11 Amos Jeffries : > We have one user with a fairly serious production machine hitting this > assertion. > It's an attempted comm_read of closed FD after reconfigure. > > Nasty, but I think the asserts can be converted to a nop return. Does anyone > know of a subsystem that would fail badly after a failed read with all its > sockets and networking closed anyway? That will bite you later on if/when you wanted to move to support Windows overlapped IO / POSIX AIO style kernel async IO on network sockets. You don't want read's scheduled on FDs that are closed; nor do you want the FD closed during the execution of the read. Figure out what is scheduling a read / what is scheduling the completion incorrectly and fix the bug. Adrian
Re: Squid logs into MySQL database
G'day! Thanks for that. Would you like to have it included in Squid-2.HEAD (and thus in the next Squid-2.x release?) thanks, Adrian 2009/5/11 Visolve Squid Team : > Hi All, > > We have released an earlier version of an external program( plug-in ) to log > squid access to MySQL database using logfile_daemon feature in squid 2.7. > The plug-in is available at : > http://www.visolve.com/squid/squid-mysqllog.php > > Do send your comments for the improvement. > > Thanks, > ViSolve Squid Team. > >
/dev/poll solaris 10 fixes
I'm giving my /dev/poll (Solaris 10) code a good thrashing on some updated Sun hardware. I've fixed one silly bug of mine in 2.7 and 2.HEAD. If you're running Solaris 10 and not using the /dev/poll code then please try out the current CVS version(s) or wait for tomorrow's snapshots. I'll commit whatever other fixes are needed in this environment here :) Thanks, Adrian
Squid-2/Lusca async io shortcomings..
Hi all, I've been braindumping my thoughts into the Lusca blog during some experimental development to eliminate the data copy in the disk store read path. This shows up as the number 1 CPU abuser in my test CDN deployment - where I see a 99% hit rate on a set of large objects (> 16meg.) My first idea was to avoid having to paper over the storage code shortcomings with refcounted buffers, and modify various bits of code to keep the store supplied read buffer around until the completion of said read IO. This mirrors the requirements for various other underlying async io implementations such as posix AIO and windows completion IO. Unfortunately the store layer and the async IO code doesn't handle event cancellation right (ie, you can't do it) but the temporary read buffer in async_io.c + the callback data pointer check papers over that. Store reads and writes may be scheduled and in flight when some other part of code calls storeClose() and nothing really tries to wait around for the read IO to complete. So either the store layer needs to be made slightly more sane (which I may attempt later), or the whole mess can stay a mess and be papered over by abusing refcounted buffers all the way down to the IO layer. Anyway, I know there are other developers out there working on filesystem code for Squid-3 and I'm reasonably certain (read: at last check a few months ago) the store layer and IO layers are just as grimey - so hopefully my braindumping will save some more of you a whole lot of headache. :) Adrian
Re: Feature: quota control
Just to add to this - implementing it as a delay pool inside Squid flattens traffic into one byte pool. Various places may not do this at all - there may be "free" versus "non-free" (which means one set of ACLs inside Squid); there may be "cheap" versus "expensive" (again, possibly requiring multiple delay pools and multiple ACLs to map it all together; again all inside Squid) - things get very messy, very quickly. This is why my proposal (which I hope -finally- gets approved so I can begin work on it ASAP! :) involves passing off the traffic assignment to an external daemon that implements -all- of the traffic assignment and accounting logic. Squid will then just send requests for traffic and interim updates like you've said. 2c, Adrian 2009/2/26 Amos Jeffries : > Robert Collins wrote: >> >> On Fri, 2009-02-27 at 10:00 +1100, Mark Nottingham wrote: >>> >>> Honestly, if I wanted to do byte-based quotas today, I'd have an >>> external ACL helper talking to an external logging helper; that way, you >>> can just log the response sizes to a daemon and then another daemon would >>> use that information to make a decision at access time. The only even >>> mildly hard part about this is sharing state between the daemons, but if >>> you don't need the decisions to be real-time, it's not that bad (especially >>> considering that in any serious deployment, you'll have state issues >>> between multiple boxes anyway). >> >> Sure; I think that would fit with 'ensuring enough hooks' :P >> >> -Rob > > The brief description of what I gave Pieter to start with was: > > A pool based on DelayPools in that Squid decrements live as traffic goes > through. With a helper/ACL hook to retrieve the initial pool size and to > call as needed to check for current quotas. > > How the helper operates is not relevant to Squid. Thats important. > > The key things being that; its always called for new visitors to assign the > start quota, and when the quota is nearing empty its called again to see if > they get more. > > Helper would need to send back "UNITS AMOUNT MINIMUM" where UNITS is the > unit of quota (seconds, bytes, requests, misses?, other?), AMOUNT being a > integer count of units the client is allowed to use, and MINIMUM is the > level of units where the helper is to be asked for an update. > > 0 remaining units results in an Error page 'quota exceeded' or somesuch. > > Amos > -- > Please be using > Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13 > Current Beta Squid 3.1.0.5 > >
Re: Feature: quota control
I'm looking at implementing this as part of a contract for squid-2. I was going to take a different approach - that is, i'm not going to implement quota control or management in squid; I'm going to provide the hooks to squid to allow external controls to handle the "quota". adrian 2009/2/21 Pieter De Wit : > Hi Guys, > > I would like to offer my time in working on this feature - I have not done > any squid dev, but since I would like to see this feature in Squid, I > thought I would take it on. > > I have briefly contacted Amos off list and we agreed that there is no "set > in stone" way of doing this. I would like to propose that we then start > throwing around some ideas and let's see if we can get this into squid :) > > Some ideas that Amos quickly said : > > - "Based" on delay pools > - Use of external helpers to track traffic > > > The way I see this happening is that a Quota is like a pool that empties > based on 2 classes - bytes and requests. Requests will be for things like > the number of requests, i.e. a person is only allowed to download 5 exe's > per day or 5 requests of >1meg or something like that (it just popped into > my head :) ) > > Bytes is a pretty straight forward one, the user is only allowed x amount of > bytes per y amount of time. > > Anyways - let the ideas fly :) > > Cheers, > > Pieter > >
Resigning from squid-core
Hi all, It's been a tough decision, but I'm resigning from any further active role in the Squid core group and cutting back on contributing towards Squid development. I'd like to wish the rest of the active developers all the best in the future, and thank everyone here for helping me develop and test my performance and feature related Squid work. Adrian
Re: IRC Meetup logs up in the wiki
> Uhm, guess I go on holiday and miss out on EVERYTHING I got back on the > 17th and would have loved to attend had I the precence of mind to have > checked. :) Hey, someone got a holiday! Quick, he's relaxed enough now to work! :) > > Sorry guys. > > In other news I've got some new exposed counters for squid-2 performance - > will port to 3.1 and then submit for review. Also planning to extend > cachemgr to output in xml as alternative, will allow far simpler processing > and xsl transforms. Do you have the patches against Squid-2 available? adrian > > Extended cacti monitoring of all relevant bits is in process and will be > available soon. > > Regardt > >
Re: Ref-counted strings in Squid-2/Cacheboy
I'd like to avoid having to write to those pages if possible. Leaving the incoming data as read-only will save another write-back pass for those pages through the cache/bus, and in the case of tiny objects (ie, where parsing becomes a -big- part of the overhead), that may end up hurting. NUL terminated strings make iteration easier (you only need an address register and a check for 0) but current CPUs with their plenty-of-registers and superscalar execution mostly make that point moot. You can check, increment the pointer and decrement a length value pretty damned quickly. :) There aren't all that many places that assume C buffer semantics for String. Most of it isn't all that hairy (access_log, etc); some of it is only hairy because of the use of _C_ string library functions with String.buf() (ftp); the biggest annoyance is the vary code and the client-side code. Oh, and one has to copy the buffer anyway for regexp lookups (POSIX regex API requires a NUL terminated string), at least until we convert to PCRE which can and does take a length parameter to a regex run function. :) The point is, once you've been forced to tidy up the String users by removing the assumption that NUL will occur, you'll (hopefully) have been forced to write nicer replacement code, and everyone benefits from that. Adrian 2009/1/21 Henrik Nordstrom : > fre 2009-01-16 klockan 12:53 -0500 skrev Adrian Chadd: > >> So far, so good. It turns out doing this as an intermediary step >> worked out better than trying to replace the String code in its >> entirety with replacement code which doesn't assume NUL terminated >> strings. > > Just a thought, but is there really any parsing step where we can not > just overwrite the next octet with a \0 to get null-terminated strings? > This is what the parser does today, right? > > The HTTP parser certainly can in-place null-terminate everything. Header > names always ends with a : which we always throw away, and the data ends > with a newline which is also thrown away. > > Regards > Henrik > >
Re: Buffer/String split, take2
2009/1/21 Kinkie : > What I fear from the D&C approach is that we'll end up with lots of > duplicate code between the 'buffer' classes, to gain a tiny little bit > of efficiency and semantic clarity. If that approach has to be taken, > then I'd rather take the variant of the note - in fact that's quite in > line with what the current (agreeably ugly) code does. The trouble is that the current, agreeably ugly code, actually works (for values of "works") right now, and the last thing the project needs is for that "works" bit to be disturbed too much. > In my opinion the 'universal buffer' model can be adapted quite easily > to address different uses by extending its allocation strategy - it's > a self-contained function of code exactly for this purpose, and it > could be extended again by using Strategy patterns to do whatever the > caller wishes. It would be trivial for instance for users to request > that the underlying memory be allocated by the pageful, or to request > preallocation of a certain amount of memory if they know they'll be > using, etc. > Having a wide interface is a drawback of the Universal approach, But you don't know how that memory should be arranged. If its just for strings, then you know the memory should be arranged in whatever makes sense to minimise memory allocator overheads. In the parsing codepath, that involves parsing and creating references to an already-allocated large chunk of RAM, instead of copying into separately allocated areas. For things like disk IO (and later on, network IO too!) this may not be as obvious a case. In fact, based on the -provider- (anonymous? disk? network? some peer module?) you may want to request pages from -them- to put data into for various reasons, as simply grabbing an anonymous page from the system allocator and filling it with data may need -another- copy step later on. This is why I'm saying that right now, focusing on -just- the String stuff and the minimum required to do copy-free parsing and copying in and out of the store is probably the best bet. A "universal" buffer method is probably over-reaching things. There's a lot of code in Squid which needs tidying up and whatever we come up and -all- of it -has- to happen -regardless- of what buffer abstraction(s) we choose. > Regarding vector i/o, it's almost a no-brainer at a first glance: > given UniversalBuffer, implement UniversalBufferList and make MemBuf > use the latter to implement producer-consumer semantics. Then use this > for writev(). produce and consume become then extremely lightweight > calls. Let me remind you that currently MemBuf happily memmoves > contents at each consume, and other producer-consumer classes I could > find (BodyPipe and StoreEntry) are entirely different beasts, which > would benefit from having their interfaces changed to use > UniversalBuffers, but probably not their innards. And again, what I'm saying here is that a conservative, cautious approach now is likely to save a lot of risk in the development path forward. > Regarding Adrian's proposal, he and I discussed the issue extensively. > I don't agree with him that the current String will give us the best > long-term benefits. My expectation is (but we can only know after we > have at least some extensive use of it) that the cheap substringing > features of the current UniversalBuffer implementation will give us > substantial benefits in the long term. > I agree with him that fixing the most broken parts of the String > interface is a sensible strategy for merging whatever String > implementation we end up choosing. > I fear that if we focus too much on the long-term, we may end up > losing sight of the medium-term, and thus we risk reaching neither > because short-term noone does anything. EVERYONE keeps on asserting > that squid (2 and 3) has low-level issues to be fixed, yet at the same > time only Adrian does something in squid-2, and I feel I'm the only > one trying to do something in squid-3 - PLEASE correct me and prove me > wrong. *shrug* I think people keep choosing the wrong bits to bite off. I'm not specifically talking about you Kinkie, this certainly isn't the only instance where the problem isn't really fully understood. The problem in my eyes is that noone understands the entire Squid-3 codebase enough to start to understand what needs to happen and begin engineering an actual path forward. Everyone knows their little "corner" of the codebase. Squid-3 seems to be plagued by little mini-projects which focus on specific areas without much knowledge of how it all holds together, and all kinds of busted behaviour ensues. > There's another issue which worries me: the current implementation has > been in the works for 5 months; there have been two extensive reviews, > two half-rewrites and endless discussions. Now the issue crops up that > the basic design - whose blueprint has also been available for 5 > months in the wiki - is not good, and that we may end up having to > basical
Re: Buffer/String split, take2
2009/1/20 Alex Rousskov : > Please voice your opinion: which design would be best for Squid 3.2 and > the foreseeable future. [snip] I'm about 2/3rds of the way along the actual implementation path of this in Cacheboy so I can provide an opinion based on increasing amounts of experience. :) [Warning: long, somewhat rambly post follows, from said experience.] The thing I'm looking at right now is what buffer design is required to adequately handle the problem set. There's a few things which we currently do very stupidly in any Squid related codebase: * storeClientCopy - which Squid-2.HEAD and Cacheboy avoid the copy on, but it exposes issues (see below); * storeAppend - the majority of data coming -into- the cache (ie, anything from an upstream server, very applicable today for forward proxies, not as applicable for high-hit-rate reverse proxies) is still memcpy()'ed, and this can use up a whole lot of bus time; * creating strings - most strings are created during parsing; few are generated themselves, and those which are, are at least half static data which shouldn't be re-generated over and over and over again; * duplicating strings - httpHeaderClone() and friends - dup'ing happens quite often, and making it cheap for the read only copies which are made would be fantastic * later on, being able to use it for disk buffers, see below * later on, being able to properly use it for the memory cache, again see below The biggest problems I've hit thus far stem from the data pipeline from server -> memstore -> store client -> client side. At the moment, the storeClientCopy() call aggregates data across the 4k stmem page size (at least in squid-2/cacheboy, I think its still 4k in squid-3) and thus if your last access gave you half a page, your next access can get data from both the other half of the page and whatever is in the next buffer. Just referencing the stmem pages in 2.HEAD/Cacheboy means that you can (and do) end up with a large number of small reads from the memory store. You save on the referencing, but fail on the "work chunk size." You end up having to have a sensible reference counted buffer design -and- a vector list to operate on it with. The string type right now makes sense if it references a contiguous, linear block of memory (ie, a sub-region of a contig buffer). This is how its treated today. For almost all of the lifting inside Squid proper, that may be enough. There may however be a need later on for string-like and buffer-like operations on buffer -vectors- - for example, if you're doing some kind of content scanning over incoming data, you may wish to buffer your incoming data until you have enough data to match that string which is straddling two buffers - and the current APIs don't support it. Well, nothing in Squid supports it currently, but I think its worth thinking about for the longer term. Certainly though, I think that picking a sensible string API with absolutely no direct buffer access out of a few controlled areas (eg, translating a list of strings or list of buffers into an iovec for writev(), for example) is the way to go. That will equip Squid with a decent enough set of tools to start converting everything else which currently uses C strings over to using Squid Strings and eventually reap the benefits of the zero-cost string duplication. Ok, to summarise, and this may not exactly be liked by the majority of fellow developers: I think the benefits that augmenting/fixing the current SquidString API and tidying up all the bad places where its used right now is going to give you the maximum long-term benefit. There's a lot of legacy code right now which absolutely needs to be trashed and rewritten. I think the smartest path forward is to ignore 95% of the decision about deciding which buffering method to use for now, fix the current String API and all the code which uses it so its sensible (and fixing it so its "sensible" won't take long; fixing the code which uses it will take longer) and at that point the codebase will be in much better shape to decide which will be the better path forward. Now, just so people don't think I'm stirring trouble, I've gone through this myself in both a squid-2 branch and Cacheboy, and here's what I found: * there's a lot of code which uses C strings created from Strings; * there's a lot of code which init'ed strings from C strings, where the length was already known and thrown out; * there's a lot of code which init'ed strings from C strings which were once Strings; * there's even code which init's strings -from- a string, but only by using strBuf(s) (I'm pointing at the http header related code here, ugh) * all the stuff which directly accesses the string buffer code can and should be tossed, immediately - unfortunately there's a lot of it, the majority being in what I gather is very long-lived code in src/client_side.c (and what it became in squid-3) So what I'm sort of doing now in Cacheboy-head, combined with tidying up some of
Ref-counted strings in Squid-2/Cacheboy
I've just created a branch off of my Cacheboy tree and dumped in the first set of changes relating to ref-counted strings into it. They're not as useful and flexible as the end-goal we all want - specifically, this pass just creates ref counted NUL-terminated C strings, so creating references of regions of other strings / buffers isn't possible. But it does mean that duplicating header sets (ie, httpHeaderClone() I think?) becomes bloody cheap. The next move - removing the requirement for the NUL-termination - is slightly hairer, but still completely doable (and I've done it in a previous branch in sourceforge, so I know whats required.) Thats when the real benefits start to appear. So far, so good. It turns out doing this as an intermediary step worked out better than trying to replace the String code in its entirety with replacement code which doesn't assume NUL terminated strings. http://code.google.com/p/cacheboy/source/list?path=/branches/CACHEBOY_HEAD_strref This, and all the other gunk thats gone into cacheboy over the last few months during the reorganisation and tidyup, still mostly represents where I think Squid core codebase should have gone / should be going at the present time. Enjoy. :) Adrian
Re: [PATCH] WCCPv2 documentation and cleanup for bug 2404
Have you tested these changes against various WCCPv2 implementations? I do recall some structure definitions in the draft mis-matching the wide number of IOS versions out there, this is why I'm curious. Adrian 2009/1/10 Amos Jeffries : > This patch: > - adds a reference to each struct mentioning the exact draft > RFC section where that struct is defined. > - fixes sent mask structure fields to match draft. (bug 2404) > - removes two duplicate useless structs > > Submitting as a patch to give anyone interested time to double-check the > code changes. > > > As a result we are a step closer toward splitting the code into a separate > library. It's highlighted some of the WCCPv2 issues and a pathway forward > now clear: > - move types definitions to a protocol types header (wccp2_types.h ?) > - correct mangled definitions for generic use. including code in that. > - add capability handling > - add hash/mask service negotiation > - add sibling peer discovery through WCCP group details ?? > > > Amos > -- > Please be using > Current Stable Squid 2.7.STABLE5 or 3.0.STABLE11 > Current Beta Squid 3.1.0.3 >
Re: When can we make Squid using multi-CPU?
2009/1/8 Alex Rousskov : > SMP support has been earmarked for Squid v3.2 but there is currently not > enough resources to make it happen (AFAICT) so it may have to wait until > v3.3 or later. > > FWIW, I think that multi-core scalability in many environments would not > require another Squid rewrite, especially if initial support does not > have to do better than running multiple Squids. Well, people are already doing that where its suitable. Whats really missing for those sorts of setups is a simple(!) storage-only backend and some smarts in Squid to be able to push and pull stuff out of a shared storage backend, rather than relaying through it. The trouble, as I've found here, is if you're trying to aggregate a bunch of forward proxy squid instances on one box through one backend squid instance - all of a sudden you end up with lots of RAM wastage and things die at high loads with all the duplicate data floating around in socket buffers. :/ Adrian
Re: When can we make Squid using multi-CPU?
I've been looking into what would be needed to thread squid as part of my cacheboy squid-2 fork. Basically, I've been working on breaking out a bunch of the core code into libraries, which I can then check and verify are thread-safe. I can then use these bits in threaded code. My first goal was probably to break out the ACL and internal URL rewriter code into threads, but the current use of the callback data setup in Squid makes passing cbdata pointers into other threads quite uhm, "tricky". The basic problem is that although a given chunk of memory backing a cbdata pointer will remain valid for as long as the reference exists, the -data itself- may not be valid at any point. So if thread A creates a cbdata pointer and passes it into thread B to do something (say an ACL lookup), there's no way (at the moment) for thread B to guarantee at any/all points during its execution that the data in B will stay valid without a whole lot of pissing around with locking, which I'd absolutely like to avoid doing in a high performance network application even the apparent wonderful performance current hardware has w/ lots of locking. :) So for the time being, I'm looking at what would be needed for a basic inter-thread "batch" event/callback message queue, sort of like AsyncCalls in squid-3 but minus 100% of the legacy cruft; and then I'll see what kind of tasks can be pushed out to the threads. Hopefully a bunch of stuff can be easily pushed out to threads with a minimum amount of effort, such as some/all of the ACL lookups, some URL rewriting, some GZIP and other kind of basic content mainpulation, and the freakishly simple (comparitively) server-side HTTP code (src/http.c). But doing that requires making sure a bunch of the low level code is suitably re-enterant/thread-safe/etc, and this includes a -lot- of stuff (lib/, debug, logging, memory allocation, some statistics gathering, chunks of the HTTP parsing and packing routines, the packer routines, membufs, etc.) Thankfully (in Cacheboy) I've broken out almost all of the needed stuff into top-level libraries which can be independently audited for thread-happiness. There's just some loose ends which need tidying up. For example, almost all of the code in libhttp/ in cacheboy (ie, basic http header and header entry stuff, parsing, range request headers, cc, headers, etc) are thread-safe, but the functions -they- call (such as the base64 functions) use static buffers which may or may not be thread-safe. Stuff which calls the legacy non-safe inet_* routines, or perhaps the non thread-safe strtok() and other string.h functions, all need to be fixed. Threading the rest of it would take a lot, -lot- more time. A thread-aware storage backend (disk, memory, store index) is definitely an integral part of making a threaded Squid, and a whole lot more code modularity and reorganisation would have to take place for that to occur. Want to help? :) Adrian 2009/1/4 ShuXin Zheng : > I've ever do this to run multi-squid on one machine which can use multi-CPU, > but can't share the same store-fs, and must configure multi-IP on the same > machine. Can we rewrite squid as follow: > > thread0(client side, no block, can accept many connections) thread1 > ..threadn(n=CPU number) > | > | > v > v > access check > access check > | > | > v >v > http header parse > http header parse > | > | > v >v > acl filter > acl filter > | > | > v >v > check local cache > check local cache > | > | > v >v > --- > | > neighbor| ||-ufs > webserver--|--- forward - |store fs |-aufs >| | > |-coss > --- > |(thread0) > |(thread1) .. > v > v > ... > > > > > 2009/1/4 : >> >> I've found the best way is to run multiple copies of squid on a single >> machine, and use LVS to load balance between the squid processes. >> >> -- Joe >> >> Quoting Adrian Chadd : >> >>> when someone decides to either help code it up, or donate towards the >>> effort. >>> >>> >>> >>> adrian >>> >>> 2009/1/3 ShuXin Zheng : &g
Re: When can we make Squid using multi-CPU?
when someone decides to either help code it up, or donate towards the effort. adrian 2009/1/3 ShuXin Zheng : > Hi, Squid current can only use one CPU, but multi-CPU hardware > machines are so popular. These are so greatly wastely. How can we use > the multi-CPU? Can we separate some parallel sections which are CPU > wasting to run on different CPU? OMP(http://openmp.org/wp/) gives us > some thinking about using multi-CPU, so can we use these technology in > Squid? > > Thanks > > -- > zsxxsz > >
Re: Introductions
Welcome! 2008/12/30 Regardt van de Vyver : > Hi Dev Team. > > My name is Regardt van de Vyver, a technology enthusiast who tinkers with > squid on a regular basis. I've been involved in development for around 12 > years and am an active participant on numerous open source projects. > > Right now I'm focussed on improving and extending performance metrics for > squid, specifically related to SNMP and the cachemanager. > > I'd like to take a more active role in the coming year from a dev > perspective and feel the 1st step here is to at least get my butt onto the > dev mailing list ;-) > > I look forward to getting involved. > > Regards, > > Regardt van de Vyver > >
Re: Migrating debug code from src/ to src/debug/
Ok, besides the lacking build dependency on src/core and src/debug, I think the first round of changes are finished. That is, the ctx/debug routines and all that they depend on have been shuffled out of src/ and into src/core / src/debug as appropriate. I've pushed the changes to the launchpad URL mentioned previously. I'd like some feedback and some assistance figuring out how/where to convince src/Makefile.am that the two above directories are build prereqs for almost everything. There are a -lot- of build targets in that Makefile under Squid-3 and I'm not sure that I want to add to the mess in a naive way. Thanks, Adrian
src/debug.cc : amos?
Amos, whats this for in src/debug.cc ? //*AYJ:*/if (!Config.onoff.buffered_logs) fflush(debug_log); Adrian
Re: Migrating debug code from src/ to src/debug/
Would someone perhaps enlighten me why Squid-3 is trying to install src/SquidTime.h as part of some build rule, and why moving it out of the way (into src/core/) has resulted in "make install" completely failing? I'm having some real trouble understanding all of the gunk thats in the Squid-3 src/Makefile.am and its starting to give me a headache. Thanks, Adrian
Re: Migrating debug code from src/ to src/debug/
2008/12/18 Adrian Chadd : > I've begun fiddling with migrating the bulk of the debug code out of > src/ and into src/debug/; as per the source reorganisation wiki page. The next step is migrating some other stuff out and doing some API hiding hijinx of the debugging logfile code - a bunch of code directly frobs the debug log fd/filehandle for various nefarious purposes. Grr. The other next thing is to sort out where to put the SquidTime stuff, which is used by the debug code. I'll create "src/core" for now in my branch to put this random stuff; I'll worry about the final destination for it all later. I couldn't tease apart ctx and debug all that much in cacheboy (and I couldn't figure out how it should or may be done as an exercise either) so I'll just lump them together. Adrian
Re: X-Vary-Options support
2008/12/20 Mark Nottingham : > I agree. My impression was that it's pretty specific to their requirements, > not a good general solution. Well, I'm all ears about a slightly more flexible solution. I mean, this is an X-* header; we could simply document it as a Squid specific feature once a few basic concerns have been addressed, and leave nutting out the "right" solution to the IETF group. :) Adrian
Migrating debug code from src/ to src/debug/
I've begun fiddling with migrating the bulk of the debug code out of src/ and into src/debug/; as per the source reorganisation wiki page. The first step is to just relocate the syslog facility code out, which I've done. The next step is to break out the debug code which handles the actual debugging into src/debug/. The changes can be viewed at http://bazaar.launchpad.net/~adrian-squid-cache/squid/adrian_src_reorganise/ . I'll post again when I've finished the debug code shuffle so I can figure out the "right way" to submit the change request. Adrian
Re: [MERGE] Polish for ZPH Patch. Creating single-line config
(As I've forgotten my bundlebuggy login!) This patch actually starts breaking code out into src/ip versus -just- implementing the above modification to the zph code. Which I'm all for, but I think the commit line is misleading. Adrian 2008/12/12 Bundle Buggy : > Bundle Buggy has detected this merge request. > > For details, see: > http://bundlebuggy.aaronbentley.com/project/squid/request/%3C20081212080618.3EDD887DA3%40treenet.co.nz%3E > Project: Squid > >
Re: X-Vary-Options support
2008/12/17 Henrik Nordstrom : > I am a bit uneasy about adding features known to be flawed in design. > Once the header format is added it becomes hard to change. > > Sorry, don't remember the details right now what was flawed. See > archives for earlier discussion. >From the archives: * not handling q= values, and * not handling cookie names and specific attribute values, versus just arbitrary bits of the cookie (as I think it does now?) Is that it? I can take a look at the patch and see if I can extend it to support the cookie stuff at least.
X-Vary-Options support
Hi, I've got a small contract to get Squid going in front of a small group of Mediawiki servers and one of the things which needs adding is the X-Vary-Options support. So is there any reason whatsoever that it can't be committed to Squid-2.HEAD as-is, and at least backported (but not committed to start with) to squid-2.7? I remember the Wiki guys' issues wrt Variant purging, which I'm hoping Y! and Benno have sorted out, and I'm not looking to commit anything relating to that now - just the X-Vary-Options support. Thanks, Adrian
Re: Request for new round of SBuf review
Howdy, As most of you aren't aware, Kinkie, alex and I had a bit of a discussion about this on IRC rather than on the mailing list, so there's probably some other stuff which should be posted here. Kinkie, are you able to post some updated code + docs after our discussion? My main suggestion to Kinkie was to take his code and see how well it worked with some test use cases - the easiest and most relevant one being parsing HTTP requests and building HTTP replies. I think that a few test case implementations outside of the Squid codebase will be helpful in both understanding the issues which this sort of class is trying to solve. I would really be against integrating it into Squid mainline until we've all had a chance to play with it without being burdened by the rest of Squid. :) Adrian 2008/12/4 Kinkie <[EMAIL PROTECTED]>: > Hi all, > I feel that SBuf may just be complete enough to be considered a > viable replacement for SquidString, as a first step towards > integration. > I'd appreciate anyone's help in giving it a check to gather feedback > and suggestions. > > Doxygen documentation for the relevant classes is available at > http://eu.squid-cache.org/~kinkie/sbuf-docs/ , the code is at > lp:~kinkie/squid/stringng > (https://code.launchpad.net/~kinkie/squid/stringng). > > Thanks! > > -- >/kinkie > >
Re: The cache deny QUERY change... partial rollback?
2008/12/1 Henrik Nordstrom <[EMAIL PROTECTED]>: > After analyzing a large cache with significantly declining hit ratio > over the last months I have came to the conclusion that the removal of > cache deny QUERY can have a very negative impact on hit ratio, this due > to a number of flash video sites (youtube, google, various porno sites > etc) who include per-view unique query parameters in the URL and > responding with a cachable response. > > Because of this I suggest that we add back the cache deny rule in the > recommended config, but leave the refresh_pattern change as-is. > > People running reverse proxies or combating these cache busting sites > using store rewrites know how to change the cache rules, while many > users running general proxy servers are quite negatively impacted by > these sites if caching of query urls is allowed. Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? Are you able to put up some examples and statistics? I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Adrian
Re: omit to loop-forever processing some regex acls
G'day! If these are patches against Squid-2 then please put them into the Squid bugzilla so we don't lose them. There's a different process for Squid-3 submissions. Thanks! Adrian 2008/11/26 Matt Benjamin <[EMAIL PROTECTED]>: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > > > > - -- > > Matt Benjamin > > The Linux Box > 206 South Fifth Ave. Suite 150 > Ann Arbor, MI 48104 > > http://linuxbox.com > > tel. 734-761-4689 > fax. 734-769-8938 > cel. 734-216-5309 > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.7 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFJLYaAJiSUUSaRdSURCNBMAJ90xJm8VjlLJuubuxqi2drt8plR7QCdHXDs > zBhdg5Gf8JScY8BdXqMZf8I= > =Kd5i > -END PGP SIGNATURE- >
Re: Rv: Why not BerkeleyDB based object store?
I thought about it a while ago but i'm just out of time to be honest. Writing objects to disk only if they're popular or you need the RAM to handle concurrent accesses for large objects for some reason would probably way way improve disk performance as the amount of writing would drop drastically. Sponsorship for investigating and developing this is gladly accepted :) Adrian 2008/11/26 Mark Nottingham <[EMAIL PROTECTED]>: > Just a tangental thought; has there been any investigation into reducing the > amount of write traffic with the existing stores? > > E.g., establishing a floor for reference count; if it doesn't have n refs, > don't write to disk? This will impact hit rate, of course, but may mitigate > in situations where disk caching is desirable, but writing is the > bottleneck... > > > On 26/11/2008, at 9:14 AM, Kinkie wrote: > >> On Tue, Nov 25, 2008 at 10:23 PM, Pablo Rosatti >> <[EMAIL PROTECTED]> wrote: >>> >>> Amazon uses BerkeleyDB for several critical parts of its website. The >>> Chicago Mercatile Exchange uses BerkeleyDB for backup and recovery of its >>> trading database. And Google uses BerkeleyDB to process Gmail and Google >>> user accounts. Are you sure BerkeleyDB is not a good idea to replace the >>> Squid filesystems even COSS? >> >> Squid3 uses a modular storage backend system, so you're more than >> welcome to try to code it up and see how it compares. >> Generally speaking, the needs of a data cache such as squid are very >> different from those of a general-purpose backend storage. >> Among the other key differences: >> - the data in the cache has little or no value. >> it's important to know whether a file was corrupted, but it can >> always be thrown away and fetched from the origin server at a >> relatively low cost >> - workload is mostly writes >> a well-tuned forward proxy will have a hit-rate of roughly 30%, >> which means 3 writes for every read on average >> - data is stored in incremental chunks >> >> Given these characteristics, a long list of mechanisms database-like >> systems have such as journaling, transactions etc. are a waste of >> resources. >> COSS is explicitly designed to handle a workload of this kind. I would >> not trust any valuable data to it, but it's about as fast as it gets >> for a cache. >> >> IMHO BDB might be much more useful as a metadata storage engine, as >> those have a very different access pattern than a general-purpose >> cache store. >> But if I had any time to devote to this, my priority would be in >> bringing 3.HEAD COSS up to speed with the work Adrian has done in 2. >> >> -- >> /kinkie > > -- > Mark Nottingham [EMAIL PROTECTED] > > >
Re: Associating accesses with cache.log entries
I like the idea too. 2008/11/27 Kinkie <[EMAIL PROTECTED]>: > On Thu, Nov 27, 2008 at 4:21 AM, Mark Nottingham <[EMAIL PROTECTED]> wrote: >> I've been playing around with associating specific requests with the debug >> output they generate, with a simple patch to _db_print along these lines: >> >>if (Config.Log.accesslogs && Config.Log.accesslogs->logfile) { >> seqnum = LOGFILE_SEQNO(Config.Log.accesslogs->logfile); >>} >>snprintf(f, BUFSIZ, "%s %i| %s", >>debugLogTime(squid_curtime), >>seqnum, >>format); >> >> This leverages the sequence number that's available in custom access logs >> (%sn). >> >> It's really useful for debugging requests that are causing problems, etc; >> rather than having to correlate times and URLs, you can just correlate >> sequence numbers. It also makes it possible to automate debug output (which >> is the direction I want to take this in). > > Looks interesting to me. > >> beyond the obvious cleanup that needs to happen (e.g., outputting '-' or >> blank instead of 0 if there isn't an access log line associated, a few >> questions; >> >> * How do people feel about putting this in cache.log all the time? I don't >> think it'll break any scripts (there aren't many, and those that are tend to >> grep for specific phrases, rather than do an actual parse, AFAICT). Is the >> placement above appropriate? > > I'd avoid the | character, but apart from that it makes sense to me > >> * The sequence number mechanism doesn't guarantee uniqueness in the log >> file; if squid is started between rotates, it will reset the counters. Has >> fixing this been discussed? > > I don't think that uniqueness has much value, correlating seqnum with > the timestamp will address any uncertain cases. > >> * Is it reasonable to hardcode this to associate the numbers with the first >> configured access_log? >> >> * To make this really useful, it would be necessary to be able to trigger >> debug_options (or just all debugging) based upon an ACL match. However, this >> looks like it would require changing how debug is #defined. Any comments on >> this? > > YES! It's something I've been thinking about for some time. > Count me in. > > -- >/kinkie > >
Re: access_log acl not observing my_port
g'day! Just create a ticket in the Squid bugzilla and put the patch into there. Thanks for your contribution! Adrian 2008/11/13 Stephen Thorne <[EMAIL PROTECTED]>: > G'day, > > I've been looking into a problem we've observed where this situation > does not work as expected, this is in squid-2.7.STABLE4: > > acl direct myport 8080 > access_log /var/log/squid/direct_proxy.log common direct > > I did some tracing through the code and established that this chain of > events occurs: > httpRequestFree calls clientAclChecklistCreate calls aclChecklistCreate > > But aclChecklistCacheInit is the function that populates the > checklist->my_port, which is required for a myport acl to work, and it > isn't called. > > I have attached a patch that fixes this particular problem for me, which > simply calls aclChecklistCacheInit in clientAclChecklistCreate. > > -- > Regards, > Stephen Thorne > Development Engineer > NetBox Blue - 1300 737 060 > > Scanned by the NetBox from NetBox Blue > (http://netboxblue.com/) > > > Scanned by the NetBox from NetBox Blue > (http://netboxblue.com/) > >
delayed forwarding is in Squid-2.HEAD
G'day, I've just committed the delayed forwarding stuff into Squid-2.HEAD. Thanks, Adrian
Re: [PATCH] Check half-closed descriptors at most once per second.
2008/9/25 Alex Rousskov <[EMAIL PROTECTED]>: > This revision resurrects 1 check/sec limit, but hopefully with fewer > bugs. In my limited tests, CPU usage seems to be back to normal. Woo, thanks! > The DescriptorSet class has O(1) complexity for search, insertion, > and deletion. It uses about 2*sizeof(int)*MaxFD bytes. Splay tree that > used to store half-closed descriptors previously uses less RAM for small > number of descriptors but has O(log n) complexity. > > The DescriptorSet code should probably get its own .h and .cc files, > especially if it is going to be used by deferred reads. Could you do that sooner rather than later? I'd like to try using this code for deferred reads and delay pools. Thanks! Adrian
Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...
2008/9/24 Martin Langhoff <[EMAIL PROTECTED]>: > Good hint, thanks! If we did have such a control, what is the wired > memory that squid will use for each entry? In an email earlier I > wrote... sizeof(StoreEntry) per index entry, basically. > - Each index entry takes between 56 bytes and 88 bytes, plus > additional, unspecificed overhead. Is 1KB per entry a reasonable > conservative estimate? 1kb per entry is pretty conservative. The per-object overhead includes the StoreEntry, the couple of structures for the memory/disk replacement policies, plus the MD5 URL for the index hash, whatever other stuff hangs off MemObject for in-memory objects. You'll find that the RAM requirements grow a bit more for things like in-memory cache objects as the full reply headers stay in memory, and are copied whenever anyone wants to request it. > - Discussions about compressing or hashing the URL in the index are > recurrent - is the uncompressed URL there? That means up to 4KB per > index entry? The uncompressed URL and headers are in memory during: * request/reply handling * in-memory object; (objects with MemObject's allocated); on-disk entries just have the MD5 URL hash per StoreEntry. HTH, Oh, and I'll be in the US from October for a few months; I can always do a side-trip out to see you guys if there's enough interest. Adrian
Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...
2008/9/23 Martin Langhoff <[EMAIL PROTECTED]>: > Any way we can kludge our way around it for the time being? Does squid > take any signal that gets it to shed its index? It'd be pretty trivial to write a few cachemgr hooks to implement that kind of behaviour. 'flush memory cache', 'flush disk cache entirely', etc. The trouble is that the index is -required- at the moment for the disk cache. if you flush the index you flush the disk cache entirely. >> There's no "hard limit" for squid and squid (any version) handles >> memory allocation failures very very poorly (read: crashes.) > > Is it relatively sane to run it with a tight rlimit and restart it > often? Or just monitor it and restart it? It probably won't like that very much if you decide to also use disk caching. >> You can limit the amount of cache_mem which limits the memory cache >> size; you could probably modify the squid codebase to start purging >> objects at a certain object count rather than based on the disk+memory >> storage size. That wouldn't be difficult. > > Any chance of having patches that do this? I could probably do that in a week or so once I've finished my upcoming travel. Someone could try beating me to it.. > >> The big problem: you won't get Squid down to 24meg of RAM with the >> current tuning parameters. Well, I couldn't; and I'm playing around > > Hmmm... > >> with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb >> RAM.) Its something which will require quite a bit of development to >> "slim" some of the internals down to scale better with restricted >> memory footprints. Its on my personal TODO list (as it mostly is in >> line with a bunch of performance work I'm slowly working towards) but >> as the bulk of that is happening in my spare time, I do not have a >> fixed timeframe at the moment. > > Thanks for that -- at whatever pace, progress is progress. I'll stay > tuned. I'm not on squid-devel, but generally interested in any news on > this track; I'll be thankful if you CC me or rope me into relevant > threads. Ok. > Is there interest within the squid dev team in moving towards a memory > allocation model that is more tunable and/or relies more on the > abilities of modern kernels to do memory mgmt? Or an alternative > approach to handle scalability (both down to small devices and up to > huge kit) more dynamically and predictably? You'll generally find the squid dev team happy to move in whatever directions make sense. The problem isn't direction as so much as the coding to make it happen. Making Squid operate well in small memory footprints turns out to be quite relevant to higher performance and scalability; the problem is in the "doing". I'm hoping to start work on some stuff to reduce the memory footprint in my squid-2 branch (cacheboy) once the current round of IPv6 preparation is completed and stable. The developers working on Squid-3 are talking about similar stuff. Adrian
Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...
G'day, I've looked into this a bit (and have a couple of OLPC laptops to do testing with) and .. well, its going to take a bit of effort to make squid "fit". There's no "hard limit" for squid and squid (any version) handles memory allocation failures very very poorly (read: crashes.) You can limit the amount of cache_mem which limits the memory cache size; you could probably modify the squid codebase to start purging objects at a certain object count rather than based on the disk+memory storage size. That wouldn't be difficult. The big problem: you won't get Squid down to 24meg of RAM with the current tuning parameters. Well, I couldn't; and I'm playing around with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb RAM.) Its something which will require quite a bit of development to "slim" some of the internals down to scale better with restricted memory footprints. Its on my personal TODO list (as it mostly is in line with a bunch of performance work I'm slowly working towards) but as the bulk of that is happening in my spare time, I do not have a fixed timeframe at the moment. Adrian 2008/9/23 Martin Langhoff <[EMAIL PROTECTED]>: > Hi! > > I am working on the School Server (aka XS: a Fedora 9 spin, tailored > to run on fairly limited hw), I'm preparing the configuration settings > for it. It's a somewhat new area for me -- I've setup Squid before on > mid-range hardware... but this is... different. > > So I'm interested in understanding more aobut the variables affecting > memory footprint and how I can set a _hard limit_ on the wired memory > that squid allocates. > > In brief: > > - The workload is relatively "light" - 3K clients is the upper bound. > > - The XS will (in some locations) be hooked to *very* unreliable > power... uncontrolled shutdowns are the norm. Is this ever a problem with > Squid? > > - After a bad shutdown, graceful recovery is the most important > aspect. If a few cached items are lost, we can cope... > > - The XS hardware runs many services (mostly webbased), so Squid gets > only a limited slice of memory. To make matters worse, I *really* > don't want the core working set (Squid, Pg, Apache/PHP) to get paged > out. So I am interested in pegging the max memory Squid will take to itself. > > - The XS hw is varied. In small schools it may have 256MB RAM (likely > to be running on XO hardware + usb-connected ext hard-drive). > Medium-to-large schools will have the recommended 1GB RAM and a cheap > SATA disk. A few very large schools will be graced with more RAM (2 or > 4GB). > > .. so RAM allocation for Squid will prob range between 24MB at the > lower-end and 96MB at the 1GB "recommended" RAM. > > My main question is: how would you tune Squid 3 so that > > - it does not allocate directly more than 24MB / 96MB? (Assume that > the linux kernel will be smart about mmapped stuff, and aggressive > about caching -- I am talking about the memory Squid will claim to > itself). > > - still gives us good thoughput? :-) > > > > So far Google has turned up very little info, and it seems to be > rather old. What I've found can be summarised as follows: > > - The index is malloc'd, so the number of entries in the index will > be the dominant concern WRT memory footprint. > > - Each index entry takes between 56 bytes and 88 bytes, plus > additional, unspecificed overhead. Is 1KB per entry a reasonable > conservative estimate? > > - Discussions about compressing or hashing the URL in the index are > recurrent - is the uncompressed URL there? That means up to 4KB per > index entry? > > - The index does nto seem to be mmappable or otherwise > > We can rely on the (modern) linux kernel doing a fantastic job at > caching disk IO and shedding those cached entries when under memory > pressure, so I am likely to set Squid's own cache to something really > small. Everything I read points to the index being my main concern - > is there a way to limit (a) the total memory the index is allowed to > take or (b) the number of index entries allowed? > > Does the above make sense in general? Or am I barking up the wrong tree? > > > cheers, > > > > martin > -- > [EMAIL PROTECTED] > [EMAIL PROTECTED] -- School Server Architect > - ask interesting questions > - don't get distracted with shiny stuff - working code first > - http://wiki.laptop.org/go/User:Martinlanghoff > ___ > Server-devel mailing list > [EMAIL PROTECTED] > http://lists.laptop.org/listinfo/server-devel > >
Re: Strategy
"only focus" should really have been "our main focus at that short period of time", not "the only thing we care about." Sheesh. :P Adrian 2008/9/22 Alex Rousskov <[EMAIL PROTECTED]>: > On Mon, 2008-09-22 at 10:36 +0800, Adrian Chadd wrote: >> Put this stuff on hold, get Squid-3.1 out of the way, sort out the >> issues surrounding that before you start throwing more code into >> Squid-3 trunk, and -then- have this discussion. > > If "this stuff" is WordList, then "put this stuff on hold" is my > suggestion as well. > > If "this stuff" is String, then I think the basic design choices can be > discussed now, but waiting is even better for me, so I am happy to > follow your suggestion :-). > > If "this stuff" is how we improve "teamwork", then I am happy to > continue any _constructive_ discussions since releasing 3.1 can benefit > from teamwork as well. > >> We can sort this stuff out in a short period of time if its our only focus. > > The only focus? You must be dreaming :-). > > Alex. > > >> 2008/9/22 Amos Jeffries <[EMAIL PROTECTED]>: >> >> On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote: >> >>> Alex Rousskov wrote: >> >>> >> >>> > * Look for simpler warts with localized impact. We have plenty of them >> >>> > and your energy would be well spent there. If you have a choice, do >> >>> not >> >>> > try to improve something as fundamental and as critical as String. >> >>> > Localized single-use code should receive a lot less scrutiny than >> >>> > fundamental classes. >> >>> > >> >>> >> >>> Agreed, but that said. If you kinkie, picking oe of the hard ones causes >> >>> a thorough discussion, as String has, and comes up with a good API. That >> >>> not just a step in the rght direction but a giant leap. And worth doing >> >>> if you can spare the time (months in some cases). >> >>> The follow on effects will be better and easier code in other areas >> >>> depending on it. >> >> >> >> Amos, >> >> >> >> I think the above work-long-enough-and-you-will-make-it analysis and >> >> a few other related comments do not account for one important factor: >> >> cost (and the limited resources this project has). Please compare the >> >> following estimates (all numbers are very approximate, of course): >> >> >> >> Kinkie's time to draft a String class: 2 weeks >> >> Kinkie's time to fix the String class: 6 weeks >> >> Reviewers' time to find bugs and >> >> convince Kinkie that they are bugs: 2 weeks >> >> Total: 10 weeks >> >> >> >> Reviewer's time to write a String class: 3 weeks >> >> Total: 3 weeks >> >> >> > >> > Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the >> > reviewers gain 1 week themselves. So I stand by, if he feels strongly >> > enough to do it. >> > >> >> If you add to the above that one reviewer cannot review and work on >> >> something else at the same time, the waste goes well above 200%. >> > >> > Which is wrong. We can review one thing and work on another project. >> > >> >> >> >> Compare the above with a regular project that does not require writing >> >> complex or fundamental classes (again, numbers are approximate): >> >> >> >> Kinkie's time to complete a regular project: 1 week >> >> Reviewer's time to complete a regular project: 1 week >> > >> > After which both face the hard project again. Which remains hard and could >> > have cut off 5 days of the regular project. >> > >> >> >> >> If we want Squid code to continue to be a playground for half-finished >> >> code and ideas, then we should abandon the review process. Let's just >> >> commit everything that compiles and that the committer is happy with. >> > >> > I assume you are being sarcastic. >> > >> >> Otherwise, let's do our best to find a project for everyone, without >> >> sacrificing the quality of the output or wasting resources. For example, >> >> if a person wants String to implement his pet project, but cannot make a
Re: Strategy
And in the meantime, if someone (eg kinkie) wants to work on this stuff some more, I suggest sitting down and writing some of the support code which would use it. Write a HTTP parser, HTTP response builder, do some benchmaking, perhaps glue it to something like libevent or some other comm framework and do some benchmarking there. See how it performs, how it behaves, see if it does everything y'all want cleanly. _Then_ have this discussion. Adrian 2008/9/22 Adrian Chadd <[EMAIL PROTECTED]>: > Put this stuff on hold, get Squid-3.1 out of the way, sort out the > issues surrounding that before you start throwing more code into > Squid-3 trunk, and -then- have this discussion. > > We can sort this stuff out in a short period of time if its our only focus. > > > > Adrian > > 2008/9/22 Amos Jeffries <[EMAIL PROTECTED]>: >>> On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote: >>>> Alex Rousskov wrote: >>>> >>>> > * Look for simpler warts with localized impact. We have plenty of them >>>> > and your energy would be well spent there. If you have a choice, do >>>> not >>>> > try to improve something as fundamental and as critical as String. >>>> > Localized single-use code should receive a lot less scrutiny than >>>> > fundamental classes. >>>> > >>>> >>>> Agreed, but that said. If you kinkie, picking oe of the hard ones causes >>>> a thorough discussion, as String has, and comes up with a good API. That >>>> not just a step in the rght direction but a giant leap. And worth doing >>>> if you can spare the time (months in some cases). >>>> The follow on effects will be better and easier code in other areas >>>> depending on it. >>> >>> Amos, >>> >>> I think the above work-long-enough-and-you-will-make-it analysis and >>> a few other related comments do not account for one important factor: >>> cost (and the limited resources this project has). Please compare the >>> following estimates (all numbers are very approximate, of course): >>> >>> Kinkie's time to draft a String class: 2 weeks >>> Kinkie's time to fix the String class: 6 weeks >>> Reviewers' time to find bugs and >>> convince Kinkie that they are bugs: 2 weeks >>> Total: 10 weeks >>> >>> Reviewer's time to write a String class: 3 weeks >>> Total: 3 weeks >>> >> >> Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the >> reviewers gain 1 week themselves. So I stand by, if he feels strongly >> enough to do it. >> >>> If you add to the above that one reviewer cannot review and work on >>> something else at the same time, the waste goes well above 200%. >> >> Which is wrong. We can review one thing and work on another project. >> >>> >>> Compare the above with a regular project that does not require writing >>> complex or fundamental classes (again, numbers are approximate): >>> >>> Kinkie's time to complete a regular project: 1 week >>> Reviewer's time to complete a regular project: 1 week >> >> After which both face the hard project again. Which remains hard and could >> have cut off 5 days of the regular project. >> >>> >>> If we want Squid code to continue to be a playground for half-finished >>> code and ideas, then we should abandon the review process. Let's just >>> commit everything that compiles and that the committer is happy with. >> >> I assume you are being sarcastic. >> >>> Otherwise, let's do our best to find a project for everyone, without >>> sacrificing the quality of the output or wasting resources. For example, >>> if a person wants String to implement his pet project, but cannot make a >>> good String, it may be possible to trade String implementation for a few >>> other pet projects that the person can do. >> >> Then that trade needs to be discussed with the person before they start. >> I get the idea you are trying to manage this FOSS like you would a company >> project. That approach has been tried and failed miserably in FOSS. >> >>> This will not be smooth and >>> easy, but it is often doable because most of us share the goal of making >>> the best open source proxy. >>> >>>> > * When assessing the impact of your changes, do not just compare the &g
Re: Strategy
Put this stuff on hold, get Squid-3.1 out of the way, sort out the issues surrounding that before you start throwing more code into Squid-3 trunk, and -then- have this discussion. We can sort this stuff out in a short period of time if its our only focus. Adrian 2008/9/22 Amos Jeffries <[EMAIL PROTECTED]>: >> On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote: >>> Alex Rousskov wrote: >>> >>> > * Look for simpler warts with localized impact. We have plenty of them >>> > and your energy would be well spent there. If you have a choice, do >>> not >>> > try to improve something as fundamental and as critical as String. >>> > Localized single-use code should receive a lot less scrutiny than >>> > fundamental classes. >>> > >>> >>> Agreed, but that said. If you kinkie, picking oe of the hard ones causes >>> a thorough discussion, as String has, and comes up with a good API. That >>> not just a step in the rght direction but a giant leap. And worth doing >>> if you can spare the time (months in some cases). >>> The follow on effects will be better and easier code in other areas >>> depending on it. >> >> Amos, >> >> I think the above work-long-enough-and-you-will-make-it analysis and >> a few other related comments do not account for one important factor: >> cost (and the limited resources this project has). Please compare the >> following estimates (all numbers are very approximate, of course): >> >> Kinkie's time to draft a String class: 2 weeks >> Kinkie's time to fix the String class: 6 weeks >> Reviewers' time to find bugs and >> convince Kinkie that they are bugs: 2 weeks >> Total: 10 weeks >> >> Reviewer's time to write a String class: 3 weeks >> Total: 3 weeks >> > > Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the > reviewers gain 1 week themselves. So I stand by, if he feels strongly > enough to do it. > >> If you add to the above that one reviewer cannot review and work on >> something else at the same time, the waste goes well above 200%. > > Which is wrong. We can review one thing and work on another project. > >> >> Compare the above with a regular project that does not require writing >> complex or fundamental classes (again, numbers are approximate): >> >> Kinkie's time to complete a regular project: 1 week >> Reviewer's time to complete a regular project: 1 week > > After which both face the hard project again. Which remains hard and could > have cut off 5 days of the regular project. > >> >> If we want Squid code to continue to be a playground for half-finished >> code and ideas, then we should abandon the review process. Let's just >> commit everything that compiles and that the committer is happy with. > > I assume you are being sarcastic. > >> Otherwise, let's do our best to find a project for everyone, without >> sacrificing the quality of the output or wasting resources. For example, >> if a person wants String to implement his pet project, but cannot make a >> good String, it may be possible to trade String implementation for a few >> other pet projects that the person can do. > > Then that trade needs to be discussed with the person before they start. > I get the idea you are trying to manage this FOSS like you would a company > project. That approach has been tried and failed miserably in FOSS. > >> This will not be smooth and >> easy, but it is often doable because most of us share the goal of making >> the best open source proxy. >> >>> > * When assessing the impact of your changes, do not just compare the >>> old >>> > code with the one submitted for review. Consider how your classes >>> stand >>> > on their own and how they _will_ be used. Providing a poor but >>> > easier-to-abuse interface is often a bad idea even if that interface >>> is, >>> > in some aspects, better than the old hard-to-use one. >>> > >>> >> Noone else is tackling the issues that I'm working on. Should they be >>> >> left alone? Or should I aim for the "perfect" solution each time? >>> >>> Perfect varies, and will change. As the baseline 'worst' code in Squid >>> improves. The perfect API this year may need changing later. Aim for the >>> best you can find to do, and see if its good enough for inclusion. >> >> Right. The problems come when it is not good enough, and you cannot fix >> it on your own. I do not know how to avoid these ugly situations. > > Teamwork. Which I thought we were starting to get in the String API after > earlier attempts at solo by whoever wrote SquidString and myself on the > BetterString mk1, mk2, mk3. > > I doubt any of us could do a good job of something so deep without help. > Even you needed Henrik to review and find issues with AsyncCalls, maybe > others I don't know about before that. > > The fact remains these things NEED someone to kick us into a team and work > on it. > >> >>> for example, Alex had no issues with wordlist when it first came out. >> >> This
Re: [MERGE] Connection pinning patch
2008/9/22 Alex Rousskov <[EMAIL PROTECTED]>: > > It would help if there was a document describing what connection pinning > is and what are the known pitfalls. Do we have such a document? Is RFC > 4559 enough? I'll take another read. I think we should look at documenting these sorts of features somewhere else though. > If not, Christos, can you write one and have Adrian and others > contribute pitfalls? It does not have to be long -- just a few > paragraphs describing the basics of the feature. We can add that > description to code documentation too. I'd be happy to help troll over the 2.X code and see what its doing. Henrik and Steven know the code better than I do; I've just spent some time figuring out how it interplays with load balancing to peers and such. > ICAP and eCAP do not care about HTTP connections or custom headers. Is > connection pinning more than connection management via some custom > headers? Nope; it just changes the semantics a little and some code may assume things work a certain way. > Sine NTLM authentication forwarding appears to be a required feature for > many and since connection pinning patch is not trivial (but is not huge > either), I would rather see it added now (after the proper review > process, of course). It could be the right icing on 3.1 cake for many > users. I do realize that, like any 900-line patch, it may cause problems > even if it is reviewed and tested. *nodnod* I'm just making sure the reasons for pushing it through are recorded somewhere during the process. Adrian
Re: [MERGE] Connection pinning patch
Its a 900-odd line patch; granted, a lot of it is boiler plate for config parsing and management, but I recall the issues connection pinning had when it was introduced and I'd hate to try and be the one debugging whatever crazy stuff pops up in 3.1 combined with the changes to the workflow connection pinning introduces. I don't pretend to completely understand the implications for ICAP either. Is there any documentation for how connection pinning should behave with ICAP and friends? Is there any particular rush to get this in for this release at such a late point in the release cycle? Could we hold off of it until the next release, and just focus on getting whats currently in 3.HEAD released and stable? Adrian 2008/9/21 Tsantilas Christos <[EMAIL PROTECTED]>: > Hi all, > > This patch fixes the bug 1632 > (http://www.squid-cache.org/bugs/show_bug.cgi?id=1632) > It is based on the original squid2.5 connection pinning patch developed by > Henrik (http://devel.squid-cache.org/projects.html#pinning) and the related > squid 2.6 connection pinning code. > > Although I spend many hours looking on pined connections I am still not > absolutely sure that does not have bugs. However the code is very similar > with this in squid2.6 (where the pinning code runs for years) and I hope > will be easy to fix problems and bugs. > > Regards, >Christos >
Re: SBuf review
2008/9/19 Amos Jeffries <[EMAIL PROTECTED]>: > I kind of fuzzily disagree, the point of this is to replace MemBuf + String > with SBuf. Not implement both again independently duplicating stuff. I'll say it again - ignore MemBuf. Ignore MemBuf for now. Leave it as a NUL-terminated dynamic buffer with some printf append like semantics. When you've implemented a non-NUL-terminated ref-counted memory region implementation and you layer some basic strings semantics on top of it, you can slowly convert or eliminate the bulk of the MemBuf users over. You're going to find plenty of places where the string handling is plain old horrible. Don't try to cater for those situations with things like "NULL strings". I tried that, its ugly. Aim to implement something which'll cater to something narrow to begin with - like parsing HTTP headers - and look to -rewrite- larger parts of the code later on. Don't try to invent things which will somehow seamlessly fit into the existing code and provide the same semantics. Some of said semantics is plain shit. I still don't get why this is again becoming so freakishly complicated. Adrian
Re: [MERGE] WCCPv2 Config Cleanup
2008/9/13 Amos Jeffries <[EMAIL PROTECTED]>: > This one was easy and isolated, so I went and did it early. > It's back-compatible, so people don't have to use the new names if they > like. But its clearer for the newbies until the big cleanup you mention > below is stable. Well, the newbies still need to know about the different kinds of redirection/assignment methods; what would be nice is if it were mostly autonegotiated per-host per-service group, and if wccp2d could setup/teardown the GRE tunnels as required. >> The WCCPv2 stuff works fine (for what it does); it could do with some >> better documentation but what it really needs is to be broken out from >> Squid itself and run as a seperate daemon. >> > > I've been waiting most of a year for your work on that direction in Squid-2 > to be ported over. There does not appear to be any sign of it happening in > time for 3.1. > The rest of us are largely concentrating on cleaning other components. I still haven't done all that much with the WCCPv2 stuff yet. I'll be breaking out the source code in Cacheboy after I finish the next set of IPv6 changes; the wccp2d code will then use the config registry type stuff we've discussed and reuse the core code for comms, debugging, logging, etc. Adrian
Re: squid-2.HEAD:
have you dumped this into bugzilla? Thanks! 2008/9/3 Alexander V. Lukyanov <[EMAIL PROTECTED]>: > Hello! > > I have noticed lots of 'impossible keep-alive' messages in the log. > It appears that httpReplyBodySize incorrectly returns -1 for "304 Not > Modified" replies. Patch to fix it is attached. > > -- > Alexander. >
Re: squid-2.HEAD: fwdComplete/Fail before comm_close
Hiya, Could you please verify this is still a problem in the latest 2.HEAD and if so lodge a bugzilla bug report with the patch? Thanks! Adrian 2008/8/5 Alexander V. Lukyanov <[EMAIL PROTECTED]>: > Hello! > > Some time ago I had core dumps just after these messages: >Short response from ... >httpReadReply: Excess data from ... > > I beleave this patch fixes these problems. > > Index: http.c > === > RCS file: /squid/squid/src/http.c,v > retrieving revision 1.446 > diff -u -p -r1.446 http.c > --- http.c 25 Jun 2008 22:11:20 - 1.446 > +++ http.c 5 Aug 2008 06:05:29 - > @@ -755,6 +757,7 @@ httpAppendBody(HttpStateData * httpState > /* Is it a incomplete reply? */ > if (httpState->chunk_size > 0) { >debug(11, 2) ("Short response from '%s' on port %d. Expecting %" > PRINTF_OFF_T " octets more\n", storeUrl(entry), comm_local_port(fd), > httpState->chunk_size); > + fwdFail(httpState->fwd, errorCon(ERR_INVALID_RESP, HTTP_BAD_GATEWAY, > httpState->fwd->request)); >comm_close(fd); >return; > } > @@ -774,6 +777,7 @@ httpAppendBody(HttpStateData * httpState >("httpReadReply: Excess data from \"%s %s\"\n", >RequestMethods[orig_request->method].str, >storeUrl(entry)); > + fwdComplete(httpState->fwd); >comm_close(fd); >return; > } > >
Re: squid-2.HEAD: storeCleanup and -F option (foreground rebuild)
I've committed a slightly modified version of this - store_rebuild.c r1.80 . Take a look and see if it works for you. Thanks! Adrian 2008/8/5 Alexander V. Lukyanov <[EMAIL PROTECTED]>: > Hello! > > I use squid in transparent mode, so I don't want degraded performance > while rebuilding and cleanup. Here is a patch I use to make storeCleanup > do all the work at once before squid starts processing requests, when > -F option is specified on command line. > > Index: store_rebuild.c > === > RCS file: /squid/squid/src/store_rebuild.c,v > retrieving revision 1.80 > diff -u -p -r1.80 store_rebuild.c > --- store_rebuild.c 1 Sep 2007 23:09:32 - 1.80 > +++ store_rebuild.c 5 Aug 2008 05:51:43 - > @@ -68,7 +68,8 @@ storeCleanup(void *datanotused) > hash_link *link_ptr = NULL; > hash_link *link_next = NULL; > validnum_start = validnum; > -while (validnum - validnum_start < 500) { > +int limit = opt_foreground_rebuild ? 1 << 30 : 500; > +while (validnum - validnum_start < limit) { >if (++bucketnum >= store_hash_buckets) { >debug(20, 1) (" Completed Validation Procedure\n"); >debug(20, 1) (" Validated %d Entries\n", validnum); > @@ -147,8 +148,8 @@ storeRebuildComplete(struct _store_rebui > debug(20, 1) (" Took %3.1f seconds (%6.1f objects/sec).\n", dt, >(double) counts.objcount / (dt > 0.0 ? dt : 1.0)); > debug(20, 1) ("Beginning Validation Procedure\n"); > -eventAdd("storeCleanup", storeCleanup, NULL, 0.0, 1); > safe_free(RebuildProgress); > +storeCleanup(0); > } > > /* > >
Re: [MERGE] WCCPv2 Config Cleanup
Amos, why are you pushing through changes to the WCCP configuration stuff at this point in the game? The WCCPv2 stuff works fine (for what it does); it could do with some better documentation but what it really needs is to be broken out from Squid itself and run as a seperate daemon. Adrian 2008/9/13 Henrik Nordstrom <[EMAIL PROTECTED]>: > With the patch the code uses WCCP2_METHOD_.. in some places (config > parsing/dumping) and the context specific ones in other places. This is > even more confusing. > > Very minor detail in any case. > > > On lör, 2008-09-13 at 09:49 +0800, Adrian Chadd wrote: >> The specification defines them as separate entities and using them in >> this fashion makes it clearer for people working on the code. >> >> >> >> Adrian >> >> 2008/9/13 Henrik Nordstrom <[EMAIL PROTECTED]>: >> > On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote: >> > >> >> +#define WCCP2_FORWARDING_METHOD_GRE WCCP2_METHOD_GRE >> >> +#define WCCP2_FORWARDING_METHOD_L2 WCCP2_METHOD_L2 >> > >> >> +#define WCCP2_PACKET_RETURN_METHOD_GRE WCCP2_METHOD_GRE >> >> +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2 >> > >> > Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are >> > the same value? >> > >> > Regards >> > Henrik >> > >> > > >
Re: [MERGE] WCCPv2 Config Cleanup
The specification defines them as separate entities and using them in this fashion makes it clearer for people working on the code. Adrian 2008/9/13 Henrik Nordstrom <[EMAIL PROTECTED]>: > On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote: > >> +#define WCCP2_FORWARDING_METHOD_GRE WCCP2_METHOD_GRE >> +#define WCCP2_FORWARDING_METHOD_L2 WCCP2_METHOD_L2 > >> +#define WCCP2_PACKET_RETURN_METHOD_GRE WCCP2_METHOD_GRE >> +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2 > > Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are > the same value? > > Regards > Henrik > >
Australian Development Meetup 2008 - Notes
G'day, I've started publishing the notes from the presentations and developer discussions that we held at the Yahoo!7 offices last month. You can find them at http://www.squid-cache.org/Conferences/AustraliaMeeting2008/ . I'm going to try and make sure any further mini-conferences/discussions/etc which happen go up there so people get more of an idea of whats going on. Who knows, eventually there may be enough interest to hold a reasonably formal Squid conference somewhere.. :) Adrian
Re: Where to document APIs?
2008/9/11 Alex Rousskov <[EMAIL PROTECTED]>: >> To clarify: >> >> Longer API documents, .dox file in docs/, or maybe src/ next to the .cc >> >> Basic rules the code need to fulfill, or until the API documentation >> grows large, in the .h or .cc file. > > You all have seen the current API notes for Comm and AsyncCalls. Do you > think they should go into a .dox or .h file? > > I think they are big enough (and growing) to justify a .dox file. I will > probably add those files to trunk (next to the corresponding .h files) > unless there are better ideas. Whats wrong with inline documentation again? Adrian
Re: Comm API notes
2008/9/11 Alex Rousskov <[EMAIL PROTECTED]>: > Here is a replacement text: > > The comm_close API will be used exclusively for "stop future I/O, > schedule a close callback call, and cancel all other callbacks" > purposes. New user code should not use comm_close for the purpose of > immediately ending a job via a close handler call. Yup. (As part of another email) I'd also make it completely clear that the underlying socket and IO may not be immediately closed via a comm_close() until pending scheduled IO events occur; and that the callers should be prepared for the situation where the underlying buffer(s) and other resources must stay immutable until the completion of the kernel-side stuff. This is partially why I wanted explicit notification, cancellation or not, so the owners of things like buffers would know when they were able to modify/reuse them again - or the "immutable" semantics must be enforced some other way. Adrian
Re: Comm API notes
2008/9/11 Alex Rousskov <[EMAIL PROTECTED]>: > * I/O cancellation. > > To cancel an interest in a read operation, call comm_read_cancel() > with an AsyncCall object. This call guarantees that the passed Call > will be canceled (see the AsyncCall API for call cancellation > definitions and details). Naturally, the code has to store the > original read callback Call pointer to use this interface. This call > does not guarantee that the read operation has not already happen. > This call guarantees that the read operation will not happen. As I said earlier, you can't guarantee that with asynchronous IO. The call may be in progress and not completed. I'm assuming you'd count "in progress" as "has already happened" but unlike the latter, you can't cancel it at the OS level. As long as the API keeps all the relevant OS related structures in place to allow the IO to complete, and callers to the cancellation function are prepared to handle the case where the IO is happening versus has already happened, then i'm happy. > You cannot reliably cancel an interest in read operation using the old > comm_read_cancel call that uses a function pointer. The handler may > get even called after old comm_read_cancel was called. This old API > will be removed. I really did think I had fixed removing the pending callbacks from the callback queue when I implemented this. (Ie, I thought I implemented enough for the POSIX read/write API but not enough for overlapped/POSIX IO.) What were people seeing pre-AsyncCalls? > It is OK to call comm_read_cancel (both old and new) at any time as > long as the descriptor has not been closed and there is either no read > interest registered or the passed parameters match the registered > ones. If the descriptor has been closed, the behavior is undefined. > Otherwise, if parameters do not match, you get an assertion. > > To cancel other operations, close the descriptor with comm_close. I'm still not happy with comm_close() being used in that way; it seems you aren't either and are stipulating new user code aborts jobs via alternative paths. I'm also not happy with the idea of close handlers to unwind state associated with it; how "deep" do close handlers actually get? Would we be better off in the long run by stipulating a more rigid shutdown process (eg - shutting down a client-side fd would not involve comm_close(fd); but ConnStateData::close() which would handle clearing the clientHttpRequests and such, then itself + fd?) > Raw socket descriptors may be replaced with unique IDs or small > objects that help detect stale descriptor/socket usage bugs and > encapsulate access to socket-specific information. New user code > should treat descriptor integers as opaque objects. I do agree with this. As Henrik said, this makes Windows porting a bit easier. There are still other problems to tackle to properly abuse overlapped IO in any sensible fashion, mostly surrounding IO scheduling and callback scheduling.. adrian
Re: [MERGE] Config cleanups
You have the WCCPv2 stuff around the wrong way. the redirection has nothing to do with the assignment method. You can and do have L2 redirection with hash assignment. You probably won't have GRE redirection with mask assignment though, but I think its entirely possible. Keep the options separate, and named whatever they are in the wccp2 draft. I'd also suggest committing each chunk thats "different" seperately - ie, the wccp stuff seperate, the ACL tidyup seperate, the default storage stuff seperate, etc. That makes backing out patches easier if needed. 2c, Adrian 2008/9/10 Amos Jeffries <[EMAIL PROTECTED]>: > This update removes several magic number options in the WCCPv2 > configuration. Replacing them with user-freindly text options. > > This should help with a lot of config confusion where these are needed until > they are obsoleted properly. > > # Bazaar merge directive format 2 (Bazaar 0.90) > # revision_id: [EMAIL PROTECTED] > # target_branch: file:///src/squid/bzr/trunk/ > # testament_sha1: 7b319238106ae2926697f85b2ec58c3476abc121 > # timestamp: 2008-09-11 03:50:49 +1200 > # base_revision_id: [EMAIL PROTECTED] > # q5rnfdpug13p94fl > # > # Begin patch > === modified file 'src/cf.data.depend' > --- src/cf.data.depend 2008-04-03 05:31:29 + > +++ src/cf.data.depend 2008-09-10 15:22:08 + > @@ -47,6 +47,7 @@ > tristate > uri_whitespace > ushort > +wccp2_method > wccp2_service > wccp2_service_info > wordlist > > === modified file 'src/cf.data.pre' > --- src/cf.data.pre 2008-08-09 06:24:33 + > +++ src/cf.data.pre 2008-09-10 15:47:36 + > @@ -831,8 +831,8 @@ > > NOCOMMENT_START > #Allow ICP queries from local networks only > -icp_access allow localnet > -icp_access deny all > +#icp_access allow localnet > +#icp_access deny all > NOCOMMENT_END > DOC_END > > @@ -856,8 +856,8 @@ > > NOCOMMENT_START > #Allow HTCP queries from local networks only > -htcp_access allow localnet > -htcp_access deny all > +#htcp_access allow localnet > +#htcp_access deny all > NOCOMMENT_END > DOC_END > > @@ -883,7 +883,7 @@ > NAME: miss_access > TYPE: acl_access > LOC: Config.accessList.miss > -DEFAULT: none > +DEFAULT: allow all > DOC_START >Use to force your neighbors to use you as a sibling instead of >a parent. For example: > @@ -897,11 +897,6 @@ > >By default, allow all clients who passed the http_access rules >to fetch MISSES from us. > - > -NOCOMMENT_START > -#Default setting: > -# miss_access allow all > -NOCOMMENT_END > DOC_END > > NAME: ident_lookup_access > @@ -1555,9 +1550,7 @@ > > icp-port: Used for querying neighbor caches about > objects. To have a non-ICP neighbor > -specify '7' for the ICP port and make sure the > -neighbor machine has the UDP echo port > -enabled in its /etc/inetd.conf file. > +specify '0' for the ICP port. >NOTE: Also requires icp_port option enabled to send/receive > requests via this method. > > @@ -1955,7 +1948,7 @@ > NAME: maximum_object_size_in_memory > COMMENT: (bytes) > TYPE: b_size_t > -DEFAULT: 8 KB > +DEFAULT: 512 KB > LOC: Config.Store.maxInMemObjSize > DOC_START >Objects greater than this size will not be attempted to kept in > @@ -2124,7 +2117,7 @@ >which can be changed with the --with-coss-membuf-size=N configure >option. > NOCOMMENT_START > -cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256 > +# cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256 > NOCOMMENT_END > DOC_END > > @@ -2291,7 +2284,7 @@ > NAME: access_log cache_access_log > TYPE: access_log > LOC: Config.Log.accesslogs > -DEFAULT: none > +DEFAULT: @DEFAULT_ACCESS_LOG@ squid > DOC_START >These files log client request activities. Has a line every HTTP or >ICP request. The format is: > @@ -2314,9 +2307,9 @@ > >And priority could be any of: >err, warning, notice, info, debug. > -NOCOMMENT_START > -access_log @DEFAULT_ACCESS_LOG@ squid > -NOCOMMENT_END > + > + Default: > + access_log @DEFAULT_ACCESS_LOG@ squid > DOC_END > > NAME: log_access > @@ -2342,14 +2335,17 @@ > > NAME: cache_store_log > TYPE: string > -DEFAULT: @DEFAULT_STORE_LOG@ > +DEFAULT: none > LOC: Config.Log.store > DOC_START >Logs the activities of the storage manager. Shows which >objects are ejected from the cache, and which objects are > - saved and for how long. To disable, enter "none". There are > - not really utilities to analyze this data, so you can safely > + saved and for how long. To disable, enter "none" or remove the > line. > + There are not really utilities to analyze this data, so you can > safely >disable it. > +NOCOMMENT_START > +# cache_store_log @DEFAULT_STORE_LOG@ > +NOCOMMENT_END > DOC_END > > NAME: cache_swap_state cache_swap_log > @@ -3085,7 +3081,7 @@ > NAME: request_heade
Squid-2.HEAD URL regression with CONNECT
G'day, Squid-2.HEAD doesn't seem to handle CONNECT URLs anymore; I get something like: [start] The requested URL could not be retrieved While trying to retrieve the URL: www.gmail.com:443 The following error was encountered: * Invalid URL [end] Benno, could you please double/triple check that your method and url related changes to Squid-2.HEAD didn't break CONNECT? Thanks! Adrian
Re: How to buffer a POST request
Well, I've got a proof of concept which works well but its -very- ugly. This is one of those things may have been slightly easier to do in Squid-3 with Alex's BodyPipe changes. I haven't stared at the BodyPipe code to know whether its doing all the right kinds of buffering for this application. The problem is that Squid-2's request body data pipeline doesn't do any of its own buffering - it doesn't do anything at all until a consumer says "give me some more request body data please" at which point its copied out of conn->in.buf (the client-side incoming socket buffer), consumed, and passed onto the caller. I thought about a "clean" implementation which would involve the request body pipeline code consuming socket buffer data until a certain threshold is reached, then feeding that back up to the request body consumer but I decided that was too difficult for this particular contract. Instead, the "hack" here is to just keep reading data into the client-side socket buffer - its already doing double duty as a request body buffer anyway - until an ACL match fires to begin forwarding. Its certainly not clean but it seems to work in local testing. I haven't yet tested connection aborts and such to make sure that connections are properly cleaned up. I'll look at posting a patch to squid-dev in a day or two once my client has had a look at it. Thanks, Adrian 2008/8/8 Adrian Chadd <[EMAIL PROTECTED]>: > Well I'm still going through the process of planning out what changes > need to happen. > > I know what changes need to happen long-term but this project doesn't > have that sort of scope.. > > > > Adrian > > 2008/8/8 Mark Nottingham <[EMAIL PROTECTED]>: >> You said you were doing it :) >> >> >> On 08/08/2008, at 4:40 PM, Adrian Chadd wrote: >> >>> Way to dob me in! >>> >>> >>> Adrian >>> >>> 2008/8/8 Mark Nottingham <[EMAIL PROTECTED]>: >>>> >>>> I took at stab at: >>>> http://wiki.squid-cache.org/Features/RequestBuffering >>>> >>>> >>>> On 22/07/2008, at 4:40 PM, Henrik Nordstrom wrote: >>>> >>>>> It's not a bug. A feature request in the wiki is more appropriate. >>>>> >>>>> wiki.squid-cache.org/Features/ >>>>> >>>>> Regards >>>>> Henrik >>>>> >>>>> On mån, 2008-07-21 at 17:50 -0700, Mark Nottingham wrote: >>>>>> >>>>>> I couldn't find an open bug for this, so I opened >>>>>> http://www.squid-cache.org/bugs/show_bug.cgi?id=2420 >>>>>> >>>>>> >>>>>> On 11/06/2008, at 3:29 AM, Henrik Nordstrom wrote: >>>>>> >>>>>>> On ons, 2008-06-11 at 12:51 +0300, Mikko Kettunen wrote: >>>>>>> >>>>>>>> Yes, I read something about this on squid-users list, there seems >>>>>>>> to be >>>>>>>> 8kB buffer for this if I understood right. >>>>>>> >>>>>>> The buffer is bigger than that. But not unlimited. >>>>>>> >>>>>>> The big change needed is that there currently isn't anything delaying >>>>>>> forwarding of the request headers until sufficient amount of the >>>>>>> request >>>>>>> body has been buffered. >>>>>>> >>>>>>> Regards >>>>>>> Henrik >>>>>> >>>>>> -- >>>>>> Mark Nottingham [EMAIL PROTECTED] >>>>>> >>>> >>>> -- >>>> Mark Nottingham [EMAIL PROTECTED] >>>> >>>> >>>> >> >> -- >> Mark Nottingham [EMAIL PROTECTED] >> >> >> >
Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.
Hah, Amos just exposed my on-set short term memory loss! (Time to get a bigger whiteboard..) Adrian 2008/9/9 Amos Jeffries <[EMAIL PROTECTED]>: >> I've been thinking about doing exactly this after I've been knee-deep >> in the DNS code. >> It may not be a bad idea to have generic udp/tcp incoming/outgoing >> addresses which can then be over-ridden per-"protocol". >> > > WTF? We discussed this months ago and came to the conclusion it would be > good to have a two layered outgoing address/port assignment. > > a) base default of random system-assigned outbound address port. > > b) override per-component/protocol in/out bound address/port with > individual config options. > > Amos > >> >> Adrian >> >> 2008/9/9 Amos Jeffries <[EMAIL PROTECTED]>: revno: 9176 committer: Alex Rousskov <[EMAIL PROTECTED]> branch nick: trunk timestamp: Mon 2008-09-08 17:52:06 -0600 message: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address. modified: src/htcp.cc >>> >>> I think this is one of those cleanup situations where we wanted to split >>> the protocol away from generic udp_*_address and make it an >>> htcp_outgoing_address. Yes? >>> >>> Amos >>> >>> >>> >> > > >
Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.
I've been thinking about doing exactly this after I've been knee-deep in the DNS code. It may not be a bad idea to have generic udp/tcp incoming/outgoing addresses which can then be over-ridden per-"protocol". Adrian 2008/9/9 Amos Jeffries <[EMAIL PROTECTED]>: >> >> revno: 9176 >> committer: Alex Rousskov <[EMAIL PROTECTED]> >> branch nick: trunk >> timestamp: Mon 2008-09-08 17:52:06 -0600 >> message: >> Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming >> address. >> modified: >> src/htcp.cc >> > > I think this is one of those cleanup situations where we wanted to split > the protocol away from generic udp_*_address and make it an > htcp_outgoing_address. Yes? > > Amos > > >
Re: squid-2.HEAD: some changes to client_side.c for invalid requests.
G'day, Please make sure you put these patches into bugzilla so they're not lost. adrian 2008/9/8 Alexander V. Lukyanov <[EMAIL PROTECTED]>: > On Mon, Sep 08, 2008 at 02:49:50PM +0400, Alexander V. Lukyanov wrote: >> 3. create method object even for invalid requests (this fixes null pointer >> dereferences in many other places). > > I also suggest this patch to detect attempts to create method-less requests. > > -- > Alexander. >
Re: [PATCH] Send 407 on url_rewrite_access/storeurl_access
Thanks! Don't forget to bug me if its not sorted out in the next week or so. Adrian 2008/9/8 Diego Woitasen <[EMAIL PROTECTED]>: > http://www.squid-cache.org/bugs/show_bug.cgi?id=2455 > > On Sun, Sep 07, 2008 at 09:28:30AM +0800, Adrian Chadd wrote: >> It looks fine; could you dump it into bugzilla for the time being? >> (We're working on the Squid-2 -> bzr merge stuff at the moment!) >> >> >> >> Adrian >> >> 2008/9/7 Diego Woitasen <[EMAIL PROTECTED]>: >> > This patch apply to Squid 2.7.STABLE4. >> > >> > If we use a proxy_auth acl on {storeurl,url_rewrite}_access and the user >> > isn't authenticated previously, send 407. >> > >> > regards, >> >Diego >> > >> > >> > diff --git a/src/client_side.c b/src/client_side.c >> > index 23c4274..4f75ea0 100644 >> > --- a/src/client_side.c >> > +++ b/src/client_side.c >> > @@ -448,19 +448,71 @@ clientFinishRewriteStuff(clientHttpRequest * http) >> > >> > } >> > >> > -static void >> > -clientAccessCheckDone(int answer, void *data) >> > +void >> > +clientSendErrorReply(clientHttpRequest * http, int answer) >> > { >> > -clientHttpRequest *http = data; >> > err_type page_id; >> > http_status status; >> > ErrorState *err = NULL; >> > char *proxy_auth_msg = NULL; >> > + >> > +proxy_auth_msg = >> > authenticateAuthUserRequestMessage(http->conn->auth_user_request ? >> > http->conn->auth_user_request : http->request->auth_user_request); >> > + >> > +int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || >> > aclIsProxyAuth(AclMatchedName)) && !http->request->flags.transparent; >> > + >> > +debug(33, 5) ("Access Denied: %s\n", http->uri); >> > +debug(33, 5) ("AclMatchedName = %s\n", >> > + AclMatchedName ? AclMatchedName : ""); >> > +debug(33, 5) ("Proxy Auth Message = %s\n", >> > + proxy_auth_msg ? proxy_auth_msg : ""); >> > + >> > +/* >> > + * NOTE: get page_id here, based on AclMatchedName because >> > + * if USE_DELAY_POOLS is enabled, then AclMatchedName gets >> > + * clobbered in the clientCreateStoreEntry() call >> > + * just below. Pedro Ribeiro <[EMAIL PROTECTED]> >> > + */ >> > +page_id = aclGetDenyInfoPage(&Config.denyInfoList, AclMatchedName, >> > answer != ACCESS_REQ_PROXY_AUTH); >> > +http->log_type = LOG_TCP_DENIED; >> > +http->entry = clientCreateStoreEntry(http, http->request->method, >> > + null_request_flags); >> > +if (require_auth) { >> > + if (!http->flags.accel) { >> > + /* Proxy authorisation needed */ >> > + status = HTTP_PROXY_AUTHENTICATION_REQUIRED; >> > + } else { >> > + /* WWW authorisation needed */ >> > + status = HTTP_UNAUTHORIZED; >> > + } >> > + if (page_id == ERR_NONE) >> > + page_id = ERR_CACHE_ACCESS_DENIED; >> > +} else { >> > + status = HTTP_FORBIDDEN; >> > + if (page_id == ERR_NONE) >> > + page_id = ERR_ACCESS_DENIED; >> > +} >> > +err = errorCon(page_id, status, http->orig_request); >> > +if (http->conn->auth_user_request) >> > + err->auth_user_request = http->conn->auth_user_request; >> > +else if (http->request->auth_user_request) >> > + err->auth_user_request = http->request->auth_user_request; >> > +/* lock for the error state */ >> > +if (err->auth_user_request) >> > + authenticateAuthUserRequestLock(err->auth_user_request); >> > +err->callback_data = NULL; >> > +errorAppendEntry(http->entry, err); >> > + >> > +} >> > + >> > +static void >> > +clientAccessCheckDone(int answer, void *data) >> > +{ >> > +clientHttpRequest *http = data; >> > + >> > debug(33, 2) ("The request %s %s is %s, because it matched '%s'\n", >> >RequestMethods[http->request->method].str, http->uri, >> >answer == ACCESS_ALLOWED ? "ALLOWED" : "DENIED", >> >AclMatchedName ? AclMatchedName : "