Re: [code] [for discussion] map-trie
FWIW my intent with libTrie was always to head for an LPC Trie implementation as the only implementation. This should be faster still than the current naive trie, and faster than the std::map compression technique you've implemented. [ IIRC the LPC paper was Implementing a Dynamic Compressed Trie from Stefan Nilsson, Matti Tikkanen ]- I have the pdf floating around here somewhere, but a quick google just hit paywalls these days :(. -Rob On 12 June 2014 10:43, Kinkie gkin...@gmail.com wrote: Hi, I've done some benchmarking, here are the results so far: The proposal I'm suggesting for dstdomain acl is at lp:~kinkie/squid/flexitrie . It uses the level-compact trie approach I've described in this thread (NOT a Patricia trie). As a comparison, lp:~kinkie/squid/domaindata-benchmark implements the same benchmark using the current splay-based implementation. I've implemented a quick-n-dirty benchmarking tool (src/acl/testDomainDataPerf); it takes as input an acl definition - one dstdomain per line, as if it was included in squid.conf, and a hostname list file (one destination hostname per line). I've run both variants of the code against the same dataset: a 4k entries domain list, containing both hostnames and domain names, and a 18M entries list of destination hostnames, both matching and not matching entries in the domain list (about 7.5M hits, 10.5M misses). Tested 10 times on a Core 2 PC with plenty of RAM - source datasets are in the fs cache. level-compact-trie: the mean time is 11 sec; all runs take between 10.782 and 11.354 secs; 18 Mb of core used full-trie: mean is 7.5 secs +- 0.2secs; 85 Mb of core. splay-based: mean time is 16.3sec; all runs take between 16.193 and 16.427 secs; 14 Mb of core I expect compact-trie to scale better as the number of entries in the list grows and with the number of clients and requests per second; furthermore using it removes 50-100 LOC, and makes code more readable. IMO it is the best compromise in terms of performance, resources useage and expected scalability; before pursuing this further however I'd like to have some feedback. Thanks On Wed, Jun 4, 2014 at 4:11 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/04/2014 02:06 AM, Kinkie wrote: there are use cases for using a Trie (e.g. prefix matching for dstdomain ACL); these may be served by other data strcutures, but none we could come up with on the spot. std::map and StringIdentifier are the ones we came up with on the spot. Both may require some adjustments to address the use case you are after. Alex. A standard trie uses quite a lot of RAM for those use cases. There are quite a lot of areas where we can improve so this one is not urgent. Still I'd like to explore it as it's synchronous code (thus easier for me to follow) and it's a nice area to tinker with. On Tue, Jun 3, 2014 at 10:12 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/03/2014 08:40 AM, Kinkie wrote: Hi all, as an experiment and to encourage some discussion I prepared an alternate implementation of TrieNode which uses a std::map instead of an array to store a node's children. The expected result is a worst case performance degradation on insert and delete from O(N) to O(N log R) where N is the length of the c-string being looked up, and R is the size of the alphabet (as R = 256, we're talking about 8x worse). The expected benefit is a noticeable reduction in terms of memory use, especially for sparse key-spaces; it'd be useful e.g. in some lookup cases. Comments? To evaluate these optimizations, we need to know the targeted use cases. Amos mentioned ESI as the primary Trie user. I am not familiar with ESI specifics (and would be surprised to learn you want to optimize ESI!), but would recommend investigating a different approach if your goal is to optimize search/identification of strings from a known-in-advance set. Cheers, Alex. -- Francesco -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud
Re: BZR local server?repo?
Put bzr update in cron? Or if you want an exact copy of trunk, use 'bzr branch' then 'bzr pull' to keep it in sync. On 23 January 2014 20:53, Eliezer Croitoru elie...@ngtech.co.il wrote: Since I do have a local server I want to have an up-to-date bzr replica. I can just use checkout or whatever but I want it to be be updated etc. I am no bzr expert so any help about the subject is more then welcome. Thanks, Eliezer
Re: [RFC] Tokenizer API
On 10 December 2013 19:13, Amos Jeffries squ...@treenet.co.nz wrote: The problem with comparing input strings to a SBuf of characters is that parsing a input of length N againt charset of size M takes O(N*M) time. Huh? There are linear time parsers with PEGs. Or maybe I don't understand one of your preconditions to come up with an N*M complexity here. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud
Re: Dmitry Kurochkin
On 24 July 2013 05:18, Alex Rousskov rouss...@measurement-factory.com wrote: It is with great sadness I inform you that Dmitry Kurochkin died in a skydiving accident a few days ago. Dmitry was an avid skydiver, with more than 360 jumps and some regional records behind him. He loved that sport. Dmitry's recent contributions to Squid include code related to HTTP/1.1 compliance, SMP scalability, SMP Rock Store, Large Rock, Collapsed Forwarding, and FTP gateway features. He also worked on automating Squid compliance testing with Co-Advisor and Squid performance testing with Web Polygraph. Dmitry was a wonderful person, a talented developer, and a key member of the Factory team. He was a pleasure to work with. We miss him badly. Thats extremely sad news. My condolences and sympathies to his colleagues, family, and friends. -Rob
rackspace offering free compute for open source projects
https://twitter.com/jessenoller/status/355757374906183680 - this might be useful for jenkins. -Rob
Re: Should we remove ESI?
On 11 June 2013 20:23, Kinkie gkin...@gmail.com wrote: On Mon, Jun 10, 2013 at 7:22 PM, Alex Rousskov rouss...@measurement-factory.com wrote: From what I understand (Robert, can you come to the rescue?) libTrie is a very optimized key- and prefix- lookup engine, trading memory useage for speed. It would be great to use in the Http parser to look up header keys, for instance. It is a generic trie implementation, it is very good at some forms of lookup, and it's used in ESI yes; I had planned to try it in the HTTP parser, but ETIME. I do not know much about ESI, but IMHO, if somebody has cycles to work on this, it would be best to spend them removing ESI (together with libtTrie) from Squid sources while converting ESI into an eCAP adapter. This will be a big step forward towards making client side code sane (but removing ESI itself does not require making complex changes to the client side code itself). Robert is the expert on this. My question right now is, is anyone using ESI? ESI requires a specifically-crafted mix of infrastructure and application; there are nowadays simpler ways to obtain similar results. For this reason I would launch an inquiry to our users and to the original ESI sponsors to understand whether to simply stop supporting ESI. It is ~10kLOC that noone really looks after, and they imply dependencies (e.g. on the xml libraries). We get occasional queries about it on IRC and the lists; I don't know if it's in use in production or not. I think it would be sad to remove working code, but if noone is using it, noone is using it. I think refactoring it to use eCap rather than clientStreams would be fine, but I can't volunteer to do that myself. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Cloud Services
Re: Should we integrate libTrie into our build system?
On 10 June 2013 08:40, Kinkie gkin...@gmail.com wrote: Hi all, while attempting to increase portability to recent clang releases, I noticed that libTrie hasn't benefited from the portability work that was done in the past few years. I can see three ways to move forward: 1- replicate these changes into libTrie 2- change libTrie to piggyback squid's configuration variables 3- fully integrate libTrie into squid's build system. Unless Robert knows otherwise, squid is the only user of this library.. I'm not aware fo it being shipped/used separately. Probably want to replace it by now, must be a tuned equivalent somewhere :) -Rob Comments? -- /kinkie -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Cloud Services
Re: Squid SMP on MacOS
On 25 February 2013 18:24, Alex Rousskov rouss...@measurement-factory.com wrote: On 02/24/2013 10:02 PM, Amos Jeffries wrote: I'm trying to get the MacOS builds of Squid going again but having some problems with shm_open() in the Rock storage unit-tests. 1) MacOS defines the max name length we can pass to shm_open() at 30 bytes. /squid-testRock__testRockSearch being 35 or so bytes. Cutting the definition in testRock.cc down so it becomes /squid-testRock_Search resolves that, but then we hit (2). That TESTDIR name is wrong because it is used for more than just search testing. I bet the Rock name mimicked the UFS test name, but the UFS name is wrong too, for the same reason. We should use cppUnitTestRock and cppUnitTestUfs or something similarly unique and short, I guess. We should use a random name; squidtest-10-bytes-of-entropy should do it. Random because we don't want tests running in parallel to step on each other on jenkins slaves. 2) With the short string above and the current settings sent to shm_open() in src/ipc/mem/Segment.cc line 73 MacOS shm_open() starts responding with EINVAL. theFD = shm_open(theName.termedBuf(), O_CREAT | O_RDWR | O_TRUNC, S_IRUSR | S_IWUSR); Sounds like some of the five shm_open() flags we are using successfully elsewhere do not work on MacOS. I do not know which flag(s) do not work, and we have no MacOS boxes in the lab, so we cannot experiment or read documentation. I assume shared segment opening fails with similar symptoms when used outside of unit tests (e.g., with a shared memory cache)? If so, please feel free to disable shared memory support on MacOS (do not define HAVE_SHM?) until somebody who needs it can find the right combination of flags. +1 -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Cloud Services
Re: some notes and help to think is needed + Test results of testing basics in store_url_rewrite.
On Thu, Sep 27, 2012 at 4:09 PM, Eliezer Croitoru elie...@ngtech.co.il wrote: Well I was reading here and there the store code and other things then started testing some theories about how store_url can be and should be implemented. If you do remember or not I refactored the source in couple places to use originalUrl and original_url to check all the places where original url is used and then decide where the store_url should be placed if needed. (memobject-original_url and has memobject-store_url) the main place of all I have seen that should be used with the store_url is the http://bazaar.launchpad.net/~squid/squid/3-trunk/view/head:/src/store_key_md5.cc#L144 which gets requests from couple places and mainly the setPublic for store entry (Does bazaar has search option in it ?) bzr grep (if you install the bzr-grep plugin) will search all your source code. bzr search (if you install the bzr-search plugin) finds references anywhere in the project history. -Rob
Re: [RFC] or ACLs
On Tue, Sep 25, 2012 at 10:06 AM, Alex Rousskov rouss...@measurement-factory.com wrote: Hello, I would like to add support for explicit OR ACLs: On the up side i think that this would indeed give us much more flexability, and its a decent stopgap between this and a more understandable ACL language. On the downside, I think it will make things harder to explain, the current system being itself rather arcane. I'd like to see, long term, some more complete language - e.g. something with macros so you can define and reuse complex constructs without having to repeat them, which I think your OR support will only partially mitigate against. -Rob
Re: [RFC] One helper to rewrite them all
On Wed, Sep 12, 2012 at 11:05 AM, Amos Jeffries squ...@treenet.co.nz wrote: IMO the backward compatibility and easy upgrade from 2.7 overrides doing this now. It is possible to do a migration to this design later easily enough not to worry. FWIW I wouldn't worry about 2.7 compat. As of precise, squid2 isn't in Ubuntu at all. Its unlikely that folk will need to write one helper to support both 3.x and 2.7 installs. -Rob
Re: bzr unmerge
On Sat, Aug 18, 2012 at 6:58 AM, Henrik Nordström hen...@henriknordstrom.net wrote: fre 2012-08-17 klockan 19:59 +0200 skrev Kinkie: Have you considered bzr uncommit? uncommit do not cut it. uncommit simply moves the branch head to a given revision discarding any later revisions, same as git reset --hard for those familiar with git. It's ok to use on a private branch to clean up mistakes, but MUST NOT, repeat MUST NOT be used after the revisions have been pushed to a shared repository. what I want is a unmerge operation that is a commit in itself much like revert, preserving full history, but which requires attention to resolve when propagated to other branches, enabling other branches to keep the changes or revert them per user choice. bzr merge N..N-1 . will back out revision N from this branch in the way you describe.
Re: Generic helper I/O format
On Thu, Jul 5, 2012 at 4:00 PM, Amos Jeffries squ...@treenet.co.nz wrote: Why do we need backwards compat in the new protocol? As an alternative, consider setting a protocol= option on the helpers, making the default our latest-and-greatest,a nd folk running third-party helpers can set protocl=v1 or whatever to get backwards compat. This lets us also warn when we start deprecating the old protocol, that its going away. -Rob
Re: Geek fun in Squid's source code
On Fri, Jun 29, 2012 at 12:53 PM, Amos Jeffries squ...@treenet.co.nz wrote: On 29/06/2012 12:01 a.m., Kinkie wrote: from support_netbios.cc: if (p == np) { (stumbled into this while sweeping the sources changing postincrement to preincrement operators). -- /kinkie Why do you bring this up? http://en.wikipedia.org/wiki/P_versus_NP_problem
Re: squid md5 and fips mode
On Tue, Jun 19, 2012 at 9:10 AM, Paul Wouters pwout...@redhat.com wrote: Hi, I have been looking at FIPS issues with squid in both RHEL5 and RHEL6. In fips mode, MD5 is not allowed for securing web traffic (with some TLS exceptions) nor user authentication. It is allowed for other things, such as hashes for disk objects. The problem in older versions of squid was that the cache object code used md5, and since openssl did not allow it, it would die. This was fixed with a patch by using the squid buildin md5 functions for the cache object hashes. See https://bugzilla.redhat.com/show_bug.cgi?id=705097 Now for recent versions of squid, I have the reverse problem. The openssl md5 code is never used, so I had to patch it back to using openssl with the exception for the cache object id where I used private_MD5_Init() It basically undoes most of these commits: - Changing 'xMD5' function name to 'SquidMD5' - Changing 'MD5_CTX' typedef to 'SquidMD5_CTX' - Changing 'MD5_DIGEST_CHARS' define to 'SQUID_MD5_DIGEST_LENGTH' - Changing 'MD5_DIGEST_LENGTH' define to 'SQUID_MD5_DIGEST_LENGTH' - Removing messy #ifdef logic in include/md5.h that tries to use the system libraries if available. We'll always use the Squid MD5 routines. My request is to basically undo this change, and to use openssl again where possible, so that fips enforcement does not fail with the custom crypto code that goes undetected. A rought patch (that does not take into account systems with no crypt()) is attached at: https://bugzilla.redhat.com/show_bug.cgi?id=833086 A few quick thoughts: - We'd love patches that make squid explicitly aware of e.g. FIPS mode, so that we can enforce it ourselves. We've no idea today when we change something whether its going to impact on such external needs, and frankly, tracking it is going to be tedious and error prone, leading to the sort of flip-flop situation you have. - as we don't have an OpenSSL exception in our copyright (and its -hard- to add one today), you can't legally ship a squid binary linked against openSSL anyway. http://gplv3.fsf.org/wiki/index.php/Compatible_licenses#GPLv2-incompatible_licenses. Note that our COPYING file says 'version 2' not 'version 2 or later', though at least some source files say 'version 2 or later' - we'll need to figure out what to do about *that* in due course. So, basically every build of squid that uses openSSL has to have been built by the end user locally, anyway. Yes, this kindof sucks. That said, if the FIPS standard doesn't like MD5, there is no need for use to use it at all, we could use sha1 as a build time option for cache keys (or take the first N bits of sha1), if that helps: that would allow us to be entirely MD5 free when desired. -Rob
Re: [PATCH] fix up external acl type dumping
On Fri, Jun 15, 2012 at 4:23 AM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/14/2012 03:06 AM, Robert Collins wrote: +#define DUMP_EXT_ACL_TYPE_FMT(a, fmt, ...) \ + case _external_acl_format::EXT_ACL_##a: \ + storeAppendPrintf(sentry, fmt, ##__VA_ARGS__); \ + break I do not see Squid using __VA_ARGS__ currently. Are you sure it is portable, especially in the ## context? If you have not tested this outside Linux, are you OK with us pulling out this patch if it breaks the build? Yes, that would be fine. I'm mainly fixing this because I noticed it in passing: it has no concrete effect on me. If you are not sure, you can rework the macro to always use a single argument instead. The calls without a parameter in the format can add one and use an empty string (or you can have two macros). I do not like how this macro butchers the format code names. I could borderline agree with stripping the namespace, but stripping EXT_ACL_ prefix seems excessive. The prefix itself looks excessive (because the namespace already has external_acl in it!) but that is a different issue. I would prefer that the macro takes a real code name constant, without mutilating the name. There is an existing similar macro that does the same kind of mutilation, but there it is kind of justified because the name suffix is actually used. The true solution to this mess is to fix the names themselves, I guess. I think we should make external acls a class, no base class, and ditch the whole big case statement etc, and we can lose the macro at the same time. Thats a rather bigger change, and I wanted to fix the defect in the first instance. -Rob
Re: [PATCH] fix up external acl type dumping
On Sat, Jun 16, 2012 at 7:33 AM, Alex Rousskov rouss...@measurement-factory.com wrote: IMHO, ##__VA_ARGS__ is not worth the trouble in this particular case. However, even if you disagree, please use at least one argument (empty string with a corresponding %s if needed to prevent compiler warnings?). The ## is meant to compensate for that, according to the reading I did. I'm happy to back it out if it causes jenkins failures, but would like to at least try it as-is. -Rob
[PATCH] fix up external acl type dumping
This patch does four things: - adds a helper macro for format strings with external acl type config dumping. - uses that to add a missing type dumper for %%, which currently causes squid to FATAL if mgr:config is invoked. - refactors the SSL type dumping to use the macro as well, saving some redundant code - fixes a typo -case _external_acl_format::EXT_ACL_CA_CERT: -storeAppendPrintf(sentry, %%USER_CERT_%s, format-header); Seeking review, will land in a couple days if there is none :) -Rob === modified file 'src/external_acl.cc' --- src/external_acl.cc 2012-05-08 01:21:10 + +++ src/external_acl.cc 2012-06-14 08:58:33 + @@ -568,6 +568,10 @@ case _external_acl_format::EXT_ACL_##a: \ storeAppendPrintf(sentry, %%%s, #a); \ break +#define DUMP_EXT_ACL_TYPE_FMT(a, fmt, ...) \ +case _external_acl_format::EXT_ACL_##a: \ +storeAppendPrintf(sentry, fmt, ##__VA_ARGS__); \ +break #if USE_AUTH DUMP_EXT_ACL_TYPE(LOGIN); #endif @@ -592,28 +596,17 @@ DUMP_EXT_ACL_TYPE(PATH); DUMP_EXT_ACL_TYPE(METHOD); #if USE_SSL - -case _external_acl_format::EXT_ACL_USER_CERT_RAW: -storeAppendPrintf(sentry, %%USER_CERT); -break; - -case _external_acl_format::EXT_ACL_USER_CERTCHAIN_RAW: -storeAppendPrintf(sentry, %%USER_CERTCHAIN); -break; - -case _external_acl_format::EXT_ACL_USER_CERT: -storeAppendPrintf(sentry, %%USER_CERT_%s, format-header); -break; - -case _external_acl_format::EXT_ACL_CA_CERT: -storeAppendPrintf(sentry, %%USER_CERT_%s, format-header); -break; +DUMP_EXT_ACL_TYPE_FMT(USER_CERT_RAW, %%USER_CERT_RAW); +DUMP_EXT_ACL_TYPE_FMT(USER_CERTCHAIN_RAW, %%USER_CERTCHAIN_RAW); +DUMP_EXT_ACL_TYPE_FMT(USER_CERT, %%USER_CERT_%s, format-header); +DUMP_EXT_ACL_TYPE_FMT(CA_CERT, %%CA_CERT_%s, format-header); #endif #if USE_AUTH DUMP_EXT_ACL_TYPE(EXT_USER); #endif DUMP_EXT_ACL_TYPE(EXT_LOG); DUMP_EXT_ACL_TYPE(TAG); +DUMP_EXT_ACL_TYPE_FMT(PERCENT, ); default: fatal(unknown external_acl format error); break;
Re: Multiple outgoing addresses for squid?
On Fri, Mar 30, 2012 at 4:18 AM, Chris Ross cr...@markmonitor.com wrote: So, I suspect someone has looked at this before, but I have an edge device that is multi-homed. I have multiple WAN connections available, and what I'd really like to do is have a squid that's smart enough to learn which web sites are better out of which WAN connection. But, shy of something that advanced, is it possible to have squid know to bind to N outside addresses, and then either round-robin them, or try one always, and then try the other if there is a failure on the first? I'd be happy to help implement such a thing if it doesn't already exist, but I assume this is the sort of problem that has already been faced and hopefully solved. 'tcp_outgoing_address' in the config ;) -Rob
Re: Multiple outgoing addresses for squid?
2012/3/30 Henrik Nordström hen...@henriknordstrom.net: Can tcp_outgoing_address take multiple addresses now? Does it just round-robin through them? It can only select one per request at the moment. Thats probably something we should fix. For now though an external ACL could deliver round robin answers, one per request - and it could look a tthe log file to learn about size of objects/ estimate bandwidth etc. -Rob
Re: Which projects are most important?
Performance is a hot topic at the moment; I would love to see more time going into that through any of the performance related items you listed (or other ones you may come up with). -Rob
Re: Question regarding ESI Implementation
On Fri, Sep 30, 2011 at 7:38 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Theres no specific meta documentation that I recall. It should be pretty straight forward (but remember that squid is a single threaded non-blocking program - so its got to work with that style of programming). Ah, I did not yet know Squid had an async programming model. Even better. The current ESI Implementation is non-blocking, too? Yup, for sure. -Rob
Re: Question regarding ESI Implementation
On Thu, Sep 29, 2011 at 10:37 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I am thinking about trying out some ideas building upon ESI 1.0 and would like to extend the ESI implementation of Squid. For personal use right now, but if turns out to be valuable I am happy to share it. I downloaded the source yesterday and took a short look at the ESI parts. Is there any form of documentation about the rationale behind the code snippets and the supported parts of ESI 1.0? What is the best way to get up to speed? Theres no specific meta documentation that I recall. It should be pretty straight forward (but remember that squid is a single threaded non-blocking program - so its got to work with that style of programming). Can you remember when the development roughly took place to help me digging through the developer archives? early 2000's, uhm, I think 2003. -Rob
Re: hit a serious blocker on windows
DuplicateHandle is how its done. http://msdn.microsoft.com/en-us/library/ms724251(v=vs.85).aspx -Rob
Re: [RFC] Have-Digest and duplicate transfer suppression
On Thu, Aug 11, 2011 at 10:59 AM, Alex Rousskov rouss...@measurement-factory.com wrote: On 08/10/2011 04:18 PM, Robert Collins wrote: How is case B different to If-None-Match ? The origin server may not supply enough information for If-None-Match request to be possible OR it may lie when responding to If-None-Match requests. The parent squid could handle the case when the origin lies though, couldn't it ? So: client - child - parent - origin if client asks for url X, child has an old copy, child could add If-None-Match, parent could detect that origin sends the same bytes (presumably by spooling the entire response) and then satisfy the If-None-Match, and child can give an unconditional reply to client, which hadn't had the original bytes. That doesn't help with the not-enough-information case, which is I presume the lack of a strong validator. So, perhaps we could consider this 'how can intermediaries add strong validators' - if we do that, and then (in squid - no http standards violation) - honour If-None-Match on those additional validators, it seems like we'll get the functionality you want, it a slightly more generic (and thus reusable) way ? -Rob
Re: [RFC] Have-Digest and duplicate transfer suppression
(But for clarity - I'm fine with what you proposed, I just wanted to consider whether the standards would let us do it more directly, which they -nearly- do AFAICT). -Rob
Re: [RFC] Have-Digest and duplicate transfer suppression
How is case B different to If-None-Match ? -Rob
Re: FYI: http timeout headers
On Fri, Mar 11, 2011 at 6:22 AM, Mark Nottingham m...@yahoo-inc.com wrote: http://tools.ietf.org/html/draft-thomson-hybi-http-timeout-00 In a nutshell, this draft introduces two new headers: Request-Timeout, which is an end-to-end declaration of how quickly the client wants the response, and Connection-Timeout, which is a hop-by-hop declaration of how long an idle conn can stay open. I'm going to give feedback to Martin about this (I can see a few places where there may be issues, e.g., it doesn't differentiate between a read timeout on an open request and an idle timeout), but I wanted to get a sense of what Squid developers thought; in particular - 1) is this interesting enough that you'd implement if it came out? Not personally, because the sites I'm involved with set timeouts on the backend as policy: dropping reads early won't save backend computing overhead (because request threads aren't cleanly interruptible), and permitting higher timeouts would need guards to prevent excessive resource consumption being permitted inappropriately. 2) if someone submitted a patch for this, would you include it? If it was clean, sure. But again there will be an interaction with site policy. e.g. can a client ask for a higher timeout than the squid admin has configured, or can they solely lower it to give themselves a snappier responsiveness. (and if the latter, why not just drop the connection if they don't get an answer soon enough). 3) do you see any critical issues, esp. regarding performance impact? I would worry a little about backend interactions: unless this header is honoured all the way through it would be easy for multiple backend workers to be calculating expensive resources for the same client repeatedly trying something with an inappropriately low timeout. I guess this just seems a bit odd overall : servers generally have a very good idea about the urgency of things it can serve based on what they are delivering. -Rob
Re: FYI: http timeout headers
On Fri, Mar 11, 2011 at 11:16 AM, Mark Nottingham m...@yahoo-inc.com wrote: Right. I think the authors hope that intermediaries (proxies and gateways) will adapt their policies (within configured limits) based upon what they see in incoming connection-timeout headers, and rewrite the outgoing connection-timeout headers appropriately. I'm not sure whether that will happen, hence my question. I'm even less sure about the use cases for request-timeout, for the reasons you mention. I suspect that this is overengineering : scalable proxies really shouldn't need special handling for very long requests, and proxies that need special handling will still want timeouts to guard against e.g. roaming clients leaving stale connections. Simply recommending that intermediaries depend on TCP to time out connections might be sufficient, simpler, and allow room for clean layering to deal with roaming clients and the like without making requests larger. -Rob
Re: Early pre-HEAD patch testing
On Tue, Feb 8, 2011 at 10:52 AM, Alex Rousskov rouss...@measurement-factory.com wrote: The problem with branches is that you have to commit changes (and, later, fixes) _before_ you test. Sometimes, that is not a good idea because you may want to simply _reject_ the patch if it fails the test instead of committing/fixing it. I suspect it would be OK to abuse lp a little and create a garbage branch not associated with any specific long-term development but used when I need to test a patch instead. You can certainly do that, and LP won't mind at all. Note though that a parameterised build in hudson can trivially build *any* branch off of LP, so you can equally push your experiment to ...$myexistingfeature-try-$thing-out. -Rob
Re: Sharing DNS cache among Squid workers
Have you considered just having a caching-only local DNS server colocated on the same machine? -Rob
Re: Sharing DNS cache among Squid workers
On Fri, Jan 14, 2011 at 11:13 AM, Alex Rousskov rouss...@measurement-factory.com wrote: On 01/13/2011 02:18 PM, Robert Collins wrote: Have you considered just having a caching-only local DNS server colocated on the same machine? I am sure that would be an appropriate solution in some environments. On the other hand, sometimes the box has no capacity for another server. Sometimes the traffic from 8-16 Squids can be too much for a single DNS server to handle. And sometimes administration/policy issues would prevent using external caching DNS servers on the Squid box. This surprises me - surely the CPU load for a dedicated caching DNS server is equivalent to the CPU load for squid maintaining a DNS cache itself; and DNS servers are also multithreaded? Anyhow, I've no particular objection to it being in the code base, but it does seem like something we'd get better results by not doing (or having a defined IPC mechanism to a single (possibly multi-core) cache process which isn't a 'squid'. [Even if it is compiled in the squid tree]. -Rob
Re: Fwd: OpenSolaris build node for Squid to be updated
Their jobs report in subunit however, which we can thunk into hudson using subunit2junitxml, so I think needing a duplicate build farm is much more than we'd need. We would need some glue, but thats about all. _Rob
Re: Feature branch launched: deheader
On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. Thats a fairly flawed approach. Two reasons: - some headers when present affect correctness, not compilation. - on some platforms headers are mandatory, on others optional. So, if you do this, be sure to take the minimal approach after building on *all* platforms, and then still be conservative. -Rob
Re: [PATCH] [RFC] custom error pages
Also, symlinks fail on windows :(. -Rob
Re: NULL vs 0
On Wed, Sep 22, 2010 at 4:58 AM, Alex Rousskov rouss...@measurement-factory.com wrote: Squid will most likely not work if NULL is not false. 0xCDCDCDCD is not false. Consider: some_pointer = some_function_that_may_return_NULL(); if (!some_pointer) ... When compilers do that, they also translate the if expression appropriately. But they are also meant to handle NULL vs 0 transparently in that case, AIUI. -Rob
Re: new/delete overloading, why?
2010/8/21 Henrik Nordström hen...@henriknordstrom.net: Why are we overloading new/delete with xmalloc/xfree? include/SquidNew.h this is causing random linking issues every time some piece of code forgets to include SquidNew.h, especially when building helpers etc. And I fail to see what benefit we get from overloading the new/delete operators in this way. it was to stop crashes with code that had been cast and was freed with xfree(); if you don't alloc with a matching allocator, and the platform has a different default new - *boom*. There may be nothing like that left to worry about now. -Rob
Re: FYI: github
Its fine by me; we could push squid3 up as well using bzr-git, if folk are interested. -Rob
Re: Compliance: Improved HTTP Range header field validation.
+1 -Ro b
Re: Marking uncached packets with a netfilter mark value
On Tue, Jun 22, 2010 at 8:52 AM, Andrew Beverley a...@andybev.com wrote: 1. Because the marking process needs to be run as root, can this only be achieved by putting the mark function within the squid process that originally starts up, and stipulate that this has to be run as root? Consider a dedicated helper like the diskd helper - send it a fd using shm, and a mark to place, and have it make the call. This can be started up before squid drops privileges. Better still, to a patch to netfilter to allow non root capabilities here. 2. Is any such patch likely to be accepted? Yes, modulo code quality, testing, cleanliness etc etc - all the usual concerns. -Rob
Re: food for thought?
2010/6/16 Kinkie gkin...@gmail.com: Actually the thing I found the most interesting is that it suggests to use page-aware object placements so that big structures traversal is easier on the VM. Could it be useful to adopt that for some of our low-level indexes? We do have a few hashes and trees laying around which maybe could benefit from this; and adopting an alternate algorithm for those may not have a big impact code-wise.. page-aware is only part of it - really, dig up cache oblivious algorithms. Lots of use and benefits :). -Rob
Re: Bug 2957 - only-if-cached shouldn't count when we're not caching
Well it sounds totally fine in principle; I'm wondering (without reading the patch) how you define 'we are not caching' - just no cachedirs ? That excludes mem-only caching (or perhaps thats not supported now). -Rob
Re: food for thought?
Well its written in an entertaining and a little condescending style. The class of algorithmic analysis that is relevant is 'cache oblivious algorithms' and is a hot topic at the moment. Well worth reading and thinking about. -Rob
Re: How to review a remote bzr branch
2010/5/24 Alex Rousskov rouss...@measurement-factory.com: On 05/22/2010 02:41 AM, Robert Collins wrote: What I do to review is usually 'bzr merge BRANCH'; bzr diff - and then eyeball. Thats roughly what a 'proposed merge' in launchpad will show too. You could do that as an alternative. Noted. The merge command may produce conflicts that will not be immediately visible in bzr diff, right? Yes indeed. -Rob
Re: How to review a remote bzr branch
What I do to review is usually 'bzr merge BRANCH'; bzr diff - and then eyeball. Thats roughly what a 'proposed merge' in launchpad will show too. You could do that as an alternative. -Rob
Re: Poll: Which bzr versions are you using?
What OS are you using? Upgrading to 2.0.x or newer would be advantageous. If there aren't packages for it, I'm sure we can figure out who to tickle to get some. -Rob
Re: Poll: Which bzr versions are you using?
2.2 :
Re: Poll: Which bzr versions are you using?
in general No; the .bzr.log in a users homedir will contain some info, possibly the client version: the server does receive that, I think. Over HTTP bzr is just another client - the user agent field includes the bzr version, I think, if you were to look in apache logs. -Rob
Re: Upgrade repository format for trunk?
On Thu, 2010-03-25 at 15:29 -0600, Alex Rousskov wrote: Sigh. I would rather not upgrade then. I do not know how to move from bzr 1.3 to bzr 2.0.x on Red Hat box that I have to use for some of the development, and I doubt somebody here would enjoy educating me on that process... Besides, even Ubuntu 9.10 only has bzr v2.0.2 by default. Thus, we would be cutting it pretty close to bleeding edge for many. Yes, 2.0.0 was relatively recent. Bzr folks are very good at making lots of releases but the world is apparently incapable of moving with the same speed! Well, we try to balance things; and we expect to stay with 2a for quite some time as a default - probably several years, as we did with 1.0. 2a is much more compact on disk, and faster across the board. But everyone will need to upgrade their own repositories, which can take a bit of time (or delete them and pull anew). If nothing else, this will require instruction on how to upgrade the everyone repositories. I can support the upgrade once those instructions work for me :-). http://doc.bazaar.canonical.com/bzr.2.1/en/upgrade-guide/index.html -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] immortal helpers
On Sun, 2010-02-21 at 22:27 +0100, Henrik Nordström wrote: lör 2010-02-20 klockan 18:25 -0700 skrev Alex Rousskov: The reasons you mention seem like a good justification for this option official existence. I do not quite get the fork bomb analogy because we are not creating more than a configured number of concurrent forks, are we? We may create processes at a high rate but there is nothing exploding here, is there? With our large in-memory cache index even two concurrent forks is kind of exploding on a large server. Consider for example the not unrealistic case of a 8GB cache index.. I actually have some clients with such indexes. I have an idea about this. Consider a 'spawn_helper'. The spawn helper would be started up early, before index parsing. Never killed and never started again. It would have, oh, several hundred K footprint, at most. command protocol for it would be pretty similar to the SHM disk IO helper, but for processes. Something like: squid-helper: spawn stderrfd argv(escaped/encoded to be line NULLZ string safe) helper-squid: pid, stdinfd, stdoutfd This would permit several interesting things: - starting helpers would no longer need massive VM overhead - we won't need to worry about vfork, at least for a while - starting helpers can be really async from squid core processing (at the moment everything gets synchronised) -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] immortal helpers
On Mon, 2010-02-22 at 02:03 +0100, Henrik Nordström wrote: mån 2010-02-22 klockan 11:44 +1100 skrev Robert Collins: command protocol for it would be pretty similar to the SHM disk IO helper, but for processes. Something like: squid-helper: spawn stderrfd argv(escaped/encoded to be line NULLZ string safe) helper-squid: pid, stdinfd, stdoutfd Which requires UNIX domain sockets for fd passing, and unknown implementation on Windows.. I thought SHM could pass around fd's. Anyhow, its doable on unix. On windows you can supply a HANDLE cross process: BOOL WINAPI DuplicateHandle( __in HANDLE hSourceProcessHandle, __in HANDLE hSourceHandle, __in HANDLE hTargetProcessHandle, __out LPHANDLE lpTargetHandle, __in DWORD dwDesiredAccess, __in BOOL bInheritHandle, __in DWORD dwOptions ) So you call that, and then the other process can use the handle. -Rob signature.asc Description: This is a digitally signed message part
Re: SMP: inter-process communication
On Sun, 2010-02-21 at 20:18 -0700, Alex Rousskov wrote: On 02/21/2010 06:10 PM, Henrik Nordström wrote: sön 2010-02-21 klockan 17:10 -0700 skrev Alex Rousskov: The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I guess there will be no shared cache then? Not initially, but that is the second step goal. I suggest using CARP then, to route to backends. -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] icap_oldest_service_failure option
+1 signature.asc Description: This is a digitally signed message part
Re: Initial SMP implementation plan
JFDI :) -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] Plain Surrogate/1.0 support
On Sun, 2010-02-07 at 00:52 +1300, Amos Jeffries wrote: According to the W3C documentation the Surrogate/1.0 capabilities and the Surrogate-Control: header are distinct from the ESI capabilities. This patch makes Squid always advertise and perform the Surrogate/1.0 capabilities for reverse-proxy requests. Full ESI support is no longer required to use the bare-bones Surrogate-Control: capabilities. A quick check though - is it still only enabled for accel ports? (It shouldn't be enabled for forward proxying). -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] log virgin HTTP request headers
On Thu, 2010-01-28 at 09:09 -0700, Alex Rousskov wrote: On 01/28/2010 06:07 AM, Robert Collins wrote: Just a small thing: can I suggest s/virgin/pristine/ ? Or s/virgin/received/ ? Pristine may work, but we (and other adaptation related documents) use virgin in many places already, including APIs. Received is a bad idea because an adapted message is also received. virgin has a sexual connotation in some cultures, and can be confusing in a way that is avoidable. Tell that to the Virgin Islands folks. :-) It was a passing thought, I'm not like 'omg must be done' for this: I certainly knew what it meant, but as its (AFAIK) the first config option in squid to specify virgins, I see potential for confusion :) Other possibilities: - source - external - unaltered - original I don't know that the config option has to meet the code, though obviously it is nice if it does. -Rob signature.asc Description: This is a digitally signed message part
Re: Squid-3 and HTTP/1.1
On Wed, 2010-01-27 at 22:49 -0700, Alex Rousskov wrote: c) Co-Advisor currently only tests MUST-level requirements. Old Robert's checklist contained some SHOULD-level requirements as well. I see that Sheet1 on the spreadsheet has SHOULDs. Are we kind of ignoring them (and Sheet1) for now, until all MUSTs on Sheet2 are satisfied? d) I do not know who created the spreadsheet. Whoever it was, thank you! Is there a script that takes Co-Advisor results and produces a spreadsheet column for cut-and-pasting? It looks nice. It might be based on the xls spreadsheet I made, but I don't know ;) I would not worry about SHOULD's until the MUSTs are done (but if a SHOULD is in reach while doing a MUST, doing it would be good). -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] log virgin HTTP request headers
Just a small thing: can I suggest s/virgin/pristine/ ? Or s/virgin/received/ ? virgin has a sexual connotation in some cultures, and can be confusing in a way that is avoidable. -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] Cloned Range memory leak
Looks good to me. -Rob signature.asc Description: This is a digitally signed message part
Re: patch to fix debugging output
Thanks, applied to trunk. -Rob signature.asc Description: This is a digitally signed message part
Re: [RFC] Micro Benchmarking
On Tue, 2009-12-22 at 16:05 +1300, Amos Jeffries wrote: The tricky bit appears to be recovering the benchmark output and handling it after a run. If you make each thing you want to get a measurement on a separate test, you could trivially install libcppunit-subunit-dev and use subunits timing and reporting mechanisms; saving the stream provides a simple persistence mechanism. -Rob signature.asc Description: This is a digitally signed message part
Re: SMB help needed
On Sat, 2009-12-12 at 23:33 +0100, Henrik Nordstrom wrote: tor 2009-12-10 klockan 00:34 +1300 skrev Amos Jeffries: A few months ago we had this argument out and decided to keep them for the people who still don't want to or can't install Samba. Indeed. The SMB helpers are easier to get going as one does not need to join the domain or anything, just being able to speak to the SMB port of a server in the domain. But other than that the helpers are in quite crappy shape.. They are easier in that sense, but worse in the following: - they put more load on the domain - can't do NTLM reliably - very old, very crufty code I know we had the argument, but I'm not at all convinced that keeping them is the right answer. I think a better answer is to talk to samba to find out if winbindd can be used outside a domain, which is the only usecase these helpers are 'better' at, and if it can - or if it could be changed to do so, then do that, and get rid of the cruft as at that point it won't offer anything. -Rob signature.asc Description: This is a digitally signed message part
Re: [squid-users] 'gprof squid squid.gmon' only shows the initial configuration functions
On Tue, 2009-12-08 at 15:32 -0800, Guy Bashkansky wrote: I've built squid with the -pg flag and run it in the no-daemon mode (-N flag), without the initial fork(). I send it the SIGTERM signal which is caught by the signal handler, to flag graceful exit from main(). I expect to see meaningful squid.gmon, but 'gprof squid squid.gmon' only shows the initial configuration functions: gprof isn't terribly useful anyway - due to squids callback based model, it will see nearly all the time belonging to the event loop. oprofile and/or squids built in analytic timers will get much better info. -Rob signature.asc Description: This is a digitally signed message part
Re: SMB help needed
On Wed, 2009-12-09 at 17:40 +1300, Amos Jeffries wrote: During the helper conversion to C++ I found that the various SMB lookup helpers had a lot of duplicate code as each included the entire smbval/smblib validation library as inline code. Delete them. Samba project ships helpers that speak to winbindd and do a hellishly better job :-) -Rob signature.asc Description: This is a digitally signed message part
Re: Link convention fixes
Can you give an example of what you're talking about, and show the compiler warning you're getting too? (And what flags are needed to get it)? -Rob signature.asc Description: This is a digitally signed message part
Re: squid-smp: synchronization issue solutions
On Tue, 2009-11-24 at 16:13 -0700, Alex Rousskov wrote: For example, I do not think it is a good idea to allow a combination of OpenMP, ACE, and something else as a top-level design. Understanding, supporting, and tuning such a mix would be a nightmare, IMO. I think that would be hard, yes. See Henrik's email on why it is difficult to use threads at highest levels. I am not convinced yet, but I do see Henrik's point, and I consider the dangers he cites critical for the right Q1 answer. - If we do *not* permit multiple approaches, then what approach do we want for parallelisation. E.g. a number of long lived threads that take on work, or many transient threads as particular bits of the code need threads. I favour the former (long lived 'worker' threads). For highest-level models, I do not think that one job per thread/process, one call per thread/process, or any other one little short-lived something per thread/process is a good idea. Neither do I. Short lived things have a high overhead. But consider that a queue of tasks in a single long lived thread doesn't have the high overhead of making a new thread or process per item in the queue. Using ACLs as an example, ACL checking is callback based nearly everywhere; we could have a thread that does ACL checking and free up the main thread to continue doing work. Later on, with more auditing we could have multiple concurrent ACL checking threads. -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] logdaemon feature import from Squid-2.7
+1 signature.asc Description: This is a digitally signed message part
Re: /bzr/squid3/trunk/ r10149: skip performing C libTrie unit tests
On Fri, 2009-11-20 at 21:02 +0100, Francesco Chemolli wrote: revno: 10149 committer: Francesco Chemolli kin...@squid-cache.org branch nick: trunk timestamp: Fri 2009-11-20 21:02:00 +0100 message: skip performing C libTrie unit tests Please include motivation in commit messages. The diff shows that you skipped the C tests, but not why. And because the motivation is missing, I'm left asking 'why?' Untested code is broken code, so this really can't be the right answer. -Rob signature.asc Description: This is a digitally signed message part
Re: /bzr/squid3/trunk/ r10149: skip performing C libTrie unit tests
On Sat, 2009-11-21 at 01:19 +0100, Kinkie wrote: And because the motivation is missing, I'm left asking 'why?' Untested code is broken code, so this really can't be the right answer. Ok, I'll be more detailed in the future. As for this case: the autoconf environment is pretty messy in non-canonicalized environments such as Solaris. In particular, the C tests require linking against libstdc++, which causes two kind of issues: finding it (there's at least 6 of the buggers in the test zone), and finding the one with the right ABI (gcc/sunstudio cc). Since we're not using the C interface anyways, might as well skip the checks. I'd much rather you delete code that isn't being tested. -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] replace RFC2181 magic numbers with POSIX definitions
Seems ok at the code level. I worry a little that on damaged systems this will reduce our functionality. At the moment we have our own DNS lookup so the system host length shouldn't matter at all. -Rob signature.asc Description: This is a digitally signed message part
Re: squid-smp: synchronization issue solutions
On Wed, 2009-11-18 at 10:46 +0800, Adrian Chadd wrote: Plenty of kernels nowdays do a bit of TCP and socket process in process/thread context; so you need to do your socket TX/RX in different processes/threads to get parallelism in the networking side of things. Very good point. You could fake it somewhat by pushing socket IO into different threads but then you have all the overhead of shuffling IO and completed IO between threads. This may be .. complicated. The event loop I put together for -3 should be able to do that without changing the loop - just extending the modules that hook into it. -Rob signature.asc Description: This is a digitally signed message part
Re: squid-smp: synchronization issue solutions
On Tue, 2009-11-17 at 15:49 -0300, Gonzalo Arana wrote: In my limited squid expierence, cpu usage is hardly a bottleneck. So, why not just use smp for the cpu/disk-intensive parts? The candidates I can think of are: * evaluating regular expressions (url_regex acls). * aufs/diskd (squid already has support for this). So, we can drive squid to 100% CPU in production high load environments. To scale further we need: - more cpus - more performance from the cpu's we have Adrian is working on the latter, and the SMP discussion is about the former. Simply putting each request in its own thread would go a long way towards getting much more bang for buck - but thats not actually trivial to do :) -Rob signature.asc Description: This is a digitally signed message part
Re: squid-smp: synchronization issue solutions
On Mon, 2009-11-16 at 00:29 +0530, Sachin Malave wrote: Hello, Since last few days i am analyzing squid code for smp support, I found one big issue regarding debugs() function, It is very hard get rid of this issue as it is appearing at almost everywhere in the code. So for testing purpose i have disable the debug option in squid.conf as follows --- debug_options 0,0 --- Well this was only way, as did not want to spend time on this issue. Its very important that debugs works. 1. hash_link LOCKED Bad idea, not all hashes will be cross-thread, so making the primitive lock incurs massive overhead for all threads. 2. dlink_list LOCKED Ditto. 3. ipcache, fqdncache LOCKED, Probably important. 4. FD / fde handling ---WELL, SEEMS NOT CREATING PROBLEM, If any then please discuss. we need analysis and proof, not 'seems to work'. 5. statistic counters --- NOT LOCKED ( I know this is very important, But these are scattered all around squid code, Write now they may be holding wrong values) Will need to be fixed. 6. memory manager --- DID NOT FOLLOW Will need attention, e.g. per thread allocators. 7. configuration objects --- DID NOT FOLLOW ACL's are not threadsafe. AND FINALLY, Two sections in EventLoop.cc are separated and executed in two threads simultaneously as follows (#pragma lines added in existing code, no other changes) I'm not at all sure that splitting the event loop like that is sensible. Better to have the dispatcher dispatch to threads. -Rob signature.asc Description: This is a digitally signed message part
Re: [RFC] Libraries usage in configure.in and Makefiles
On Wed, 2009-11-11 at 18:38 +1300, Amos Jeffries wrote: Why? A: The squid binary is topping 3.5MB in footprint with many of the small tool stopping 500KB each. A small but substantial amount of it is libraries inked but unused. Really? dynamically linked libraries should be tiny. -Rob signature.asc Description: This is a digitally signed message part
Re: [RFC] Libraries usage in configure.in and Makefiles
On Wed, 2009-11-11 at 19:43 +1300, Amos Jeffries wrote: Robert Collins wrote: On Wed, 2009-11-11 at 18:38 +1300, Amos Jeffries wrote: Why? A: The squid binary is topping 3.5MB in footprint with many of the small tool stopping 500KB each. A small but substantial amount of it is libraries inked but unused. Really? dynamically linked libraries should be tiny. -Rob In the on-disk binary itself yes ... yet they load into memory under the parent apps footprint. The pages for the libraries will be shared though, so it really doesn't matter. -Rob signature.asc Description: This is a digitally signed message part
Re: /bzr/squid3/trunk/ r10069: Fixed OpenSolaris build issues.
On Tue, 2009-11-03 at 13:19 +1300, Amos Jeffries wrote: Just a note on this change... I'm trying to get rid of XTRA_LIBS entirely. It's adding to the code bloat and external library dependencies in Squid. This isn't clear to me, can you expand on it please. Specifically how XTRA_LIBS - the list of additional libraries to link against - is different to COMMON_LIBS. -Rob signature.asc Description: This is a digitally signed message part
Re: WebSockets negotiation over HTTP
On Wed, 2009-10-14 at 09:59 +1100, Mark Nottingham wrote: On 13/10/2009, at 10:23 PM, Ian Hickson i...@hixie.ch wrote: I want to just use port 80, and I want to make it possible for a suitably configured HTTP server to pass connections over to WebSocket servers. It seems to me that using something that looks like an HTTP Upgrade is better than just having a totally unrelated handshake, but I guess maybe we should just reuse port 80 without doing anything HTTP-like at all. To be clear, upgrade is appropriate for changing an existing connection over to a new protocol (ie reusing it). To pass a request over to a different server, a redirect would be more appropriate (and is facilitated by the new uri scheme). Yup; and the major issue here is that websockets *does not want* the initial handshake to be HTTP. Rather it wants to be something not-quite HTTP, specifically reject a number of behaviours and headers that are legitimate HTTP. -Rob signature.asc Description: This is a digitally signed message part
wiki, bugzilla, feature requests
AIUI we use the wiki [over and above being a source of docs] to /design features/ and manage [the little] scheduling that we, as volunteers can do. I think thats great. However, we also have many bugs that are not strictly-current-defects. They are wishlist items. What should we do here? I've spent quite some time using wikis for trying to manage such things, I think its a lost cause. Use them for design and notes and so forth, but not for managing metadata. I suggest that when there is a bug for a feature that is being designed in the wiki, just link to bugzilla from that wiki page. And for management of dependencies and todos, assignees and so forth, we should do it in bugzilla, which is *designed* for that. -Rob signature.asc Description: This is a digitally signed message part
Re: wiki, bugzilla, feature requests
I'm proposing: - if there is a bug for something, and a wiki page, link them together. - scheduling, assignment, and dependency data should be put in bugs - whiteboards to sketch annotate document etc should always be in the wiki -Rob signature.asc Description: This is a digitally signed message part
Re: [MERGE] Clean up htcp cache_peer options collapsing them into a single option with arguments
On Fri, 2009-09-18 at 00:22 +0200, Henrik Nordstrom wrote: the list of HTCP mode options had grown a bit too large. Collapse them all into a single htcp= option taking a list of mode flags. Its not clear from the docs whether folk should do htcp=foo htcp=bar or htcp=foo,bar -Rob signature.asc Description: This is a digitally signed message part
Re: [MERGE] Clean up htcp cache_peer options collapsing them into a single option with arguments
+1 then signature.asc Description: This is a digitally signed message part
Re: Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?
On Tue, 2009-09-15 at 16:09 +1000, Adrian Chadd wrote: But in that case, ACCESS_REQ_PROXY_AUTH would be returned rather than ACCESS_DENIED.. Right... so can we have some more details about what is happening and what you expect? deny !proxy_auth_group != allow proxy_auth_group and deny proxy_auth_group != allow !proxy_auth_group -Rob signature.asc Description: This is a digitally signed message part
Re: compute swap_file_sz before packing it
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Just a small meta point: The new function you're adding looks like it should be a method to me. - -Rob -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkquyF4ACgkQ42zgmrPGrq4kDQCeLjIz1zAP9F4nvCPz7gkrxbyw Q5cAnR8qzBjnwR97OgPa1tFxyq9Lv+5R =XS/a -END PGP SIGNATURE-
Re: Squid-smp : Please discuss
On Tue, 2009-09-15 at 14:27 +1200, Amos Jeffries wrote: RefCounting done properly forms a lock on certain read-only types like Config. Though we are currently handling that for Config by leaking the memory out every gap. SquidString is not thread-safe. But StringNG with its separate refcounted buffers is almost there. Each thread having a copy of StringNG sharing a SBuf equates to a lock with copy-on-write possibly causing issues we need to look at if/when we get to that scope. General rule: you do /not/ want thread safe objectse for high usage objects like RefCount and StringNG. synchronisation is expensive; design to avoid synchronisation and hand offs as much as possible. -Rob signature.asc Description: This is a digitally signed message part
Re: Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?
On Tue, 2009-09-15 at 15:22 +1000, Adrian Chadd wrote: G'day. This question is aimed mostly at Henrik, who I recall replying to a similar question years ago but without explaining why. Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL? The particular bit in src/client_side.c: int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || aclIsProxyAuth(AclMatchedName)) !http-request-flags.transparent; Is there any particular reason why auth is tried again? it forces a pop-up on browsers that already have done authentication via NTLM. Because it should? Perhaps you can expand on where you are seeing this - I suspect a misconfiguration or some such. Its entirely appropriate to signal HTTP_PROXY_AUTHENTICATION_REQUIRED when a user is denied access to a resource *and if they log in differently they could get access*. -Rob signature.asc Description: This is a digitally signed message part
Re: Build failed in Hudson: 3.1-amd64-CentOS-5.3 #14
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Amos Jeffries wrote: You misunderstood me. On FreeBSD from what I've seen of squid-cache the md5sum 'binary' is: /path/to/pythonversion /path/to/md5sum.py 'md5' - -Rob -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkqkyFMACgkQ42zgmrPGrq5v1gCfVg5djbsXoG+wP/fe4vecUI43 SBMAmQHm96PZ0oZAm8FtPcO7THcvM/n7 =QiDt -END PGP SIGNATURE-
Re: Build failed in Hudson: 3.1-amd64-CentOS-5.3 #14
On Sun, 2009-09-06 at 22:52 +0200, Henrik Nordstrom wrote: Yes. With no dependencies to guide it make reorders as it sees fit for the day, especially if running parallel jobs. We should be fine if we just list the dependencies. -Rob signature.asc Description: This is a digitally signed message part
Re: Build failed in Hudson: 3.1-amd64-CentOS-5.3 #14
On Sun, 2009-09-06 at 23:25 +0200, Henrik Nordstrom wrote: Not sure I am comfortable with adding dependencies to automakes own targets.. and no, we are not fine with just that. See the rest of previous response.. To repeat there is more issues here than uninstall racing, we don't even install properly if someone tries to override the default locations by make DEFAULT_CONFIG_FILE=... DEFAULT_MIME_TABLE=... We should definitely fix that :). If its plausible that automake should know and avoid the problem itself, lets file a bug. Otherwise we'll have to work around it I guess. -Rob signature.asc Description: This is a digitally signed message part
Re: WebSockets negotiation over HTTP
On Fri, 2009-09-04 at 01:44 +, Ian Hickson wrote: One very real example of this would be the web server or an fully WebSocket capable intermediary sending back bytes ... example #1 suppose there was an intermediary translating websockets-over-http to websockets-port-81 which used HTTP to format said headers of confirmation. Here is the problem I have with this: Why would we suppose the existence of such an intermediary? Why would anyone ever implementa WebSocket-specific proxy? Welcome to the internet :). Seriously, Why would we suppose the existence of an intermediary that intercepts and MITM's SSL connections? Or HTTP - its even got a defined proxy facility, there is no need to take over and insert different behaviour, is there? We *should* assume that firewall vendors and ISP's will do arguably insane things, because they have time and again in the past. example #2 is where the traffic is processed by an HTTP-only intermediary which sees the 'Upgrade:' header and flags the connection for transparent pass-thru (This by the way is the desirable method of making Squid support WebSockets). Being a good HTTP relay it accepts these bytes: HTTP/1.1 101 Web Socket Protocol Handshake Upgrade: WebSocket Connection: Upgrade It violates HTTP by omitting the Via and other headers your spec omits to handle. And passes these on: HTTP/1.1 101 Web Socket Protocol Handshake Connection: Upgrade Upgrade: WebSocket then moves to tunnel mode for you. Why would it violate HTTP in all the ways you mention? If it goes to such extreme lengths to have transparent pass-through to the point of violating HTTP, why would it then go out of its way to reorder header lines? Because sysadmins do this! Don't ask us to justify the weird and wonderful things we run into. Nearly daily we have users asking for help doing similar things in #squid. From my perspective, such a proxy would raise all kinds of alarm bells to me, and I would be _glad_ that the connection failed. If it didn't, I wouldn't be sure we could trust the rest of the data. Again, welcome to the internet :P. Seriously, if its not digitally signed, you *can't* trust the rest of the data. The MITM isn't the WebSocket client. In this situation, it's a (non-compliant, since it forwarded by-hop headers) HTTP proxy. What's more, in this scenario the server isn't a WebSocket server, either, it's an HTTP server. So what Web Socket says is irrelevant. Note that there are *still* HTTP/1.0 proxies in deployment that don't know about hop by hop headers. ... WebSockets, even when initiating the connection by talking to an HTTP server and then switching to WebSockets, isn't layered on HTTP. It's just doing the bare minimum required to allow the HTTP server to understand the request and get out of the way. Once the handshake is complete, there is no HTTP anywhere on the stack at all. Its not doing the bare minimum. The bare minimum would be to accept the valid HTTP transforms which the Internet _will_ perform on the handshake. Discard those useless transforms and validate the handshake status line. The bare minimum is the least amount of processing possible. What you describe is _more_ processing, not less. Therefore it's not the minimum. The bare minimum would be to just start the tcp connection with 'websocket/1.0\r\n' Stop pretending to be HTTP: use port 80. Our whole point has been, if you want to Use HTTP Upgrade, Do It Correctly. If you want to Use port 80, just do that. On Thu, Jul 30 2009, Henrik Nordstrom wrote: But for 2 you need to use HTTP, which essentially boils down to defining that one may switch a webserver to use WebSockets by using the HTTP Upgrade mechanism as defined by HTTP. I understand that you disagree with my interpretation, but my interpretation is that this is exactly what the spec does already. At this point I think we need to go to the HTTP-WG and discuss further. Most of us are already there... 5) Specific mention is made to ignore non-understood headers added randomly by intermediaries. So long as that happens after the handshake, that's ok, but we can't allow that inside the handshake, it would allow for smuggling data through, If having this view then you CAN NOT use HTTP for the Upgrade handshake, and MUST use another port and other protocol signatures. IANA expert review has informed me that I must use ports 80 and 443, so there I don't have a choice here. Whats the message id what list? I'm extremely happy to jump into that conversation. -Rob signature.asc Description: This is a digitally signed message part
Re: R: Squid 3 build errors on Visual Studio - problem still present
On Sun, 2009-08-30 at 09:48 +0200, Guido Serassio wrote: c:\work\nt-3.0\src\SquidString.h(98) : error C2057: expected constant expression The offending code is: const static size_type npos = std::string::npos; Can you find out what std::string::npos is defined as in your compiler's headers? Thanks, Rob signature.asc Description: This is a digitally signed message part
Re: R: R: Squid 3 build errors on Visual Studio - problem still present
On Sun, 2009-08-30 at 18:13 +0200, Guido Serassio wrote: Hi, I don't know what is std::string::npos, and so I don't know what to look for http://www.cplusplus.com/reference/string/string/npos/ It should be a static const, which is why I'm so surprised you're getting an error about it. -Rob signature.asc Description: This is a digitally signed message part
Re: [PATCH] DiskIO detection cleanup.
I haven't read the patch yet, but I concur with Henrik. We should default-enable as much as possible: An unused DiskIO module has almost no footprint - simply a registration entry and a link dependency on $whatever was needed. Rebuilding your squid because it doesn't have what you need is _much_ more painful. Even the smallest embedded devices are pretty huge these days, so the extra libs don't concern me. -Rob signature.asc Description: This is a digitally signed message part
Re: Alternate http repository for squid3
On Fri, 2009-08-21 at 09:35 +0200, Kinkie wrote: On Fri, Aug 21, 2009 at 7:42 AM, Robert Collinsrobe...@robertcollins.net wrote: On Thu, 2009-08-20 at 14:07 +0200, Kinkie wrote: As part of the ongoing buildfarm work, I've published onto http the (read-only) repository mirror that's hosted on eu. It's available at http://www.eu.squid-cache.org/bzr/squid3/ How often is this synced? 6 hours. rsync-based cron job on eu, user bzr. If we're going to build test from it, it probably needs to be synced on-demand. -Rob signature.asc Description: This is a digitally signed message part
Re: Alternate http repository for squid3
On Tue, 2009-08-25 at 11:09 +1200, Amos Jeffries wrote: On Tue, 25 Aug 2009 04:20:31 +1000, Robert Collins robe...@robertcollins.net wrote: On Fri, 2009-08-21 at 09:35 +0200, Kinkie wrote: On Fri, Aug 21, 2009 at 7:42 AM, Robert Collinsrobe...@robertcollins.net wrote: On Thu, 2009-08-20 at 14:07 +0200, Kinkie wrote: As part of the ongoing buildfarm work, I've published onto http the (read-only) repository mirror that's hosted on eu. It's available at http://www.eu.squid-cache.org/bzr/squid3/ How often is this synced? 6 hours. rsync-based cron job on eu, user bzr. If we're going to build test from it, it probably needs to be synced on-demand. Or at least hourly before the SCM polling happens. That frequency has worked so far for the SourceForge CVS copy. The lower the latency the better. That allows change-test cycles to be short, when dealing with build environments we don't locally have. -Rob signature.asc Description: This is a digitally signed message part
Re: Alternate http repository for squid3
On Thu, 2009-08-20 at 14:07 +0200, Kinkie wrote: As part of the ongoing buildfarm work, I've published onto http the (read-only) repository mirror that's hosted on eu. It's available at http://www.eu.squid-cache.org/bzr/squid3/ How often is this synced? -Rob signature.asc Description: This is a digitally signed message part
RFC: infrastructure product in bugzilla
I think we should have an infrastructure product in bugzilla, for tracking list/server/buildfarm etc issues. -Rob -- signature.asc Description: This is a digitally signed message part
Re: RFC: infrastructure product in bugzilla
On Thu, 2009-08-20 at 11:33 +1200, Amos Jeffries wrote: On Thu, 20 Aug 2009 09:00:15 +1000, Robert Collins robe...@robertcollins.net wrote: I think we should have an infrastructure product in bugzilla, for tracking list/server/buildfarm etc issues. What sort of extra issues exactly are you thinking need to be bug-tracked? We already have websites as a separate 'product' for tracking content errors. Oh hmm, perhaps just renaming websites - infrastructure. We have a bunch of services: - smtp - lists - backups? - user accounts on squid-cache.org machines (eu, us, test vms, others?) - VCS - code review And a wide range of webbish services - the CDN - bugzilla (currently xlmrpc doesn't work) - the main site content - patch set generation - wiki -Rob signature.asc Description: This is a digitally signed message part
Re: separate these to new list?: Build failed...
On Sun, 2009-08-16 at 04:05 +0200, Henrik Nordstrom wrote: sön 2009-08-16 klockan 10:23 +1000 skrev Robert Collins: If the noise is too disturbing to folk we can investigate these... I wouldn't want anyone to leave the list because of these reports. I would expect the number of reports to decline significantly as we learn to check commits better to avoid getting flamed in failed build reports an hour later.. combined with the filtering just applied which already reduced it to 1/6. But seriously, it would be a sad day if these reports becomes so frequent compared to other discussions that developers no longer would like to stay subscribed. We then have far more serious problems.. Discussion on this list can be quite sporadic, and its easy for build message volume to be a significant overhead - at least thats my experience in other projects - lists which have unintentional traffic feel hard to deal with. This includes bug mail, build mail, automated status reports and so on. Secondly, I wager that many folk on this list are not regular committers and are unlikely to hop up and fix a build failure; so its not really the right balance for them to be hearing about failures. I think it makes sense to have a dedicated list (squild-bui...@squid-cache.org) for build status activity. I probably won't be on it, for instance. (I prefer to track such data via rss feeds - they don't grab my attention when I'm in the middle of something else, but the data is there and I can still look at and fix things). -Rob signature.asc Description: This is a digitally signed message part
buildfarm builds for squid-2?
I thought I'd just gauge interest in having 2.HEAD and 2.CURRENT tested in the buildfarm. For all that most development is focused on 3, there are still commits being done to 2.x, and most of the hard work in the buildfarm is setup - which is now done. -Rob signature.asc Description: This is a digitally signed message part
Re: separate these to new list?: Build failed...
We do have other options though: - we could have a separate list - we have the RSS feed anyone can subscribe too - we could cause failing builds to file bugs. If the noise is too disturbing to folk we can investigate these... I wouldn't want anyone to leave the list because of these reports. -Rob signature.asc Description: This is a digitally signed message part