Re: SBuf review at r9370
On 02/26/2009 10:24 AM, Kinkie wrote: On Thu, Feb 26, 2009 at 1:46 AM, Alex Rousskov rouss...@measurement-factory.com wrote: http://bazaar.launchpad.net/~kinkie/squid/stringng at r9370. * Remove isLiteral and the corresponding code/logic. We might add this very minor performance optimization as a non-public Blob member once the code is known to work well. For now, the code is confusing enough without it. The Literal thing was actually added to answer to a very specific need. IIRC it was because without it valgrind complained about leaked memory in case of a global static SBuf. If you're sure, I'll remove it. I am even more sure now that I know the reason you added it. :-) * Remove isImported. Copy and then free the buffer when importing instead. Same motivation as in the isLiteral comment above. This too has a IMO a very practical use: it allows us an easy path to get into the SBuf world i/o buffer which have been created by the i/o layer. If you're sure, I'll remove it. I understand that it can be a useful optimization. I am sure we should replace that optimization with copying (in absorb()) for now because we already have enough bugs to worry about without this special case. Special cases like this significantly increase the probability of a bug left unnoticed by a reviewer, IMO. * Some throw() declarations are still left in the code. Grep for them. Are you sure? There's only one that I could find, in the virtual destructor for OutOfBoundsException, and in that case it's needed to match the signature of TextException. I forgot that standard exception class forces us to use throw(). I guess this one will have to stay then. * cow() logic feels wrong because grow() does not just grow(). Grow is discussed below. The cow() code should be something very simple like: if (we are alone) return false; // no need to copy store_ = new MemBuf(*store_); // get ourselves an exclusive copy of the blob return false; No grow() magic -- we are getting an exclusive copy, not growing something. MemBlob has disabled copy-constructor and assignment operator by choice, as it gets handled through (refcount) pointers. (BTW: is this a violation of the coding guidelines?) It's certainly possible to add it, it's just a different style. The only extra operation performed by the current implementation is to run a re-evaluation of the blob store sizing heuristics: at the end it's a cheap operation compared to the copying, and since we're copying we may as well check it. I'll change if you confirm me that you have also considered this side of the argument. Here, I am not complaining about what happens under the hood. I am complaining about the on-the-surface logic of cow(). It is broken. Whether the implementation actually works is irrelevant. There should be no grow() in cow(). So you need to replace grow() with copying. If we do not want to expose a MemBuf copy constructor, you can: - Implement MemBuf copy constructor as a private member. - Add MemBuf::Copy(const MemBuf mb) that returns a Pointer to new MemBuf (mb). This approach is both clean and efficient. The copy constructor can optimize, but please do keep it simple. My suggestion: reintroduce reAlloc() which gets called by cow() and does the low-level work, and have both cow() and grow() perform checks, heuristics and call reAlloc(). How would you like this? I believe the Copy approach above is superior as far as cow() is concerned. I will have to revisit grow() separately. * estimateCapacity() is too complex to understand its performance effects, especially since nreallocs is a very strange animal. I suggest that we keep it very simple and add optimizations later, if needed. Something like granularity = desired PageSize ? 16 : PageSize; return roundto(desired, granularity) is much easier to comprehend, analyze, and optimize. Let's postpone complex optimizations. I agree, with a SLIGHTLY more complex design to match the current standard MemPool String sizes. Stuff like: desired *= constant_growth_factor; if (desired ShortStringsSize) return ShortStringsSize; if (desired MediumStringsSize) return ShortStringsSize; if (desired BigStringsSize) return BigStringsSize; if (desired 2kBuffersSize) return BigStringsSize; Can you add some static(?) SuggestMemSize() method to the String pool and call it from SBuf? We should not duplicate the logic of string-related memory size calculations, IMO. return roundto(desired,PageSize); We may want to discuss whether squidSystemPageSize should be static to this module (as it is now) or belongs to the global namespace. Does any other Squid code use it? If yes, it would be nice to store/calculate it in a single, shared location. * startsWith() is buggy -- it ignores the contents of its parameter's buffer. Does it? It does. s/*this/S/. * SBuf::consume()
Re: Feature: quota control
I'm looking at implementing this as part of a contract for squid-2. I was going to take a different approach - that is, i'm not going to implement quota control or management in squid; I'm going to provide the hooks to squid to allow external controls to handle the quota. adrian 2009/2/21 Pieter De Wit pie...@insync.za.net: Hi Guys, I would like to offer my time in working on this feature - I have not done any squid dev, but since I would like to see this feature in Squid, I thought I would take it on. I have briefly contacted Amos off list and we agreed that there is no set in stone way of doing this. I would like to propose that we then start throwing around some ideas and let's see if we can get this into squid :) Some ideas that Amos quickly said : - Based on delay pools - Use of external helpers to track traffic The way I see this happening is that a Quota is like a pool that empties based on 2 classes - bytes and requests. Requests will be for things like the number of requests, i.e. a person is only allowed to download 5 exe's per day or 5 requests of 1meg or something like that (it just popped into my head :) ) Bytes is a pretty straight forward one, the user is only allowed x amount of bytes per y amount of time. Anyways - let the ideas fly :) Cheers, Pieter
Re: Feature: quota control
Isn't requests really just an external acl helper? On 27/02/2009, at 8:36 AM, Adrian Chadd wrote: I'm looking at implementing this as part of a contract for squid-2. I was going to take a different approach - that is, i'm not going to implement quota control or management in squid; I'm going to provide the hooks to squid to allow external controls to handle the quota. adrian 2009/2/21 Pieter De Wit pie...@insync.za.net: Hi Guys, I would like to offer my time in working on this feature - I have not done any squid dev, but since I would like to see this feature in Squid, I thought I would take it on. I have briefly contacted Amos off list and we agreed that there is no set in stone way of doing this. I would like to propose that we then start throwing around some ideas and let's see if we can get this into squid :) Some ideas that Amos quickly said : - Based on delay pools - Use of external helpers to track traffic The way I see this happening is that a Quota is like a pool that empties based on 2 classes - bytes and requests. Requests will be for things like the number of requests, i.e. a person is only allowed to download 5 exe's per day or 5 requests of 1meg or something like that (it just popped into my head :) ) Bytes is a pretty straight forward one, the user is only allowed x amount of bytes per y amount of time. Anyways - let the ideas fly :) Cheers, Pieter -- Mark Nottingham m...@yahoo-inc.com
Re: Feature: quota control
Key things I'd look for: - quota's shouldn't be lost when squid restarts - people should be able to use external quota systems (they may have e.g. netflow or other systems tracking direct b/w use, and squid is pulling from those allowances when downloads are caused by a given user). Both of which are nicely solved by Adrians suggestion of making sure there are appropriate hooks in squid to let other software actually do quotas. It would be nice to ship a default client for those hooks that does something simple (like monthly quotas with a simple script to report on current totals/reset). But even that is probably too much in core - it should probably be a parallel project, like the reference icap client is. (And isn't iCap enough to do quotas?, or is it too heavyweight? Perhaps look at the in-squid iCap-alike Alex has been working on?) -Rob signature.asc Description: This is a digitally signed message part
Re: Feature: quota control
Honestly, if I wanted to do byte-based quotas today, I'd have an external ACL helper talking to an external logging helper; that way, you can just log the response sizes to a daemon and then another daemon would use that information to make a decision at access time. The only even mildly hard part about this is sharing state between the daemons, but if you don't need the decisions to be real-time, it's not that bad (especially considering that in any serious deployment, you'll have state issues between multiple boxes anyway). Squid modifications that would help this (and similar use cases): 1) Allow multiple, different external loggers to be defined and used. 2) Harmonise the interfaces/configuration of *all* external helpers, so that you can pass arbitrary args to each, using the same syntax. I'm looking at you, redirectors. 3) Reduce the overhead of using external helpers, if possible. A *lot* of the customisation that I do for Squid is based on uses of helpers like this; they're actually very powerful and flexible, if you know what you're doing with them. I would very much like to see Squid turn them into more of a strength... On 27/02/2009, at 9:30 AM, Robert Collins wrote: Key things I'd look for: - quota's shouldn't be lost when squid restarts - people should be able to use external quota systems (they may have e.g. netflow or other systems tracking direct b/w use, and squid is pulling from those allowances when downloads are caused by a given user). Both of which are nicely solved by Adrians suggestion of making sure there are appropriate hooks in squid to let other software actually do quotas. It would be nice to ship a default client for those hooks that does something simple (like monthly quotas with a simple script to report on current totals/reset). But even that is probably too much in core - it should probably be a parallel project, like the reference icap client is. (And isn't iCap enough to do quotas?, or is it too heavyweight? Perhaps look at the in-squid iCap-alike Alex has been working on?) -Rob -- Mark Nottingham m...@yahoo-inc.com
Re: Feature: quota control
I was talking about request number quotas (e.g., you can make n requests in m minutes). Regarding byte quotes -- see subsequent message -- I disagree that you need eCAP :) On 27/02/2009, at 10:00 AM, Kinkie wrote: On Thu, Feb 26, 2009 at 11:23 PM, Mark Nottingham m...@yahoo- inc.com wrote: Isn't requests really just an external acl helper? Not really.. an external ACL helper would need to do real-time parsing of the logs to really know how much each client downloaded, as AFAIK both request and reply acl's are evaluated BEFORE the actual transfer takes place. Only eCAP probably has the right hooks. -- /kinkie -- Mark Nottingham m...@yahoo-inc.com
Re: Feature: quota control
On 02/26/2009 04:04 PM, Mark Nottingham wrote: I was talking about request number quotas (e.g., you can make n requests in m minutes). Regarding byte quotes -- see subsequent message -- I disagree that you need eCAP :) eCAP is one way of doing byte- and/or count- quotas, but there are, of course, other approaches as well. I am still not very excited about the slow bloat of redirectorsCo API towards duplication of *CAP functionality, but perhaps it is inevitable. If it is, a clean unified interface among all such one-line external helpers would make that duplication a lot less painful. Cheers, Alex. On 27/02/2009, at 10:00 AM, Kinkie wrote: On Thu, Feb 26, 2009 at 11:23 PM, Mark Nottingham m...@yahoo-inc.com wrote: Isn't requests really just an external acl helper? Not really.. an external ACL helper would need to do real-time parsing of the logs to really know how much each client downloaded, as AFAIK both request and reply acl's are evaluated BEFORE the actual transfer takes place. Only eCAP probably has the right hooks. -- /kinkie -- Mark Nottingham m...@yahoo-inc.com
Re: Feature: quota control
On Thu, Feb 26, 2009 at 11:23 PM, Mark Nottingham m...@yahoo-inc.com wrote: Isn't requests really just an external acl helper? Not really.. an external ACL helper would need to do real-time parsing of the logs to really know how much each client downloaded, as AFAIK both request and reply acl's are evaluated BEFORE the actual transfer takes place. Only eCAP probably has the right hooks. -- /kinkie
Re: SBuf review at r9370
Kinkie wrote: On Thu, Feb 26, 2009 at 1:46 AM, Alex Rousskov rouss...@measurement-factory.com wrote: Hi Kinkie, Here are a few corrections for http://bazaar.launchpad.net/~kinkie/squid/stringng at r9370. * Remove isLiteral and the corresponding code/logic. We might add this very minor performance optimization as a non-public Blob member once the code is known to work well. For now, the code is confusing enough without it. The Literal thing was actually added to answer to a very specific need. IIRC it was because without it valgrind complained about leaked memory in case of a global static SBuf. If you're sure, I'll remove it. Once isLiteral is removed, compare with the prototype pointer instead of testing isLiteral in cow(). * Remove isImported. Copy and then free the buffer when importing instead. Same motivation as in the isLiteral comment above. This too has a IMO a very practical use: it allows us an easy path to get into the SBuf world i/o buffer which have been created by the i/o layer. If you're sure, I'll remove it. * SBuf copy constructor and assignment operator still initialize members differently. Fixed. nreallocs is now imported by copy-constructor. * Some throw() declarations are still left in the code. Grep for them. Are you sure? There's only one that I could find, in the virtual destructor for OutOfBoundsException, and in that case it's needed to match the signature of TextException. * cow() logic feels wrong because grow() does not just grow(). Grow is discussed below. The cow() code should be something very simple like: if (we are alone) return false; // no need to copy The only difference I could find is an isLiteral check.. store_ = new MemBuf(*store_); // get ourselves an exclusive copy of the blob return false; No grow() magic -- we are getting an exclusive copy, not growing something. MemBlob has disabled copy-constructor and assignment operator by choice, as it gets handled through (refcount) pointers. (BTW: is this a violation of the coding guidelines?) It's certainly possible to add it, it's just a different style. The only extra operation performed by the current implementation is to run a re-evaluation of the blob store sizing heuristics: at the end it's a cheap operation compared to the copying, and since we're copying we may as well check it. I'll change if you confirm me that you have also considered this side of the argument. I however agree with you that the current cow()/grow() logic is flawed in that a certain amout of growing is always performed, regardless the fact that the MemBlob is used or not. Which means that a SBuf which gets aliased and cow()ed a lot will grow and waste memory. My suggestion: reintroduce reAlloc() which gets called by cow() and does the low-level work, and have both cow() and grow() perform checks, heuristics and call reAlloc(). How would you like this? * roundto(5,5) should return 5 but returns 10. The comment and implementation should be fixed. The method probably does not belong here though. Reimplemented and re-checked. Now works as advertised. Moved to libmiscutil. * estimateCapacity() is too complex to understand its performance effects, especially since nreallocs is a very strange animal. I suggest that we keep it very simple and add optimizations later, if needed. Something like granularity = desired PageSize ? 16 : PageSize; return roundto(desired, granularity) is much easier to comprehend, analyze, and optimize. Let's postpone complex optimizations. I agree, with a SLIGHTLY more complex design to match the current standard MemPool String sizes. Stuff like: desired *= constant_growth_factor; if (desired ShortStringsSize) return ShortStringsSize; if (desired MediumStringsSize) return ShortStringsSize; if (desired BigStringsSize) return BigStringsSize; if (desired 2kBuffersSize) return BigStringsSize; return roundto(desired,PageSize); We may want to discuss whether squidSystemPageSize should be static to this module (as it is now) or belongs to the global namespace. * To me, compare() returns the opposite of what strcmp() does. This should be treated as the first parameter of strcmp() and not the second one, IMO. Gah! Fixed and added explicit cross-checks to unit tests. * startsWith() is buggy -- it ignores the contents of its parameter's buffer. Does it? Its implementation reads: if *this is too short to match, then it can't match. Otherwise does a SBuf equality test on a a substr() of *this long just as the parameter.. * SBuf::consume() returns a copy of SBuf. It should return a reference to this or void. Er.. no. Consume shortens the SBuf by chopping off the head and returns the chopped part. See how it's used in the tokenizer. Given a: SBuf data=some very long string; then: SBuf chopped=data.consume(5); is equivalent to: SBuf chopped=data.substr(0,5); data.substrSelf(5,SBuf::npos); or to: SBuf chopped=data.substr(0,5);
Re: Feature: quota control
Robert Collins wrote: On Fri, 2009-02-27 at 10:00 +1100, Mark Nottingham wrote: Honestly, if I wanted to do byte-based quotas today, I'd have an external ACL helper talking to an external logging helper; that way, you can just log the response sizes to a daemon and then another daemon would use that information to make a decision at access time. The only even mildly hard part about this is sharing state between the daemons, but if you don't need the decisions to be real-time, it's not that bad (especially considering that in any serious deployment, you'll have state issues between multiple boxes anyway). Sure; I think that would fit with 'ensuring enough hooks' :P -Rob The brief description of what I gave Pieter to start with was: A pool based on DelayPools in that Squid decrements live as traffic goes through. With a helper/ACL hook to retrieve the initial pool size and to call as needed to check for current quotas. How the helper operates is not relevant to Squid. Thats important. The key things being that; its always called for new visitors to assign the start quota, and when the quota is nearing empty its called again to see if they get more. Helper would need to send back UNITS AMOUNT MINIMUM where UNITS is the unit of quota (seconds, bytes, requests, misses?, other?), AMOUNT being a integer count of units the client is allowed to use, and MINIMUM is the level of units where the helper is to be asked for an update. 0 remaining units results in an Error page 'quota exceeded' or somesuch. Amos -- Please be using Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13 Current Beta Squid 3.1.0.5
Re: SBuf review at r9370
On 02/26/2009 08:46 PM, Amos Jeffries wrote: Kinkie wrote: * Please use .length() everywhere you can. Do not mix .len_ and .length(). Ok, checked. I've left len_ where it's being assigned to another object's len_ or in very low-level stuff (e.g. object dumping, stuff doing pointer maths). Um, IMO its better to be liberal with length() and very,very conservative with len_. Dumping stuff and pointer maths particularly is one place its often better/safer to use const length() const instead of len_, unless the math is actually intended to change len_. Yeah. For zero-overhead wrapper methods like length(), the rule of thumb is very simple: use the wrapper method unless you modify the data member (or take its address). Cheers, Alex.
[PATCH] Uninstall configuration files
Hello, Please review the attached patch. The patched Makefiles delete installed configuration files, restoring functionality removed from Squid some time ago, but hopefully in a safer manner. This is the cleanest way I could find to make make distcheck work again (with the other pending SourceLayout changes). change log -- Made make distuninstallcheck work: Fixed make distuninstallcheck by removing installed configuration files iff they are identical to the installed default configuration files. Added scripts/remove-cfg.sh to do the safe removal because we need that functionality in many Makefiles. Made installed mime.conf removal safe. We were removing it without checking for modifications. Added commands to remove the following installed default configuration files: cachemgr.conf.default and msntauth.conf.default. Thank you, Alex. bb:approve === modified file 'errors/Makefile.am' --- errors/Makefile.am 2009-02-25 04:08:59 + +++ errors/Makefile.am 2009-02-26 22:30:53 + @@ -93,7 +93,8 @@ rm -f $(DESTDIR)$(DEFAULT_ERROR_DIR)/$$l/`basename $$f`; \ done; \ fi \ - done; \ + done; + @$(SHELL) $(top_srcdir)/scripts/remove-cfg.sh $(RM) $(DESTDIR)$(DEFAULT_STYLESHEET) rm -f $(DESTDIR)$(DEFAULT_STYLESHEET).default # undocumented hack. You can use this target to create multi-lingual === modified file 'helpers/basic_auth/MSNT/Makefile.am' --- helpers/basic_auth/MSNT/Makefile.am 2009-02-18 12:26:04 + +++ helpers/basic_auth/MSNT/Makefile.am 2009-02-26 22:41:20 + @@ -6,6 +6,7 @@ # Uncomment and customize the following to suit your needs: # +MSNTAUTH_CONF = $(sysconfdir)/msntauth.conf libexec_PROGRAMS = msnt_auth @@ -38,9 +39,14 @@ $(COMPILE) -DSYSCONFDIR=\$(sysconfdir)\ -c $(srcdir)/confload.c -o $@ install-data-local: - @if test -f $(DESTDIR)$(sysconfdir)/msntauth.conf ; then \ - echo $@ will not overwrite existing $(DESTDIR)$(sysconfdir)/msntauth.conf ; \ + @if test -f $(DESTDIR)$(MSNTAUTH_CONF) ; then \ + echo $@ will not overwrite existing $(DESTDIR)$(MSNTAUTH_CONF) ; \ else \ - echo $(INSTALL_DATA) $(srcdir)/msntauth.conf.default $(DESTDIR)$(sysconfdir)/msntauth.conf ; \ - $(INSTALL_DATA) $(srcdir)/msntauth.conf.default $(DESTDIR)$(sysconfdir)/msntauth.conf ; \ + echo $(INSTALL_DATA) $(srcdir)/msntauth.conf.default $(DESTDIR)$(MSNTAUTH_CONF) ; \ + $(INSTALL_DATA) $(srcdir)/msntauth.conf.default $(DESTDIR)$(MSNTAUTH_CONF) ; \ fi + +uninstall-local: + @$(SHELL) $(top_srcdir)/scripts/remove-cfg.sh $(RM) $(DESTDIR)$(MSNTAUTH_CONF) + $(RM) -f $(DESTDIR)$(MSNTAUTH_CONF).default + === modified file 'scripts/Makefile.am' --- scripts/Makefile.am 2001-08-31 17:19:04 + +++ scripts/Makefile.am 2009-02-26 23:11:14 + @@ -1,11 +1,7 @@ -# -# This file is a Makefile for compiling and installing Cache Manager. -# Cache Manager is a manager program for Internet Object Cache. -# - - bin_SCRIPTS = RunCache RunAccel EXTRA_DIST = AnnounceCache.pl access-log-matrix.pl cache-compare.pl \ cachetrace.pl check_cache.pl convert.configure.to.os2 \ fileno-to-pathname.pl flag_truncs.pl icp-test.pl \ icpserver.pl tcp-banger.pl udp-banger.pl upgrade-1.0-store.pl + +dist_noinst_SCRIPTS = remove-cfg.sh === added file 'scripts/remove-cfg.sh' --- scripts/remove-cfg.sh 1970-01-01 00:00:00 + +++ scripts/remove-cfg.sh 2009-02-26 22:45:09 + @@ -0,0 +1,28 @@ +#!/bin/sh + +# Removes an configuration file if it is identical to the default file, +# preventing make distcheck failures due to configuration leftovers. +# Intended to be used for installed configuration files. + +remover=$1 # the program to remove a file +prime=$2 # the configuration file to be removed, including path +default=$3 # the default configuration filename, including path + +# by default, use .default default extension +if test -z $default +then + default=$prime.default +fi + +# is the primary configuration file present? +if test -f $prime +then + # is the primary config identical to the default? + if diff -q $default $prime /dev/null + then + echo $remover -f $prime; +$remover -f $prime; +fi +fi + +exit 0 === modified file 'src/Makefile.am' --- src/Makefile.am 2009-02-20 00:14:30 + +++ src/Makefile.am 2009-02-26 21:49:29 + @@ -1118,16 +1118,8 @@ $(mkinstalldirs) $(DESTDIR)$(DEFAULT_LOG_PREFIX) uninstall-local: - @if test -f $(DESTDIR)$(DEFAULT_MIME_TABLE) ; then \ - echo rm -f $(DESTDIR)$(DEFAULT_MIME_TABLE); \ - $(RM) -f $(DESTDIR)$(DEFAULT_MIME_TABLE); \ - fi - -# Don't automatically uninstall config files -# @if test -f $(DESTDIR)$(DEFAULT_CONFIG_FILE) ; then \ -# echo rm -f $(DESTDIR)$(DEFAULT_CONFIG_FILE); \ -# $(RM) -f $(DESTDIR)$(DEFAULT_CONFIG_FILE); \ -# fi + @$(SHELL) $(top_srcdir)/scripts/remove-cfg.sh $(RM) $(DESTDIR)$(DEFAULT_MIME_TABLE) + @$(SHELL) $(top_srcdir)/scripts/remove-cfg.sh $(RM) $(DESTDIR)$(DEFAULT_CONFIG_FILE) CLEANFILES =
Re: [PATCH] Uninstall configuration files
Bundle Buggy has detected this merge request. For details, see: http://bundlebuggy.aaronbentley.com/project/squid/request/%3C49A77084.20304%40measurement-factory.com%3E Project: Squid
RFC: add distcheck to test-builds.sh
Hello, With the recent SourceLayout and distclean cleanup, make distcheck is working again. I would like to decrease the chance that it gets broken. Any objections to adding that check to test-builds.sh? It will double the script running time, but I think it is worth it and appropriate for a developer-only test-builds.sh script that is not run often. If you think that doubling the script runtime is a big issue, then I would suggest replacing all current targets with distcheck because the latter, in most cases, verifies more than the current set of targets (all and check). Thank you, Alex.