[squid-users] Pass the username to the cache parent
Dear Best list I'm using Squid connected to an Active Directory server in front of users. We have a central Squid server that act has Parent. This Squid parent server serves as caching and did not have any authentication method I would like the child Squid sends usernames to the Squid Parent in order to log users ID inside access.log. Is there somebody have successfully set the kind of architecture ? Is it Possible ? best regards
[squid-users] Re: Automatic StoreID ?
Amos Jeffries-2 wrote You just described how Store-ID feature works today. The map of urlA == urlB == urlC is inside the helper. You can make it a static list of regex patterns like the original Squid-2 helpers, a DB text file of patterns like the bundled Squid-3 helper, or anything else you like inside the helper. Squid learns the mappings by asking the helper about each URL. There is a helper response cache on these lookups same as other helpers and prevent complex/slow mappings having much impact on hot objects. Amos Really ? Squid has it's own learning mechanism without need human hand ? Also it can GUESS new urls which it was not aware till now ? One more question . Squid will delete current duplicate objects ? -- View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/Automatic-StoreID-tp4665140p4665189.html Sent from the Squid - Users mailing list archive at Nabble.com.
Re: [squid-users] Re: Automatic StoreID ?
On 13/03/2014 22:21, Amos Jeffries wrote: Adding a domain or acl test for internal squid StoreID feature to allow it run faster but with a patch to the sources. I was thinking about adding the code to the StoreID reply section on a ERR case while another flag is being used to allow this option and note that it will not work when using an external helper. What do we think about an ides with this effect? What exists inside squid code that can help me work with regex extraction and match stuff? maybe use an acl like code? how a about reading the perl DB into squid internals? pointers are welcome. Eliezer You just described how Store-ID feature works today. The map of urlA == urlB == urlC is inside the helper. You can make it a static list of regex patterns like the original Squid-2 helpers, a DB text file of patterns like the bundled Squid-3 helper, or anything else you like inside the helper. Squid learns the mappings by asking the helper about each URL. There is a helper response cache on these lookups same as other helpers and prevent complex/slow mappings having much impact on hot objects. Amos
Re: [squid-users] Pass the username to the cache parent
On 14/03/2014 9:18 p.m., David Touzeau wrote: Dear Best list I'm using Squid connected to an Active Directory server in front of users. We have a central Squid server that act has Parent. This Squid parent server serves as caching and did not have any authentication method I would like the child Squid sends usernames to the Squid Parent in order to log users ID inside access.log. Is there somebody have successfully set the kind of architecture ? Is it Possible ? Supported since Squid-3.2: cache_peer ... login=PASSTHRU Amos
Re: [squid-users] Re: Automatic StoreID ?
On 14/03/2014 9:20 p.m., Omid Kosari wrote: Amos Jeffries-2 wrote You just described how Store-ID feature works today. The map of urlA == urlB == urlC is inside the helper. You can make it a static list of regex patterns like the original Squid-2 helpers, a DB text file of patterns like the bundled Squid-3 helper, or anything else you like inside the helper. Squid learns the mappings by asking the helper about each URL. There is a helper response cache on these lookups same as other helpers and prevent complex/slow mappings having much impact on hot objects. Amos Really ? Squid has it's own learning mechanism without need human hand ? Also it can GUESS new urls which it was not aware till now ? You did not describe any learning mechanism. Just stated the parts Squid already does: when three URL are known by the helper to be identical the first fetch for any one of them causes that object to be cached, then later reuqests for any of them use that cached version. urlB/urlC need not have been fetched at all. Squid just asks the helper for information about each URL, the helper could be made to contain any learning mechanism you want. The ones bundled with Squid leverage human knowledge in the form of database list of patterns. In short. It is ready for you to figure out how that learning should be done and make a helper to do it. One more question . Squid will delete current duplicate objects ? Squid does lazy deletion. It deletes just before re-using the cache position for another file, or when needing to free up space. Amos
Re: [squid-users] Pass the username to the cache parent
On 14/03/2014 9:18 p.m., David Touzeau wrote: Dear Best list I'm using Squid connected to an Active Directory server in front of users. We have a central Squid server that act has Parent. This Squid parent server serves as caching and did not have any authentication method I would like the child Squid sends usernames to the Squid Parent in order to log users ID inside access.log. Is there somebody have successfully set the kind of architecture ? Is it Possible ? Supported since Squid-3.2: cache_peer ... login=PASSTHRU Amos Thanks Amos I will try this feature...
Re: [squid-users] Re: Automatic StoreID ?
On Fri, Mar 14, 2014 at 4:20 PM, Omid Kosari omidkos...@yahoo.com wrote: Really ? Squid has it's own learning mechanism without need human hand ? Also it can GUESS new urls which it was not aware till now ? Squid doesn't have it's own learning mechanism, it simply does what the helper *you* wrote tells it to do. However, theoretically, it should be possible to compute and store the checksum of the object *after* it's been fetch, and store that in a URL/checksum database. But this requires more than just a StoreID helper, as the helper's role is before the fetch. Something else needs to do the checksum after the fetch. And the database needs to build up its entries first before it becomes useful. The question is whether the additional percentage of hits you get is worth the effort.
Re: [squid-users] Automatic StoreID ?
On Tue, Mar 11, 2014 at 9:43 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 03/11/2014 01:18 PM, Nikolai Gorchilov wrote: On Tue, Mar 11, 2014 at 6:10 PM, Alex Rousskov wrote: On 03/11/2014 08:05 AM, Omid Kosari wrote: Is it possible for Squid to automatically find every similar object based on something like md5 of objects and serve them to clients without need custom DB ? No, because clients do not tell Squid what checksum they are looking for. It is possible to avoid caching duplicate content, but that allows you to handle cache hits more efficiently. It does not help with cache misses (when the URL requested by the client has not been seen before). Actually, two commercial vendors - PeerApp and ThunderCache - claim their products doesn't use urls to identify the objects, thus they don't have to maintain StoreID-like de-duplication database manually. Any ideas how do they do it? Most likely they do not, and you are simply being mislead by their marketing claims. In general, it is not possible to ignore the request I also suspected it is just marketing. But wanted to check if I miss something :) URL and still produce the right response (think about it!). They probably do not store duplicate cache objects, but, as discussed above, that is far from the automatic StoreID functionality that the original poster is asking about. In other words, there are at least two de-duplication layers: * The higher-level one is based on URLs and essentially requires manual URL mapping. It helps turn cache misses into hits. * The lower-level one is based on checksums and can be automated. It helps spend less cache space to serve cache hits. Some commercial products have implemented this lower-level optimization. I was thinking about this second option some time back. It doesn't seem very complicated and I see clear benefits if implemented in Squid, thus having the best of both worlds. Having lower-level checksum-based deduplication in a combination with some form of feedback mechanizm (logging, helper, etc) can be used by either humans or heruistic algorithms to create/update StoreID patterns. Best, Niki
[squid-users] logrotate only instead (all) squid rotate
Using: squid -k rotate squid rotates logs but also closes and reopen caches_dirs and url_rewrite_programs There's a way to signal only the (logfile-daemon) processes to rotate the logs and only the logs ? -- Alfrenovsky
Re: [squid-users] Re: ICP and HTCP and StoreID
On Thu, Mar 13, 2014 at 5:44 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 03/13/2014 07:24 AM, Nikolai Gorchilov wrote: On Wed, Mar 12, 2014 at 1:27 AM, Alex Rousskov wrote: Just to make sure we are on the same page, here is a list of options I recall being discussed: 1. Using ICP reqnum field as a cache key. I don't understand how this option is going to work. AFAIK regnum is just 4 octets long - how is it supposed to accommodate the StoreID? By using StoreIDs that are 31 bits long. Recall that you control the StoreID map and, in most cases, there are fewer than 2^31 mapped/altered URLs in the cache, so one could use positive reqnums as regular reqnums and negative reqnums as this is my special StoreID reqnums. There are other caveats or optimizations that may make sense with this scheme. And, as I said earlier, this is a hack (that may work well in some environments). I can't think of a reliable checksum algorithm that can fit in 31 bits :) This means some form of db-based storeid-to-url mapping, that has to be shared between cache peers. It adds too much complexity and reduced reliability in the helpers... Using MD5 as StoreID can do the job, but this is option 2. 2. Adding StoreID to ICP/HTCP requests as an optional field. 3. Computing StoreID upon receiving a regular ICP/HTCP request. Out of those three, do you prefer #3? Note that #1 is a little hackish, but may be a easier to implement (and is a lot cheaper CPU-wise) than #3. Neither #1 nor #3 make the ICP packets bigger, unlike #2. Option 3 is the only universal solution that works in all scenarios. Sharing the a StoreID string or a derivative of it (checksum/hash/digest/whatever) will do only for peers using same StoreID rewriting logic. Yes, of course. And with a StoreID cache or, in the worst case, a loaded module computing Store IDs, it will be fast enough too. To sum it ip, the above list ordered by preference: Option 3 with StoreID helper and StoreID caching Option 2 (using MD5 to minimize the packet size) Option 3 with StoreID helper, but without StoreID caching Option 1
[squid-users] Re: Automatic StoreID ?
Actually, two commercial vendors - PeerApp and ThunderCache - claim their products doesn't use urls to identify the objects, thus they don't have to maintain StoreID-like de-duplication database manually. Any ideas how do they do it? Instead of first mapping the URL to a memory-resident table, keeping pointers (file-id, bucket no.) to the real location of the object on disk, a hash-value, derived from the URL could directly be used to designate the storage location on disk, avoiding the translation table, squid uses. This is the principle of every hashed table in a fast database system. Drawback is, you have to deal with collisions on the disk and overflows: hashes for different URLs point to same storage location on disk. Different solutions for this problem available, though (chaining, sequential storage, secondary storage area etc.). And you have to manage variable sized buckets, the storage locations, hashing points to. Positive consequence: No rebuild of the in-memory-table necessary, as there is none. Avoids the time-comsuning rebuild of rock-storage-table from disk. I can imagine, that because of historical reasons (much simpler to implement), squid uses the translation-table instead of direct hashing, whereas Thundercache etc. can rely on some low-level DB-system, having direct hashing ready to be used. -- View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/Automatic-StoreID-tp4665140p4665198.html Sent from the Squid - Users mailing list archive at Nabble.com.
Re: [squid-users] Re: SquidGuard redirect to parent proxy (Off-Topic)
Am 2014-03-13 21:32, schrieb Amos Jeffries: On 2014-03-14 05:21, Christian Scholz wrote: Hi, I know that my question is a little bit off-topic but nevertheless I hope that some can help me :-) I've configured squid3 with squidguard and one parent-proxy. In the case of access violation squidguard redirects the user to a customized block page hosted by the proxy himself. Unfortunately the proxy tries to access the local blockpage over his parent proxy. Does some have an idea why? 1) this is a re-write, not a redirect. HTTP redirects have a 3xx status code prefixing the URL in squidguard config. redirect 302:http://example.com/ # redirect client to example.com redirect http://example.com/ # re-write URL to http://example.com and fetch 2) you probably also have no cache_peer_access rules preventing the parent from being a source for these ttp://proxyname.localsuffix/... URLs. Amos Okay, I've fixed it with the following lines acl local-domain dstdomain proxyname.localsuffix always_direct allow local-domain Thanks!
Re: [squid-users] Re: Automatic StoreID ?
On 03/14/2014 06:34 AM, babajaga wrote: Instead of first mapping the URL to a memory-resident table, keeping pointers (file-id, bucket no.) to the real location of the object on disk, a hash-value, derived from the URL could directly be used to designate the storage location on disk, avoiding the translation table, squid uses. This is how Rock store does it, essentially: Rock store index does not store the real location of the object on disk but computes it based on the hash value. Positive consequence: No rebuild of the in-memory-table necessary, as there is none. Avoids the time-comsuning rebuild of rock-storage-table from disk. While Rock store can avoid building the memory-resident index, you actually want that table in most cases: If you do not build the index, you have to do a disk I/O to fetch the first slot of the candidate object on _every_ request. Without that disk I/O, Squid would not know whether it has a hit or a miss because _every_ URL corresponds to a valid location on disk. You have an infinite number of URLs pointing to the same location and, without a memory-resident table, you do not know what is actually stored there (if anything at all) until you do that disk I/O. For reverse proxy caches with very high hit ratios, avoiding the rebuild may indeed be a good optimization where warmup time is more important than speed. Most such proxies have small caches that do not require a long rebuild, making the need for that optimization moot though. The building of the index needs to be optimized, but that is a different story. Note that Rock store can cache new objects while building the index (because the index does not store the object location). Cheers, Alex.
Re: [squid-users] Re: Automatic StoreID ?
On 03/14/2014 02:36 AM, Eliezer Croitoru wrote: Adding a domain or acl test for internal squid StoreID feature to allow it run faster but with a patch to the sources. I was thinking about adding the code to the StoreID reply section on a ERR case while another flag is being used to allow this option and note that it will not work when using an external helper. You can add a new store_id_map directive. I do not think it should depend on store_id_program actions. The two options do not even have to be mutually exclusive: if store_id_map does not match, check store_id_access. store_id_map filename with a regex map acl1 acl2 ... how a about reading the perl DB into squid internals? If you do something like that, I urge you to revise the current regex map file syntax used by a popular StoreID script to allow for comments (if not already allowed) and to contain complete substitution patterns instead of space-separated from/to tokens: # comment s/replace this/with that/flags s@can use custom delimiters and other regex features@as needed@g ^this line is an invalid line example$ To minimize confusion, you can even require that the map file starts with some well-defined prefix. For example: #Store ID Map #Version: 1.0 This approach will allow this feature to evolve as needed. What exists inside squid code that can help me work with regex extraction and match stuff? maybe use an acl like code? pointers are welcome. $ fgrep -RI regcomp src HTH, Alex.
[squid-users] Cygwin SSL bumping
Hi, I am trying to run Squid on a Windows Server 2008 R2 Standard as a Squid in the middle. I need to do SSL bumping, and I need to to block access to certain websites (eg. sites with the word games in the url) I've installed Cygwin on the server, and included squid in the installation. Where do I go from here? From my understanding I am not able to run ./configure, and then make / make install to enable features such as ssl-crtd. Any help would be greatly appreciated. Thanks! Derek
[squid-users] Is it possible to mark tcp_outgoing_mark (server side) with SAME MARK as incoming packet (client side)?
Hello, I would like to mark outgoing packet (on server side) with SAME MARK as on incoming (NATed or CONNECTed) packet. There is option tcp_outgoing_mark with which I can mark packets. But there is no ACL option to check incoming mark. If there is already a way to do this then please guide. Otherwise I would like to suggest: Option 1) --- Syntax: tcp_outgoing_mark SAMEMARK [!]aclname where SAMEMARK is special (literal) word where acl matching are applied same mark as on incoming packet. For e.g I can do: tcp_outgoing_mark SAMEMARK all And all packets will be applied same mark as incoming packet mark. Option 2) --- Have an acl: Syntax: acl aclname nfmark mark-value Then I can do something like this: acl mark101 nfmark 0x101 tcp_outgoing_mark 0x101 mark101 If both above options can be combined then it would be even better. Thanks in advance, Amm.