Re: regular expressions was: Kernel Oops
on Wed, Mar 09, 2011 at 11:00:34AM +1100, Erik de Castro Lopo wrote: My idea was to autogenerate the complex regexes using something like this: 178.183.237.0.dsl.dynamic.eranet.pl 183.246.69.111.dynamic.snap.net.nz 188.146.109.136.nat.umts.dynamic.eranet.pl as input. FWIW, of 74510 patterns in the most recent Enemieslist patterns release, 9779 of them match leading four digits separated by dots (2447) or dashes (589) or a mix of dots or dashes (the rest). You will have your hands full coming up with groups of same. -- hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2553 w: http://hesketh.com/ antispam news and intelligence to help you stop spam: http://enemieslist.com/
Re: regular expressions was: Kernel Oops
mouss put forth on 3/8/2011 5:03 PM: [WARNING: Steven CC'd] things. so I'd say, do not consider performances as a primary target. go for catching spammers first. only tune after you get the irght rules, and only if needed (I personally don't tune anything here. I'm happy to focus on catching spammers). Likewise. In my particular case execution time of the table is irrelevant. However, the execution latency of very large tables on busy systems piqued my curiosity, giving me a desire to learn more, so I can avoid adopting potentially bad habits now that may come back to haunt me, performance wise, in the future. Also, it's very possible, maybe more likely than not, that I misunderstood some of Steven's advice, or took it out of context. (Steven, sorry for inadvertently dragging you into the mosh pit) :) Some who have been working with regular expressions for a long time may feel otherwise, but at this point I find them fascinating. From a spam fighting standpoint they can be extremely powerful. Again, I just want to make sure I develop good habits now. WRT Viktor's earlier post, I have seen examples of the grouping with if/then blocks. In fact, the fqrdns.pcre file makes use of them. Although I'm not sure it's well optimized in this case. There seem to be an enormous number of expressions within a single if/then block, and IIRC, there are only three such groupings in the set of 1600+ expressions. So there's probably room for more performance optimization. At the table's current size though, I'm guessing the potential performance gain wouldn't be worth the tweaking labor. -- Stan
Re: regular expressions was: Kernel Oops
Steve put forth on 3/8/2011 5:12 PM: Maybe using if/endif conditions like Stan Hoeppner has done on his pcre map could speedup things even more? - http://www.hardwarefreak.com/fqrdns.pcre You're giving me too much credit. ;) Again, I'm not the original author of that table. That person created the if/then structure. I was ignorant of exactly how it works in a PCRE until the last 24 hours. I've simply made some additions, and fixed some minor errors I found, as have others. My current role WRT to the table is simply making it freely available for others, adding an expression now and then, incorporating contributions from others so all changes hit a master copy, and spreading the word a little now and then as I think it's a pretty useful A/S tool. -- Stan
Re: regular expressions was: Kernel Oops
mouss put forth on 3/7/2011 5:45 PM: Le 07/03/2011 15:13, Stan Hoeppner a écrit : Ok, so if I'm doing what I've heard called a fully qualified regular expression, WRT FQrDNS matching, should I use the anchors or not? postmap -q says these all work (the actuals with action and text that is). /^(\d{1,3}-){3}\d{1,3}\.dynamic\.chello\.sk$/ .dynamic.chello.skREJECT blah blah /^(\d{1,3}\.){4}dsl\.dyn\.forthnet\.gr$/ .dyn.forthnet.gr REJECT blah blah /^(\d{1,3}-){4}adsl-dyn\.4u\.com\.gh$/ /dyn\.4u.com\.gh$/REJECT blah assuming you get real mail from there. otherwise .4u.com.ghREJECT blah Yes, these can all be done with a hash/cdb. But these are being added to my fqrdns.pcre file. As the name implies the goal is to exactly match fully qualified reverse DNS strings, at least, that's part of the goal. The other part is the exact opposite: _not_ matching them. I'll explain that a little later. /^[\d\w]{8}\.[\w]{2}-[\d]-[\d\w]{2}\.dynamic\.ziggo\.nl$/ ahem? I fail to see what yoy're trying to match here. \d is a \w, so [\d\w] is the same as \w. do you mean \W (capital letter)? anyway: I tried \d alone in those places and postmap -q wouldn't match it. I scoured my regex cheat sheet and it said \d is for digits, and \w is for alphas. I added \d\w and it worked. I was trying to match this oddball FQrDNS: 541ABE2E.cm-5-3c.dynamic.ziggo.nl well, that's what regular expressions are about by default: /foo/ means contains foo /^foo/ means starts with foo /foo$/ means ends with foo Got it. You (or Noel) already explained this, and it really helps understanding. so /^bart.*homer.*marge$/ means: starts with bart, ends with marge and somewhere between these contains homer. Also good to understand. Ok, to explain the not matching goal. The PCRE file is almost 1700 expressions, and growing. In a couple of years it could be double that size. Over a longer period of time it could hit 5000 expressions. For users of this file, it is usually the first table checked against a connecting smtp client. That client rDNS will match 1 of 1700 expressions, or none. Thus, we want the fastest processing of the does not match case, as this is the common case. A match is rare from a mathematical and cycles consumed standpoint. Modern processors are extremely fast. But if our expressions aren't speed optimized for the does not match case, we're slowing our system down. For most systems this is irrelevant. But for an extremely high volume MX gateway system, receiving say, 3000 connects/second, consisting of 2700 spam bots and showshoe servers, with 300 legit mails to be relayed to downstream mailbox servers, a few extra milliseconds of table processing time per connection adds up quickly. Assuming this host is running the full gamut of anti spam checks, policy daemons, content filters, etc, we need to keep each as lean as possible. If this example MX gateway sees spikes of 5000 connections/second due to a large botnet targeting multiple users, any extra delay this PCRE table imposes may contribute to bogging the system down, and cause unwanted delays. So, the question is, which form of expression processes the does not match case faster? The fully qualified expression, or the simple expression? Noel mentioned that the fully qualified expressions will tend to process faster. Is this true? Is it true for both the matches and does not match case? Thanks again for continuing my regex education guys. :) This knew knowledge and understanding is already paying dividends, mostly in time savings and I'm knocking expressions out more easily without having to reference help docs. :) -- Stan
Re: regular expressions was: Kernel Oops
Stan Hoeppner: So, the question is, which form of expression processes the does not match case faster? The fully qualified expression, or the simple expression? Noel mentioned that the fully qualified expressions will tend to process faster. Is this true? Is it true for both the matches and does not match case? I would expect better performance when patterns only match the text that needs to be matched. If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. Similar optimizations are needed for large CIDR maps. Right now, Postfix's linear search does 10^8 patterns/s. With this, postscreen can search the largest ipdeny.com file in 1ms on a modern CPU, which is sufficient for the moment. To make it fast, the CIDR entries need to be arranged into a tree that can be traversed in log(N) time. Wietse
Re: regular expressions was: Kernel Oops
Wietse Venema put forth on 3/8/2011 10:39 AM: Stan Hoeppner: So, the question is, which form of expression processes the does not match case faster? The fully qualified expression, or the simple expression? Noel mentioned that the fully qualified expressions will tend to process faster. Is this true? Is it true for both the matches and does not match case? I would expect better performance when patterns only match the text that needs to be matched. So this would mean the simpler expressions would be faster? That makes me wonder why Enemies List[1] uses complex expressions, each one precisely matching a specific rDNS pattern, given EL matches 65k+ patterns total. Likewise, the original author of my fqrdns.pcre table also used mostly expressions that exactly match a specific rDNS pattern, although in this case we have only 1600+ expressions so speed isn't as critical. I've not made 1 to 1 equivalent simpler expressions and run timing tests. It would be rather time consuming to copy the current table and simplify the expressions in the copy. I'm wondering now if execution times would show any meaningful difference. I wonder if testing just a small subset, say 100 expressions, would be sufficient to show meaningful execution time differences. If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. This sounds really interesting. Do you have a link to info about this flex software? I'd like to read about it. Similar optimizations are needed for large CIDR maps. Right now, Postfix's linear search does 10^8 patterns/s. With this, postscreen can search the largest ipdeny.com file in 1ms on a modern CPU, which is sufficient for the moment. To make it fast, the CIDR entries need to be arranged into a tree that can be traversed in log(N) time. I recall you and Viktor discussing this a while ago. I don't really understand how an OP (myself) would go about creating a tree of our CIDR tables. Or is this something that the Postfix CIDR code would handle? [1] Enemies List is not available for Postfix, yet, and the intelligence dataset is not free, although the source code is open. EL is integrated in some commercial AS appliances and commercial mail software. I mention it frequently here because it is the only antispam tool I'm aware of that makes almost exclusive use of regexes to identify likely spam sources, and it uses 10s of thousands of regexes. -- Stan
Re: regular expressions was: Kernel Oops
On Tue, Mar 08, 2011 at 02:29:23PM -0600, Stan Hoeppner wrote: So this would mean the simpler expressions would be faster? That makes me wonder why Enemies List[1] uses complex expressions, each one precisely matching a specific rDNS pattern, To avoid false positives by matching in the wrong context. The performance can be improved by grouping: /^\d+\.\d+\.\d+\.\d+$/ DUNNO only hostnames matched below if /\.net$/ # patterns for .net hosts ... /^/ DUNNO done with .net[ endif if /\.net\.au$/ # patterns for .net.au hosts ... /^/ DUNNO done with .net.au endif if /\.com$/ # patterns for .com hosts ... /^/ DUNNO done with .com endif if /\.edu$/ # patterns for .edu hosts ... /^/ DUNNO done with .edu endif -- Viktor.
Re: regular expressions was: Kernel Oops
Wietse Venema wrote: If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. Is there a limit the the pattern length in the pcre tables? If not, it would be possible to convert this (3 only, but could be hundreds or even thousands): /^([0-9]{1,3}\.){4}\.dsl\.dynamic\.eranet\.pl$/ /^([0-9]{1,3}\.){4}\.dynamic\.snap\.net\.nz$/ /^([0-9]{1,3}\.){4}\.nat\.umts\.dynamic\.eranet\.pl$/ to this: /^([0-9]{1,3}\.){4}\.(dsl\.dynamic\.eranet\.pl|dynamic\.snap\.net\.nz|nat\.umts\.dynamic\.eranet\.pl)$/ and that should reject 1.1.1.1.not-found in 1/3 the time of the three original regexes while also matching quicker than the original. Obviously, a conversion from the first three to the optimised version has to be done mechanistically to avoid errors. Cheers, Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/
Re: regular expressions was: Kernel Oops
[WARNING: Steven CC'd] Le 08/03/2011 21:29, Stan Hoeppner a écrit : Wietse Venema put forth on 3/8/2011 10:39 AM: Stan Hoeppner: So, the question is, which form of expression processes the does not match case faster? The fully qualified expression, or the simple expression? Noel mentioned that the fully qualified expressions will tend to process faster. Is this true? Is it true for both the matches and does not match case? I would expect better performance when patterns only match the text that needs to be matched. to get better performance, one would use patterns that fail to match as soon as possible. I mean if you have /^a/, then the check would stop as soon as the first char isn't an a. but the expressions we would like to match and the expressions we see are completely different things. so I'd say, do not consider performances as a primary target. go for catching spammers first. only tune after you get the irght rules, and only if needed (I personally don't tune anything here. I'm happy to focus on catching spammers). So this would mean the simpler expressions would be faster? No. /^a(complex blah)/ is faster than /joe/ because the first will stop if the first char sin't a whatever is the rest of the expression. That makes me wonder why Enemies List[1] uses complex expressions, each one precisely matching a specific rDNS pattern, given EL matches 65k+ patterns total. as said above, the goal isn't performance (to improve performance, buy better hardware or run multiple instances). The goal of Steven is to maximize hit rate while minimizing false positives. many of us have created rules to block generic/dynamic/silly senders. when doing so, you can start by being precise at the risk of doing a lot of work because your rules minimise FPs, or going the other side by using expressions that block a lot of senders inclusing legitimate ones, that is increasing the FP rate. it takes time and efforts to get a good balance, and that's what Steven work is about. [snip] If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. This sounds really interesting. Do you have a link to info about this flex software? I'd like to read about it. [note: it wasn't me who said the text above. I however studied the problem, in a completely different context. I can tell you one thing: forget about optimizing your pcre rules. optimisation is useful in DNA matching problems and the like. and even then...). Similar optimizations are needed for large CIDR maps. Right now, Postfix's linear search does 10^8 patterns/s. With this, postscreen can search the largest ipdeny.com file in 1ms on a modern CPU, which is sufficient for the moment. To make it fast, the CIDR entries need to be arranged into a tree that can be traversed in log(N) time. I recall you and Viktor discussing this a while ago. I don't really understand how an OP (myself) would go about creating a tree of our CIDR tables. Or is this something that the Postfix CIDR code would handle? if cidr is to be enhanced, then it would be done inside cidr implementation. the problem is the usual one: algorithms are often said to be k*O(f(n)). so you generally prefer f(n)=log(n) over f(n)=n^2. but this is only good for large n, and n is never large, so you need to remember about the k constant. said otherwise: k1 * n^2 k2 log(n) for small n under some conditions. [1] Enemies List is not available for Postfix, yet, and the intelligence dataset is not free, although the source code is open. EL is integrated in some commercial AS appliances and commercial mail software. I mention it frequently here because it is the only antispam tool I'm aware of that makes almost exclusive use of regexes to identify likely spam sources, and it uses 10s of thousands of regexes. I don't use EL, but I think it is usable with postfix. Steven, can you confirm this? (some of the features may be sendmail oriented, but it would be easy to generalize them).
Re: regular expressions was: Kernel Oops
Original-Nachricht Datum: Wed, 9 Mar 2011 09:49:21 +1100 Von: Erik de Castro Lopo mle+to...@mega-nerd.com An: postfix-users@postfix.org Betreff: Re: regular expressions was: Kernel Oops Wietse Venema wrote: If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. Is there a limit the the pattern length in the pcre tables? I think there is one (if memory does not fool me then it is somewhere around 1000 characters). But I am not 100% sure. If not, it would be possible to convert this (3 only, but could be hundreds or even thousands): /^([0-9]{1,3}\.){4}\.dsl\.dynamic\.eranet\.pl$/ /^([0-9]{1,3}\.){4}\.dynamic\.snap\.net\.nz$/ /^([0-9]{1,3}\.){4}\.nat\.umts\.dynamic\.eranet\.pl$/ Are you sure the above is correct? You have there a double dot and I think that is not correct. to this: /^([0-9]{1,3}\.){4}\.(dsl\.dynamic\.eranet\.pl|dynamic\.snap\.net\.nz|nat\.umts\.dynamic\.eranet\.pl)$/ Or even shorter: /^([0-9]{1,3}\.){4}((dsl\.dynamic|nat\.umts)\.dynamic\.eranet\.pl|dynamic\.snap\.net\.nz)$/ Maybe using if/endif conditions like Stan Hoeppner has done on his pcre map could speedup things even more? - http://www.hardwarefreak.com/fqrdns.pcre and that should reject 1.1.1.1.not-found in 1/3 the time of the three original regexes while also matching quicker than the original. Obviously, a conversion from the first three to the optimised version has to be done mechanistically to avoid errors. Well... if the source is already buggy (double dot issue) then automating that transformation is not going to help you much. Cheers, Erik -- // Steve -- Erik de Castro Lopo http://www.mega-nerd.com/ -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
Re: regular expressions was: Kernel Oops
mouss: [ Charset ISO-8859-1 unsupported, converting... ] Le 08/03/2011 23:49, Erik de Castro Lopo a ?crit : Wietse Venema wrote: If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. Is there a limit the the pattern length in the pcre tables? If not, it would be possible to convert this (3 only, but could be hundreds or even thousands): /^([0-9]{1,3}\.){4}\.dsl\.dynamic\.eranet\.pl$/ /^([0-9]{1,3}\.){4}\.dynamic\.snap\.net\.nz$/ /^([0-9]{1,3}\.){4}\.nat\.umts\.dynamic\.eranet\.pl$/ to this: /^([0-9]{1,3}\.){4}\.(dsl\.dynamic\.eranet\.pl|dynamic\.snap\.net\.nz|nat\.umts\.dynamic\.eranet\.pl)$/ and that should reject 1.1.1.1.not-found in 1/3 the time of the three original regexes while also matching quicker than the original. your speculations are wrong. /(joe|foo|bar)/ isn't /3 times faster than individual tests. but before all, premature optimisation is the root of all evil. one should not convert readable stuff to unmaintainable hieroglyph without measuring the real benefits. In the Postfix implementation, each regexp/pcre pattern is executed separately, therefore (a|b|c) is faster than separate rules for a, b and c. The savings are noticeable only in body_checks. As for large numbers of CIDR patterns, I was referring to files with 100,000 patterns. That is a non-trivial number, and I took care to implement this such that postscreen could handle them. I do agree with all the comments about skipping patterns with IF/ENDIF or terminating matches early (which PCRE is very good at if you use look-ahead and look-behind). Wietse
Re: regular expressions was: Kernel Oops
Noel Jones wrote: The pattern length limit is controlled by the pcre library you're using. I think most implementations limit single expressions to 64k characters. Obviously something that needs testing. It's unclear to me if a single huge complex expression will evaluate faster that multiple less complex expressions. I'm not exactly sure how the pcre regex engine works in Postfix. My assumptions below is that each pattern is matched individually which is why I am suggesting that patterns can be combined for speed improvements. If the multiple complex expressions have the same prefix, then combining the prefix test into a single expression will definitely be faster to fail some non matching strings than using multiple less complex expressions. Consider the input string '123-234-32-12.whatever' and now compare matching against three rules: /^([0-9]{1,3}\.){4}foo$/ /^([0-9]{1,3}\.){4}bar$/ /^([0-9]{1,3}\.){4}baz$/ In this ase, there will be three attempts (one on each pattern) that fail on the fourth character ('-') of the input pattern. That means that to fail all three patterns, there will be 12 character comparisions. Now compare that against: /^([0-9]{1,3}\.){4}(foo|bar|baz)$/ which will again fail on the fourth character, but there is only one pattern which matches the same strings as the 3 patterns above. (your sample expression looks a little wonky to me. You sure it works?) No, this was a poorly checked paper example. Improving performance would be better accomplished by enclosing the similar lines in an IF..ENDIF statement. Performance should be improved for non-matching input, readability and maintainability is dramatically improved. Personally I find reading regexes a pita even though I've been doing it for about 2 decades. My idea was to autogenerate the complex regexes using something like this: 178.183.237.0.dsl.dynamic.eranet.pl 183.246.69.111.dynamic.snap.net.nz 188.146.109.136.nat.umts.dynamic.eranet.pl as input. Skipping rules always beats evaluating rules. Agreed. Unreadable rules should be avoided. Unless those rules were never intended to me read or modified by hand. Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/
Re: regular expressions was: Kernel Oops
on Wed, Mar 09, 2011 at 12:03:27AM +0100, mouss wrote: [WARNING: Steven CC'd] :-) Le 08/03/2011 21:29, Stan Hoeppner a écrit : That makes me wonder why Enemies List[1] uses complex expressions, each one precisely matching a specific rDNS pattern, given EL matches 65k+ patterns total. Eh, it varies quite a bit, some of them are complex groups like this: [0-9]+\-[0-9]+\-[0-9]+\-[0-9]+\.dynamic\.(brasov|craiova|fagaras|resita|sfantugheorghe|victoria|zarnesti)\.rdsnet\.ro because for whatever reason I can't just use a [0-9a-z\-]+ in place of the group, or because they just grew over time as I saw more hosts. But some are relatively simple: [0-9a-z\-]+\-[0-9]+\.fiberlink\.[a-z]+\.rdsnet\.ro wherever I can get away with it. You have to be careful with blanket alphanumeric token host parts, because sometimes you're matching a city or town or state or abbreviation and everything's fine, and then the ISP starts putting 'mail' or 'static' in that token's position in a similar hostname and suddenly you're blocking more than residential dynamic cable modems. :-/ eg [0-9]+\-[0-9]+\-[0-9]+\-[0-9]+\.mail[0-9]+\.fft\.com\.au [0-9]+\-[0-9]+\-[0-9]+\-[0-9]+\.mail\.eletti\.com\.br [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\.mail\.sistemairis\.com\.br I haven't really tried to optimize the regular expressions, because of the way our library processes them - by walking down a tree from '.' (so, '.' - ro - rdsnet - all the patterns for rdsnet.ro) - so perf is acceptable (several hundred thousand matches/sec on decent hardware; ~225K lookups/s on my old Macbook via C program). Oh, and we're long past 65K - last build was 74494 patterns. I keep forgetting to update the Web site. :-) as said above, the goal isn't performance (to improve performance, buy better hardware or run multiple instances). Well, no, the goal is acceptable performance, but also managable update mechanisms that allow for rapid correction of FP classifications. The goal of Steven is to maximize hit rate while minimizing false positives. many of us have created rules to block generic/dynamic/silly senders. when doing so, you can start by being precise at the risk of doing a lot of work because your rules minimise FPs, or going the other side by using expressions that block a lot of senders inclusing legitimate ones, that is increasing the FP rate. it takes time and efforts to get a good balance, and that's what Steven work is about. Yup. And it took me a few months to really understand that the useful concept of a 'generic' hostname also unfortunately also applied to large mail farms that we wanted mail from. (Now we track 'outmx' patterns, too, and they account for around an eighth of all the patterns we have. Same goes for 'webhost' - we mostly just see phishing scams from most of them, but when you're analyzing someone's mailflow it helps to be able to tell them which of their mail is coming from legit or quasi-legit mail sources.) I used to have a few hundred compact expressions, like this, which were left-anchored but not fully qualified: %compact = ( duN = 'du[0-9]+', dynN = 'dyn[0-9]+', pppN = 'ppp[0-9]+', N-N-N = '[0-9]+\-[0-9]+\-[0-9]+', dhcpH = 'dhcp[0-9a-f]+', dhcpN = 'dhcp[0-9]+', dialN = 'dial[0-9]+', duN-N = 'du[0-9]+\-[0-9]+', dyn-N = 'dyn\-[0-9]+', portN = 'port[0-9]+', ppp-N = 'ppp\-[0-9]+', dhcp-N = 'dhcp\-[0-9]+', dial-N = 'dial\-[0-9]+', dialup = 'dialup', du-N-N = 'du\-[0-9]+\-[0-9]+', dynN-N = 'dyn[0-9]+\-[0-9]+', port-N = 'port\-[0-9]+', [...] but frankly the FP rate was so awful I ditched them. And not just because of silly people like whoever set up Marriott's reservations transactional servers with names like host184.marriott.com, but they were one very big reason why I ditched them. [snip] If you must match a very large numbers of patterns, you need an implementation that transforms N patterns into one deterministic automaton. This can match 1 pattern in the same time as N patterns. Once the automaton is built (which takes some time) it is blindingly fast. An example of such an implementation is flex. This sounds really interesting. Do you have a link to info about this flex software? I'd like to read about it. Oh, that was what we tried first. Matt Sergeant wrote a perl wrapper around a hunk of C object code that we generated using re2c. Worked fine, you feed it regexes, it generates C code, you compile it into an object and call it from a simple perl DNS server, voila. That was how I provided the first instance of the Enemieslist via DNSBL, for a year or so, on a Mac Mini. As far as the code went, it worked great. Unfortunately, it took almost an hour to compile, and that was back when I only had a few thousand patterns. Oh, and you had to recompile every
Re: regular expressions was: Kernel Oops
On 3/8/2011 6:00 PM, Erik de Castro Lopo wrote: Noel Jones wrote: The pattern length limit is controlled by the pcre library you're using. I think most implementations limit single expressions to 64k characters. Obviously something that needs testing. Many years ago I worked on a system with a 32k limit on pcre expressions. Ever since then, everything I've checked has been 64k, and then I gave up checking. I expect any non-ancient system will support 64k, and some maybe even more. (To clarify for others following along, this is a characters per single expression limit, not a filesize or number of expressions per file limit) Consider the input string '123-234-32-12.whatever' and now compare matching against three rules: /^([0-9]{1,3}\.){4}foo$/ /^([0-9]{1,3}\.){4}bar$/ /^([0-9]{1,3}\.){4}baz$/ In this ase, there will be three attempts (one on each pattern) that fail on the fourth character ('-') of the input pattern. That means that to fail all three patterns, there will be 12 character comparisions. Now compare that against: /^([0-9]{1,3}\.){4}(foo|bar|baz)$/ which will again fail on the fourth character, but there is only one pattern which matches the same strings as the 3 patterns above. This example is pretty easy to see that combining is better. It's not so clear if you create 32k of complex gibberish if it will actually operate faster as there may be significant startup times. YMMV and all that. BTW, with pcre you should use the the non-greedy flag inside parenthesis if you're not doing $n substitutions. This saves another smidgen of time and memory. /^(?:[0-9]{1,3}\.){4}(?:foo|bar|baz)$/ -- Noel Jones
Re: regular expressions was: Kernel Oops
Noel Jones wrote: Many years ago I worked on a system with a 32k limit on pcre expressions. Ever since then, everything I've checked has been 64k, and then I gave up checking. I expect any non-ancient system will support 64k, and some maybe even more. (To clarify for others following along, this is a characters per single expression limit, not a filesize or number of expressions per file limit) Thanks for the info. Now compare that against: /^([0-9]{1,3}\.){4}(foo|bar|baz)$/ which will again fail on the fourth character, but there is only one pattern which matches the same strings as the 3 patterns above. This example is pretty easy to see that combining is better. Exactly. Fortunately this is the very common example that will very easily lend itself to this optimisation. It's not so clear if you create 32k of complex gibberish if it will actually operate faster as there may be significant startup times. YMMV and all that. I agree completely. BTW, with pcre you should use the the non-greedy flag inside parenthesis if you're not doing $n substitutions. This saves another smidgen of time and memory. /^(?:[0-9]{1,3}\.){4}(?:foo|bar|baz)$/ Good tip, thanks. Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/
Re: Kernel Oops
mouss put forth on 3/6/2011 7:03 PM: /^.*foo/ means it starts with something followed by foo. and this is the same thing as it contains foo, which is represented by /foo/ I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? Steven being the author of the Enemies List: http://enemieslist.com/ which contains over 65,000 regexes matching FQrDNS patterns. well, you know I know these:) we all got spam from these... As with most/all dynamic ranges. 1) first use IP ranges. 2) then domains (hash/cdb) for example: .alshamil.net.ae REJECT blah blah because there is no point to try to match something like auh-b113917.alshamil.net.ae 3) then use regular expressions, but only when IPs and domains aren't the way to go. Well, you know I know these mouss. :) Have ever been locked in a certain train of thought and simply forgot to consider something related, later putting hand to forehead and saying Duh!. My mindset was focused on showing how a single PCRE can block the same number of hosts as using IP addresses in a CIDR or hash table. I just didn't consider the domain blocking aspect of hash tables at the time. That's the Duh!. I've been blocking domains with my hash table for something like 6 years now... I think some folks call this a brain fart. ;) no. IPs and domains are different things. cidr is about IPs. hash/cdb/pcre is about names. these are different things and you know that. use each as appropriate. Of course. But IPs are valid in a hash table. You can even list them by the equivalent of a /24, /16, and /8 if you like, simply by omitting the last 1, 2, or 3 octets of the dotted quad. Just as I brain farted WRT using domains in a hash table, it appears you have done the same WRT to using IP addresses in a hash table. :) I agree it makes more sense to block domains with hash/cdb and IPs with CIDR. I've been doing exactly that for 5 of the 6 years I've been running Postfix. The first year (maybe less) I blocked IPs with a hash table, until I joined this list and learned about CIDR tables. I'm guessing most other new Postfix OPs go through the same progression--most beginners docs returned via Google teach the hash table and nothing else. if the ISP makes it too much, then you should reduce it: .embarqhsd.netREJECT blah blah Yeah, but then you end up potentially blocking large numbers of ham servers in SOHO land, in this case *.sta.embarqhsd.net. Even in 2011 there are still hundreds of thousands or more SOHO MTAs on static IP aDSL and cable circuits with generic rDNS. I should know as I'm one of them. (Please let's not allow this to turn into yet another flame war WRT generic rDNS, real OPs rent a VPS/colo, yada yada--I'm not directing this at you mouss but to those predisposed to flog this dead, stripped to the bone, horse carcass). a better example would be /(\W\d+){4}\..*\.embarqhsd\.net$/ REJECT ... Better in what way? in the sense that this can't be represented using hash or the like. Ok. So you're not showing this PCRE above because it better matches the target rDNS string, or that the engine executes it faster or something, etc. You're simply saying don't use a PCRE for something you can match using a simpler table, such as hash/cdb. Correct? -- Stan
Re: Kernel Oops
On 2011-03-07 Stan Hoeppner wrote: mouss put forth on 3/6/2011 7:03 PM: /^.*foo/ means it starts with something followed by foo. and this is the same thing as it contains foo, which is represented by /foo/ I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? I wouldn't know what his rationale was, but Noel and mouss are certainly right. Anchoring something between wildcard matches is utterly pointless. As mouss explained above, /^.*foo/, /.*foo/ and /foo/ produce the same results. That is, unless your regexp processor implicitly anchors an expression at the beginning of the string, in which case you'd need the leading .*, but still won't need to explicitly anchor it with a ^. Regards Ansgar Wiechers -- Abstractions save us time working, but they don't save us time learning. --Joel Spolsky
Re: Kernel Oops
On 3/7/2011 4:47 AM, Stan Hoeppner wrote: I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? That's good advice when you're actually matching something. The special case of .* means, as you know, anything or nothing. There's never a case where it's necessary to explicitly match a leading or trailing anything or nothing. Consider: /^.*foo$/ match the string beginning with anything or nothing, ending with foo. can always be simplified to: /foo$/ match the string ending with foo. This works the same without the ending $ anchor (contains foo, rather than ends with foo), but helps the illustration. (In the other special case where you're using $1, $2, etc. substitution in the result, you might need some form of /^(.*foo)$/ to fill the substitution buffer, but that's about substitution, not about matching.) -- Noel Jones
Re: Kernel Oops
Noel Jones put forth on 3/7/2011 7:00 AM: On 3/7/2011 4:47 AM, Stan Hoeppner wrote: I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? That's good advice when you're actually matching something. Ok, so if I'm doing what I've heard called a fully qualified regular expression, WRT FQrDNS matching, should I use the anchors or not? postmap -q says these all work (the actuals with action and text that is). /^(\d{1,3}-){3}\d{1,3}\.dynamic\.chello\.sk$/ /^(\d{1,3}\.){4}dsl\.dyn\.forthnet\.gr$/ /^(\d{1,3}-){4}adsl-dyn\.4u\.com\.gh$/ /^[\d\w]{8}\.[\w]{2}-[\d]-[\d\w]{2}\.dynamic\.ziggo\.nl$/ /^(\d{1,3}\.){4}dynamic\.snap\.net\.nz$/ /^pppoe-dyn(-\d{1,3}){4}\.kosnet\.ru$/ The special case of .* means, as you know, anything or nothing. There's never a case where it's necessary to explicitly match a leading or trailing anything or nothing. What of the case where you want to match something in the middle of the input string, with extra junk on both ends? Consider: /^.*foo$/ match the string beginning with anything or nothing, ending with foo. can always be simplified to: /foo$/ match the string ending with foo. This works the same without the ending $ anchor (contains foo, rather than ends with foo), but helps the illustration. So, in my examples above, given we're matching rDNS patterns, are the anchors necessary, or helpful? If not using them means contains, then they should still match. What advantage is there to using the anchors when matching rDNS patterns? Any? (In the other special case where you're using $1, $2, etc. substitution in the result, you might need some form of /^(.*foo)$/ to fill the substitution buffer, but that's about substitution, not about matching.) Thank you for the continuing PCRE education Noel, and Ansgar. :) -- Stan
Re: Kernel Oops
On 3/7/2011 8:13 AM, Stan Hoeppner wrote: Noel Jones put forth on 3/7/2011 7:00 AM: On 3/7/2011 4:47 AM, Stan Hoeppner wrote: I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? That's good advice when you're actually matching something. Ok, so if I'm doing what I've heard called a fully qualified regular expression, WRT FQrDNS matching, should I use the anchors or not? postmap -q says these all work (the actuals with action and text that is). /^(\d{1,3}-){3}\d{1,3}\.dynamic\.chello\.sk$/ /^(\d{1,3}\.){4}dsl\.dyn\.forthnet\.gr$/ /^(\d{1,3}-){4}adsl-dyn\.4u\.com\.gh$/ /^[\d\w]{8}\.[\w]{2}-[\d]-[\d\w]{2}\.dynamic\.ziggo\.nl$/ /^(\d{1,3}\.){4}dynamic\.snap\.net\.nz$/ /^pppoe-dyn(-\d{1,3}){4}\.kosnet\.ru$/ In these examples, you're explicitly matching something at the start and/or end of the string. Using the anchors is correct and recommended. The special case of .* means, as you know, anything or nothing. There's never a case where it's necessary to explicitly match a leading or trailing anything or nothing. What of the case where you want to match something in the middle of the input string, with extra junk on both ends? If you're looking for a string that contains foo anywhere, simply /foo/ with no anchors. Consider: /^.*foo$/ match the string beginning with anything or nothing, ending with foo. can always be simplified to: /foo$/ match the string ending with foo. This works the same without the ending $ anchor (contains foo, rather than ends with foo), but helps the illustration. So, in my examples above, given we're matching rDNS patterns, are the anchors necessary, or helpful? If not using them means contains, then they should still match. What advantage is there to using the anchors when matching rDNS patterns? Any? You use anchors to reduce the chance of a false positive. A side benefit is improved performance. Any pattern that matches with the anchors will still match without the anchors, but may match additional input that you don't intend to match. In the case of the rDNS patterns, a FP is unlikely (but possible, more so with the shorter patterns). In other cases, such as matching a sort bare domain name, a FP may be very likely without anchors. best practice is to use the anchors when you can, ie. what you're matching will always be at the beginning and/or end of the input string. Never use ^.* or .*$. -- Noel Jones
Re: Kernel Oops
Noel Jones put forth on 3/7/2011 9:49 AM: On 3/7/2011 8:13 AM, Stan Hoeppner wrote: Noel Jones put forth on 3/7/2011 7:00 AM: On 3/7/2011 4:47 AM, Stan Hoeppner wrote: I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? That's good advice when you're actually matching something. Ok, so if I'm doing what I've heard called a fully qualified regular expression, WRT FQrDNS matching, should I use the anchors or not? postmap -q says these all work (the actuals with action and text that is). /^(\d{1,3}-){3}\d{1,3}\.dynamic\.chello\.sk$/ /^(\d{1,3}\.){4}dsl\.dyn\.forthnet\.gr$/ /^(\d{1,3}-){4}adsl-dyn\.4u\.com\.gh$/ /^[\d\w]{8}\.[\w]{2}-[\d]-[\d\w]{2}\.dynamic\.ziggo\.nl$/ /^(\d{1,3}\.){4}dynamic\.snap\.net\.nz$/ /^pppoe-dyn(-\d{1,3}){4}\.kosnet\.ru$/ In these examples, you're explicitly matching something at the start and/or end of the string. Using the anchors is correct and recommended. The special case of .* means, as you know, anything or nothing. There's never a case where it's necessary to explicitly match a leading or trailing anything or nothing. What of the case where you want to match something in the middle of the input string, with extra junk on both ends? If you're looking for a string that contains foo anywhere, simply /foo/ with no anchors. Consider: /^.*foo$/ match the string beginning with anything or nothing, ending with foo. can always be simplified to: /foo$/ match the string ending with foo. This works the same without the ending $ anchor (contains foo, rather than ends with foo), but helps the illustration. So, in my examples above, given we're matching rDNS patterns, are the anchors necessary, or helpful? If not using them means contains, then they should still match. What advantage is there to using the anchors when matching rDNS patterns? Any? You use anchors to reduce the chance of a false positive. A side benefit is improved performance. Any pattern that matches with the anchors will still match without the anchors, but may match additional input that you don't intend to match. In the case of the rDNS patterns, a FP is unlikely (but possible, more so with the shorter patterns). In other cases, such as matching a sort bare domain name, a FP may be very likely without anchors. best practice is to use the anchors when you can, ie. what you're matching will always be at the beginning and/or end of the input string. Never use ^.* or .*$. Excellent explanations. Thank you Noel. -- Stan
Re: Kernel Oops
Le 07/03/2011 11:47, Stan Hoeppner a écrit : mouss put forth on 3/6/2011 7:03 PM: /^.*foo/ means it starts with something followed by foo. and this is the same thing as it contains foo, which is represented by /foo/ I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? Steven being the author of the Enemies List: http://enemieslist.com/ which contains over 65,000 regexes matching FQrDNS patterns. well, you know I know these:) we all got spam from these... As with most/all dynamic ranges. 1) first use IP ranges. 2) then domains (hash/cdb) for example: .alshamil.net.ae REJECT blah blah because there is no point to try to match something like auh-b113917.alshamil.net.ae 3) then use regular expressions, but only when IPs and domains aren't the way to go. Well, you know I know these mouss. :) yes, but we're talking on a public list, so it's good to say it all. coz' all this stuff is archived and used in way we can't imagine. Have ever been locked in a certain train of thought and simply forgot to consider something related, later putting hand to forehead and saying Duh!. My mindset was focused on showing how a single PCRE can block the same number of hosts as using IP addresses in a CIDR or hash table. I just didn't consider the domain blocking aspect of hash tables at the time. That's the Duh!. I've been blocking domains with my hash table for something like 6 years now... I think some folks call this a brain fart. ;) no. IPs and domains are different things. cidr is about IPs. hash/cdb/pcre is about names. these are different things and you know that. use each as appropriate. Of course. But IPs are valid in a hash table. You can even list them by the equivalent of a /24, /16, and /8 if you like, simply by omitting the last 1, 2, or 3 octets of the dotted quad. Just as I brain farted WRT using domains in a hash table, it appears you have done the same WRT to using IP addresses in a hash table. :) not really. I never put IPs in hash tables. more precisely, I never mix domains and IPs. be it just for the fact that postfix first looks up domains/hostnames before looking up IPs, which is the opposite of what I want. the /24, /16, /8 in postfix is a sendmail compat thing. something I don't need. I agree it makes more sense to block domains with hash/cdb and IPs with CIDR. I've been doing exactly that for 5 of the 6 years I've been running Postfix. The first year (maybe less) I blocked IPs with a hash table, until I joined this list and learned about CIDR tables. I'm guessing most other new Postfix OPs go through the same progression--most beginners docs returned via Google teach the hash table and nothing else. if the ISP makes it too much, then you should reduce it: .embarqhsd.net REJECT blah blah Yeah, but then you end up potentially blocking large numbers of ham servers in SOHO land, in this case *.sta.embarqhsd.net. Even in 2011 there are still hundreds of thousands or more SOHO MTAs on static IP aDSL and cable circuits with generic rDNS. I should know as I'm one of them. (Please let's not allow this to turn into yet another flame war WRT generic rDNS, real OPs rent a VPS/colo, yada yada--I'm not directing this at you mouss but to those predisposed to flog this dead, stripped to the bone, horse carcass). believe it or not, I have nothing against dynamic IPs. my approach is as follows: - whitelisted IPs get whitelisted. this includes public whitelists and local whitelists - I do not include an expression for generic rdns until I get spam - after N spam, I add an expression. well, I do check if it's ok to add a blocking rule - I do not care if it's static, .sta or whatever. as I said above, it's not about dynamic, it's about accountability. if I get spam from joe.example, I know I can complain to (abuse|postmaster)@joe.example. if I get junk from 1.2.3.4.largeisp.example, I know I have no right to complain, because I'm not part of the money circuit. a better example would be /(\W\d+){4}\..*\.embarqhsd\.net$/ REJECT ... Better in what way? in the sense that this can't be represented using hash or the like. Ok. So you're not showing this PCRE above because it better matches the target rDNS string, or that the engine executes it faster or something, etc. You're simply saying don't use a PCRE for something you can match using a simpler table, such as hash/cdb. Correct? yep. but that said, if you don't have performance problems, using a single map is probably better than splitting it into a pcre and a has/cdb map. so what I said doesn't apply to _you_. it was about the example (showing a better example).
regex anchoring (Was: Kernel Oops)
Le 07/03/2011 11:47, Stan Hoeppner a écrit : mouss put forth on 3/6/2011 7:03 PM: /^.*foo/ means it starts with something followed by foo. and this is the same thing as it contains foo, which is represented by /foo/ I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? Steven being the author of the Enemies List: http://enemieslist.com/ which contains over 65,000 regexes matching FQrDNS patterns. You misunderstood what Steven meant. what Stevens meant is to avoid things like /adsl/ REJECT blah so he recommends anchoring expressions, right and left: /^cpe\..*\.joe\.example$/ ... contrast this with /^cpe/ ... and /adsl/ ... which could match a lot of places you wouldn't want to match. /^.*foo/ means: starts with anything followed by foo. this is the same as contains foo, which can be represented by /foo/ and /foo.*$/ means contains foo followed by anything. this is the same as contains foo, which can be represented by /foo/ of course, I appreciate Steven and I agree with what he says here, to some extent (obviously, I'm paid by my employer so it's easy for me to push for freely available stuff). [snip]
Re: Kernel Oops
Le 07/03/2011 15:13, Stan Hoeppner a écrit : Noel Jones put forth on 3/7/2011 7:00 AM: On 3/7/2011 4:47 AM, Stan Hoeppner wrote: I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? That's good advice when you're actually matching something. Ok, so if I'm doing what I've heard called a fully qualified regular expression, WRT FQrDNS matching, should I use the anchors or not? postmap -q says these all work (the actuals with action and text that is). /^(\d{1,3}-){3}\d{1,3}\.dynamic\.chello\.sk$/ .dynamic.chello.sk REJECT blah blah /^(\d{1,3}\.){4}dsl\.dyn\.forthnet\.gr$/ .dyn.forthnet.grREJECT blah blah /^(\d{1,3}-){4}adsl-dyn\.4u\.com\.gh$/ /dyn\.4u.com\.gh$/ REJECT blah assuming you get real mail from there. otherwise .4u.com.gh REJECT blah /^[\d\w]{8}\.[\w]{2}-[\d]-[\d\w]{2}\.dynamic\.ziggo\.nl$/ ahem? I fail to see what yoy're trying to match here. \d is a \w, so [\d\w] is the same as \w. do you mean \W (capital letter)? anyway: .dynamic.ziggo.nlREJECT blah blah /^(\d{1,3}\.){4}dynamic\.snap\.net\.nz$/ .dynamic.snap.net.nzREJECT blah /^pppoe-dyn(-\d{1,3}){4}\.kosnet\.ru$/ /\Wdyn\W.*\.kosnet\.ru$/REJECT blah The special case of .* means, as you know, anything or nothing. There's never a case where it's necessary to explicitly match a leading or trailing anything or nothing. What of the case where you want to match something in the middle of the input string, with extra junk on both ends? well, that's what regular expressions are about by default: /foo/ means contains foo /^foo/ means starts with foo /foo$/ means ends with foo so /^bart.*homer.*marge$/ means: starts with bart, ends with marge and somewhere between these contains homer. Consider: /^.*foo$/ match the string beginning with anything or nothing, ending with foo. can always be simplified to: /foo$/ match the string ending with foo. This works the same without the ending $ anchor (contains foo, rather than ends with foo), but helps the illustration. So, in my examples above, given we're matching rDNS patterns, are the anchors necessary, or helpful? If not using them means contains, then they should still match. What advantage is there to using the anchors when matching rDNS patterns? Any? (In the other special case where you're using $1, $2, etc. substitution in the result, you might need some form of /^(.*foo)$/ to fill the substitution buffer, but that's about substitution, not about matching.) Thank you for the continuing PCRE education Noel, and Ansgar. :)
Re: Kernel Oops
it is necessary to consider the option parent_domain_matches_subdomains = Le mardi 08 mars 2011 à 00:45 +0100, mouss a écrit : Le 07/03/2011 15:13, Stan Hoeppner a écrit : Noel Jones put forth on 3/7/2011 7:00 AM: On 3/7/2011 4:47 AM, Stan Hoeppner wrote: I was taught to always start my expressions with /^ and end them with $/. Why did Steven teach me to do this if it's not necessary? That's good advice when you're actually matching something. Ok, so if I'm doing what I've heard called a fully qualified regular expression, WRT FQrDNS matching, should I use the anchors or not? postmap -q says these all work (the actuals with action and text that is). /^(\d{1,3}-){3}\d{1,3}\.dynamic\.chello\.sk$/ .dynamic.chello.skREJECT blah blah /^(\d{1,3}\.){4}dsl\.dyn\.forthnet\.gr$/ .dyn.forthnet.gr REJECT blah blah /^(\d{1,3}-){4}adsl-dyn\.4u\.com\.gh$/ /dyn\.4u.com\.gh$/REJECT blah assuming you get real mail from there. otherwise .4u.com.ghREJECT blah /^[\d\w]{8}\.[\w]{2}-[\d]-[\d\w]{2}\.dynamic\.ziggo\.nl$/ ahem? I fail to see what yoy're trying to match here. \d is a \w, so [\d\w] is the same as \w. do you mean \W (capital letter)? anyway: .dynamic.ziggo.nl REJECT blah blah /^(\d{1,3}\.){4}dynamic\.snap\.net\.nz$/ .dynamic.snap.net.nz REJECT blah /^pppoe-dyn(-\d{1,3}){4}\.kosnet\.ru$/ /\Wdyn\W.*\.kosnet\.ru$/ REJECT blah The special case of .* means, as you know, anything or nothing. There's never a case where it's necessary to explicitly match a leading or trailing anything or nothing. What of the case where you want to match something in the middle of the input string, with extra junk on both ends? well, that's what regular expressions are about by default: /foo/ means contains foo /^foo/ means starts with foo /foo$/ means ends with foo so /^bart.*homer.*marge$/ means: starts with bart, ends with marge and somewhere between these contains homer. Consider: /^.*foo$/ match the string beginning with anything or nothing, ending with foo. can always be simplified to: /foo$/ match the string ending with foo. This works the same without the ending $ anchor (contains foo, rather than ends with foo), but helps the illustration. So, in my examples above, given we're matching rDNS patterns, are the anchors necessary, or helpful? If not using them means contains, then they should still match. What advantage is there to using the anchors when matching rDNS patterns? Any? (In the other special case where you're using $1, $2, etc. substitution in the result, you might need some form of /^(.*foo)$/ to fill the substitution buffer, but that's about substitution, not about matching.) Thank you for the continuing PCRE education Noel, and Ansgar. :) -- gpg --keyserver pgp.mit.edu --recv-key 092164A7 http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0x092164A7 signature.asc Description: Ceci est une partie de message numériquement signée
Re: Kernel Oops
On Fri, Mar 04, 2011 at 03:43:11PM +0300, Denis Shulyaka wrote: Mar 4 14:46:29 shulyaka kern.alert kernel: CPU 0 Unable to handle kernel paging request at virtual address 0050, epc == 800fbdb4, ra == 800fbdf8 This kernel is broken bejond repair. Get a fixed one. Mar 4 14:46:29 shulyaka kern.warn kernel: Tainted: G D This is _not_ the first oops in the log. Bastian -- Emotions are alien to me. I'm a scientist. -- Spock, This Side of Paradise, stardate 3417.3
Re: Kernel Oops
On Sat, Mar 05, 2011 at 06:24:57PM +0300, Denis Shulyaka wrote: If I pass change `fsspace(., fsbuf);' to `fsspace(/, fsbuf);' it works, no oopses, and the messages are received without problems. I will make some stress tests later. So the remaining question is what . in smtpd context mean? Is it the dir postfix has been started from? Services spawned from master.cf run with cwd == $queue_directory (typically /var/spool/postfix). -- Viktor.
Re: Kernel Oops
Hi Viktor, You are right, for some reason my system has some troubles with fsspace(/var/spool/postfix, fsbuf). Possibly, Bastian is right about my kernel. But I just don't how to fix it. Any way, Postfix code is OK, and the workaround with `fsspace(/overlay, fsbuf)` satisfies me so far. Best regards, Denis Shulyaka 2011/3/6 Victor Duchovni victor.ducho...@morganstanley.com: On Sat, Mar 05, 2011 at 06:24:57PM +0300, Denis Shulyaka wrote: If I pass change `fsspace(., fsbuf);' to `fsspace(/, fsbuf);' it works, no oopses, and the messages are received without problems. I will make some stress tests later. So the remaining question is what . in smtpd context mean? Is it the dir postfix has been started from? Services spawned from master.cf run with cwd == $queue_directory (typically /var/spool/postfix). -- Viktor.
Re: Kernel Oops
Hi Viktor, I have tried both statfs() and statvfs() and it shows the similar behaivour. 2011/3/6 Victor Duchovni victor.ducho...@morganstanley.com: The fsspace function is a Postfix utility function, the underlying system interface is either statfs() or statvfs(). You should find out which is used on your system and test that... -- Viktor.
Re: Kernel Oops
Victor Duchovni: The fsspace function is a Postfix utility function, the underlying system interface is either statfs() or statvfs(). You should find out which is used on your system and test that... Denis Shulyaka: I have tried both statfs() and statvfs() and it shows the similar behaivour. Postfix uses statfs/statvfs as part of a safety net. If you delete the call, then Postfix would waste more bandwidth receiving mail that it can't store. However, if statfs/statvfs are broken, then there are likely to be more problems. I would recommend against using the file system for the email queue. Wietse
Re: Kernel Oops
Wietse Venema put forth on 3/6/2011 3:29 PM: Postfix uses statfs/statvfs as part of a safety net. If you delete the call, then Postfix would waste more bandwidth receiving mail that it can't store. However, if statfs/statvfs are broken, then there are likely to be more problems. I would recommend against using the file system for the email queue. ^ What?!?!? What?! Seeing you state this Wietse prompts me to run for the bomb shelter, for the world as we know it will soon end. :) Would that not make his only other option, assuming he sticks with his current kernel, a ramdisk? In a scenario where the target machine has only 64MB RAM? And considering you've expended countless keystrokes over the years telling OPs to _never_ _ever_ put the queue on a ramdisk? Or, are you suggesting, in a creative Wietse'esque dead pan humorous way, that he fix the problem with his current kernel, as I did far back in this thread, and others have since? -- Stan
Re: Kernel Oops
Wietse: However, if statfs/statvfs are broken, then there are likely to be more problems. I would recommend against using the file system for the email queue. Instead, use a better file system. Wietse
Re: Kernel Oops
Le 05/03/2011 00:18, Stan Hoeppner a écrit : lst_ho...@kwsoft.de put forth on 3/4/2011 3:33 PM: BTW, is there any how-to for getting the least possible memory footprint for Postfix. - don't use regex/pcre maps This isn't necessarily true, is it? In some cases I would think it's dramatically reversed in favor of PCRE tables (unless the Postfix PCRE processing code overhead eats up a massive amount of memory). For example, with the following single PCRE I can block a few million, literally, residential hosts in the Centurylink (formerly Embarq) consumer broadband aDSL network: /^.*\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay you can simplify that: /\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay more generally /^.* is never needed. anyway, this example is too simple and can be replaced with 2 cdb entries: .dyn.embarqshd.net REJECT ... .dhcp.embarqshd.net REJECT ... a better example would be /(\W\d+){4}\..*\.embarqhsd\.net$/ REJECT ... To do this with a CIDR would take at least 100 entries to cover all the subnets, probably many many more, due to the way they assign blocks by state, and rDNS by customer type, with (dyn|dhcp|sta) all existing within each of the top level parents. To do this with a hash table would require multiple hundreds of entries as you'd be limited to using /24s.
Re: Kernel Oops
Hi all, I have investigated the problem a little, and here are some results: First of all, it has nothing to do with memory consumption. The smtpd crashes on statfs() in fsspase() function, which is called from smtpd_check_queue() to check available free space on current filesystem for a queue. In the suggested System.map file the closest entry is 'alloc_page_buffers'. The default_process_limit, qmgr_message_active_limit and qmgr_message_recipient_limit tweaks have no effect at all. Any thoughts why statfs() may trigger a kernel oops? Best regards, Denis Shulyaka 2011/3/4 Wietse Venema wie...@porcupine.org: Wietse: Postfix asks the kernel for memory. If the kernel oopses and crashes Postfix, then that can't be fixed by changing Postfix. Denis Shulyaka: How much memory does smtpd need to receive a message, approximately? Can I tweak this value somehow? First, you can't run Postfix on a kernel that oopses and sends signal 11 when Postfix asks for memory. It should report the memory shortage to Postfix instead. The amount of memory depends on libc, and on what else you linked into Postfix: OpenSSL, PCRE, LDAP, and so on quickly add up to the memory footprint. The biggest tweak is reducing default_process_limit by a factor 10 or more. Other tweaks are reducing qmgr_message_active_limit and qmgr_message_recipient_limit by a factor 10 or more. Wietse
Re: Kernel Oops
Well, I found it! If I pass change `fsspace(., fsbuf);' to `fsspace(/, fsbuf);' it works, no oopses, and the messages are received without problems. I will make some stress tests later. So the remaining question is what . in smtpd context mean? Is it the dir postfix has been started from? 2011/3/5 Denis Shulyaka shuly...@gmail.com: Hi all, I have investigated the problem a little, and here are some results: First of all, it has nothing to do with memory consumption. The smtpd crashes on statfs() in fsspase() function, which is called from smtpd_check_queue() to check available free space on current filesystem for a queue. In the suggested System.map file the closest entry is 'alloc_page_buffers'. The default_process_limit, qmgr_message_active_limit and qmgr_message_recipient_limit tweaks have no effect at all. Any thoughts why statfs() may trigger a kernel oops? Best regards, Denis Shulyaka
Re: Kernel Oops
mouss put forth on 3/5/2011 7:20 AM: Le 05/03/2011 00:18, Stan Hoeppner a écrit : /^.*\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay you can simplify that: /\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay more generally /^.* is never needed. Does this expression correctly match a longer string when used as: check_reverse_client_hostname_access pcre:/etc/postfix/foo.pcre The actual FQrDNS strings in my example network will be of the form: fl-65-40-2-201.dyn.embarqhsd.net tx-67-232-101-101.dhcp.embarqhsd.net I was of the impression that a preceding wild card is required if not using fully qualified expressions, but simply trying to match only a substring at the back end of the line. anyway, this example is too simple and can be replaced with 2 cdb entries: .dyn.embarqshd.netREJECT ... .dhcp.embarqshd.net REJECT ... I just realized I erred in my original thought process leading to my example. I started out thinking of banning blocks of IPs, and how using a PCRE matching rDNS patterns can shrink an equivalent IP subnet hash table or CIDR table dramatically. I was strictly thinking of a hash table full of IP subnets. For some reason using host names in a hash table slipped my mind (hand to forehead). One could just as easily do this with hash table. So yes, this wasn't the greatest example. A better example would have been an ISP that uses goofy multiple rDNS conventions, possibly due to mergers, etc, such as: 10-1-2-3.dhcp.[state-abbr].isp.net 10-2-3-4.dyn.[city-name].isp.net 10-3-4-5.res.[state-abbr].isp.net 10-4-5-6-dynamic.[city-name].isp.net etc A PCRE table would definitely have a smaller memory footprint (the current thread focus) in this example than an equivalent hash or cdb table. And doing this with a CIDR would likely be smaller than hash or cdb as well, given the number of cities and states that such as ISP would be operating in, which would kick the total number of rDNS patterns into the hundreds. a better example would be /(\W\d+){4}\..*\.embarqhsd\.net$/ REJECT ... Better in what way? Does this get processed using significantly less cycles or with significant memory footprint savings? Your example is incomprehensible to non regex experts (myself included). I had to hit my regex docs to understand this syntax choice. Non experts at least have a fighting chance at deciphering my original example mouss. :) Thanks in advance for the anticipated forthcoming regex education. -- Stan
Re: Kernel Oops
On 3/5/2011 9:32 AM, Stan Hoeppner wrote: mouss put forth on 3/5/2011 7:20 AM: Le 05/03/2011 00:18, Stan Hoeppner a écrit : /^.*\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay you can simplify that: /\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay more generally /^.* is never needed. Does this expression correctly match a longer string when used as: check_reverse_client_hostname_access pcre:/etc/postfix/foo.pcre (Why would the string be longer?) Regardless, it's not required to anchor the beginning because it's anchored at the end. The actual FQrDNS strings in my example network will be of the form: fl-65-40-2-201.dyn.embarqhsd.net tx-67-232-101-101.dhcp.embarqhsd.net I was of the impression that a preceding wild card is required if not using fully qualified expressions, but simply trying to match only a substring at the back end of the line. A wildcard anchored to the beginning (or the end) is always useless -- think about it a minute and you'll see why. a better example would be /(\W\d+){4}\..*\.embarqhsd\.net$/ REJECT ... Better in what way? This example shows something that would be impossible to reproduce in a hash/cdb table. -- Noel Jones
Kernel Oops
Hi list! I'm trying to run postfix on my OpenWrt system. I have successfully compiled it and now I can send mails, but when I try to receive a mail, smtpd crashes and I can see this in the system log: Mar 4 14:46:29 shulyaka mail.info postfix/smtpd[18020]: connect from mail-bw0-f52.google.com[209.85.214.52] Mar 4 14:46:29 shulyaka kern.alert kernel: CPU 0 Unable to handle kernel paging request at virtual address 0050, epc == 800fbdb4, ra == 800fbdf8 Mar 4 14:46:29 shulyaka mail.warn postfix/master[16781]: warning: process /usr/libexec/postfix/smtpd pid 18020 killed by signal 11 Mar 4 14:46:29 shulyaka mail.warn postfix/master[16781]: warning: /usr/libexec/postfix/smtpd: bad command startup -- throttling Mar 4 14:46:29 shulyaka kern.warn kernel: Oops[#23]: Mar 4 14:46:29 shulyaka kern.warn kernel: Cpu 0 Mar 4 14:46:29 shulyaka kern.warn kernel: $ 0 : 0001 820b3280 8012c43c Mar 4 14:46:29 shulyaka kern.warn kernel: $ 4 : 810c7e60 Mar 4 14:46:29 shulyaka kern.warn kernel: $ 8 : 0018 800643f8 802f fff4 Mar 4 14:46:29 shulyaka kern.warn kernel: $12 : f000 0001 0400 0043c994 Mar 4 14:46:29 shulyaka kern.warn kernel: $16 : 810c7e60 83577580 0003 7fcf9ec8 Mar 4 14:46:29 shulyaka kern.warn kernel: $20 : 0003 00409740 0046eaf0 004560a0 Mar 4 14:46:29 shulyaka kern.warn kernel: $24 : 0070 Mar 4 14:46:29 shulyaka kern.warn kernel: $28 : 810c6000 810c7df0 0047 800fbdf8 Mar 4 14:46:29 shulyaka kern.warn kernel: Hi: 03b8 Mar 4 14:46:29 shulyaka kern.warn kernel: Lo: 0001e74d Mar 4 14:46:29 shulyaka kern.warn kernel: epc : 800fbdb4 0x800fbdb4 Mar 4 14:46:29 shulyaka kern.warn kernel: Tainted: G D Mar 4 14:46:29 shulyaka kern.warn kernel: ra: 800fbdf8 0x800fbdf8 Mar 4 14:46:29 shulyaka kern.warn kernel: Status: 1000fc03KERNEL EXL IE Mar 4 14:46:29 shulyaka kern.warn kernel: Cause : 0088 Mar 4 14:46:29 shulyaka kern.warn kernel: BadVA : 0050 Mar 4 14:46:29 shulyaka kern.warn kernel: PrId : 00019374 (MIPS 24Kc) Mar 4 14:46:29 shulyaka kern.warn kernel: [truncated] Modules linked in: ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom sch_red sch_sfq ums_datafab sch_hfsc ums_cypress cls_fw ums_alauda sch_ingress act_mirred act_connmark em_u32 ledtrig_u Mar 4 14:46:29 shulyaka kern.warn kernel: Process smtpd (pid: 18020, threadinfo=810c6000, task=82c49dc0, tls=2b7cb2f0) Mar 4 14:46:29 shulyaka kern.warn kernel: Stack : 82016dc0 0001 83b35480 83577580 810c7e60 83577580 800fbdf8 Mar 4 14:46:29 shulyaka kern.warn kernel: 80a5e000 800e4544 7fcf9ec8 800e36f8 810c7ed0 810c7e60 800fbe48 Mar 4 14:46:29 shulyaka kern.warn kernel: 80a5e000 80a5e000 800e4928 80c48300 810c7ed8 00453af8 800fbfb4 Mar 4 14:46:29 shulyaka kern.warn kernel: 83b35480 83577580 0001 82016dc0 Mar 4 14:46:29 shulyaka kern.warn kernel: Mar 4 14:46:29 shulyaka kern.warn kernel: ... Mar 4 14:46:29 shulyaka kern.warn kernel: Call Trace:[800fbdf8] 0x800fbdf8 Mar 4 14:46:29 shulyaka kern.warn kernel: [800e4544] 0x800e4544 Mar 4 14:46:29 shulyaka kern.warn kernel: [800e36f8] 0x800e36f8 Mar 4 14:46:29 shulyaka kern.warn kernel: [800fbe48] 0x800fbe48 Mar 4 14:46:29 shulyaka kern.warn kernel: [800e4928] 0x800e4928 Mar 4 14:46:29 shulyaka kern.warn kernel: [800fbfb4] 0x800fbfb4 Mar 4 14:46:29 shulyaka kern.warn kernel: [800fc0dc] 0x800fc0dc Mar 4 14:46:29 shulyaka kern.warn kernel: [8009d0c8] 0x8009d0c8 Mar 4 14:46:29 shulyaka kern.warn kernel: [800d9c84] 0x800d9c84 Mar 4 14:46:29 shulyaka kern.warn kernel: [80081744] 0x80081744 Mar 4 14:46:29 shulyaka kern.warn kernel: [80062544] 0x80062544 Mar 4 14:46:29 shulyaka kern.warn kernel: Code: afb10018 afb00014 afbf001c 8c820050 00808821 00a08021 8c420024 8c43002c 10600012 This happens every time I receive a mail. I also tried to telnet to the smtp port and found out that postfix correctly responds to HELO and crashes right after I send MAIL command. Besides that, the whole system is very stable, so I don't believe it is a hardware fault. Postfix version 2.8.0 # uname -r 2.6.37.1 # uname -m mips # free total used free shared buffers Mem:6204048348136920 5916 Swap: 5242840 524284 Total: 58632448348 537976 Best regards, Denis Shulyaka
Re: Kernel Oops
* Denis Shulyaka shuly...@gmail.com: Hi list! I'm trying to run postfix on my OpenWrt system. I have successfully compiled it and now I can send mails, but when I try to receive a mail, smtpd crashes and I can see this in the system log: Mar 4 14:46:29 shulyaka mail.info postfix/smtpd[18020]: connect from mail-bw0-f52.google.com[209.85.214.52] Mar 4 14:46:29 shulyaka kern.alert kernel: CPU 0 Unable to handle kernel paging request at virtual address 0050, epc == 800fbdb4, ra == 800fbdf8 Mar 4 14:46:29 shulyaka mail.warn postfix/master[16781]: warning: process /usr/libexec/postfix/smtpd pid 18020 killed by signal 11 Mar 4 14:46:29 shulyaka mail.warn postfix/master[16781]: warning: /usr/libexec/postfix/smtpd: bad command startup -- throttling Sounds like you run out of memory. But let's see what the others say... # free total used free shared buffers Mem:6204048348136920 5916 Swap: 5242840 524284 Total: 58632448348 537976 Best regards, Denis Shulyaka -- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebra...@charite.de | http://www.charite.de
Re: Kernel Oops
What hardware are running openwrt on?
Re: Kernel Oops
* john j...@klam.ca: What hardware are running openwrt on? Sounds like a MIPS based OpenWRT system, e.g. a WRT54g (am I correct?) -- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebra...@charite.de | http://www.charite.de
Re: Kernel Oops
On 04/03/2011 8:58 AM, Denis Shulyaka wrote: Hi John, It's D-Link DIR-825 router, CPU Atheros AR7161@680MHz (mips) 2011/3/4 johnj...@klam.ca: What hardware are running openwrt on? I think that you are being a little ambitious, that box has 8M flash and 64M RAM. All that is necessary for the triumph of evil is that good men do nothing. (Edmund Burke)
Re: Kernel Oops
I think you should listen to the advise you were given on the OpenWRT developers forum by Philip. All that is necessary for the triumph of evil is that good men do nothing. (Edmund Burke)
Re: Kernel Oops
Hi Ralf, Thanks for the response. I think 13 Mb should be well enough for receiving a message, and I also expect some different error message if it is a memory allocation problem. 2011/3/4 Ralf Hildebrandt ralf.hildebra...@charite.de: Sounds like you run out of memory. But let's see what the others say... # free total used free shared buffers Mem: 62040 48348 13692 0 5916 Swap: 524284 0 524284 Total: 586324 48348 537976
Re: Kernel Oops
Denis Shulyaka: Hi Ralf, Thanks for the response. I think 13 Mb should be well enough for receiving a message, and I also expect some different error message if it is a memory allocation problem. Postfix asks the kernel for memory. If the kernel oopses and crashes Postfix, then that can't be fixed by changing Postfix. Wietse
Re: Kernel Oops
Hi Wietse, How much memory does smtpd need to receive a message, approximately? Can I tweak this value somehow? 2011/3/4 Wietse Venema wie...@porcupine.org: Denis Shulyaka: Hi Ralf, Thanks for the response. I think 13 Mb should be well enough for receiving a message, and I also expect some different error message if it is a memory allocation problem. Postfix asks the kernel for memory. If the kernel oopses and crashes Postfix, then that can't be fixed by changing Postfix. Wietse
Re: Kernel Oops
Hi John, I don't agree with Philip, but the only way to prove my point is to make it running. I will need to see it myself to believe that 64M RAM + swap is not enough. 2011/3/4 john j...@klam.ca: I think you should listen to the advise you were given on the OpenWRT developers forum by Philip.
Re: Kernel Oops
On 3/4/2011 9:13 AM, Denis Shulyaka wrote: Hi John, I don't agree with Philip, but the only way to prove my point is to make it running. I will need to see it myself to believe that 64M RAM + swap is not enough. Things to try: Don't use any lookup tables. comment out all unused entries in master.cf. set in main.cf: default_process_limit = 1 Even still, I doubt it will work. -- Noel Jones
Re: Kernel Oops
Wietse: Postfix asks the kernel for memory. If the kernel oopses and crashes Postfix, then that can't be fixed by changing Postfix. Denis Shulyaka: How much memory does smtpd need to receive a message, approximately? Can I tweak this value somehow? First, you can't run Postfix on a kernel that oopses and sends signal 11 when Postfix asks for memory. It should report the memory shortage to Postfix instead. The amount of memory depends on libc, and on what else you linked into Postfix: OpenSSL, PCRE, LDAP, and so on quickly add up to the memory footprint. The biggest tweak is reducing default_process_limit by a factor 10 or more. Other tweaks are reducing qmgr_message_active_limit and qmgr_message_recipient_limit by a factor 10 or more. Wietse
Re: Kernel Oops
Hi Noel, Wietse, Thanks! I will try to do this and will update you with the result. Best regards, Denis Shulyaka
Re: Kernel Oops
Wietse Venema: The biggest tweak is reducing default_process_limit by a factor 10 or more. Other tweaks are reducing qmgr_message_active_limit and qmgr_message_recipient_limit by a factor 10 or more. And don't use Berkeley DB. Use CDB instead. Wietse
Re: Kernel Oops
On Fri, Mar 4, 2011 at 8:01 AM, Denis Shulyaka shuly...@gmail.com wrote: Thanks! I will try to do this and will update you with the result. When I read Denis' first post I thought WHAT? Postfix on a WRT54G? He's crazy! But now I'm rooting for you, Denis! I hope you get it working! :) SteveJ
Re: Kernel Oops
Ralf Hildebrandt put forth on 3/4/2011 6:53 AM: * Denis Shulyaka shuly...@gmail.com: Hi list! I'm trying to run postfix on my OpenWrt system. I have successfully compiled it and now I can send mails, but when I try to receive a mail, smtpd crashes and I can see this in the system log: Mar 4 14:46:29 shulyaka mail.info postfix/smtpd[18020]: connect from mail-bw0-f52.google.com[209.85.214.52] Mar 4 14:46:29 shulyaka kern.alert kernel: CPU 0 Unable to handle kernel paging request at virtual address 0050, epc == 800fbdb4, ra == 800fbdf8 Mar 4 14:46:29 shulyaka mail.warn postfix/master[16781]: warning: process /usr/libexec/postfix/smtpd pid 18020 killed by signal 11 Mar 4 14:46:29 shulyaka mail.warn postfix/master[16781]: warning: /usr/libexec/postfix/smtpd: bad command startup -- throttling Sounds like you run out of memory. But let's see what the others say... AFAIK OOM will throw a different error. More than likely his problem is a MIPS kernel compile issue or a problem with his RAM. Googling Unable to handle kernel paging request turns up some interesting results, this one on the first page likely being the most relevant, though 6 years old. http://www.linux-mips.org/archives/linux-mips/2004-10/msg00314.html The OP needs to follow the troubleshooting procedure in the above thread, and if he can't solve it alone, take it up on lkml. -- Stan
Re: Kernel Oops
Steve Jenkins: On Fri, Mar 4, 2011 at 8:01 AM, Denis Shulyaka shuly...@gmail.com wrote: Thanks! I will try to do this and will update you with the result. When I read Denis' first post I thought WHAT? Postfix on a WRT54G? He's crazy! But now I'm rooting for you, Denis! I hope you get it working! :) +1. It's fun to find out how small Postfix can get. Postfix has been running since late 1998 on a 64MB box, 24/7. I replaced the few parts that break, and blow out the dust once a year or so. Good hardware does not die. Wietse
Re: Kernel Oops
On 3/4/2011 2:01 PM, Wietse Venema wrote: Steve Jenkins: On Fri, Mar 4, 2011 at 8:01 AM, Denis Shulyakashuly...@gmail.com wrote: Thanks! I will try to do this and will update you with the result. When I read Denis' first post I thought WHAT? Postfix on a WRT54G? He's crazy! But now I'm rooting for you, Denis! I hope you get it working! :) +1. It's fun to find out how small Postfix can get. Postfix has been running since late 1998 on a 64MB box, 24/7. I replaced the few parts that break, and blow out the dust once a year or so. Good hardware does not die. Wietse A cheers from this corner as well. A light just went on. Did not even realize until now the referent was an old fashioned, jailbroken blue-box Linksys router. Talk about consolidation! Oh, that's your home router? -- No, corporate mailhub. Please, post a detailed blog and link to it when you're done! -Daniel
Re: Kernel Oops
Hi Daniel, Actually it's D-Link DIR 825 with attached USB hard drive, and it's white and stylish! 2011/3/4 Daniel Bromberg dan...@basezen.com: On 3/4/2011 2:01 PM, Wietse Venema wrote: Steve Jenkins: On Fri, Mar 4, 2011 at 8:01 AM, Denis Shulyakashuly...@gmail.com wrote: Thanks! I will try to do this and will update you with the result. When I read Denis' first post I thought WHAT? Postfix on a WRT54G? He's crazy! But now I'm rooting for you, Denis! I hope you get it working! :) +1. It's fun to find out how small Postfix can get. Postfix has been running since late 1998 on a 64MB box, 24/7. I replaced the few parts that break, and blow out the dust once a year or so. Good hardware does not die. Wietse A cheers from this corner as well. A light just went on. Did not even realize until now the referent was an old fashioned, jailbroken blue-box Linksys router. Talk about consolidation! Oh, that's your home router? -- No, corporate mailhub. Please, post a detailed blog and link to it when you're done! -Daniel
Re: Kernel Oops
Zitat von Wietse Venema wie...@porcupine.org: Steve Jenkins: On Fri, Mar 4, 2011 at 8:01 AM, Denis Shulyaka shuly...@gmail.com wrote: Thanks! I will try to do this and will update you with the result. When I read Denis' first post I thought WHAT? Postfix on a WRT54G? He's crazy! But now I'm rooting for you, Denis! I hope you get it working! :) +1. It's fun to find out how small Postfix can get. Postfix has been running since late 1998 on a 64MB box, 24/7. I replaced the few parts that break, and blow out the dust once a year or so. Good hardware does not die. Wietse You must have solid caps, don't you? BTW, is there any how-to for getting the least possible memory footprint for Postfix. As learned some points are - reduce either the global default process limit or the relevant process limits in master.cf - use a small footprint lookup table like cdb and the least possible count of tables - don't use regex/pcre maps - reduce active limit for qmgr any other knobs/screws to adjust? Many Thanks Andreas smime.p7s Description: S/MIME Cryptographic Signature
Re: Kernel Oops
On Fri, Mar 04, 2011 at 10:33:30PM +0100, lst_ho...@kwsoft.de wrote: BTW, is there any how-to for getting the least possible memory footprint for Postfix. As learned some points are - reduce either the global default process limit or the relevant process limits in master.cf - use a small footprint lookup table like cdb and the least possible count of tables - don't use regex/pcre maps Nothing wrong with small regexp/pcre maps. - reduce active limit for qmgr any other knobs/screws to adjust? Use postscreen, to reduce demand for connections to the real SMTP service. Potentially compile-in fewer features (TLS, SASL, LDAP, ...), but Berkeley DB is still needed for dynamic databases (e.g. postscreen dynamic whitelist), just don't use read-only Berkeley DB tables, use CDB for that. -- Viktor.
Re: Kernel Oops
lst_ho...@kwsoft.de put forth on 3/4/2011 3:33 PM: Zitat von Wietse Venema wie...@porcupine.org: Postfix has been running since late 1998 on a 64MB box, 24/7. I replaced the few parts that break, and blow out the dust once a year or so. Good hardware does not die. Wietse You must have solid caps, don't you? While film capacitors do have lifespan issues compared to solid capacitors, they can last 10-20 years if operating at a relatively low temperature, i.e. sufficient case cooling w/ system in a temp controlled environment. One of my personal servers contains an 11 year old Abit BP6 dual Celery mobo: http://www.hardwarefreak.com/web/server_pics/gallery/ A couple of caps are mildly bulging but the system is rock solid, even under burnp6 load on each CPU for 10+ minutes. -- Stan
Re: Kernel Oops
lst_ho...@kwsoft.de put forth on 3/4/2011 3:33 PM: BTW, is there any how-to for getting the least possible memory footprint for Postfix. - don't use regex/pcre maps This isn't necessarily true, is it? In some cases I would think it's dramatically reversed in favor of PCRE tables (unless the Postfix PCRE processing code overhead eats up a massive amount of memory). For example, with the following single PCRE I can block a few million, literally, residential hosts in the Centurylink (formerly Embarq) consumer broadband aDSL network: /^.*\.(dyn|dhcp)\.embarqhsd\.net$/ REJECT Please use ISP relay To do this with a CIDR would take at least 100 entries to cover all the subnets, probably many many more, due to the way they assign blocks by state, and rDNS by customer type, with (dyn|dhcp|sta) all existing within each of the top level parents. To do this with a hash table would require multiple hundreds of entries as you'd be limited to using /24s. -- Stan