On Tue, 2009-02-24 at 12:47 -0500, Gene Heskett wrote:
> On Tuesday 24 February 2009, Karsten Bräckelmann wrote:

> > > Would this work?
> > >
> > > :0:
> > > *^*no To-header on input*
> > > /dev/null
> >
> > Nope, it wouldn't. Procmail uses REs, not shell-style globbing.
> 
> I never claimed to understand regex's.  I know the ^ anchors the start of the 
> search to the start of the line, and that the first * is needed to into a 
> recipe, but how does one go about allowing it to search the whole line for 
> the given character sequence, triggering on finding it at some arbitrary 
> location in that line?  If grep can do it, why can't procmail?

*sigh*  OK, slow down a bit. You're mixing procmail and REs here.

Yes, the very first * denotes a condition in the procmail recipe. Also
yes, ^ anchors your RE at the beginning of the string. Procmail is no
different than grep in this regard.

  grep undisclosed raw-message-headers-only.file

Limiting to the headers only, which is the procmail default. Just to
prevent grep matches in the body. Now that trivial grep will check ALL
headers -- for example, including the Subject. If you want to limit your
grep to a specific header, you need to anchor your RE. Exactly the same
with procmail. (With the notable exception of procmail properly treating
multi-line headers, which grep of course doesn't. Add formail to the mix
for that. ;)

So you want to match a specific header (read: the beginning of a given
line), and match a string that might appear anywhere in that header.
That's basically "whatever there is between the two". In RE, that's "any
char, any number of times". The good old /.*/, and the RE becomes:

  ^Header: .*Catch Phrase

Procmail, just like grep, is line oriented -- whereas SA rules make the
syntax for dealing with that easier. For reference, and in a vain
attempt to get back on-topic:

  header MY_CATCH_PHRASE  Header =~ /Catch Phrase/

The rule is limited to header Header, but the RE itself is not anchored
to the beginning, thus we don't need to care about stuff between them.


Sorry, Gene -- this is most *basic* stuff.


> IMO the Docs suck a deep space quality vacuum in re these details.  If there 
> exists a decent tut on this subject, please point me at it.

man procmailex

For RE docs, see http://perldoc.perl.org/perlre.html in particular the
Quick-Start and Tutorial introduction mentioned early. Note though, that
procmail REs are quite different from that in anything even mildly
advanced.


> > If you don't want to anchor your condition REs at the beginning of the
> > line, don't. IMHO you'd better do though, for multiple reasons -- speed,
> > and not to match any arbitrary header but the To header only.
> 
> Are you saying that if I remove the ^ and second *, then it will search the 
> whole header?  Testing that now...

Yes. ALL headers, since you didn't specify a header. Can appear anywhere
in the line. See above for properly limiting to a specific header.

FWIW, the "second *" (actually the first one in your RE) is just bad.
I'd expect procmail to complain about that.


> > That said, I do agree with Martin and John. The absence of a real
> > recipient in the To header is NOT sufficient to silently discard mail.
> > Even more so, since the POP3 server appears to have rewritten that
> > stuff.
> 
> If I was an ISP, maybe.  But I'm just sick of junk mail & if I miss a free 
> offer for 20 boxes of viagra, well... :)

Or the wedding of a friend. [1] Coincidentally, I sent some mail to Bcc
only recipients the other day. You would discard it. Good one...


Again, I *strongly* suggest to score it a point or two in SA, if you are
so inclined. Leaving at least a chance for legitimate mail to survive
it.

  guenther


[1] No, not me. :)

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to