On Wed, 2010-02-17 at 08:19 -0800, John Hardin wrote:
> On Wed, 17 Feb 2010, Karsten Brckelmann wrote:
>
> >>>> rawbody STYLE_GIBBERISH /<style[^>]{0,30}>(?:\s{1,20}|[^\s:;<]){175}/im
>
> > The problem is nested quantifiers with an alternation.
> >
> > An alternative approach that should match the desired would look like
> > this -- eliminating the alternation with quantifiers inside.
> > John, does the above example help? :)
>
> Not enough. What if there are more than 20 spaces? Or no spaces in a block
> of more than 80 non-punctuation characters?
It was merely example showing a different approach to a similar RE.
You're free to adjust the numbers -- I pretty much pulled them out of
the air anyway.
> I don't think there's any really _good_ way do what I'm trying to do in a
> rawbody rule. I'm now thinking a plugin that pulls out specified HTML tags
> and their contents and allows rules on them is the best way to approach
> this, for example:
>
> tagbody STYLE_GIBBERISH style =~ /^[^:;]{200}/
That actually looks useful and quite elegant, but it should be easy to
convert into a simple rawbody rule, no? Without any substantial amount
of thinking about it:
rawbody STYLE_GIBBERISH /<style[^>]*>[^:;<]{200}/im
Of course, it isn't limited to text/html parts.
--
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}