On Wed, 2010-02-17 at 08:19 -0800, John Hardin wrote:
> On Wed, 17 Feb 2010, Karsten Brckelmann wrote:
> 
> >>>> rawbody STYLE_GIBBERISH /<style[^>]{0,30}>(?:\s{1,20}|[^\s:;<]){175}/im
> 
> > The problem is nested quantifiers with an alternation.
> >
> > An alternative approach that should match the desired would look like
> > this -- eliminating the alternation with quantifiers inside.

> > John, does the above example help? :)
> 
> Not enough. What if there are more than 20 spaces? Or no spaces in a block 
> of more than 80 non-punctuation characters?

It was merely example showing a different approach to a similar RE.
You're free to adjust the numbers -- I pretty much pulled them out of
the air anyway.

> I don't think there's any really _good_ way do what I'm trying to do in a 
> rawbody rule. I'm now thinking a plugin that pulls out specified HTML tags 
> and their contents and allows rules on them is the best way to approach 
> this, for example:
> 
>    tagbody  STYLE_GIBBERISH  style =~ /^[^:;]{200}/

That actually looks useful and quite elegant, but it should be easy to
convert into a simple rawbody rule, no? Without any substantial amount
of thinking about it:

  rawbody STYLE_GIBBERISH  /<style[^>]*>[^:;<]{200}/im

Of course, it isn't limited to text/html parts.


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to