Hello Stuart,

On  Thu, 21 Dec 2000  at  17:38:35 GMT +0000 (which was 9:38 AM
where I live) witnesses say Stuart Tares typed:

> I though that it would have been a relatively easy thing to set up
> but am getting nowhere.

It takes a while.  You made a great attempt.

> I was thinking that the logic for the regexp
> would be:

> 1) Take all the text of the message

Correct.

> 2) Set everything up to the -=-=-=- as %SUBPATT0

Ah, this is your mistake.  %SUBPATT="0" is reserved for the entire
match.  So in your case you're matching everything including the sig.

> 3) Set (-=-=-=-\n\[.*\]\n\[.*\]\n) as %SUBPATT1

In your regexp, this is actually a higher subpattern.  I think it is
Subpattern 3.  Try it and see what number it takes before you get the
signature only.

> You could then set %QUOTES to the output of %SUBPATT0.  This is what
> I came up with:

> <one Line>
> %QUOTES="%SETPATTREGEXP=""(?is)((.*\n)*)(-=-=-=-\n\[.*\]\n\[.*\]\n)""%
> REGEXPBLINDMATCH=""%TEXT""%SUBPATT=""0"""
> </one line>

That's good.  However, as I explained above, you're capturing one
subpattern to small.  Make it %SUBPATT=""1"" and it should work.

> This, however, gives me two copies of the text both with the
> footer/signature in it !  I am sure that I am missing something
> extremely obvious but I just can't see it.

Subpattern 0 is the entire text that is matched.  The two copies are
probably because you have forgotten to remove one of the %quotes.  You
only need the one line that you posted.

What you want is stored in SubPattern 1.  The subpatterns are what is
in brackets.  Count the number of opening brackets to find the
subpattern number.

BTW, you can simplify the regexp to:

<one line>
%Quotes='%SETPATTREGEXP="(?is)(.*?)(\n-=-=-=-\n\[.*\])"%REGEXPBLINDMATCH="%TEXT"%SUBPATT="1"'
</one line>

The newlines are matched by the '.' in '.*' because of the mode you are
using.  Knowing this, the end part of the sig can also be simplified
because '*' is greedy (see the end for discussion of greedy).

This regexp will give you everything up to the block:

-=-=-=-
[blah blah blah]
[blah blah blah]

Take a look at %SUBPATT="2" and you will see the entire signature.

Explanation of Greedy:

When you use a variable length repeat operator (like *), the largest
pattern that matches the regular expression will be used.  For
example, if you have:

> [some text] [ some other text]

Now use the regexp "\[(.*)\]" and look at subpattern 1.  You'll see:

> some text] [ some other text

If you don't want that behaviour, you have to make * ungreedy (The
same applies for all the repeat operators).  Ungreedy repeats will
match the smallest pattern that satisfies the rest of the regular
expression.  To make it ungreedy, add a '?' after the repeat operator.
Note that '?' has two meanings depending on context.
If we had used the regexp, "\[(.*?)\]", now you'd get:

> some text

 

-- 
Thanks for writing,
 Januk Aggarwal

 Using The Bat! 1.48f
 under Windows 98 4.10 Build 2222  A 

-- 
--------------------------------------------------------------
View the TBUDL archive at http://tbudl.thebat.dutaint.com
To send a message to the list moderation team double click here:
   <mailto:[EMAIL PROTECTED]>
To Unsubscribe from TBUDL, double click here and send the message:
   <mailto:[EMAIL PROTECTED]>
--------------------------------------------------------------

You are subscribed as : [email protected]


Reply via email to