Re: Regex help-hilfe-ajuto needed

2002-08-18 Thread Mandara

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, 17 Aug 2002, at 18:25:20 -0700 Januk wrote in
[EMAIL PROTECTED]">mid:[EMAIL PROTECTED] :

JA On Saturday, August 17, 2002 at 21:35 GMT +0200, a stampede was
JA started when Mandara hollered:

Well, that's something... ;-)

JA I know this should really go on TBTECH, but I suppose one or two of
JA these every couple of years on TBUDL isn't so bad.

.

Januk, thanks a lot for the all useful things you wrote. I'll proceed
with it on tbtech, as you and Luc suggested.


Mandara
- --
(__) If you need this key:
('') mailto:[EMAIL PROTECTED]?subject=0x257DFF36
 \/
-BEGIN PGP SIGNATURE-

iD8DBQE9X7lsvgcu6yV9/zYRAhHEAJ9HcfE2Coe2iv9+Rc7zw4K1eB7ObACgt9qM
/GFqvgoH+cnSuEHxVaVXrbA=
=P1cy
-END PGP SIGNATURE-



 Current version is 1.61 | Using TBUDL information: 
 http://www.silverstones.com/thebat/TBUDLInfo.html



Regex help-hilfe-ajuto needed

2002-08-17 Thread Mandara

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

  I need just one simple thing: extract a defined *single line* from
  message header and put it in body of reply.

  But no matter how is that simple, I couldn't do that.

  This one ^From: (.*)?$ extracts to much and doesn't stop at the
  end of line (includes entire header from the from line downward).

  This one ^From: (.*?)\.\w+\s* extracts only till dot in address
  domain name (e.g Name username@ispname).

  Is there some elegant formula which would take only one line you
  chose by the first word in the line?

  Be patient with me, please, I am very beginner, and already infected
  with regexp (last night downloaded Gerd's tutorial and forgot to go
  to sleep).

Mandara
- --
(__) If you need this key:
('') mailto:[EMAIL PROTECTED]?subject=0x257DFF36
 \/
-BEGIN PGP SIGNATURE-

iD8DBQE9XqUevgcu6yV9/zYRAgeFAKCP9L4Y9o6CGS3rn1FfDEKkaV9MdgCfbDUo
o6wOSJd2puP4RTdNqrq0+84=
=pkSF
-END PGP SIGNATURE-



 Current version is 1.61 | Using TBUDL information: 
 http://www.silverstones.com/thebat/TBUDLInfo.html



Re: Regex help-hilfe-ajuto needed

2002-08-17 Thread Peter Palmreuther

Hello Mandara,

On Saturday, August 17, 2002 at 9:35:24 PM you [M] wrote (at least in
part):

M   Is there some elegant formula which would take only one line you
M   chose by the first word in the line?

You nearly got it:

^From:\s*(.*?)\n

You see the difference? The question mark is _inside_ the parentheses
and it searches for the 'newline' explicitly.
This should work quite fine and have everything after 'From:' followed
by any number of white spaces but before 'new line' in sub pattern 1.
-- 
Regards
Peter Palmreuther
(The Bat! v1.61 on Windows 2000 5.0 Build 2195 Service Pack 1)

Borrow money from pessimists--they don't expect it back.



 Current version is 1.61 | Using TBUDL information: 
 http://www.silverstones.com/thebat/TBUDLInfo.html



Re: Regex help-hilfe-ajuto needed

2002-08-17 Thread Januk Aggarwal

Hello Mandara,

On Saturday, August 17, 2002 at 21:35 GMT +0200, a stampede was
started when Mandara hollered:

I know this should really go on TBTECH, but I suppose one or two of
these every couple of years on TBUDL isn't so bad.

 This one ^From: (.*)?$ extracts to much and doesn't stop at the
 end of line (includes entire header from the from line downward).

As Peter mentioned, you need to put the question mark inside the
brackets.  The reason is that the way you've written it, the
expression says:

 Look for zero or one occurrence of any number of any character
 after a a line starting with From: .

This means you have a redundancy in your repetition operators.  What
you wanted instead is:

 Look for the fewest number of anything on one line that starts
 with From: .

To do that you need to make the any number operator (*) ungreedy.
That is done by adding a question mark *immediately* after the repeat
operator.  So, .* matches the most number of any characters it can,
while .*? matches the fewest number of any characters that it can.

So a corrected version of your expression would be:
^From: (.*?)$

Of course, there are three stylistic things that I would change about
the above.  First, I usually add the ignore case option, just to be
more general. Second, I usually use \s* instead of a literal space to
find whitespace.  This makes sure that I trim *all* spaces if
there is more than one space and it removes other white space
characters like tabs.  Note, these suggestions make the expression
more general, so there can be some undesired behaviour.  In my
experience, the undesired cases for these two suggestions are so rare
that they are well worth the risk.

The third suggestion is to explicitly set the multi-line mode.  This
prevents problems if the default settings ever get changed for
any reason.  It also makes it very clear what behaviour you want from
TB.  After all, the help file suggests that you actually are in the
*opposite* mode by default, so your expression (and Peter's) would
fail if From: wasn't the very first header.  So the final corrected
version that I would use would be: 
(?im-s)^From:\s*(.*?)$

Note I also unset the Dot All option.  I'll let you read the help
file (or the tutorial) to find out why.

 Be patient with me, please, I am very beginner, and already infected
 with regexp (last night downloaded Gerd's tutorial and forgot to go
 to sleep).

I highly recommend that you subscribe to the TBTECH list.  That list
was specially designed for detailed technical discussions.  It has
become more or less specialized towards regexps.

While Gerd's tutorial is very well written, it won't really sink in
until you try analyzing some of the many expressions written on these
lists.  You should try it, post your results on TBTECH and you'll get
some good feedback.

-- 
Thanks for writing,
 Januk Aggarwal

Ok, who is General Relativity, and what did he do with Sir Newton?



 Current version is 1.61 | Using TBUDL information: 
 http://www.silverstones.com/thebat/TBUDLInfo.html



Re: Regex help-hilfe-ajuto needed

2002-08-17 Thread Mandara

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, 17 Aug 2002, at 22:01:02 +0200 Peter wrote:

PP ^From:\s*(.*?)\n

PP You see the difference? The question mark is _inside_ the parentheses
PP and it searches for the 'newline' explicitly.

Yep, I got it. I glued myself with $ as only metachar for the end of
the line. ;-)

PP This should work quite fine and have everything after 'From:' followed
PP by any number of white spaces but before 'new line' in sub pattern 1.

Works like baby. Thanks!

Mandara
- --
(__) If you need this key:
('') mailto:[EMAIL PROTECTED]?subject=0x257DFF36
 \/
-BEGIN PGP SIGNATURE-

iD8DBQE9Xvo9vgcu6yV9/zYRAudYAJ9PVtEWzdTskGnc95hHuYovfy9OKQCePJ24
tbwahriRb9bFsBKZv8R7dT8=
=uV7H
-END PGP SIGNATURE-



 Current version is 1.61 | Using TBUDL information: 
 http://www.silverstones.com/thebat/TBUDLInfo.html