Having the \D+ I can see it not working because it will stop at 3 in 3000. I 
tried a couple of other things, but was unable to get it working. One of the regex 
gurus hopefully will respond.

        Sorry for not more than \D+ 

Wags ;)

-----Original Message-----
From: Peter Cline [mailto:[EMAIL PROTECTED]]
Sent: Friday, January 04, 2002 12:49
To: [EMAIL PROTECTED]
Subject: Strange (from my perspective) regex behavior


I am trying to extraxt some text from a file using a regular 
expression.  It is not behaving as expected and am totally perplexed as to why.
Here is an excerpt of the text

1. Top Story: Dynegy in Agreement to Get Enron Pipeline
2. M&A: Newmont-Normandy, Hewlett-Compaq, Pax TV, WorldCom
3. Investment Banking: Goldman, Sandler, Merrill Lynch
4. I.P.O.s/Offerings: Sirius Satellite Radio, Neuer Markt
5. Venture Capital: Lucent-Coller Capital, EM.TV
6. Private Equity: HSBC, Canada 3000, Edel Music, Kumho Tire
7. Legal: GE Capital Aviation, EchoStar-DirecTV
8. Correction: Daily Deal Echostar-DirecTV  Article


/------------------advertisement--------------\

I want to extract the numbered list.

here is the regex I am using to do it:
m!((\d\.\s\D+)+)/[-]+advertisement!

For some reason this starts matching at number 7.  If I eliminate 
everything after / the regex matches from 1 to the / in item 4.

I am totally perplexed as to why this is happening.  If anyone has insite, 
I would be most appreciative.

Thanks
Peter Cline
Inet Developer
New York Times Digital 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to