regular expression help

2010-11-29 Thread goldtech
Hi, say: import re m=cccvlvlvlvnnnflfllffccclfnnnooo re.compile(r'ccc.*nnn') rtt=.sub(||,m) rtt '||ooo' The regex is eating up too much. What I want is every non-overlapping occurrence I think. so rtt would be: '||flfllff||ooo' just like findall acts but in this case I

Re: regular expression help

2010-11-29 Thread Yingjie Lan
--- On Tue, 11/30/10, goldtech goldt...@worldpost.com wrote: From: goldtech goldt...@worldpost.com Subject: regular expression help To: python-list@python.org Date: Tuesday, November 30, 2010, 9:17 AM The regex is eating up too much. What I want is every non-overlapping occurrence I think

Re: regular expression help

2010-11-29 Thread Tim Harig
On 2010-11-30, goldtech goldt...@worldpost.com wrote: Hi, say: import re m=cccvlvlvlvnnnflfllffccclfnnnooo re.compile(r'ccc.*nnn') rtt=.sub(||,m) rtt '||ooo' The regex is eating up too much. What I want is every non-overlapping occurrence I think. so rtt would be:

Re: regular expression help

2010-11-29 Thread Tim Harig
Python 3.1.2 (r312:79147, Oct 9 2010, 00:16:06) [GCC 4.4.4] on linux2 Type help, copyright, credits or license for more information. import re m=cccvlvlvlvnnnflfllffccclfnnnooo pattern = re.compile(r'ccc[^n]*nnn') pattern.sub(||, m) '||flfllff||ooo' # or, assuming that the middle

Re: regular expression help

2010-11-29 Thread goldtech
.*? fixed it. Every occurrence of the pattern is now affected, which is what I want. Thank you very much. -- http://mail.python.org/mailman/listinfo/python-list

Python's regular expression help

2010-04-29 Thread goldtech
Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex on a string

Re: Python's regular expression help

2010-04-29 Thread Dodo
Le 29/04/2010 20:00, goldtech a écrit : Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions:

Re: Python's regular expression help

2010-04-29 Thread MRAB
goldtech wrote: Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex

Re: Python's regular expression help

2010-04-29 Thread Tim Chase
On 04/29/2010 01:00 PM, goldtech wrote: Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call

Re: Python's regular expression help

2010-04-29 Thread goldtech
On Apr 29, 11:49 am, Tim Chase python.l...@tim.thechases.com wrote: On 04/29/2010 01:00 PM, goldtech wrote: Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' )

Re: Regular Expression Help

2009-04-13 Thread Graham Breed
Jean-Claude Neveu wrote: Hello, I was wondering if someone could tell me where I'm going wrong with my regular expression. I'm trying to write a regexp that identifies whether a string contains a correctly-formatted currency amount. I want to support dollars, UK pounds and Euros, but the

Regular Expression Help

2009-04-11 Thread Jean-Claude Neveu
Hello, I was wondering if someone could tell me where I'm going wrong with my regular expression. I'm trying to write a regexp that identifies whether a string contains a correctly-formatted currency amount. I want to support dollars, UK pounds and Euros, but the example below deliberately

Re: Regular Expression Help

2009-04-11 Thread rurpy
On Apr 11, 9:42 pm, Jean-Claude Neveu jcn-france1...@pobox.com wrote: My regexp that I'm matching against is: ^\$\£?\d{0,10}(\.\d{2})?$ Here's how I think it should work (but clearly I'm wrong, because it does not actually work): ^\$\£? Require zero or one instance of $ or £ at the

Re: Regular Expression Help

2009-04-11 Thread John Machin
On Apr 12, 2:19 pm, ru...@yahoo.com wrote: On Apr 11, 9:42 pm, Jean-Claude Neveu jcn-france1...@pobox.com wrote: My regexp that I'm matching against is: ^\$\£?\d{0,10}(\.\d{2})?$ Here's how I think it should work (but clearly I'm wrong, because it does not actually work): ^\$\£?      

regular expression, help

2009-01-27 Thread Vincent Davis
I think there are two parts to this question and I am sure lots I am missing. I am hoping an example will help meI have a html doc that I am trying to use regular expressions to get a value out of. here is an example or the line td colspan='2'Parcel ID: 39-034-15-009 /td I want to get the number

regular expression, help

2009-01-27 Thread Vincent Davis
I think there are two parts to this question and I am sure lots I am missing. I am hoping an example will help meI have a html doc that I am trying to use regular expressions to get a value out of. here is an example or the line td colspan='2'Parcel ID: 39-034-15-009 /td I want to get the number

Re: regular expression, help

2009-01-27 Thread Vincent Davis
is BeautifulSoup really better? Since I don't know either I would prefer to learn only one for now. Thanks Vincent Davis On Tue, Jan 27, 2009 at 10:39 AM, MRAB goo...@mrabarnett.plus.com wrote: Vincent Davis wrote: I think there are two parts to this question and I am sure lots I am

Re: regular expression, help

2009-01-27 Thread MRAB
Vincent Davis wrote: I think there are two parts to this question and I am sure lots I am missing. I am hoping an example will help me I have a html doc that I am trying to use regular expressions to get a value out of. here is an example or the line td colspan='2'Parcel ID: 39-034-15-009 /td

Regular expression help: unable to search ' # ' character in the file

2008-09-27 Thread dudeja . rajat
Hi, Can some help me with the regular expression. I'm looking to search # character in my file? My file has contents: ### Hello World ### length = 10 breadth = 20 height = 30 ###

Re: Regular expression help: unable to search ' # ' character in the file

2008-09-27 Thread Fredrik Lundh
[EMAIL PROTECTED] wrote: import re fd = open(file, 'r') line = fd.readline pat1 = re.compile(\#*) while(line): mat1 = pat1.search(line) if mat1: print line line = fd.readline() I strongly doubt that this is

Re: Regular expression help: unable to search ' # ' character in the file

2008-09-27 Thread dudeja . rajat
On Sat, Sep 27, 2008 at 1:58 PM, Fredrik Lundh [EMAIL PROTECTED]wrote: [EMAIL PROTECTED] wrote: import re fd = open(file, 'r') line = fd.readline pat1 = re.compile(\#*) while(line): mat1 = pat1.search(line) if mat1: print

Re: Regular expression help

2008-07-18 Thread Russell Blau
[EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I am new to Python, with a background in scientific computing. I'm trying to write a script that will take a file with lines like c afrac=.7 mmom=0 sev=-9.56646 erep=0 etot=-11.020107 emad=-3.597647 3pv=0 extract the values of afrac

Re: Regular expression help

2008-07-18 Thread Brad
[EMAIL PROTECTED] wrote: Hello, I am new to Python, with a background in scientific computing. I'm trying to write a script that will take a file with lines like c afrac=.7 mmom=0 sev=-9.56646 erep=0 etot=-11.020107 emad=-3.597647 3pv=0 extract the values of afrac and etot... Why not just

Re: Regular expression help

2008-07-18 Thread Gerard flanagan
[EMAIL PROTECTED] wrote: Hello, I am new to Python, with a background in scientific computing. I'm trying to write a script that will take a file with lines like c afrac=.7 mmom=0 sev=-9.56646 erep=0 etot=-11.020107 emad=-3.597647 3pv=0 extract the values of afrac and etot and plot them. I'm

Re: Regular expression help

2008-07-18 Thread Nick Dumas
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I think you're over-complicating this. I'm assuming that you're going to do a line graph of some sorta, and each new line of the file contains a new set of data. The problem you mentioned with your regex returning a match object rather than a string

Re: Regular expression help

2008-07-18 Thread nclbndk759
On Jul 18, 3:35 pm, Nick Dumas [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I think you're over-complicating this. I'm assuming that you're going to do a line graph of some sorta, and each new line of the file contains a new set of data. The problem you mentioned

Re: Regular expression help

2008-07-18 Thread Marc 'BlackJack' Rintsch
On Fri, 18 Jul 2008 10:04:29 -0400, Russell Blau wrote: values = {} for expression in line.split( ): if = in expression: name, val = expression.split(=) values[name] = val […] And when you get to be a really hard-core Pythonista, you could write the whole routine

Regular Expression Help

2008-03-16 Thread santhosh kumar
Hi all, I have text like , STRINGTABLE BEGIN ID_NEXT_PANECambiar a la siguiente sección de laventana \nSiguiente sección ID_PREV_PANERegresar a la sección anterior de laventana\nSección anterior END STRINGTABLE BEGIN ID_VIEW_TOOLBAR Mostrar u ocultar la

Re: Regular Expression Help

2008-03-16 Thread Duncan Booth
santhosh kumar [EMAIL PROTECTED] wrote: I have text like , STRINGTABLE BEGIN ID_NEXT_PANECambiar a la siguiente sección de laventana \nSiguiente sección ID_PREV_PANERegresar a la sección anterior de laventana\nSección anterior END STRINGTABLE BEGIN

Regular Expression Help

2008-02-26 Thread Lythoner
Hi All, I have a python utility which helps to generate an excel file for language translation. For any new language, we will generate the excel file which will have the English text and column for interested translation language. The translator will provide the language string and again I will

Re: Regular Expression Help

2008-02-26 Thread John Machin
On Feb 27, 6:28 am, [EMAIL PROTECTED] wrote: Hi All, I have a python utility which helps to generate an excel file for language translation. For any new language, we will generate the excel file which will have the English text and column for interested translation language. The translator

Re: python regular expression help

2007-04-12 Thread 7stud
On Apr 11, 11:15 pm, [EMAIL PROTECTED] wrote: On Apr 11, 9:50 pm, Gabriel Genellina [EMAIL PROTECTED] lhs = re.compile(r'\s*(\b\w+\s*=)') for s in [ a = 4 b =3.4 5.4 c = 4.5, a = 4.5 b = 'h' 'd' c = 4.5 3.5]: tokens = lhs.split(s) results = [tokens[_] + tokens[_+1] for _ in

Re: python regular expression help

2007-04-12 Thread Qilong Ren
expression help En Wed, 11 Apr 2007 23:14:01 -0300, Qilong Ren [EMAIL PROTECTED] escribió: Thanks for reply. That actually is not what I want. Strings I am dealing with may look like this: s = 'a = 4.5 b = 'h' 'd' c = 4.5 3.5' What I want is a = 4.5 b = 'h' 'd' c

python regular expression help

2007-04-11 Thread Qilong Ren
Hi, everyone, I am extracting some information from a given string using python RE. The string is ,for example, s = 'a = 4 b =3.4 5.4 c = 4.5' What I want is : a = 4 b = 3.4 5.4 c = 4.5 Right now I use : pattern = re.compile(r'\w+\s*=\s*.*?\s+') lists = pattern.findall(s) It

Re: python regular expression help

2007-04-11 Thread liupeng
pattern = re.compile(r'\w+\s*=\s*[0-9]*.[0-9]*\s*') lists = pattern.findall(s) print lists ['a=4 ', 'b=3.4 ', 'c=4.5'] On Wed, Apr 11, 2007 at 06:10:07PM -0700, Qilong Ren wrote: Hi, everyone, I am extracting some information from a given string using python RE. The string is ,for example,

Re: python regular expression help

2007-04-11 Thread Qilong Ren
Sent: Wednesday, April 11, 2007 6:41:30 PM Subject: Re: python regular expression help pattern = re.compile(r'\w+\s*=\s*[0-9]*.[0-9]*\s*') lists = pattern.findall(s) print lists ['a=4 ', 'b=3.4 ', 'c=4.5'] On Wed, Apr 11, 2007 at 06:10:07PM -0700, Qilong Ren wrote: Hi, everyone, I am

Re: python regular expression help

2007-04-11 Thread 7stud
On Apr 11, 7:41 pm, liupeng [EMAIL PROTECTED] wrote: pattern = re.compile(r'\w+\s*=\s*[0-9]*.[0-9]*\s*') lists = pattern.findall(s) print lists ['a=4 ', 'b=3.4 ', 'c=4.5'] On Wed, Apr 11, 2007 at 06:10:07PM -0700, Qilong Ren wrote: Hi, everyone, I am extracting some information from a

Re: python regular expression help

2007-04-11 Thread Gabriel Genellina
En Wed, 11 Apr 2007 23:14:01 -0300, Qilong Ren [EMAIL PROTECTED] escribió: Thanks for reply. That actually is not what I want. Strings I am dealing with may look like this: s = 'a = 4.5 b = 'h' 'd' c = 4.5 3.5' What I want is a = 4.5 b = 'h' 'd' c = 4.5 3.5 That's

Re: python regular expression help

2007-04-11 Thread Qilong Ren
) the corresponding values values = re.split(r'\w+\s*=',s)[1:] It dose not look good but it works. What do you think? Thanks,Qilong - Original Message From: 7stud [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, April 11, 2007 8:27:57 PM Subject: Re: python regular expression help

Re: python regular expression help

2007-04-11 Thread 7stud
On Apr 11, 10:50 pm, Gabriel Genellina [EMAIL PROTECTED] wrote: En Wed, 11 Apr 2007 23:14:01 -0300, Qilong Ren [EMAIL PROTECTED] escribió: Thanks for reply. That actually is not what I want. Strings I am dealing with may look like this: s = 'a = 4.5 b = 'h' 'd' c = 4.5 3.5'

Re: python regular expression help

2007-04-11 Thread attn . steven . kuo
On Apr 11, 9:50 pm, Gabriel Genellina [EMAIL PROTECTED] wrote: En Wed, 11 Apr 2007 23:14:01 -0300, Qilong Ren [EMAIL PROTECTED] escribió: Thanks for reply. That actually is not what I want. Strings I am dealing with may look like this: s = 'a = 4.5 b = 'h' 'd' c = 4.5 3.5' What I

Re: python regular expression help

2007-04-11 Thread Paul McGuire
On Apr 11, 11:50 pm, Gabriel Genellina [EMAIL PROTECTED] wrote: En Wed, 11 Apr 2007 23:14:01 -0300, Qilong Ren [EMAIL PROTECTED] escribió: Thanks for reply. That actually is not what I want. Strings I am dealing with may look like this: s = 'a = 4.5 b = 'h' 'd' c = 4.5 3.5'

Re: Regular Expression help for parsing html tables

2006-10-29 Thread Odalrick
[EMAIL PROTECTED] skrev: Hello, I am having some difficulty creating a regular expression for the following string situation in html. I want to find a table that has specific text in it and then extract the html just for that immediate table. the string would look something like this:

Re: Regular Expression help for parsing html tables

2006-10-29 Thread Paddy
[EMAIL PROTECTED] wrote: Hello, I am having some difficulty creating a regular expression for the following string situation in html. I want to find a table that has specific text in it and then extract the html just for that immediate table. the string would look something like this:

Regular Expression help for parsing html tables

2006-10-28 Thread steve551979
Hello, I am having some difficulty creating a regular expression for the following string situation in html. I want to find a table that has specific text in it and then extract the html just for that immediate table. the string would look something like this: ...stuff here... table ...stuff

Re: Regular Expression help for parsing html tables

2006-10-28 Thread Stefan Behnel
Hi Steve, [EMAIL PROTECTED] wrote: I am having some difficulty creating a regular expression for the following string situation in html. I want to find a table that has specific text in it and then extract the html just for that immediate table. Any reason why you can't use a real HTML

Re: need some regular expression help

2006-10-08 Thread Diez B. Roggisch
hanumizzle wrote: On 7 Oct 2006 15:00:29 -0700, Diez B. Roggisch [EMAIL PROTECTED] wrote: Chris wrote: I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = [ '((2x+2)sin(x))',

Re: need some regular expression help

2006-10-08 Thread Theerasak Photha
On 8 Oct 2006 01:49:50 -0700, Diez B. Roggisch [EMAIL PROTECTED] wrote: Even if it has - I'm not sure if it really does you good, for several reasons: - regexes - even enhanced ones - don't build trees. But that is what you ultimately want from an expression like sin(log(x)) - even

Re: need some regular expression help

2006-10-08 Thread bearophileHUGS
Tim Chase: It still doesn't solve the aforementioned problem of things like ')))(((' which is balanced, but psychotic. :) This may solve the problem: def balanced(txt): d = {'(':1, ')':-1} tot = 0 for c in txt: tot += d.get(c, 0) if tot 0: return False

Re: need some regular expression help

2006-10-08 Thread Fredrik Lundh
[EMAIL PROTECTED] wrote: The dict solution looks better, but this may be faster: it's slightly faster, but both your alternatives are about 10x slower than a straightforward: def balanced(txt): return txt.count(() == txt.count()) /F --

Re: need some regular expression help

2006-10-08 Thread Mirco Wahab
Thus spoke Diez B. Roggisch (on 2006-10-08 10:49): Certainly true, and it always gives me a hard time because I don't know to which extend a regular expression nowadays might do the job because of these extensions. It was so much easier back in the old times Right, in perl, this would be a

Re: need some regular expression help

2006-10-08 Thread Diez B. Roggisch
Mirco Wahab schrieb: Thus spoke Diez B. Roggisch (on 2006-10-08 10:49): Certainly true, and it always gives me a hard time because I don't know to which extend a regular expression nowadays might do the job because of these extensions. It was so much easier back in the old times Right,

Re: need some regular expression help

2006-10-08 Thread bearophileHUGS
Fredrik Lundh wrote: it's slightly faster, but both your alternatives are about 10x slower than a straightforward: def balanced(txt): return txt.count(() == txt.count()) I know, but if you read my post again you see that I have shown those solutions to mark )))((( as bad expressions.

Re: need some regular expression help

2006-10-08 Thread Roy Smith
Diez B. Roggisch [EMAIL PROTECTED] wrote: Certainly true, and it always gives me a hard time because I don't know to which extend a regular expression nowadays might do the job because of these extensions. It was so much easier back in the old times What old times? I've been working with

Re: need some regular expression help

2006-10-08 Thread Theerasak Photha
On 10/8/06, Roy Smith [EMAIL PROTECTED] wrote: Diez B. Roggisch [EMAIL PROTECTED] wrote: Certainly true, and it always gives me a hard time because I don't know to which extend a regular expression nowadays might do the job because of these extensions. It was so much easier back in the old

need some regular expression help

2006-10-07 Thread Chris
I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = [ '((2x+2)sin(x))', '(log(2)/log(5))' ] Can anybody help me out? Thanks for any help! -- http://mail.python.org/mailman/listinfo/python-list

Re: need some regular expression help

2006-10-07 Thread Diez B. Roggisch
Chris wrote: I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = [ '((2x+2)sin(x))', '(log(2)/log(5))' ] Can anybody help me out? This is not possible with regular expressions - they can't

Re: need some regular expression help

2006-10-07 Thread John Machin
Chris wrote: I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = [ '((2x+2)sin(x))', '(log(2)/log(5))' ] Can anybody help me out? No, there is so such pattern. You will have to code up a function.

Re: need some regular expression help

2006-10-07 Thread hanumizzle
On 7 Oct 2006 15:00:29 -0700, Diez B. Roggisch [EMAIL PROTECTED] wrote: Chris wrote: I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = [ '((2x+2)sin(x))', '(log(2)/log(5))' ] Can anybody

Re: need some regular expression help

2006-10-07 Thread Roy Smith
In article [EMAIL PROTECTED], Chris [EMAIL PROTECTED] wrote: I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = [ '((2x+2)sin(x))', '(log(2)/log(5))' ] Can anybody help me out? Thanks for any

Re: need some regular expression help

2006-10-07 Thread Tim Chase
Why does it need to be a regex? There is a very simple and well-known algorithm which does what you want. Start with i=0. Walk the string one character at a time, incrementing i each time you see a '(', and decrementing it each time you see a ')'. At the end of the string, the count

Re: Regular Expression help

2006-04-28 Thread Kent Johnson
Edward Elliott wrote: [EMAIL PROTECTED] wrote: If you are parsing HTML, it may make more sense to use a package designed especially for that purpose, like Beautiful Soup. I don't know Beautiful Soup, but one advantage regexes have over some parsers is handling malformed html. Beautiful

Regular Expression help

2006-04-27 Thread RunLevelZero
I have some data and I need to put it in a list in a particular way. I have that figured out but there is stuff in the data that I don't want. Example: 10:00am - 11:00am:/b a

Re: Regular Expression help

2006-04-27 Thread Edward Elliott
RunLevelZero wrote: 10:00am - 11:00am:/b a href=/tvpdb?d=tvpid=167540528[snip]The Price Is Right/aem All I want is Price Is Right Here is the re. findshows = re.compile(r'(\d\d:\d\d\D\D\s-\s\d\d:\d\d\D\D:*.*/aem)') 1. A regex remembers everything it matches -- no need to wrap the

Re: Regular Expression help

2006-04-27 Thread RunLevelZero
Great I will test this out once I have the time... thanks for the quick response -- http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression help

2006-04-27 Thread johnzenger
If you are parsing HTML, it may make more sense to use a package designed especially for that purpose, like Beautiful Soup. -- http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression help

2006-04-27 Thread RunLevelZero
I considered that but what I need is simple and I don't want to use another library for something so simple but thank you. Plus I don't understand them all that well :) -- http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression help

2006-04-27 Thread johnzenger
If what you need is simple, regular expressions are almost never the answer. And how simple can it be if you are posting here? :) BeautifulSoup isn't all that hard. Observe: from BeautifulSoup import BeautifulSoup html = '10:00am - 11:00am:/b a href=/tvpdb?d=tvpid=167540528[snip]The Price

Re: Regular Expression help

2006-04-27 Thread RunLevelZero
r'a[^]*(.*?)/a' With a slight modification that did exactly what I wanted, and yes the findall was the only way to get all that I needed as I buffered all the read. Thanks a bunch. -- http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression help

2006-04-27 Thread RunLevelZero
Interesting... thank you. -- http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression help

2006-04-27 Thread Edward Elliott
[EMAIL PROTECTED] wrote: If you are parsing HTML, it may make more sense to use a package designed especially for that purpose, like Beautiful Soup. I don't know Beautiful Soup, but one advantage regexes have over some parsers is handling malformed html. Omitted closing tags can wreak havoc.

Re: Regular Expression help

2006-04-27 Thread John Bokma
Edward Elliott [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: If you are parsing HTML, it may make more sense to use a package designed especially for that purpose, like Beautiful Soup. I don't know Beautiful Soup, but one advantage regexes have over some parsers is handling malformed