I think % are not comments (because the first '(' on the second line 
matches the last ')' on the same line). They might just be pattern 
replacers...
I guess the one-liner shuld simply be

attrfind = re.compile('[%s]*([a-zA-Z_][-.a-zA-Z_0-9]*)' % string.whitespace 
+ ('([%s]*=[%s]*' % (string.whitespace, string.whitespace)) + 
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./:+*%?!\(\)_#=~]*))?')

or maybe just add parens around each line :

attrfind = re.compile(('[%s]*([a-zA-Z_][-.a-zA-Z_0-9]*)' % 
string.whitespace) + ('([%s]*=[%s]*' % (string.whitespace, 
string.whitespace)) + 
(r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./:+*%?!\(\)_#=~]*))?'))

NH


> -----Message d'origine-----
> De:   Bill Nalen/Towers Perrin [SMTP:[EMAIL PROTECTED]]
> Date: mercredi 6 juin 2001 15:22
> �:    Plucker Development List
> Objet:        Re: Windows Palm conduit
>
>
> Can someone translate the following into a single line?
>
> It's from the sgml library and is supposed to allow me to find the
> attributes within a tag.  I have a version of the pcre, but I don't know
> what the second and third lines are supposed to do.
>
> attrfind = re.compile(
>     '[%s]*([a-zA-Z_][-.a-zA-Z_0-9]*)' % string.whitespace
>     + ('([%s]*=[%s]*' % (string.whitespace, string.whitespace))
>     + r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./:+*%?!\(\)_#=~]*))?')
>
> I thought it was
>
> '[ ]*([a-zA-Z_][-.a-zA-Z_0-9]*)([ ]*=[ 
]*('[^']*'|\"[^\"]*\"|[-a-zA-Z0-9./:
> +*%?!\\_#=~]*))?'
>
> but that doesn't seem to work quite right.  I guess I don't know what the 
r
> does in the third line.
>
> I am assuming that for the tag
> <a href="http://someurl/page.html";>
> I would get href, and http://someurl/page.html for the attributes from 
the
> re.match command.
>
> Any help would be appreciated.
> Bill
>

Reply via email to