Jim Popovitch wrote:

>On Fri, Feb 22, 2008 at 4:03 PM, Mark Sapiro <[EMAIL PROTECTED]> wrote:
>>  You could try to find the line
>>
>>  urlpat = re.compile(r'(\w+://[^>)\s]+)') # URLs in text
>>
>>  near the beginning of Mailman/Archiver/HyperArch.py and change it to
>>
>>  urlpat = re.compile(r'(\w+://[^>)\s]+?)\.?(\s|$)') # URLs in text
>
>Mark, that works well for the case I described.  I did find something
>else similar that doesn't work:
>
>     this is another url http://www.yahoo.com, and so is this
>http://www.google.com.
>
>Gets converted into:
>   this is another url <A
>HREF="http://www.yahoo.com,";>http://www.yahoo.com,</A>
>            and so is this <A
>HREF="http://www.ibm.com";>http://www.google.com</A>.


I assume that's a typo and 'ibm' should be 'google'.


>So, the problem seems to appear with commas too which makes me wonder
>if this can be resolved with this:
>
>   urlpat = re.compile(r'(\w+://[^>)\s]+?)(\.|,)?(\s|$)') # URLs in text
>
>but then I got to thinking about any other punctuation make that
>follows a URL... and my mind started spinning :-)


I think the suggestion above - (\.|,)? would work for comma, but you
could do it other ways - e.g.

   urlpat = re.compile(r'(\w+://[^>)\s]+?)[.,;]?(\s|$)') # URLs in text

to handle '.', ',' and ';', and you could extend that with more
characters, but you really need to be careful. Consider for example,
<http://www.example.com/some/page#anchor.> which could be a valid URL
ending in '.'.

-- 
Mark Sapiro <[EMAIL PROTECTED]>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

------------------------------------------------------
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&amp;file=faq01.027.htp

Reply via email to