I've recently switch to using this regex for pulling out links,
haven't spotted any issues with any extra characters surrounding the
links as yet.

/(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d?[.])(?:[^\s()<>]+|\
([^\s()<>]+\))+(?:\([^\s()<>]+\)|[^`!()\[\]{};:\'".,<>?«»“”‘’\s]))/

It was posted by @gruber to his twitter feed a couple of days after
his post that Chad linked to above.



On Dec 19, 3:48 am, Chad Etzel <jazzyc...@gmail.com> wrote:
> This might be relevant to your 
> interests:http://daringfireball.net/2009/11/liberal_regex_for_matching_urls
>
> Something definitely changed in the twitter web front-end code which
> is borking url matching as of a month or so ago...
>
> -Chad
>
> On Fri, Dec 18, 2009 at 2:44 AM, Harshad RJ <harshad...@gmail.com> wrote:
> > Although not an API issue, it might be good to track it as such, because
> > Twitter clients can then follow exactly the same policies that Twitter web
> > interface does.
> > If there is a standard regular expression that can be used for detecting a
> > URL, it could be published as a guideline in the API documentation for
> > consistency between all clients.
> > cheers,
> > Harshad
>
> > On Fri, Dec 18, 2009 at 2:45 AM, Raffi Krikorian <ra...@twitter.com> wrote:
>
> >> its not an API issue -- the API doesn't do any auto-URLification.
> >>  however, i'll pass this thread off to the web client team.
>
> >> On Thu, Dec 17, 2009 at 1:13 PM, dbasch <dba...@gmail.com> wrote:
>
> >>> I agree. I searched the issues db and didn't find it. Not sure if it
> >>> belongs as an API issue but I submitted it anyway.
>
> >>>http://code.google.com/p/twitter-api/issues/detail?id=1298
>
> > --
> > Harshad RJ
> >http://hrj.wikidot.com

Reply via email to