The following bug has been logged on the website:

Bug reference:      6381
Logged by:          john melesky
Email address:      c...@phaedrusdeinus.org
PostgreSQL version: 9.1.1
Operating system:   x86_64-pc-linux-gnu
Description:        

This simple regexp returns correctly (that is, (.*?) matches
'blahblah.com'):

=# select regexp_matches('http://blahblah.com/asdf',
'http://(.*?)(/|%2f|$)');
  regexp_matches  
------------------
 {blahblah.com,/}

This, more complex/complete version, matches greedily, which is incorrect:

=# select regexp_matches('http://blahblah.com/asdf',
'http(s?)(:|%3a)(//|%2f%2f)(.*?)(/|%2f|$)');
         regexp_matches         
--------------------------------
 {"",:,//,blahblah.com/asdf,""}

(That is, (.*?) matches 'blahblah.com/asdf')

The problem appears to be the inclusion of '$' in the final paren group. So,
this works:

select regexp_matches('http://blahblah.com/asdf',
'http(s?)(:|%3a)(//|%2f%2f)(.*?)(/|%2f)');
      regexp_matches      
--------------------------
 {"",:,//,blahblah.com,/}




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to