Non-greedy item can cause Regexp to get stuck
---------------------------------------------

                 Key: JRUBY-5340
                 URL: http://jira.codehaus.org/browse/JRUBY-5340
             Project: JRuby
          Issue Type: Bug
          Components: Core Classes/Modules
            Reporter: Hannes Wallnoefer
            Priority: Minor


This is a rather esoteric regexp bug best illustrated with an example.

regex = Regexp.new(/(([a-c])a*?\2){2}/)
=> /(([a-c])a*?\2){2}/

regex.match("aaabab")
actual result in JRuby 1.6.0.RC1:
=> #<MatchData "aaa" 1:"a" 2:"a">
expected result (and produced by Ruby 1.8.7):
=> #<MatchData "aaabab" 1:"bab" 2:"b">

regex.match("aaa")
actual result in JRuby 1.6.0.RC1:
=> #<MatchData "aaa" 1:"a" 2:"a">
expected result (and produced by Ruby 1.8.7):
=> nil

In JRuby, the outer and inner Regexp group both have the same value (apparently 
the third "a") which is wrong as the outer group needs to contain the inner 
group twice. It looks like the non-greedy a* after the inner group causes the 
same character to get matched twice.

FYI, I discovered this by running the Mozilla JavaScript test suite on Rhino 
with Joni as Regexp engine. There will probably be more reports like this. Let 
me know if this is the right place or if I should use something else like the 
Joni github issue tracker.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to