Non-greedy item can cause Regexp to get stuck
---------------------------------------------
Key: JRUBY-5340
URL: http://jira.codehaus.org/browse/JRUBY-5340
Project: JRuby
Issue Type: Bug
Components: Core Classes/Modules
Reporter: Hannes Wallnoefer
Priority: Minor
This is a rather esoteric regexp bug best illustrated with an example.
regex = Regexp.new(/(([a-c])a*?\2){2}/)
=> /(([a-c])a*?\2){2}/
regex.match("aaabab")
actual result in JRuby 1.6.0.RC1:
=> #<MatchData "aaa" 1:"a" 2:"a">
expected result (and produced by Ruby 1.8.7):
=> #<MatchData "aaabab" 1:"bab" 2:"b">
regex.match("aaa")
actual result in JRuby 1.6.0.RC1:
=> #<MatchData "aaa" 1:"a" 2:"a">
expected result (and produced by Ruby 1.8.7):
=> nil
In JRuby, the outer and inner Regexp group both have the same value (apparently
the third "a") which is wrong as the outer group needs to contain the inner
group twice. It looks like the non-greedy a* after the inner group causes the
same character to get matched twice.
FYI, I discovered this by running the Mozilla JavaScript test suite on Rhino
with Joni as Regexp engine. There will probably be more reports like this. Let
me know if this is the right place or if I should use something else like the
Joni github issue tracker.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email