$KCODE is ignored when multibyte character is used inside regular expression
----------------------------------------------------------------------------
Key: JRUBY-1133
URL: http://jira.codehaus.org/browse/JRUBY-1133
Project: JRuby
Issue Type: Bug
Affects Versions: JRuby 1.0.0
Environment: JRuby trunk (r3847), Java 5, Mac OS X 10.4
Reporter: Shinya Kasatani
Assignee: Thomas E Enebo
Attachments: testUTF8KCodeRegex.rb, testUTF8KCodeRegex2.rb
I've attached two test cases describing this bug.
The first one (testUTF8KCodeRegex.rb) is a simple test case that just changed
"/u" in testUTF8Regex.rb to "/", and changed $KCODE to "u". This passes in MRI,
while it fails in JRuby. I thought using getRuntime().getKCode() in some parts
of regular expression related code will make this first test case pass, so I
tried to write a patch.
However, the second test case (testUTF8KCodeRegex2.rb) is more complex. Even
with the same pattern object, the result of the match differs if you change
$KCODE. It means that while org.jruby.RubyRegexp owns a jregex.Pattern object,
you have to create another jregex.Pattern instance if the $KCODE was changed
after the RubyRegexp instance was created. It seems harder than I thought, so
I'll think about writing the patches later.
Please let me know if there are any good ideas or if I've got something wrong.
BTW thanks for the T-shirt today at Shibuya :)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email