$KCODE is ignored when multibyte character is used inside regular expression
----------------------------------------------------------------------------

                 Key: JRUBY-1133
                 URL: http://jira.codehaus.org/browse/JRUBY-1133
             Project: JRuby
          Issue Type: Bug
    Affects Versions: JRuby 1.0.0
         Environment: JRuby trunk (r3847), Java 5, Mac OS X 10.4
            Reporter: Shinya Kasatani
            Assignee: Thomas E Enebo
         Attachments: testUTF8KCodeRegex.rb, testUTF8KCodeRegex2.rb

I've attached two test cases describing this bug.
The first one (testUTF8KCodeRegex.rb) is a simple test case that just changed 
"/u" in testUTF8Regex.rb to "/", and changed $KCODE to "u". This passes in MRI, 
while it fails in JRuby. I thought using getRuntime().getKCode() in some parts 
of regular expression related code will make this first test case pass, so I 
tried to write a patch.

However, the second test case (testUTF8KCodeRegex2.rb) is more complex. Even 
with the same pattern object, the result of the match differs if you change 
$KCODE.  It means that while org.jruby.RubyRegexp owns a jregex.Pattern object, 
you have to create another jregex.Pattern instance if the $KCODE was changed 
after the RubyRegexp instance was created. It seems harder than I thought, so 
I'll think about writing the patches later.
Please let me know if there are any good ideas or if I've got something wrong.

BTW thanks for the T-shirt today at Shibuya :)


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Reply via email to