Unicode regular expressions by UTF-8 don't work -----------------------------------------------
Key: JRUBY-2982 URL: http://jira.codehaus.org/browse/JRUBY-2982 Project: JRuby Issue Type: Bug Components: Core Classes/Modules Affects Versions: JRuby 1.1.4 Environment: Linux, OSX 10.5.4 Reporter: Yoko Harada Unicode regular expressions by property names described in Oniguruma's document don't work if a script file is saved by UTF-8 encoding. For example. this raises an exception, "invalid character property name {Katakana}: /\p{Katakana}/ (RegexpError)." {code} # -*- coding: UTF-8 -*- $KCODE = "utf8" p 'abcアイウαβγ'.scan(/[a-z]/) p 'abcアイウαβγ'.scan(/\p{Katakana}/) p 'abcアイウαβγ'.scan(/\p{^Greek}/) p 'abcアイウαβγ'.scan(/[\u0370-\u30FF]/) {code} "/\p{Katakana}/u" raised the same exception, too. Whereas current Ruby 1.9 (ruby 1.9.0 (2008-08-26 revision 18849) [i386-darwin9.4.0]) outputs: {code} warning: variable $KCODE is no longer effective; ignored ["a", "b", "c"] ["ア", "イ", "ウ"] ["a", "b", "c", "ア", "イ", "ウ"] ["ア", "イ", "ウ", "α", "β", "γ"] {code} When I recompiled JRuby 1.1.4 by turning USE_UNICODE_PROPERTIES option of joni to true, unicode property name expressions worked as well as Ruby 1.9 does. The last unicode codepoint range expression didn't work even after recompiling. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email