Hi,

I am using JRuby 1.0.3. And I encountered one annoying problem with it. If I
run a script like this: "jruby simp_xml.rb"
          simp_xml.rb:
           "
XMLDECL_PATTERN = /<\?xml\s+(.*?)\?>/um

source='<?xml version="1.0"
?><methodCall><methodName>Hex</methodName><params><param><value><string>漱</string></value></param></params></methodCall>'

md=XMLDECL_PATTERN.match(source)

puts md
puts $'
"
I always got error output like this:
"
<?xml version="1.0" ?>
<methodCall><methodName>Hex</methodName><params><param><value>
<string></string></value></param></params></methodCall>
"
And the correct out should be like this:
"
<?xml version="1.0" ?>
<methodCall><methodName>Hex</methodName><params><param><value>
<string>漱</string></value></param></params></methodCall>
"
Apparently the root cause is that the "RubyRegexp.java" file under
"org/jruby" directory in JRuby 1.0.3 has one bug:
"
/** rb_reg_match
     *
     */
    public IRubyObject match(IRubyObject target) {
        if (target.isNil()) {
            return getRuntime().getFalse();
        }
        // FIXME:  I think all String expecting functions has this magic via
RSTRING
        if (target instanceof RubySymbol || target instanceof RubyHash ||
target instanceof RubyArray) {
            return getRuntime().getFalse();
        }
        *// FIXME: make Unicode-aware
        RubyString ss = RubyString.stringValue(target);
        String string = ss.toString();*

        if (string.length() == 0 && "^$".equals(pattern.toString())) {
            string = "\n";
        }

        int result = search(string, ss, 0);

        return result < 0 ? getRuntime().getNil() :
            getRuntime().newFixnum(result);
    }
"
Those lines stripped my non-ascii word in "<string>漱</string>". I noticed
that JRuby 1.1 has fixed this bug and refactorred (most parts of ) the
"RubyRegexp.java" .file. But unfortunately upgrade to 1.1 is not an option
for me. Can anyone help me to figure out a work around fix for 1.0.3? I need
this badly...

Thanks anyway.
Song Ma

Reply via email to