Hi,
I am using JRuby 1.0.3. And I encountered one annoying problem with it. If I
run a script like this: "jruby simp_xml.rb"
simp_xml.rb:
"
XMLDECL_PATTERN = /<\?xml\s+(.*?)\?>/um
source='<?xml version="1.0"
?><methodCall><methodName>Hex</methodName><params><param><value><string>漱</string></value></param></params></methodCall>'
md=XMLDECL_PATTERN.match(source)
puts md
puts $'
"
I always got error output like this:
"
<?xml version="1.0" ?>
<methodCall><methodName>Hex</methodName><params><param><value>
<string></string></value></param></params></methodCall>
"
And the correct out should be like this:
"
<?xml version="1.0" ?>
<methodCall><methodName>Hex</methodName><params><param><value>
<string>漱</string></value></param></params></methodCall>
"
Apparently the root cause is that the "RubyRegexp.java" file under
"org/jruby" directory in JRuby 1.0.3 has one bug:
"
/** rb_reg_match
*
*/
public IRubyObject match(IRubyObject target) {
if (target.isNil()) {
return getRuntime().getFalse();
}
// FIXME: I think all String expecting functions has this magic via
RSTRING
if (target instanceof RubySymbol || target instanceof RubyHash ||
target instanceof RubyArray) {
return getRuntime().getFalse();
}
*// FIXME: make Unicode-aware
RubyString ss = RubyString.stringValue(target);
String string = ss.toString();*
if (string.length() == 0 && "^$".equals(pattern.toString())) {
string = "\n";
}
int result = search(string, ss, 0);
return result < 0 ? getRuntime().getNil() :
getRuntime().newFixnum(result);
}
"
Those lines stripped my non-ascii word in "<string>漱</string>". I noticed
that JRuby 1.1 has fixed this bug and refactorred (most parts of ) the
"RubyRegexp.java" .file. But unfortunately upgrade to 1.1 is not an option
for me. Can anyone help me to figure out a work around fix for 1.0.3? I need
this badly...
Thanks anyway.
Song Ma