UTF-8 char in XML hangs in Joni
-------------------------------

                 Key: JRUBY-6204
                 URL: https://jira.codehaus.org/browse/JRUBY-6204
             Project: JRuby
          Issue Type: Bug
    Affects Versions: JRuby 1.6.5, JRuby 1.6.4
         Environment: Nokogiri 1.5.0
export JRUBY_OPTS="--1.9"
            Reporter: Anders Bengtsson
            Assignee: Thomas E Enebo
         Attachments: regexp_killer.rb

In 1.9-mode, when a UTF-8 character is present in an XML string, Nokogiri does 
some regexp work that gets stuck within Joni:

"main" prio=5 tid=7fcbdc801000 nid=0x10971c000 runnable [10971a000]
   java.lang.Thread.State: RUNNABLE
        at org.joni.Matcher.matchCheck(Matcher.java:293)
        at org.joni.Matcher.search(Matcher.java:461)
        at org.jruby.RubyRegexp.search(RubyRegexp.java:1489)
        at org.jruby.RubyRegexp.op_match(RubyRegexp.java:1406)
        at org.jruby.ast.Match3Node.interpret(Match3Node.java:101)
        at org.jruby.ast.OrNode.interpret(OrNode.java:98)
        at org.jruby.ast.IfNode.interpret(IfNode.java:111)
        at org.jruby.ast.LocalAsgnNode.interpret(LocalAsgnNode.java:123)
        at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
        at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
        at 
org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:75)
        at 
org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:190)
        at 
org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:179)
        at 
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:312)
        at 
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:169)
        at regexp_killer.__file__(regexp_killer.rb:4)
        at regexp_killer.load(regexp_killer.rb)
        at org.jruby.Ruby.runScript(Ruby.java:693)
        at org.jruby.Ruby.runScript(Ruby.java:686)
        at org.jruby.Ruby.runNormally(Ruby.java:593)
        at org.jruby.Ruby.runFromMain(Ruby.java:442)
        at org.jruby.Main.doRunFromMain(Main.java:321)
        at org.jruby.Main.internalRun(Main.java:241)
        at org.jruby.Main.run(Main.java:207)
        at org.jruby.Main.run(Main.java:191)
        at org.jruby.Main.main(Main.java:171)

This will work in 1.8 mode, but breaks in 1.9 mode for 1.6.4, 1.6.5 and HEAD:

#encoding: utf-8
require 'nokogiri'
xml = %q{<?xml version="1.0" encoding="UTF-8"?><hörna/>}
parsed_xml = Nokogiri.parse(xml)
puts "done!"



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to