[jruby-dev] [jira] (JRUBY-6668) StringScanner#scan_until spins forever on UTF-8 data

Scott Gonyea (JIRA) Wed, 16 May 2012 19:48:09 -0700

Scott Gonyea created JRUBY-6668:
-----------------------------------

             Summary: StringScanner#scan_until spins forever on UTF-8 data
                 Key: JRUBY-6668
                 URL: https://jira.codehaus.org/browse/JRUBY-6668
             Project: JRuby
          Issue Type: Bug
    Affects Versions: JRuby 1.6.6
         Environment: Mac OS X Lion.


java -version
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04-415-11M3635)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-415, mixed mode)

JRuby 1.6.5 / 1.6.6
            Reporter: Scott Gonyea
            Assignee: Thomas E Enebo


While running the tests in the ruby library 'mustache' (link: 
https://github.com/defunkt/mustache), one test in particular is failing:

https://github.com/defunkt/mustache/blob/master/test/mustache_test.rb#L510-522

JRuby dies calling StringScanner#scan_until here:

https://github.com/defunkt/mustache/blob/master/lib/mustache/parser.rb#L231

You can reproduce the issue with the following:

require 'strscan'
regex = /(^[ \t]*)?\{\{/
text = "<h1>&#20013;&#25991; {{test}}</h1>\n\n{{> utf8_partial}}\n"
text.force_encoding 'BINARY'
scanner = StringScanner.new(text)
scanner.scan_until(regex) # Fans spin up, and this method never returns.

This seems to happen regardless of whether or not JRuby is in 1.8 or 1.9 mode.  
I am running this test like so:

JRUBY_OPTS=--1.9 ruby -I"lib:test" test/mustache_test.rb -n test_utf8 -v

I've also run it with: JRUBY_OPTS="--1.9 LC_ALL=en_US.UTF-8"

It appears that this affects UTF-8 characters.  If I replace the chinese 
characters with "foo bar", then there is no problem.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.codehaus.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

[jruby-dev] [jira] (JRUBY-6668) StringScanner#scan_until spins forever on UTF-8 data

Reply via email to