Rails + cyrillic chars = broken gsub/regexp
-------------------------------------------

                 Key: JRUBY-2725
                 URL: http://jira.codehaus.org/browse/JRUBY-2725
             Project: JRuby
          Issue Type: Bug
          Components: Miscellaneous
    Affects Versions: JRuby 1.2
         Environment: JRuby 1.2, rails (2.1.0),
Win XP, source file encoding: cp-1251
            Reporter: Oleksandr Shyshko
         Attachments: jruby_gsub.rb

REPRODUCE BY

1. Have the  following code (attached): 
=======================================
jruby_gsub.rb
=======================================
def assert_equal expected, actual
    raise "Was: #{actual}, expected: #{expected}" if expected != actual 
end

a = "м'який" # ukraininan word "m'akiy" - "soft"  
assert_equal(6,        a.length)
assert_equal("м_який", a.gsub(/\'/, "_"))
=======================================

Execute it, there should be no messages,

2. Now install Rails 2.1.0 (current)
# gem install rails

3. Generate some application
# rails someapp

4. Put  "jruby_gsub.rb" into "./someapp/tools" dir

5. Edit "jruby_gsub.rb", add 1 more line:
=======================================
require ../config/environment.rb' # load Rails stuff

def assert_equal expected, actual
    raise "Was: #{actual}, expected: #{expected}" if expected != actual 
end

a = "м'який" # ukraininan word "m'akiy" - "soft"  
assert_equal(6,        a.length)
assert_equal("м_який", a.gsub(/\'/, "_"))
=======================================

Run "jruby_gsub.rb". You should get something like this:

=======================================
someapp/tools/jruby_gsub.rb:4:in `assert_equal': Was: ?'????, expected: ?_???? 
(RuntimeError)
        from someapp/tools/jruby_gsub.rb:9:in someapp/tools/jruby_gsub.rb'
        from someapp/tools/jruby_gsub.rb:1:in `load'
        from -e:1
=======================================

CONCLUSION
It seems that Rails breaks behavior of regexp/gsub when used on Unicode strings.

POSSIBLE SECURITY BREAK
Also, this thing affects ActiveRecord - ActiveRecords uses gsub or regexps to 
escape user input for quotes etc.
ActiveRecord is unable to properly escape quotes when fed with quotes mixed 
cyrillic chars - thus may be used for SQL injections.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to