Hello rubyonrails-core,

I’ve been looking into possible changes to ActiveRecord / Arel to make it 
easier to write Rails applications that are free of SQL injection 
vulnerabilities, and in particular do so in a way that makes it easy for a 
code reviewer to verify that the app is safe from such bugs.

The concern:
-----------------

With the ActiveRecord API as is, it’s relatively easy to write applications 
with SQL injection bugs, and those bugs don’t particularly stand out in 
code reviews.  For example,
  Customer.where("name like "%#{params[:id]}%")
is vulnerable, while
  Customer.where("name = ?”, params[:id])
  Customer.where(name => params[:id])
  Customer.where(Customer.arel_table[:name].matches('%#{name_pattern}%')
etc are safe.  

While a careful security code review would likely spot this type of bug, 
there is still a significant risk that it’d be missed, in particular in an 
agile project with frequent releases that can’t afford to do an in-depth 
security review between each release.

Proposal:
-------------

I’m wondering if it would be possible to introduce a “strict arel” mode for 
ActiveRecord which if turned on allows only SQL-safe Arel-style queries 
(including queries that are internally built via Arel by ActiveRecord, such 
as hash queries and find_by_xyz queries). 

It looks like there’d be two main aspects to an implementation of this mode:

1) A change to Arel to allow the construction of SqlLiterals to be 
restricted to values that are program constants (i.e., fully determined at 
application boot time, before the first request is served).

2) A change to ActiveRecord to allow only execution of SQL statements that 
have been obtained by flattening an Arel AST.


#1 is necessary for two reasons:
- There are a number of code sites in Arel itself where SqlLiterals are 
constructed from String’s such that it’s impossible to be sure that the 
String’s value is a program constant (and not derived from user input).

E.g., 
gems/arel-2.2.1/lib/arel/select_manager.rb
  def group *columns
    columns.each do |column|
      # FIXME: backwards compat
      column = Nodes::SqlLiteral.new(column) if String === column
      column = Nodes::SqlLiteral.new(column.to_s) if Symbol === column

- code sites in ActiveRecord where a String is explicitly converted into a 
SqlLiteral, e.g. to support the Table.where(“foo = ‘#{bar}’”) form of 
queries:

gems/activerecord-3.1.1/lib/active_record/relation/query_methods.rb
  def collapse_wheres(arel, wheres)
[...]
    (wheres - equalities).each do |where|
      where = Arel.sql(where) if String === where


#2 is necessary to prevent direct execution of SQL statements that are not 
constructed via Arel, e.g by direct calls to find_by_sql.


Implementation Ideas:
----------------------------

For #1, add a registry to Arel of constant values that may be used as 
SqlLiterals.  If “strict Arel” mode is enabled, and the app is in 
production mode, the registry is “locked” at the end of application boot 
(after Rails.initialized? is true), and the SqlLiteral constructor would 
only allow creation of previously registered SQL literals.  

The registry could allow registration of the actual literal values 
themselves, or be in terms of Symbol-indexed hashes, or both.

Any code inside Arel and ActiveRecord that instantiates SqlLiterals would 
need to add appropriate calls to register its literals at load time. For 
example,  arel-2.2.1/lib/arel/expressions.rb would call,
  Arel.register_sql('sum_id', ‘max_id’, ….)

In addition, application code itself could register literals at load time, 
to white-list custom statements, as in

class CustomersController < ApplicationController
Arel.register_sql {:customer_by_id => "name = ?"}
def show
  @customer = Customer.where(:customer_by_id, params[:id])

Note that this effectively restricts the app to using only statements 
written in terms of bind variables, which is exactly what we want (since 
the statement has to be registered at load time, it can’t possibly include 
the value of a request parameter, e.g via string interpolation).

For an initial experiment, see below.

For #2, a possible approach might be to encapsulate strings obtained from 
flattening Arel in a specific type, e.g. “ArelSqlStatement”; probably in 
ActiveRecord::ConnectionAdapters::DatabaseStatements.to_sql .
Then (if strict arel mode is enabled), in 
ActiveRecord::ConnectionAdapters::XyzAdapter.select,  enforce that sql === 
ArelSqlStatement.

Question:
------------

* Is this a terrible idea, or is there some reason this wouldn’t work in 
practice?  Note that it would of course be optional, and developers who 
enable this would make a conscious decision to restrict themselves to 
Arel-style queries in their code, in return for higher confidence that 
their app is free of SQL injection bugs.

* Would you be open to accepting a patch for the above to Arel and 
ActiveRecord?


Experiments:
-----------------

As an experiment, I tried the following monkey patch (which obviously is 
missing the registration mechanism; and the whitelisting of 
Arel/ActiveRecord-internal SQL literals):

$ cat vendor/plugins/strict_arel/lib/strict_arel.rb
# StrictArel

require 'arel'
module Arel
WHITELISTED_RAW_SQL = ["\0", '?', '*'] by 
module Nodes
  class SqlLiteral
    alias_method :sqlLiteralInitialize_Do_Not_Call_This_Or_Else, :initialize

    def initialize(raw_sql)
      if WHITELISTED_RAW_SQL.include?(raw_sql)
        sqlLiteralInitialize_Do_Not_Call_This_Or_Else(raw_sql)
      else
        raise "non-whitelisted value for Arel::Nodes::SqlLiteral: 
#{raw_sql}"
      end
    end
  end

end

This appears to pretty much work as intended:

* Basic (primary key) queries work (which are internally constructed into 
arel):

ruby-1.9.2-p290 :002 > Customer.first
Customer Load (0.4ms)  SELECT `customers`.* FROM `customers` LIMIT 1
=> #<Customer id: 3, name: "BooBaz", credit: "baz", created_at: "2011-10-26 
22:01:36", updated_at: "2011-10-26 22:24:40">

ruby-1.9.2-p290 :021 >   Customer.find(3)
Customer Load (0.2ms)  SELECT `customers`.* FROM `customers` WHERE 
`customers`.`id` = 3 LIMIT 1
=> #<Customer id: 3, name: "BooBaz", credit: "baz", created_at: "2011-10-26 
22:01:36", updated_at: "2011-10-26 22:24:40">

* by_{column} queries work:

ruby-1.9.2-p290 :047 > Customer.find_by_name_and_credit('BooBaz', 'baz')
Customer Load (0.4ms)  SELECT `customers`.* FROM `customers` WHERE 
`customers`.`name` = 'BooBaz' AND `customers`.`credit` = 'baz' LIMIT 1
=> #<Customer id: 3, name: "BooBaz", credit: "baz", created_at: "2011-10-26 
22:01:36", updated_at: "2011-10-26 22:24:40">

* Custom Arel-based queries work:

ruby-1.9.2-p290 :005 > t = Customer.arel_table
=> #<Arel::Table:0x000000026da950 @name="customers", 
@engine=ActiveRecord::Base, @columns=nil, @aliases=[], @table_alias=nil, 
@primary_key=nil>

ruby-1.9.2-p290 :022 > Customer.where(t[:name].eq('BooBaz'))
Customer Load (0.3ms)  SELECT `customers`.* FROM `customers` WHERE 
`customers`.`name` = 'BooBaz'
=> [#<Customer id: 3, name: "BooBaz", credit: "baz", created_at: 
"2011-10-26 22:01:36", updated_at: "2011-10-26 22:24:40">]
ruby-1.9.2-p290 :024 > Customer.where(t[:name].matches('%Boo%'))
Customer Load (0.3ms)  SELECT `customers`.* FROM `customers` WHERE 
(`customers`.`name` LIKE '%Boo%')
=> [#<Customer id: 3, name: "BooBaz", credit: "baz", created_at: 
"2011-10-26 22:01:36", updated_at: "2011-10-26 22:24:40">]

* Hash queries work:
* Projections, order, etc work, but columns need to be given as arel column 
objects:

ruby-1.9.2-p290 :025 > Customer.select(t[:name]).where(:name => 
'BooBaz').order(t[:name]).limit(10)
Customer Load (0.3ms)  SELECT `customers`.`name` FROM `customers` WHERE 
`customers`.`name` = 'BooBaz' ORDER BY `customers`.`name` LIMIT 10
=> [#<Customer name: "BooBaz">]


** However, Queries using string fragments end up calling Arel.sql when the 
underlying arel is constructed, and are rejected as intended:

ruby-1.9.2-p290 :035 > Customer.where("name = 'evil'")
RuntimeError: non-whitelisted value for Arel::Nodes::SqlLiteral: name = 
'evil'
      from 
/usr/local/google/xtof/s/review/b-5393417-ghost-rails/sample-app/rubymine_sample_native_ruby/vendor/plugins/strict_arel/lib/strict_arel.rb:19:in
 
`initialize'
      from 
/home/xtof/.rvm/gems/ruby-1.9.2-p290/gems/arel-2.2.1/lib/arel.rb:39:in `new'


Best Regards,
Christoph

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Core" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/rubyonrails-core/-/lyWteomJxKQJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-core?hl=en.

Reply via email to