find_in_batches has several major limitations - 
 1) order and limit are not supported
 2) joins can break it (i.e. if there end being multiple records with the 
same primary key, the next batch might miss some that were truncated by the 
limit)
 3) it forces a table scan because it's ordering by primary key, making it 
inefficient

All of these can be worked around by either using cursors or temporary 
tables. Would a patch to automatically use such features (if the need for 
it is detected, like where it is currently warning about order and limit) 
be accepted? What would the suggested way to structure such a patch, given 
that it would use DB specific features? Add some stubs to SchemaStatements 
or AbstractAdapter, and call them from find_in_batches? I'm guessing 
detecting the adapter inline is frowned upon.

For reference, we're using such an implementation right now in our project 
(Rails 2.3; we're in the process of 
upgrading): 
https://github.com/instructure/canvas-lms/blob/release/2013-11-16.13/config/initializers/active_record.rb#L499-598.

Cody Cutrer

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to