On Oct 6, 10:10 am, jreidthompson <[email protected]> wrote:
> Does sequel automatically use cursors for postgresql/can I make it use
> a cursor?
> I have a table with > 10M rows that I need to iterate over and update

Sequel::Dataset#each iterates over rows as the database adapter
provides them.  So the answer depends on whether the underlying
database driver keeps all rows in memory or not.  I've heard from
multiple users that the MySQL adapter does not keep all rows in
memory, but I haven't heard reports either way about the PostgreSQL
adapter.  Can you try the following code:

  DB[:table_with_10m_rows].each{|row|}

If memory usage for the process grows without bound, it's probably
keeping all rows in memory.  If memory usage is fairly constant, it's
probably not.

Note that with Sequel, iterating over an extremely large dataset and
updating it at the same time can be problematic due to the connection
pool.  The connection pool is based on threads, and if you are
iterating over a large dataset, you can't use the same connection to
update inside the Dataset#each block.  There are a couple ways to fix
this.  One is using a separate thread for each update:

  ds = DB[:table_with_10m_rows]
  ds.each{|row| Thread.new{ds.filter(:id=>row[:id]).update(...)}.join}

That's fairly simple but creates a lot of threads and may be slow.
Another is using produer/consumer threads (assuming a reasonable Queue
implementation):

  ds = DB[:table_with_10m_rows]
  queue = Queue.new(10)
  consumer = Thread.new do
    while row = queue.pop do
      ds.filter(:id=>row[:id]).update(...)
    end
  end
  ds.each{|row| queue.push(row)}
  queue.push nil
  consumer.join

Something I've thought about recently is abusing Sequel's sharding
support for this:

  DB = Sequel.postgres(..., :servers=>{:backup=>{}})
  ds = DB[:table_with_10m_rows]
  ds.server(:backup).each{|row| ds.filter(:id=>row[:id]).update(...)}

That should work already, but I haven't tried it.

Jeremy

I haven't
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/sequel-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to