On Oct 6, 10:55 am, jreidthompson <[email protected]> wrote: > DB[:table_with_10m_rows].each{|row| puts row } > top shows memory usage growing, with no output > after it hit ~400 MB I killed it
Looks like the PostgreSQL driver you are using (e.g. ruby-pg, ruby- postgres, or postgres-pr) loads all records in memory before yielding any. I'm not sure if Sequel can do anything about that. The interface for getting the result from the underlying driver involves asking the driver for the value for the given row number and column number, which is random access instead of sequential access. I not sure if the postgres drivers support a sequential access method that doesn't require loading all rows first. I don't think the ruby- postgres or postgres-pr drivers do, but it may be possible using ruby- pg (http://www.espace.com.eg/blog/2009/03/05/faster-io-for-ruby-with- postgres/). I'll see if I can look into that in the future. Currently, in this case, the pagination extension and Dataset#each_page is probably the best way to handle iterating over a large table and updating rows. Jeremy --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sequel-talk" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/sequel-talk?hl=en -~----------~----~----~----~------~----~------~--~---
