On Apr 29, 2013, at 9:49 AM, Pau Tallada <[email protected]> wrote: > Hi, > > I have some very big datasets that we want to process in batches of > reasonable size. > In order to do that, we enable the 'stream_results' execution option > (available in Postgresql) and we use the fetchmany method to retrieve the > records in batches. > With those records we must build a numpy array, but its constructor complains > about the RowProxy wrapper. > > So I tried using the cursor directly to retrieve the selected rows, but the > cursor skips the first one if 'stream_results' is enabled.
whoaaaaa OK I just looked at what you're doing here, wondering if I was crazy or not. You're doing "rs.cursor.fetchall()" on the second run. You can't do that here; the mechanism of "stream results" requires that rows must be buffered, so in this case you're bypassing that mechanism. I copied your test without seeing that detail. If numpy doesn't recognize the dict interface of a RowProxy, then iterate dicts by passing it "(dict(row) for row in result)". -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
