On Oct 13, 2011, at 6:38 AM, Eduardo wrote: > Hi, > I am trying to copy a table (or more of them) from one database to the > other. I found somewhere in internet a code snippet that I have > slightly modified. I read sequentially rows from the existing table > and write them to the new one. The code worked for smaller table (up > to several thousand rows). However when working with the table with > more than million rows, the session commit failed more frequently as > the number of the copied rows grows. The rows to be committed in the > previous loop are still stored in the session and when the session > filled with the rows from next sequence they are still in the session > although they are supposed to have bee already committed. > Is there any problem with the code or it is dependent upon the RDBMS > (Postgres) configuration. > Is there any other way to copy a table from one database to the other > (except of using pg_dump)?
there's nothing obviously wrong with the code and its not the case that the rows remain in the Session after commit; I don't see them being placed in any kind of persistent collection and the Session does not maintain strong references to objects with no pending changes. A check against len(gc.get_objects()) would confirm whether or not the total size of objects in memory is growing. You're also emitting SELECTs with LIMIT/OFFSET which also will ensure that psycopg2 isn't buffering the entire result set, even though LIMIT/OFFSET gets very slow as you move deeper into the table (I typically use this recipe instead: http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery) That said, an operation like this is strictly table->table, you should be using the Core for this, exchanging simple tuples from the result set of one statement to a single parameter list for the other which will insert the full span of rows in one go (i.e. http://www.sqlalchemy.org/docs/core/tutorial.html#executing-multiple-statements). The ORM code here is unnecessary and complex, and makes more a much slower operation. > Thanks > ___________________________________________________________________________________________ > > > def pull_data(from_db, to_db, tables,slicesize): > smeta,source, sengine = create_meta_session_engine(from_db) > smeta = MetaData(bind=sengine) > dmeta,destination, dengine = create_meta_session_engine(to_db) > for table_name in tables: > table = Table(table_name, smeta, autoload=True) > table.metadata.create_all(dengine) > NewRecord = quick_mapper(table) > columns = table.columns.keys() > rowquery=source.query(table) > count=rowquery.count() > loop=0 > while count-(loop*slicesize)>slicesize: > > oneslice=source.query(table).slice(slicesize*loop,slicesize*(loop > +1)).all() > for record in oneslice: > data = dict([(str(column), getattr(record, column)) > for column in columns]) > destination.add(NewRecord(**data)) > try: > destination.commit() > loop+=1 > except: > destination.rollback() > time.sleep(25) > print loop > lastpending=True > lastslice=source.query(table).slice(slicesize*loop,count > +1).all() > lastloop=0 > while lastpending: > for record in lastslice: > data = dict([(str(column), getattr(record, column)) > for column in columns]) > destination.add(NewRecord(**data)) > try: > destination.commit() > lastpending=False > except: > destination.roolback() > time.sleep(25) > lastloop+=1 > print lastloop > source.close() > destination.close() > def quick_mapper(table): > Base = declarative_base() > class GenericMapper(Base): > __table__ = table > return GenericMapper > ___________________________________________________________________________________________ > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
