On 4/21/15 1:47 PM, John Doe wrote:
I am using pandas 0.16 and sqlalchemy 0.99. Yes, I know I should probably upgrade, but I don't have admin rights on my PC.
I understand the pandas.DataFrame.to_sql() method relies on sqlalchemy.

I have a pandas dataframe with ca 155,000 rows and 12 columns. If I export it to csv with dataframe.to_csv , the output is an 11MB file (which is produced instantly).

If, however, I export to a Microsoft SQL Server with the to_sql method, it takes between 5 and 6 minutes!

Reading the same table from SQL to Python with the pandas.read_sql_table takes 2 seconds.

No columns are text: only int, float, bool and dates. Also, there are no constraints on the table. I have seen cases where ODBC drivers set nvarchar(max) and this slows down the data transfer, but it cannot be the case here.

Any suggestions on how to speed up the export process?

I appreciate that SQLAlchemy will probably never be the fastest method to insert data, but 6 minutes for 11 MBs of data is too slow...

Unfortunately I don't work with Pandas so this is dependent on how Pandas is doing their queries here.

If you can at least set echo=True on the engine here, you'd see what queries are being emitted. Whether they are emitting a handful of queries, or thousands, makes a difference, as well as if these queries are returning vast numbers of rows due to cartesian products or similar makes a difference, and how they are fetching rows back makes a difference.

It would be helpful to everyone if either you or someone on the Pandas development team could walk through the steps detailed at http://docs.sqlalchemy.org/en/rel_1_0/faq/performance.html#how-can-i-profile-a-sqlalchemy-powered-application to isolate where the performance issue is originating.



Thanks!


My code looks like this:

|importpandas aspd fromsqlalchemy importcreate_engine,MetaData,Table,select ServerName="myserver"Database="mydatabase"TableName="mytable"engine =create_engine('mssql+pyodbc://'+ServerName+'/'+Database)conn =engine.connect()metadata =MetaData(conn)my_data_frame.to_sql(TableName,engine)|

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to