I am using pandas 0.16 and sqlalchemy 0.99. Yes, I know I should probably
upgrade, but I don't have admin rights on my PC.
I understand the pandas.DataFrame.to_sql() method relies on sqlalchemy.
I have a pandas dataframe with ca 155,000 rows and 12 columns. If I export
it to csv with dataframe.to_csv , the output is an 11MB file (which is
produced instantly).
If, however, I export to a Microsoft SQL Server with the to_sql method, it
takes between 5 and 6 minutes!
Reading the same table from SQL to Python with the pandas.read_sql_table
takes 2 seconds.
No columns are text: only int, float, bool and dates. Also, there are no
constraints on the table. I have seen cases where ODBC drivers set
nvarchar(max) and this slows down the data transfer, but it cannot be the
case here.
Any suggestions on how to speed up the export process?
I appreciate that SQLAlchemy will probably never be the fastest method to
insert data, but 6 minutes for 11 MBs of data is too slow...
Thanks!
My code looks like this:
import pandas as pdfrom sqlalchemy import create_engine, MetaData, Table,
selectServerName = "myserver"Database = "mydatabase"TableName = "mytable"
engine = create_engine('mssql+pyodbc://' + ServerName + '/' + Database)
conn = engine.connect()
metadata = MetaData(conn)
my_data_frame.to_sql(TableName,engine)
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.