Re: [sqlalchemy] Re: pandas to MS SQL DataWarehouse (to_sql)

Mike Bayer Wed, 13 Sep 2017 12:22:50 -0700

On Wed, Sep 13, 2017 at 3:14 PM, dirk.biesinger
<dirk.biesin...@gmail.com> wrote:
> Got ya,
>
> so we could solve the issue on the sqlalchemy end with the alteration of the
> pyodbc.py file.
> I assume you'll include this in the next release?


um.

can you just confirm for me this makes the error?


connection = pyodbc.connect(....)
connection.autocommit = 0
connection.rollback()



> The issue with creating a table when the option "if_exists='append'" is set
> in the df.to_sql() call, is a pandas problem.
>
> Thank you for your help.
>
> Best,
> DB
>
> On Wednesday, September 13, 2017 at 11:45:15 AM UTC-7, Mike Bayer wrote:
>>
>> On Wed, Sep 13, 2017 at 2:41 PM, dirk.biesinger
>> <dirk.bi...@gmail.com> wrote:
>> > I don't get why the table is getting created in the first place. A table
>> > with this name exists, and the option "if_exists='append'" should append
>> > the
>> > dataframe to the existing table.
>> > There should not be a dropping of the table (which I have not seen) nor
>> > creation of the table.
>> >
>> > And in case of creating the table, I think it should be possible to
>> > define
>> > the length of the field, so varchar([variable_to_be_submitted]).
>> > In my case I expect this particular table to grow to several hundred
>> > million
>> > rows, so assigned storage space is a factor.
>> >
>> > the existing table was created like this:
>> >
>> > CREATE TABLE dbo.DSI
>> > (
>> > a datetime
>> > b varchar(10) null,
>> > c varchar(100) null,
>> > d varchar(10) null,
>> > e varchar(100) null,
>> > f decimal (8,6) null,
>> > g decimal (8,6) null
>> > )
>> >
>> > If I read and understand the error stack correct, the cause is the
>> > create
>> > table statement, which I would very strongly hope to cause an error, as
>> > the
>> > table exists.
>> > Should the create table statement not be omitted if the option
>> > "if_exists='append'" option is set?
>>
>> This is all on the Pandas side so you'd need to talk to them.
>>
>>
>> >
>> > On Wednesday, September 13, 2017 at 11:15:27 AM UTC-7, Mike Bayer wrote:
>> >>
>> >> OK so....don't use VARCHAR(max)?  What datatype would you like?  We
>> >> have most of them and you can make new ones too.
>> >>
>> >> On Wed, Sep 13, 2017 at 1:07 PM, dirk.biesinger
>> >> <dirk.bi...@gmail.com> wrote:
>> >> > Mike,
>> >> >
>> >> > here's the error stack (I had to mask some details):
>> >> > The columns (dataformats) in the create table statement are wrong.
>> >> > Also,
>> >> > this table does not have an index.
>> >> >
>> >> > 2017-09-13 15:07:50,200 INFO sqlalchemy.engine.base.Engine SELECT
>> >> > SERVERPROPERTY('ProductVersion')
>> >> > 2017-09-13 15:07:50,202 INFO sqlalchemy.engine.base.Engine ()
>> >> > 2017-09-13 15:07:50,246 INFO sqlalchemy.engine.base.Engine SELECT
>> >> > schema_name()
>> >> > 2017-09-13 15:07:50,247 INFO sqlalchemy.engine.base.Engine ()
>> >> > 2017-09-13 15:07:50,522 INFO sqlalchemy.engine.base.Engine SELECT
>> >> > CAST('test
>> >> > plain returns' AS VARCHAR(60)) AS anon_1
>> >> > 2017-09-13 15:07:50,522 INFO sqlalchemy.engine.base.Engine ()
>> >> > 2017-09-13 15:07:50,562 INFO sqlalchemy.engine.base.Engine SELECT
>> >> > CAST('test
>> >> > unicode returns' AS NVARCHAR(60)) AS anon_1
>> >> > 2017-09-13 15:07:50,563 INFO sqlalchemy.engine.base.Engine ()
>> >> > 2017-09-13 15:07:50,604 INFO sqlalchemy.engine.base.Engine SELECT
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[TABLE_SCHEMA],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[TABLE_NAME],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[COLUMN_NAME],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[IS_NULLABLE],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[DATA_TYPE],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[ORDINAL_POSITION],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[CHARACTER_MAXIMUM_LENGTH],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[NUMERIC_PRECISION],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[NUMERIC_SCALE],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[COLUMN_DEFAULT],
>> >> > [INFORMATION_SCHEMA].[COLUMNS].[COLLATION_NAME]
>> >> > FROM [INFORMATION_SCHEMA].[COLUMNS]
>> >> > WHERE [INFORMATION_SCHEMA].[COLUMNS].[TABLE_NAME] = CAST(? AS
>> >> > NVARCHAR(max))
>> >> > AND [INFORMATION_SCHEMA].[COLUMNS].[TABLE_SCHEMA] = CAST(? AS
>> >> > NVARCHAR(max))
>> >> > 2017-09-13 15:07:50,605 INFO sqlalchemy.engine.base.Engine
>> >> > ('dbo.MSODS_DSI',
>> >> > 'dbo')
>> >> > 2017-09-13 15:07:52,151 INFO sqlalchemy.engine.base.Engine
>> >> > CREATE TABLE [dbo.MSODS_DSI] (
>> >> > [a] DATETIME NULL,
>> >> > [b] VARCHAR(max) NULL,
>> >> > [c] VARCHAR(max) NULL,
>> >> > [d] VARCHAR(max) NULL,
>> >> > [e] VARCHAR(max) NULL,
>> >> > [f] FLOAT(53) NULL,
>> >> > [g] FLOAT(53) NULL
>> >> > )
>> >> >
>> >> >
>> >> > 2017-09-13 15:07:52,152 INFO sqlalchemy.engine.base.Engine ()
>> >> > 2017-09-13 15:07:54,374 INFO sqlalchemy.engine.base.Engine ROLLBACK
>> >> >
>> >> >
>> >> > ---------------------------------------------------------------------------
>> >> > ProgrammingError                          Traceback (most recent call
>> >> > last)
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _execute_context(self, dialect, constructor, statement,
>> >> > parameters,
>> >> > *args)
>> >> >    1181                         parameters,
>> >> > -> 1182                         context)
>> >> >    1183         except BaseException as e:
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/default.py
>> >> > in do_execute(self, cursor, statement, parameters, context)
>> >> >     469     def do_execute(self, cursor, statement, parameters,
>> >> > context=None):
>> >> > --> 470         cursor.execute(statement, parameters)
>> >> >     471
>> >> >
>> >> > ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 13 for
>> >> > SQL
>> >> > Server][SQL Server]The statement failed. Column 'b' has a data type
>> >> > that
>> >> > cannot participate in a columnstore index.\r\nOperation cancelled by
>> >> > user.
>> >> > (35343) (SQLExecDirectW)")
>> >> >
>> >> > The above exception was the direct cause of the following exception:
>> >> >
>> >> > ProgrammingError                          Traceback (most recent call
>> >> > last)
>> >> > <ipython-input-16-290e9c1020c9> in <module>()
>> >> >      17 #cnxn = pyodbc.connect(connection_str)
>> >> >      18 #engn.connect()
>> >> > ---> 19 df.to_sql(tbl_server_out, engn, if_exists='append',
>> >> > index=False)
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py
>> >> > in to_sql(self, name, con, flavor, schema, if_exists, index,
>> >> > index_label,
>> >> > chunksize, dtype)
>> >> >    1343         sql.to_sql(self, name, con, flavor=flavor,
>> >> > schema=schema,
>> >> >    1344                    if_exists=if_exists, index=index,
>> >> > index_label=index_label,
>> >> > -> 1345                    chunksize=chunksize, dtype=dtype)
>> >> >    1346
>> >> >    1347     def to_pickle(self, path, compression='infer'):
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/pandas/io/sql.py
>> >> > in
>> >> > to_sql(frame, name, con, flavor, schema, if_exists, index,
>> >> > index_label,
>> >> > chunksize, dtype)
>> >> >     469     pandas_sql.to_sql(frame, name, if_exists=if_exists,
>> >> > index=index,
>> >> >     470                       index_label=index_label, schema=schema,
>> >> > --> 471                       chunksize=chunksize, dtype=dtype)
>> >> >     472
>> >> >     473
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/pandas/io/sql.py
>> >> > in
>> >> > to_sql(self, frame, name, if_exists, index, index_label, schema,
>> >> > chunksize,
>> >> > dtype)
>> >> >    1148                          if_exists=if_exists,
>> >> > index_label=index_label,
>> >> >    1149                          schema=schema, dtype=dtype)
>> >> > -> 1150         table.create()
>> >> >    1151         table.insert(chunksize)
>> >> >    1152         if (not name.isdigit() and not name.islower()):
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/pandas/io/sql.py
>> >> > in
>> >> > create(self)
>> >> >     596                     "'{0}' is not valid for
>> >> > if_exists".format(self.if_exists))
>> >> >     597         else:
>> >> > --> 598             self._execute_create()
>> >> >     599
>> >> >     600     def insert_statement(self):
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/pandas/io/sql.py
>> >> > in
>> >> > _execute_create(self)
>> >> >     581         # Inserting table into database, add to MetaData
>> >> > object
>> >> >     582         self.table = self.table.tometadata(self.pd_sql.meta)
>> >> > --> 583         self.table.create()
>> >> >     584
>> >> >     585     def create(self):
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/sql/schema.py
>> >> > in create(self, bind, checkfirst)
>> >> >     754         bind._run_visitor(ddl.SchemaGenerator,
>> >> >     755                           self,
>> >> > --> 756                           checkfirst=checkfirst)
>> >> >     757
>> >> >     758     def drop(self, bind=None, checkfirst=False):
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _run_visitor(self, visitorcallable, element, connection, **kwargs)
>> >> >    1927                      connection=None, **kwargs):
>> >> >    1928         with self._optional_conn_ctx_manager(connection) as
>> >> > conn:
>> >> > -> 1929             conn._run_visitor(visitorcallable, element,
>> >> > **kwargs)
>> >> >    1930
>> >> >    1931     class _trans_ctx(object):
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _run_visitor(self, visitorcallable, element, **kwargs)
>> >> >    1536     def _run_visitor(self, visitorcallable, element,
>> >> > **kwargs):
>> >> >    1537         visitorcallable(self.dialect, self,
>> >> > -> 1538                         **kwargs).traverse_single(element)
>> >> >    1539
>> >> >    1540
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py
>> >> > in traverse_single(self, obj, **kw)
>> >> >     119             meth = getattr(v, "visit_%s" %
>> >> > obj.__visit_name__,
>> >> > None)
>> >> >     120             if meth:
>> >> > --> 121                 return meth(obj, **kw)
>> >> >     122
>> >> >     123     def iterate(self, obj):
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py
>> >> >  in
>> >> > visit_table(self, table, create_ok, include_foreign_key_constraints,
>> >> > _is_metadata_operation)
>> >> >     765             CreateTable(
>> >> >     766                 table,
>> >> > --> 767
>> >> > include_foreign_key_constraints=include_foreign_key_constraints
>> >> >     768             ))
>> >> >     769
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in execute(self, object, *multiparams, **params)
>> >> >     943             raise exc.ObjectNotExecutableError(object)
>> >> >     944         else:
>> >> > --> 945             return meth(self, multiparams, params)
>> >> >     946
>> >> >     947     def _execute_function(self, func, multiparams, params):
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py
>> >> >  in
>> >> > _execute_on_connection(self, connection, multiparams, params)
>> >> >      66
>> >> >      67     def _execute_on_connection(self, connection, multiparams,
>> >> > params):
>> >> > ---> 68         return connection._execute_ddl(self, multiparams,
>> >> > params)
>> >> >      69
>> >> >      70     def execute(self, bind=None, target=None):
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _execute_ddl(self, ddl, multiparams, params)
>> >> >    1000             compiled,
>> >> >    1001             None,
>> >> > -> 1002             compiled
>> >> >    1003         )
>> >> >    1004         if self._has_events or self.engine._has_events:
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _execute_context(self, dialect, constructor, statement,
>> >> > parameters,
>> >> > *args)
>> >> >    1187                 parameters,
>> >> >    1188                 cursor,
>> >> > -> 1189                 context)
>> >> >    1190
>> >> >    1191         if self._has_events or self.engine._has_events:
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _handle_dbapi_exception(self, e, statement, parameters, cursor,
>> >> > context)
>> >> >    1400                 util.raise_from_cause(
>> >> >    1401                     sqlalchemy_exception,
>> >> > -> 1402                     exc_info
>> >> >    1403                 )
>> >> >    1404             else:
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/util/compat.py
>> >> > in raise_from_cause(exception, exc_info)
>> >> >     201     exc_type, exc_value, exc_tb = exc_info
>> >> >     202     cause = exc_value if exc_value is not exception else None
>> >> > --> 203     reraise(type(exception), exception, tb=exc_tb,
>> >> > cause=cause)
>> >> >     204
>> >> >     205 if py3k:
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/util/compat.py
>> >> > in reraise(tp, value, tb, cause)
>> >> >     184             value.__cause__ = cause
>> >> >     185         if value.__traceback__ is not tb:
>> >> > --> 186             raise value.with_traceback(tb)
>> >> >     187         raise value
>> >> >     188
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py
>> >> > in _execute_context(self, dialect, constructor, statement,
>> >> > parameters,
>> >> > *args)
>> >> >    1180                         statement,
>> >> >    1181                         parameters,
>> >> > -> 1182                         context)
>> >> >    1183         except BaseException as e:
>> >> >    1184             self._handle_dbapi_exception(
>> >> >
>> >> >
>> >> >
>> >> > /home/saravji/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/default.py
>> >> > in do_execute(self, cursor, statement, parameters, context)
>> >> >     468
>> >> >     469     def do_execute(self, cursor, statement, parameters,
>> >> > context=None):
>> >> > --> 470         cursor.execute(statement, parameters)
>> >> >     471
>> >> >     472     def do_execute_no_params(self, cursor, statement,
>> >> > context=None):
>> >> >
>> >> > ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000]
>> >> > [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]The statement
>> >> > failed.
>> >> > Column 'b' has a data type that cannot participate in a columnstore
>> >> > index.\r\nOperation cancelled by user. (35343) (SQLExecDirectW)")
>> >> > [SQL:
>> >> > '\nCREATE TABLE [dbo.MSODS_DSI] (\n\t[a] DATETIME NULL, \n\t[b]
>> >> > VARCHAR(max)
>> >> > NULL, \n\t[c] VARCHAR(max) NULL, \n\t[d] VARCHAR(max) NULL, \n\t[e]
>> >> > VARCHAR(max) NULL, \n\t[f] FLOAT(53) NULL, \n\t[g] FLOAT(53)
>> >> > NULL\n)\n\n']
>> >> >
>> >> >
>> >> >
>> >> > On Monday, September 11, 2017 at 3:34:47 PM UTC-7, dirk.biesinger
>> >> > wrote:
>> >> >>
>> >> >> I am encountering errors when trying to use the pd.to_sql function
>> >> >> to
>> >> >> write a dataframe to MS SQL Data Warehouse.
>> >> >> The connection works when NOT using sqlalchemy engines.
>> >> >> I can read dataframes as well as row-by-row via select statements
>> >> >> when
>> >> >> I
>> >> >> use pyodbc connections
>> >> >> I can write data via insert statements (as well as delete data) when
>> >> >> using
>> >> >> pyodbc.
>> >> >> However, when I try to connect using a sqlalchemy engine I run into
>> >> >> a
>> >> >> string of error messages starting with:
>> >> >>
>> >> >> ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000]
>> >> >> [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Catalog view
>> >> >> 'dm_exec_sessions' is not supported in this version. (104385)
>> >> >> (SQLExecDirectW)")
>> >> >>
>> >> >>
>> >> >> I have searched online, and this exact error seems to have been
>> >> >> reported /
>> >> >> evaluated in May of this year as issue #3994:
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> https://bitbucket.org/zzzeek/sqlalchemy/issues/3994/azure-sql-datawarehouse-basic
>> >> >>
>> >> >>
>> >> >> I could not find a solution to this, and I'd really dislike to do a
>> >> >> line-wise or blob insert statement (I'm working with multiple
>> >> >> datasets
>> >> >> that
>> >> >> each has a few million rows, so execution time is a consideration,
>> >> >> although
>> >> >> the result sets I'm getting are more like in the 100k lines area
>> >> >> each.)
>> >> >>
>> >> >>
>> >> >> I get the same error messages even when I replace the pd.to_sql
>> >> >> command
>> >> >> with a simple engine.connect()
>> >> >>
>> >> >>
>> >> >> Enclosed my installed packages (packages.list)
>> >> >>
>> >> >> Enclosed the full traceback (traceback.txt)
>> >> >>
>> >> >>
>> >> >> This is the code I'm using:
>> >> >>
>> >> >> connection_string =
>> >> >>
>> >> >>
>> >> >> "mssql+pyodbc://<username>:<password>@<sqlhost>.database.windows.net:<port>/<database>?driver=ODBC+Driver+13+for+SQL+Server"
>> >> >> engn = sqlalchemy.engine.create_engine(connection_string, echo=True)
>> >> >> engn.connect()
>> >> >>
>> >> >>
>> >> >> I'm very well aware that MS SQL DataWarehouse behaves a bit
>> >> >> different,
>> >> >> so
>> >> >> I'm open for some experimenting to get this issue narrowed down.
>> >> >>
>> >> >> In case it matters: I'm running an ubuntu 16.04 VM on azure with
>> >> >> jupyter
>> >> >> notebook server and python 3.6.1.
>> >> >>
>> >> >> Best,
>> >> >>
>> >> >> DB
>> >> >
>> >> > --
>> >> > SQLAlchemy -
>> >> > The Python SQL Toolkit and Object Relational Mapper
>> >> >
>> >> > http://www.sqlalchemy.org/
>> >> >
>> >> > To post example code, please provide an MCVE: Minimal, Complete, and
>> >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> >> > description.
>> >> > ---
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "sqlalchemy" group.
>> >> > To unsubscribe from this group and stop receiving emails from it,
>> >> > send
>> >> > an
>> >> > email to sqlalchemy+...@googlegroups.com.
>> >> > To post to this group, send email to sqlal...@googlegroups.com.
>> >> > Visit this group at https://groups.google.com/group/sqlalchemy.
>> >> > For more options, visit https://groups.google.com/d/optout.
>> >
>> > --
>> > SQLAlchemy -
>> > The Python SQL Toolkit and Object Relational Mapper
>> >
>> > http://www.sqlalchemy.org/
>> >
>> > To post example code, please provide an MCVE: Minimal, Complete, and
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> > description.
>> > ---
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "sqlalchemy" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to sqlalchemy+...@googlegroups.com.
>> > To post to this group, send email to sqlal...@googlegroups.com.
>> > Visit this group at https://groups.google.com/group/sqlalchemy.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Re: pandas to MS SQL DataWarehouse (to_sql)

Reply via email to