Re: [sqlalchemy] Cannot insert strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019

Nicolas Lykke Iversen Fri, 16 Oct 2020 00:54:24 -0700

Thank you, Mike.

pyODBC has the following to say about the issue:


*SQLAlchemy. pyODBC is generic and does not know about special handling of 
varchar(max), whereas SQLAlchemy appears to have code for specific database 
types. It needs to call setinputsizes as you described, when the length is 
more than maximum for non-max types (2K wide characters or 4K bytes).*

Let me be clear, I'm not exactly sure whether omitting _SC is the cause of 
the following error, which I have no idea about how to handle:
    ...
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1124, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1514, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", 
line 182, in raise_
    raise exception
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1256, in _execute_context
    self.dialect.do_executemany(
  File 
"/usr/local/lib/python3.8/site-packages/sqlalchemy/dialects/mssql/pyodbc.py", 
line 446, in do_executemany
    super(MSDialect_pyodbc, self).do_executemany(
  File 
"/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 
590, in do_executemany
    cursor.executemany(statement, parameters)
UnicodeEncodeError: 'utf-16-le' codec can't encode character '\udce5' in 
position 11: surrogates not allowed
It complains about UTF-16-LE and surrogates.

However, since _SC is needed for string operations on the SQL Server 
backend, don't you think that any real-world application would need it down 
the road?

Is it really necessary to use your very-subtle vendored version of the 
set_input_sizes() hook? Why use it compared to Simon King's simple version?

Using Simon King's version I experience a weird issue: it works perfectly, 
when using a single-threaded application, but when using multiprocessing it 
doesn't work.

In particular, if I execute:

engine = sqlalchemy.create_engine(
  
 
f'mssql+pyodbc://{user}:{pwd}@{host}:{port}/{db}?driver=ODBC+Driver+17+for+SQL+Server',
   pool_size=5,
   max_overflow=10,
   pool_pre_ping=True,
   isolation_level='READ_UNCOMMITTED',
   pool_recycle=900,
   echo=debug,
   connect_args={'connect_timeout': 10}
)

@sqlalchemy.event.listens_for(engine, 'before_cursor_execute')
def receive_before_cursor_execute(conn, cursor, statement, parameters, 
context, executemany):
   cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 0, 0), ])

in the main application and:

def db_init():
   engine = common.db.Session.get_bind()
   engine.dispose()

in all the children, then the hook gets called in the children, but somehow 
doesn't affect the INSERTs - the original error is produced for strings 
with a. length longer than 2000 characters.

Best regards
Nicolas



On Thursday, October 15, 2020 at 7:39:08 PM UTC+2 Mike Bayer wrote:

>
>
> On Thu, Oct 15, 2020, at 3:26 AM, Nicolas Lykke Iversen wrote:
>
> Hi Mike,
>
> I have created an issue for pyodbc: 
> https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-708930870
>
> I've gotten really good feedback there from Microsoft, and a fix has been 
> proposed that works:
>
>
> *"You can try to use setinputsizes on your parameter to tell it to send as 
> varchar(max): cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),])"*
> I'm by no means an SQLAlchemy expert, but shouldn't the pyodbc dialect be 
> updated to support varchar(max)using the proposed method? If not, how can 
> I execute cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),]) using 
> SQLAlchemy, so that I can make use of  varchar(max)in my application?
>
>
> SQLAlchemy has some dialects that make use of setinputsizes() out of 
> necessity, but it's an area that is fraught with issues as it means 
> SQLAlchemy is second-guessing what the DBAPI is coming up with.  
>
> It's actually news to me that pyodbc supports setinputsizes() as 
> historically, the cx_Oracle DBAPI was the only DBAPI that ever did so and 
> this method is usually not supported by any other DBAPI.   We have a hook 
> that calls upon setinputsizes() but right now it's hardcoded to cx_Oracle's 
> argument style, so the hook would need alterations to support different 
> calling styles on different dialects.
>
> In https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-709385941 
> it is suggested that there should be no need to use this "_SC" collation -  
> then in 
> https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-709428774, 
> you stated "I previously experimented with non-_SC in my application, and 
> it caused errors.".   Can you be more specific of these errors?   At the 
> moment, this is suggesting a major architectural rework of the pyodbc 
> dialect to support a use case which has other workarounds.   The 
> architecture of SQLAlchemy's set_input_sizes() hook has changed and at best 
> this would be part of 1.4 which is not in beta release yet, a production 
> release is not for several months.   
>
> From that point, there's an event hook at do_setinputsizes(): 
> https://docs.sqlalchemy.org/en/13/core/events.html?highlight=do_setinputsizes#sqlalchemy.events.DialectEvents.do_setinputsizes
>    
> that would be usable so that you could set up rules like these on your own, 
> and we eventually would document the known workarounds for various unusual 
> issues.  
>
> This issue is definitely unusual, it's never been reported and was 
> difficult to find in google searches, so I don't believe we are using 
> pyodbc incorrectly and it would be nice if pyodbc could someday realize 
> that MS SQL Server is the only database anyone really uses their driver 
> with, and they could perhaps add a SQL Server ruleset directly.  If this 
> were a problem that occurred frequently, then we would begin looking into 
> turning on some of this behavior by default but we need to be very 
> conservative on that as this is an area where things break quite a lot.
>
> Below is the recipe that includes a directly vendored version of the 
> set_input_sizes() hook to suit your immediate use case.   that's what I can 
> get you for now and it will allow you to set the input sizes any way you'd 
> like.
>
> import pyodbc
>
> from sqlalchemy import Column
> from sqlalchemy import create_engine
> from sqlalchemy import event
> from sqlalchemy import Integer
> from sqlalchemy import String
> from sqlalchemy.ext.declarative import declarative_base
> from sqlalchemy.orm import Session
>
>
> Base = declarative_base()
>
> e = create_engine(
>     
> "mssql+pyodbc://scott:tiger^5HHH@mssql2017:1433/test?driver=ODBC+Driver+13+for+SQL+Server",
>     echo=True,
> )
>
>
> @event.listens_for(e, "before_cursor_execute")
> def before_cursor_execute(
>     conn, cursor, statement, parameters, context, executemany
> ):
>
>     if not hasattr(context.compiled, "bind_names"):
>         return
>
>     inputsizes = {}
>     for bindparam in context.compiled.bind_names:
>         # check for the specific datatype you care about here
>         if bindparam.type._type_affinity is String:
>             inputsizes[bindparam] = ((pyodbc.SQL_WLONGVARCHAR, 0, 0),)
>         else:
>             inputsizes[bindparam] = None
>
>     positional_inputsizes = []
>     for key in context.compiled.positiontup:
>         bindparam = context.compiled.binds[key]
>         dbtype = inputsizes.get(bindparam, None)
>         positional_inputsizes.append(dbtype)
>
>     cursor.setinputsizes(positional_inputsizes)
>
>
> class A(Base):
>     __tablename__ = "a"
>
>     id = Column(Integer, primary_key=True)
>     x = Column(Integer)
>     data = Column(String)
>     y = Column(Integer)
>
> Base.metadata.drop_all(e)
> Base.metadata.create_all(e)
>
> s = Session(e)
>
> s.add(A(data="some data", x=1, y=4))
> s.commit()
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Can you recommend a hotfix for using varchar(max)in current SQLAlchemy 
> applications that need to handle Unicode supplementary characters (_SC)? 
>
> I appreciate really appreciate your help.
>
> Best regards
> Nicolas 
> On Wednesday, October 14, 2020 at 3:36:11 PM UTC+2 Mike Bayer wrote:
>
>
>
> On Wed, Oct 14, 2020, at 5:35 AM, Nicolas Lykke Iversen wrote:
>
> Hi Mike,
>
> I've now tested inserting strings with more than 2000 characters using 
> Azure Data Studio (SQL Server GUI) and everything works.
>
> Furthermore, I've tested pyodbc (DBAPI) directly, and it only fails when 
> inserting such strings using parameterised SQL queries (it succeeds without 
> using parametrised queries).
>
>
> that would be expected because all the datatype-related issues occur when 
> bound parameters are passed.
>
>
>
> You can see my POC below, if you have any interest.
>
> I guess it should be submitted as a bug to pyodbc... Do you know if I can 
> disable parametrisation for certain SQL queries in SQLAlchemy?
>
>
> there is not and this is definitely an issue that has to be solved at the 
> pyodbc level, either a bug on their end or something in your configuration 
> that has to be changed.
>
>
>
>
>
>
> Best regards (and thanks for your help and support!!!)
> Nicolas
>
> *System info*:
> python v. 3.8.5
> pyodbc v. 4.0.30
> msodbcsql17 v. 17.6.1.1
>
> *POC*:
> import sys
> import pyodbc
>
> host = 'tcp:127.0.0.1,1433'
> db = 'pyodbc_test'
> user = 'sa'
> pwd = 'P@ssw0rd'
>
> print('started')
>
> cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL 
> Server};SERVER='+host+';DATABASE='+'master'+';UID='+user+';PWD='+ pwd, 
> autocommit=True)
> cursor = cnxn.cursor()
>
> try:
>     cursor.execute(f'CREATE DATABASE {db} COLLATE 
> Latin1_General_100_CI_AS_SC')
> except pyodbc.ProgrammingError as e:
>     pass # database exists
>
> cursor.execute(f'USE {db}')
>
> try:
>     cursor.execute("""
>         CREATE TABLE msg (
>             id int identity(1,1) not null,
>             content varchar(max) not null
>         );""")
> except pyodbc.ProgrammingError as exc:
>     pass # table exists
>
> content = 2000 * 'A'
>
> cursor.execute(f"""
>     INSERT INTO msg (content)
>     VALUES ('{content}')""")
> print(f'non-param: {len(content)=}: success')
>
> sql = f"""
>       INSERT INTO msg (content)
>       VALUES (?)"""
> cursor.execute(sql, (content))
> print(f'param: {len(content)=}: success')
>
> content = 2001 * 'A'
>
> cursor.execute(f"""
>     INSERT INTO msg (content)
>     VALUES ('{content}')""")
> print(f'non-param: {len(content)=}: success')
>
> # this fails!
> sql = f"""
>       INSERT INTO msg (content)
>       VALUES (?)"""
> cursor.execute(sql, (content))
> print(f'param: {len(content)=}: success')
>
>
> #cursor.execute('SELECT * FROM msg')
> #rows = cursor.fetchall()
> #for r in rows:
> #    print(r)
>
> print('finished')
>
>
>
> On Wednesday, October 14, 2020 at 12:43:25 AM UTC+2 Mike Bayer wrote:
>
>
>
> On Tue, Oct 13, 2020, at 4:57 PM, Nicolas Lykke Iversen wrote:
>
> Thank you, Mike - very much appreciated!
>
> Just to be clear, pyodbc is not a driver, it’s a ODBC-compliant DBAPI, 
> right? I separately downloaded a driver for SQL Server from Microsoft, 
> which pyodbc makes use of.
>
>
> right the pyodbc is the DBAPI in this case, which we usually refer to as a 
> "driver" but in the case of ODBC the "driver" is more specifically the 
> separate ODBC driver component.
>
>
> Do you suggest that changing pyodbc to another SQL Server DPAPI would 
> solve the problem?
>
>
> I suggest that if there is no issue with the query you're running outside 
> of the context of pyodbc that you submit an issue to pyodbc at 
> https://github.com/mkleehammer/pyodbc/issues .  However I was able to 
> find a discussion thread about your error message that seemed to be 
> independent of ODBC.
>
>
>
>
>
> If so, can you recommend another DBAPI for SQL Server? Or do you think 
> that the problem is caused by Microsoft’s driver?
>
>
> pyodbc is the only supported driver for SQL Server that exists now for 
> Python.    You also definitely want to use Microsoft's ODBC drivers so 
> you're already there.
>
>
>
> I’m pretty sure SQL Server works fine when accessed using .NET, otherwise 
> the Internet would be full of complaints regarding not being able to insert 
> +2000 characters in a varchar(max).
>
>
> you'd want to see if the same ODBC driver and options are in use in that 
> scenario.
>
>
>
>
>
> Best regards
> Nicolas
>
>
> On Tue, 13 Oct 2020 at 22.22, Mike Bayer <mik...@zzzcomputing.com> wrote:
>
>
>
>
> On Tue, Oct 13, 2020, at 10:50 AM, Nicolas Lykke Iversen wrote:
>
> Hi SQLAlchemy,
>
> *System information:*
>
>    - Mac OS X v. 10.15.7
>    - Python v. 3.8.5
>    - SQLAlchemy v. 1.3.19
>    - MS SQL Server 2017 and 2019 (both Enterprise and Docker images e.g. 
>    mcr.microsoft.com/mssql/server:2019-latest)
>    
> *Problem*:
> I have an issue with inserting strings with a length greater than 2000 
> into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL 
> Server 2017 and 2019.
>
> I've checked the MS SQL Server and it creates it column properly with e.g. 
> a datatype of varchar(max)for Text and VARCHAR, which should be able to 
> store strings with a size up to 2 GB according to Microsoft documentation.
>
> Furthermore, I've tried using other collations, but I need _SC (supplementary 
> character) support for my applications, so I cannot drop it, and adding _UTF8 
> (UTF-8) doesn't solve the problem either.
>
> Why am I not allowed to store strings with a size greater than 2000? 
>
>
> I'm not really sure, this has to do with a behavior of SQL Server and/or 
> your ODBC driver.   a google search finds fairly scant results but there is 
> a long discussion regarding this error here: 
> https://www.sqlservercentral.com/forums/topic/collation-error-when-adding-distributer
>
>
>
>
>
>
>
>
> And why is SQLAlchemy displaying that error message, when trying to insert 
> plain ASCII text ("AAAA...")?
>
>
> SQLAlchemy runs SQL statements using a method called "cursor.execute()", 
> which is a feature of the DBAPI (database driver) in use.    As your error 
> message indicates you're using the pyodbc driver, this method is documented 
> here:  
> https://github.com/mkleehammer/pyodbc/wiki/Cursor#executesql-parameters
>
> The DBAPI execute() method, and most of the other DBAPI methods, can throw 
> exceptions if something goes wrong.  SQLAlchemy has the policy that if it 
> encounters such an exception when it calls cursor.execute(), it wraps it in 
> a SQLAlchemy-specific version of that exception (named the same) and then 
> throws it.   
>
> In other words you're using a driver called pyodbc that's creating this 
> error. SQLAlchemy just propagates it for you but otherwise has nothing to 
> do with how it's produced.
>
>
>
>
> I would really appreciate some guidance on how to solve this problem. What 
> could be causing it?
>
> *Error*:
> sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', 
> "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot 
> convert to text/ntext or collate to 'Latin1_General_100_CI_AS_SC' because 
> these legacy LOB types do not support UTF-8 or UTF-16 encodings. Use types 
> varchar(max), nvarchar(max) or a collation which does not have the _SC or 
> _UTF8 flags. (4189) (SQLParamData)")
> [SQL: INSERT INTO msg (content) OUTPUT inserted.id VALUES (?)]
> [parameters: 
> ('AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>  
> ... (1703 characters truncated) ... 
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA',)]
> (Background on this error at: http://sqlalche.me/e/13/f405)
>
>
> *Program (POC)*:
> import logging
> import sqlalchemy
> from sqlalchemy import Column, Text, Integer, NVARCHAR, VARCHAR
> from sqlalchemy.ext.declarative import declarative_base
> from sqlalchemy.orm import sessionmaker
>
> logging.root.setLevel(logging.DEBUG)
>
> Base = declarative_base()
>
>
> class Msg(Base):
>   __tablename__ = 'msg'
>
>   id = Column(Integer, primary_key=True, autoincrement=True)
>   content = Column(VARCHAR, nullable=False)
>
>
> user = 'sa'
> pwd = 'P@ssw0rd'
> host = 'localhost'
> port = 1433
> db = 'test'
>
> logging.info('started.')
>
> engine = sqlalchemy.create_engine(
>   
> f'mssql+pyodbc://{user}:{pwd}@{host}:{port}/{db}?driver=ODBC+Driver+17+for+SQL+Server',
>   pool_size=5,
>   max_overflow=10,
>   pool_pre_ping=True,
>   isolation_level='READ_UNCOMMITTED',
>   pool_recycle=900,
>   echo=False,
>   connect_args={'connect_timeout': 10})
>
> Base.metadata.create_all(engine)
>
> Session = sessionmaker(bind=engine, autoflush=False, autocommit=False)
> session = Session()
>
> for i in range(10000):
>   try:
>     msg = Msg(content='A' * i)
>     session.add(msg)
>     session.commit()
>   except Exception as exc:
>     logging.exception(f'fail: {i=}: {exc}')
>     break
>   else:
>     logging.info(f'success: {i=}')
>
> logging.info('done.')
>
>
> --
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
>
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/ca549391-4360-480e-8c58-06577f6d92dan%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/ca549391-4360-480e-8c58-06577f6d92dan%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
> --
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to a topic in the 
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/sqlalchemy/Kk6DkPNWlR4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/b4a135ef-1c1e-476c-a3f0-60326dca76ea%40www.fastmail.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/b4a135ef-1c1e-476c-a3f0-60326dca76ea%40www.fastmail.com?utm_medium=email&utm_source=footer>
> .
>
>
> --
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/CAE%2B5CJABrO_459MHtDePMXah8mb67TFvQB8rUwgVgk6%2By4v-ow%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/CAE%2B5CJABrO_459MHtDePMXah8mb67TFvQB8rUwgVgk6%2By4v-ow%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>
>
> --
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/60f82633-887a-43a3-ac52-c5541058e0bcn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/60f82633-887a-43a3-ac52-c5541058e0bcn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
> --
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/f3e2a277-8529-4fd5-83be-26445616f6c3n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/f3e2a277-8529-4fd5-83be-26445616f6c3n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/f963fe8a-a595-48c5-92a8-597046e199c6n%40googlegroups.com.

Re: [sqlalchemy] Cannot insert strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019

Reply via email to