Bugs item #2798331, was opened at 2009-05-29 12:16
Message generated for change (Comment added) made by plewy
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2798331&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SQL/ODBC
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Pawel Lewicki (plewy)
Assigned to: Sjoerd Mullender (sjoerd)
Summary: ODBC - broken 'utf-8' data

Initial Comment:
The same problem as reported in #2014118
SQL queries executed using ODBC driver are not properly transferred. Inserted 
data has broken utf-8 characters.
Broken data is recognized by "LIKE" queries if using the same driver (ODBC vs 
native Python bindings).
Windows 32-bit Feb2009-SP2 release.
ODBC driver version is 1.28.04.01

 >>> import MonetSQLdb
 >>> testStr = 'us\xc5\x82ugi'
 >>> conPython= MonetSQLdb.connect(host = 'localhost', user = 'monetdb', 
 >>> password = 'monetdb', lang = 'sql', dbname='demo')
 >>> cuPython = con.cursor()
 >>> cuPython.execute("CREATE TABLE testtable (testfield varchar(30))")
0
 >>> cuPython.execute("INSERT INTO testtable VALUES ('%s')"%testStr)
1
 >>> cuPython.execute("SELECT * FROM testtable WHERE testfield LIKE 
 >>> '%s';"%testStr)
1
 >>> print cuPython.fetchall()
[('us\xc5\x82ugi',)]
 >>>
 >>>
 >>> import pyodbc
 >>> conODBC = pyodbc.connect('DSN=demo')
 >>> cuODBC = conODBC.cursor()
 >>> cuODBC.execute("SELECT * FROM testtable WHERE testfield LIKE 
 >>> '%s';"%testStr)
<pyodbc.Cursor object at 0x00D322F8>
 >>> print cuODBC.fetchall()
[]
 >>> cuODBC.execute("INSERT INTO testtable VALUES ('%s')"%testStr)
1
>>> cuODBC.execute("COMMIT;") #only Python driver is in autocommit mode
0
>>> cuODBC.execute("SELECT * FROM testtable WHERE testfield LIKE '%s';"%testStr)
<pyodbc.Cursor object at 0x00D31100>
>>> print cuODBC.fetchall()
[('us\xc4\xb9\xe2\x80\x9augi', )]

>>> cuPython.execute("SELECT * FROM testtable")
2
>>> print cuPython.fetchall()
[('us\xc5\x82ugi',), ('us\xc4\xb9\xe2\x80\x9augi',)]
>>> cuODBC.execute("SELECT * FROM testtable")
<pyodbc.Cursor object at 0x00D31100>
>>> print cuODBC.fetchall()
[('us\xc5\x82ugi', ), ('us\xc4\xb9\xe2\x80\x9augi', )]
 
 >>> cuPython.execute("DROP TABLE testtable")
0

----------------------------------------------------------------------

>Comment By: Pawel Lewicki (plewy)
Date: 2009-06-05 14:07

Message:
I updated client and server to May 2009 - no change.
odbc.log:

DllMain 1
DllMain 2
DllMain 2
DllMain 2
SQLAllocHandle Env 0
DllMain 2
new env 1172f28
SQLSetEnvAttr 1172f28 200 3
SQLAllocHandle Dbc 1172f28
new dbc 1174860
SQLGetInfoW 1174860 77
addDbcError 1174860 01004 String data, right-truncated 0
SQLSetConnectAttrW 1174860 115
addDbcError 1174860 HY092 Invalid attribute/option identifier 0
SQLDriverConnectW 1174860 "DSN=monet_demo;" 0
addDbcError 1174860 01004 String data, right-truncated 0
"DSN=monet_demo;" 0
SQLConnect: DSN=monet_demo UID=monetdb PWD=monetdb host=localhost
port=50000 database=demo
addDbcError 1174860 01004 String data, right-truncated 0
SQLGetDiagRecW Dbc 1174860 1 0
SQLGetDiagRecW: [MonetDB][ODBC Driver 1.0]String data, right-truncated
SQLGetDiagRecW Dbc 1174860 1 56
SQLGetDiagRecW: [MonetDB][ODBC Driver 1.0]String data, right-truncated
SQLGetDiagFieldW Dbc 1174860 1 8 256
SQLGetDiagFieldW: ISO 9075
SQLGetDiagFieldW Dbc 1174860 1 9 256
SQLGetDiagFieldW: ISO 9075
SQLGetDiagFieldW Dbc 1174860 1 10 256
SQLGetDiagFieldW: MonetDB ODBC/Mapi
SQLGetDiagFieldW Dbc 1174860 1 11 256
SQLGetDiagFieldW: monet_demo
SQLGetDiagRecW Dbc 1174860 2 0
SQLGetFunctions 1174860 999
SQLGetInfoW 1174860 23
SQLGetInfoW 1174860 24
SQLGetInfoW 1174860 1750
addDbcError 1174860 HY096 Invalid information type 0
SQLSetConnectAttrW 1174860 102
SQLSetConnectAttr set autocommit off
SQLGetInfoW 1174860 10002
addDbcError 1174860 01004 String data, right-truncated 0
SQLAllocHandle Stmt 1174860
new stmt 1d440a0
SQLGetStmtAttrW 1d440a0 10010
SQLGetStmtAttrW 1d440a0 10011
SQLGetStmtAttrW 1d440a0 10012
SQLGetStmtAttrW 1d440a0 10013
SQLGetTypeInfoW 1d440a0 93
ODBCInitResult: querytype 1
SQLFetch 1d440a0
SQLGetData 1d440a0 3 4
SQLFreeStmt 1d440a0 0
SQLAllocHandle Stmt 1174860
new stmt 1174ba8
SQLGetStmtAttrW 1174ba8 10010
SQLGetStmtAttrW 1174ba8 10011
SQLGetStmtAttrW 1174ba8 10012
SQLGetStmtAttrW 1174ba8 10013
SQLFreeStmt 1174ba8 0
SQLExecDirectW 1174ba8
SQLExecDirect: "SELECT * FROM testtable WHERE testfield LIKE
'usługi';"
ODBCInitResult: querytype 1
SQLNumResultCols 1174ba8
SQLRowCount 1174ba8
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLFetch 1174ba8
SQLFreeStmt 1174ba8 0
SQLExecDirectW 1174ba8
SQLExecDirect: "INSERT INTO testtable VALUES ('usługi')"
ODBCInitResult: querytype 2
SQLNumResultCols 1174ba8
SQLRowCount 1174ba8
SQLFreeStmt 1174ba8 0
SQLExecDirectW 1174ba8
SQLExecDirect: "COMMIT;"
ODBCInitResult: querytype 4
SQLNumResultCols 1174ba8
SQLRowCount 1174ba8
SQLFreeStmt 1174ba8 0
SQLExecDirectW 1174ba8
SQLExecDirect: "SELECT * FROM testtable WHERE testfield LIKE
'usługi';"
ODBCInitResult: querytype 1
SQLNumResultCols 1174ba8
SQLRowCount 1174ba8
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLFetch 1174ba8
SQLGetData 1174ba8 1 1
SQLFetch 1174ba8
SQLFreeStmt 1174ba8 0
SQLExecDirectW 1174ba8
SQLExecDirect: "SELECT * FROM testtable"
ODBCInitResult: querytype 1
SQLNumResultCols 1174ba8
SQLRowCount 1174ba8
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLDescribeColW 1174ba8 1
addStmtError 1174ba8 01004 String data, right-truncated 0
SQLFetch 1174ba8
SQLGetData 1174ba8 1 1
SQLFetch 1174ba8
SQLGetData 1174ba8 1 1
SQLFetch 1174ba8
SQLFreeStmt 1174ba8 0
SQLFreeHandle Stmt 1174ba8
SQLEndTran Dbc 1174860 1
new stmt 1174ba8
SQLExecDirect: "rollback"
ODBCInitResult: querytype 4
SQLFreeStmt 1d440a0 1
SQLGetInfoW 1174860 1
SQLDisconnect 1174860
SQLFreeHandle Dbc 1174860
SQLFreeHandle Env 1172f28
DllMain 0


----------------------------------------------------------------------

Comment By: Sjoerd Mullender (sjoerd)
Date: 2009-06-03 18:01

Message:
I am not convinced that the problem is in the ODBC driver.  In other words,
I need more evidence.

Can you repeat the pyodbc interaction after setting the environment
variable ODBCDEBUG to the name of a file where the ODBC driver can dump
logging infromation (e.g.
set ODBCDEBUG=C:\odbc.log
).  Can you then attach the resulting file to this bug report?

What I did to try out the bug is to create a table like you did, except I
inserted just one value consisting of an a-ring (U+00C5, or '\xc4\xa5' in
UTF-8).  When I try to select with a like clause with the a-ring from a
command window, SQLExecDirect is called with a string containing
\xc3\x83\xc2\xa5 when I write out the UTF-8 sequence (similar to what you
did), and \xe2\x80\xa0 if I use a literal a-ring.  This shows that pyodbc
(or the Windows ODBC driver manager) translates the input string from
(presumably) cp1252 to UTF-8, even though the command window uses cp437.

My exact queries where:
cuODBC.execute('select * from testtable where testfield like
\'\xc3\xa5\'')
cuODBC.execute('select * from testtable where testfield like \'å\'')
and the log shows these queries (and there is absolutely no conversion in
the ODBC driver before this gets logged, i.e. these are the exact strings
the driver gets from  pyodbc):
SQLExecDirect: "select * from testtable where testfield like
'<C3><83><C2><A5>'"
SQLExecDirect: "select * from testtable where testfield like
'<E2><80><A0>'"
Note that in cp437 (which the console uses, a-ring has the value '\x86',
and that this value is interpreted as the dagger (U+2020) in cp1252.  The
UTF-8 encoding of U+2020 is the 3 byte string shown above.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2798331&group_id=56967

------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to