Edit report at https://bugs.php.net/bug.php?id=25972&edit=1
ID: 25972
Comment by: j dot faithw at yahoo dot com
Reported by: phpbug at chipple dot net
Summary: ODBC truncates multi-byte text (w/ MSSQL)
Status: Analyzed
Type: Feature/Change Request
Package: ODBC related
Operating System: Win2K 5.00.2195 SP4
PHP Version: 4.3, 5
Block user comment: N
Private report: N
New Comment:
I have the same problem with php 5.3.8 using PostgreSQL with char columns. If
the database is created with e.g the EUC_CN character encoding(createdb -E
EUC_CN).
This encoding uses between 1 and 3 bytes per character. So a char(10) could
need up to 30 bytes.
The problem is in the odbc_bindcols function in ext/odbc/php_odbc.c
SQLColAttributes is called with SQL_COLUMN_DISPLAY_SIZE but this indicates
the maximum number of characters required not the number of bytes.
This means the buffer allocated for the value may not be big enough
result->values[i].value=(char)emalloc(displaysize+1);
Later on in e.g. odbc_fetch_into
Z_STRLEN_P(tmp) = result->values[i].vallen;
Z_STRVAL_P(tmp) = estrndup(result->values[i].value,Z_STRLEN_P(tmp));
This can result in a vallen bigger that displaysize. But the ODBC driver will
only fill in at most displaysize+1 bytes(including null terminator). This means
character data is missed and junk bytes are returned instead.
The same problem may exist in ext/pdo_odbc/odbc_stmt.c. Where
rc = SQLColAttribute(S->stmt, colno+1, SQL_DESC_DISPLAY_SIZE,
NULL, 0, NULL, &displaysize);
is called. But I have not tested this.
The following fixes odbc_bindcols for the char(x) datatype. I believe 4 bytes
is the maximum required for any charater encoding.
php_odbc.c:line 988
if (result->values[i].coltype == SQL_CHAR) {
//If using a multibyte character encoding
//number of bytes could be 4*SQL_COLUMN_DISPLAY_SIZE.
//Without this workaround various functions
//e.g. odbc_fetch_into will return data with a null after
//diplaysize bytes and extra junk data at the end as
//vallen can be bigger than displaysize. Tested using
//PostgreSQL with EUC_CN encoding.
displaysize*=4;
}
The fix may be needed for other data types as well as SQL_CHAR.
Previous Comments:
------------------------------------------------------------------------
[2003-11-05 02:45:46] phpbug at chipple dot net
In case this helps...
One thing I just noticed from my test above (in the results before the change)
is that strlen() on the tTitle field value gives 86 [bytes], the string's
correct length. I verified that the 6 last bytes are all null (ASCII 0), only
the first 80 bytes are correctly being returned.
------------------------------------------------------------------------
[2003-11-05 01:18:08] phpbug at chipple dot net
Thank you very much for the attention to my bug report.
I gave the fix a try in my environment but then all field values received are
empty (details below).
Perhaps the SQL_COLUMN_LENGTH attribute always contains the value 0?
// Test code
$oOdbcConn = odbc_connect(C_Gen_sDbDSN,C_Gen_sDbUser,C_Gen_sDbPassword);
$oOdbcRs = odbc_exec($oOdbcConn,$sSql);
$aOdbcRow = odbc_fetch_array($oOdbcRs);
for ($i = 1; $i <= odbc_num_fields($oOdbcRs); $i++)
echo odbc_field_name($oOdbcRs,$i).": ".
odbc_field_len($oOdbcRs,$i).": ".
strlen($aOdbcRow[odbc_field_name($oOdbcRs,$i)]).": ".
gettype($aOdbcRow[odbc_field_name($oOdbcRs,$i)])."<br>";
// Result with php4-STABLE-200311050430 before change
// (SQL_COLUMN_DISPLAY_SIZE)
aCourseID: 10: 1: string
tTitle: 80: 86: string
// Result with php4-STABLE-200311050430 after change
// (SQL_COLUMN_LENGTH)
aCourseID: 10: 0: NULL
tTitle: 80: 0: NULL
------------------------------------------------------------------------
[2003-11-04 18:33:10] [email protected]
I do not like the idea of introducing ODBCv3 based code/
options to a system predominately defined by ODBCv2
specifications. So I am against the inclusion of
Moriyoshi's initial suggested fix for this.
That being said, I did a bit more research on this today
and think the following change should allow the double
wide characters to work much better. I haven't tested
it out yet myself, but if someone else has the time, it
would be beneficial to all. Sorry about the bug system
mangling.
Essentially the SQL_COLUMNS_DISPLAY_SIZE lists the
number of characters needed to display everything. This
works fine but in the case of a double wide character
array it doesn't (as explained my Moriyoshi). The
SQL_COLUMN_LENGTH should return the number of bytes
necessary for retrival of the column.
WARNING: this change may fundamentally alter the
functionality of the longreadlen variable comparisons as
well. Use at your own risk right now.
Index: php_odbc.c
========================================================
===========
RCS file: /repository/php-src/ext/odbc/php_odbc.c,v
retrieving revision 1.176
diff -r1.176 php_odbc.c
671,672c671,672
< rc =
SQLColAttributes(result->stmt, (UWORD)(i+1),
SQL_COLUMN_DISPLAY_SIZE,
<
NULL, 0, NULL, &displaysize);
---
> rc =
SQLColAttributes(result->stmt, (UWORD)(i+1),
> SQL_COLUMN_LENGTH, NULL, 0,
NULL, &displaysize);
------------------------------------------------------------------------
[2003-11-04 09:31:56] [email protected]
Well, then how did you conclude NVARCHAR support will break some kinds of
compatibilities? I think I have already pointed out that we'd still be able to
handle it on ODBCv2.
(Sorry if this sounds offending. I don't mean so :)
Basically we don't have to check whether the column type is NVARCHAR or not,
but just allocate enough space for that type of characters. That way we also
got to take a slight loss of memory into account though.
------------------------------------------------------------------------
[2003-11-04 07:56:51] [email protected]
moriyoshi,
It's not as simple as you show it to be. First you must
realize that PHP's ODBC layer is written as an ODBC v2
compliant system, to just randomly add in support for
NVARCHAR (and friends) will break support for other
database systems.
The point of my post wasn't to say this isn't a bug
(hence why I marked it as verified), but rather to say
it's a known bug and the issue is the extension is in
need of updating.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
https://bugs.php.net/bug.php?id=25972
--
Edit this bug report at https://bugs.php.net/bug.php?id=25972&edit=1