Edit report at https://bugs.php.net/bug.php?id=54379&edit=1
ID: 54379 Updated by: paj...@php.net Reported by: sms at inbox dot ru Summary: PDO_OCI: UTF-8 output gets truncated Status: Open Type: Bug Package: PDO related PHP Version: 5.3.6 -Assigned To: +Assigned To: sixd Block user comment: N Private report: N New Comment: See PR at https://github.com/php/php-src/pull/59 Previous Comments: ------------------------------------------------------------------------ [2012-09-07 21:35:35] tomasz at trejderowski dot pl This bug is nearly exact duplicate of bug #35003: https://bugs.php.net/bug.php?id=35003 Then only difference is that this one is still open (second one: closed) and this contains a solution, how to fix this bug. Even though, it requires to change ONE line and recompile sources, no one at PHP Dev Team took care to fix it. Even though, this is really huge bug, as it completely prevents using PHP + Oracle on UTF-8 encode webpages / databases. The last change is, that this bug issue remains open for "only" 1,5 year, while second one, closed, but still not fixed, remains untough for SEVEN years (since initial report). ------------------------------------------------------------------------ [2011-10-13 16:51:59] info-phpbug at ch2o dot info i've issued the same bug with different version of oracle client (10.2) and the patch i've resolved the problem. ------------------------------------------------------------------------ [2011-08-06 08:17:20] mitans02 at gmail dot com I have same problem with UTF8 database and UTF8 client. PDO_OCI should set string buffer length correct to handling UTF8. Check also oracle OCI documents below: http://download.oracle.com/docs/cd/B10500_01/server.920/a96529/ch6.htm#1004620 - Database settings SQL> 1 SELECT PARAMETER, VALUE 2 FROM NLS_DATABASE_PARAMETERS 3* WHERE PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET') PARAMETER VALUE ------------------------------ ------------------------------ NLS_CHARACTERSET AL32UTF8 NLS_NCHAR_CHARACTERSET AL16UTF16 - Table for test CREATE TABLE mytable (col1 NVARCHAR2(20)); - Test data INSERT INTO mytable VALUES('12345678901234567890'); /* 20 signle byte char */ INSERT INTO mytable VALUES('ããããããããããããããããã¡ã¤ã¦ã¨'); /* 20 double byte char, Japanese */ - Test Program <?php print "NLS_LANG=".getenv('NLS_LANG')."\n\n"; $db = new PDO("oci:dbname=//instance1.cf9klgqzy0gu.ap-northeast-1.rds.amazonaws.com:3306/mydb;charset=AL32UTF8", "user", "pass"); $db->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_EXCEPTION); $stmt = $db->prepare("SELECT * FROM mytable"); $stmt->execute(); var_dump($stmt->fetchAll(PDO::FETCH_ASSOC)); ?> - Test Program Output # php ocitest.php NLS_LANG=Japanese_Japan.AL32UTF8 Warning: PDOStatement::fetchAll(): column 0 data was too large for buffer and was truncated to fit it in /root/ocitest.php on line 9 array(2) { [0]=> array(1) { ["COL1"]=> string(20) "12345678901234567890" } [1]=> array(1) { ["COL1"]=> string(40) "ããããããããããããã } } - Client Environment # uname -a Linux localhost.localdomain 2.6.18-194.el5 #1 SMP Fri Apr 2 14:58:35 EDT 2010 i686 i686 i386 GNU/Linux # rpm -qa | grep oracle oracle-instantclient11.2-devel-11.2.0.2.0-1 oracle-instantclient11.2-sqlplus-11.2.0.2.0-1 oracle-instantclient11.2-basic-11.2.0.2.0-1 # php -v PHP 5.3.6 (cli) (built: Aug 5 2011 09:15:02) Copyright (c) 1997-2011 The PHP Group Zend Engine v2.3.0, Copyright (c) 1998-2011 Zend Technologies - Suggest fix Patch to oci_statement.c # diff -u oci_statement.c oci_statement.c.new --- oci_statement.c 2011-08-06 01:07:53.000000000 -0700 +++ oci_statement.c.new 2011-08-06 01:08:03.000000000 -0700 @@ -529,7 +529,7 @@ (param, OCI_DTYPE_PARAM, &colname, &namelen, OCI_ATTR_NAME, S->err)); col->precision = scale; - col->maxlen = data_size; + col->maxlen = ( data_size + 1 ) * sizeof(utext); col->namelen = namelen; col->name = estrndup((char *)colname, namelen); ------------------------------------------------------------------------ [2011-03-25 04:39:19] sms at inbox dot ru Description: ------------ Data is stored in ANSI charset (CL8MSWIN1251) while connection uses UTF-8. PDOStatement::fetchAll() generates warning and fields containing non-english characters gets truncated. For example, PDO outputs only 53 UTF-8 russian characters for VARCHAR2(100) field. MySQL's PDOStatement::fetchAll() works fine in the same situation. Test script: --------------- $pdo = new PDO('oci:dbname=[host];charset=UTF8', '[user]', '[pass]'); $cmd = $pdo->query('SELECT * FROM user'); var_dump($cmd->fetchAll()); Expected result: ---------------- Table field(s) not truncated, no warnings Actual result: -------------- Table field(s) gets truncated, PHP warning: PDOStatement::fetchAll() [<a href='pdostatement.fetchall'>pdostatement.fetchall</a>]: column 0 data was too large for buffer and was truncated to fit it ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=54379&edit=1