From: jisgro at teliae dot fr Operating system: Debian Edge PHP version: 5.3.24 Package: *General Issues Bug Type: Bug Bug description:mb_detect_encoding problem
Description: ------------ php 5.3.3 We open a file with ANSII encoding, we set the encoding with the "iconv_set_encoding("internal_encoding", "UTF-8");" function to UTF8 the mb_detect_encoding return before and after the encoding : Format : ISO-8859-1 The function is in the test script, it returns : Format : ISO-8859-1 mystere ééé ééé é éé à à à à à , <-> , ��� ��� � �� � � � � � , Format : ISO-8859-1 Test script: --------------- function convertirFichierEnUTF8($sNomFichier){ $sContenuFichier = file_get_contents($sNomFichier); if($sContenuFichier == ''){//cas vide et cas erreur de lecture return; } $tabFormatsReconnus = array( 'ASCII' ,'ISO-8859-1' ,'ISO-8859-2' ,'ISO-8859-15' ,'UTF-8' ,'UTF-16' ,'UTF-32' ,'Windows-1251' ,'Windows-1252' ); $sFormat = mb_detect_encoding($sContenuFichier, $tabFormatsReconnus, true); //echo $sNomFichier."\n"; echo "Format : ".$sFormat."\n"; if($sFormat === false){ CLog::trace('Erreur encodage du fichier '.$sNomFichier.' inconnu', 'Conversion fichier', 'Erreur détection encodage', 0, CLog::INIVEAU_ERREUR_CRITIQUE, CConfig::$sEmail_Trace_Erreur); return; } //Les formats suivants n'ont pas besoin de conversion if(in_array($sFormat, array('UTF-8', 'ASCII'))){ return; } iconv_set_encoding("internal_encoding", "UTF-8"); //iconv_set_encoding("output_encoding", "UTF-8"); $sNouveauContenu = iconv($sFormat, 'UTF-8', $sContenuFichier); //Si la conversion a eu un problème if($sNouveauContenu === ''){ CLog::trace('Erreur à la conversion en UTF8 du fichier '.$sNomFichier, 'Conversion fichier', 'Erreur conversion UTF8', 0, CLog::INIVEAU_ERREUR); $sNouveauContenu = iconv($sFormat, 'UTF-8//IGNORE', $sContenuFichier); CreeRepSiNonExiste(CConfig::$sRepertoire_log, 'erreursConversionFichiers'); file_put_contents(CConfig::$sRepertoire_log.'erreursConversionFichiers/'.basename($sNomFichier), $sContenuFichier); } //On sauvegarde le résultat de la conversion file_put_contents($sNomFichier, $sNouveauContenu); echo ($sContenuFichier === $sContenuFichier ? 'aie aie aie c pareil':'mystere' ); ttt($sNouveauContenu,'<->',$sNouveauContenu); $sFormat = mb_detect_encoding($sNouveauContenu, $tabFormatsReconnus, true); //echo $sNomFichier."\n"; echo "Format : ".$sFormat."\n"; } Expected result: ---------------- return format in UTF8 Actual result: -------------- Format : ISO-8859-1 -- Edit bug report at https://bugs.php.net/bug.php?id=64667&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=64667&r=trysnapshot54 Try a snapshot (PHP 5.3): https://bugs.php.net/fix.php?id=64667&r=trysnapshot53 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=64667&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=64667&r=fixed Fixed in release: https://bugs.php.net/fix.php?id=64667&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=64667&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=64667&r=needscript Try newer version: https://bugs.php.net/fix.php?id=64667&r=oldversion Not developer issue: https://bugs.php.net/fix.php?id=64667&r=support Expected behavior: https://bugs.php.net/fix.php?id=64667&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=64667&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=64667&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=64667&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=64667&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=64667&r=dst IIS Stability: https://bugs.php.net/fix.php?id=64667&r=isapi Install GNU Sed: https://bugs.php.net/fix.php?id=64667&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=64667&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=64667&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=64667&r=mysqlcfg