Edit report at http://bugs.php.net/bug.php?id=51880&edit=1
ID: 51880
Comment by: bgamrat at wirehopper dot com
Reported by: tnpaulik at gmail dot com
Summary: Missfunction of mb_eregi() and mb_ereg()
Status: Assigned
Type: Bug
Package: *Regular Expressions
Operating System: Windows, Linux, doesn't matter
PHP Version: Irrelevant
Assigned To: moriyoshi
Block user comment: N
Private report: N
New Comment:
This may be related.
I had a date string (YYYY-MM-DD HH:MM:SS) validation that was
inconsistent. The code below runs the validation 100 times on the same
values and regex. Most of the time the mb_ereg works, occasionally it
doesn't.
Earlier issues with case-sensitivity caused me to add a case-insensitive
fallback, and to solve this issue, I added a fallback to use
preg_match.
<?php
mb_internal_encoding('UTF-8');
mb_detect_order('UTF-8');
mb_regex_encoding('UTF-8');
echo date('r').'<br />';;
for ($i=0;$i<100;$i++)
filter ('^\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2}$','2011-05-15
09:00:07');
function filter($sRegExp,$sInput)
{
if (!isset($sInput))
return false;
$sInput=trim($sInput);
/* mb_ereg functions don't use slashes */
if ($sRegExp[0]=='/')
$sRegExp=substr($sRegExp,1,-1);
$aMatches=array();
$iResult=mb_ereg($sRegExp,$sInput,$aMatches);
echo 'Testing '.$sInput.' against
'.$sRegExp.PHP_EOL.var_export($aMatches,true).' result '.$iResult.'<br
/>';;
if (strlen($sInput)!=$iResult)
{
$sLowerCaseRegExp=mb_strtolower($sRegExp);
$sLowerCaseInput=mb_strtolower($sInput);
$iResult=mb_ereg($sLowerCaseRegExp,$sLowerCaseInput,$aMatches);
echo 'Fallback Testing '.$sLowerCaseInput.' against
'.$sLowerCaseInput.PHP_EOL.var_export($aMatches,true).' result
'.$iResult.'<br />';;
if (strlen($sInput)!=$iResult)
{
$bResult=preg_match('/'.$sRegExp.'/i',$sLowerCaseInput);
echo 'preg_match/i '.$bResult.'<br />';
return $bResult!=0;
}
}
return true;
}
Linux domain.com 2.6.18-164.9.1.el5 #1 SMP Tue Dec 15 21:04:57 EST 2009
i686 i686 i386 GNU/Linux
Apache/2.2.3
PHP version 5.1.6
mbstring
Multibyte Support enabled
Multibyte string engine libmbfl
Multibyte (japanese) regex support enabled
Multibyte regex (oniguruma) version 3.7.1
mbstring extension makes use of "streamable kanji code filter and
converter", which is distributed under the GNU Lesser General Public
License version 2.1.
Directive Local Value Master Value
mbstring.detect_order no value no value
mbstring.encoding_translation Off Off
mbstring.func_overload 0 0
mbstring.http_input pass pass
mbstring.http_output pass pass
mbstring.internal_encoding no value no value
mbstring.language neutral neutral
mbstring.strict_detection Off Off
mbstring.substitute_character no value no value
Previous Comments:
------------------------------------------------------------------------
[2010-05-21 16:13:23] tnpaulik at gmail dot com
Description:
------------
mb_eregi doesnt macht caseinsensitivity for non ASCII signs in PHP 5.2
and 5.3
Example:
mb_eregi('Ã','ü') returns false.
mb_ereg is case insensotoive for non ASCII charakters if i put tem in
[]
Example:
mb_ereg("[Ã]","ü") returns true.
Test script:
---------------
if (!mb_eregi("Ã","ü"))
echo "THAT shoudldn't be so...\n";
if (mb_ereg("[Ã]","ü"))
echo "THAT shoudldn't be so...\n";
Expected result:
----------------
no output
Actual result:
--------------
THAT shoudldn't be so...
THAT shoudldn't be so...
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/bug.php?id=51880&edit=1