Has anyone noticed an issue with mb_ereg_replace when the pattern string contains a # character?
The following problem is seen with php-4.2.2 with mbstring/mbregex enabled. In (1) below, ereg_replace has no problem matching a pattern containing a #. In (2), mb_ereg_replace ignores #52 from the pattern and replaces @ with test To fix the problem, we need to escape # to be \# as in (3). I didn't think # has special significance in POSIX regex and it worked ok in (1) with ereg_replace. 1) $s = 'blah @#52 blah'; print("s: $s \n "); $s = ereg_replace('@#52','test',$s); print("s: $s \n "); s: blah @#52 blah s: blah test blah ---------------------- 2) $s = 'blah @#52 blah'; print("s mb: $s \n "); $s = mb_ereg_replace('@#52','test',$s); print("s mb: $s \n "); s mb: blah @#52 blah s mb: blah test#52 blah ---------------------- 3) $s = 'blah @#52 blah'; print("s mb\: $s \n"); $s = mb_ereg_replace('@\#52','test',$s); print("s mb\: $s \n"); s mb\: blah @#52 blah s mb\: blah test blah ---------------------- The problem comes up when trying to create the following function: function html_special_decode($s) { $s = mb_ereg_replace('>', '>', $s); $s = mb_ereg_replace('<', '<', $s); $s = mb_ereg_replace('"', '"', $s); $s = mb_ereg_replace(''', '\'', $s); $s = mb_ereg_replace('&', '&', $s); return $s; } -Ezra "Renato De Giovanni" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > It's probable that it's a PHP...erm..."fact of life" right now. I ran > > into similar problems with iso-8859-7 and -9, using both > > htmlspecialchars and htmlentities with the (optional) 3rd parameter. > > Things worked unpredictably. In the PHP build I have now (4.4ish, from > > recent CVS), htmlspecialchars actually prints out a PHP error message > > (E_WARNING, I believe) that: > > > > "ISO-8859-7 is not supported by htmlspecialchars(); assuming ISO-8859-1" > > > > So I wouldn't be surprised if you weren't running into this problem, > > which wasn't officially recognized until after 4.2 was released. Look > > at bugs.php.net for related bugs...it's the only good way to keep up on > > the issue, which seems to be evolving... > > > > Cheers, > > spud. > > Ok, so it's a known "missing feature". > > Meanwhile, it's possible to replace: > > $s = htmlspecialchars($s, ENT_COMPAT, 'UTF-8'); > > with: > > mb_regex_encoding('UTF-8'); > $s = mb_ereg_replace('&', '&', $s); > $s = mb_ereg_replace('>', '>', $s); > $s = mb_ereg_replace('<', '<', $s); > $s = mb_ereg_replace('"', '"', $s); > > ...which should decrease performance considerably, but I see no other > workaround. > > Thanks, > -- > Renato > > -- > This message has been scanned for viruses and > dangerous content and is believed to be clean. > -- PHP Internationalization Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php