Edit report at https://bugs.php.net/bug.php?id=61001&edit=1
ID: 61001
User updated by: mike at eastghost dot com
Reported by: mike at eastghost dot com
Summary: Corruption of "=0a" but not "=a0"
-Status: Open
+Status: Closed
Type: Bug
Package: PCRE related
Operating System: Ubuntu LAMP
PHP Version: 5.3.10
Block user comment: N
Private report: N
New Comment:
closed
Previous Comments:
------------------------------------------------------------------------
[2012-02-07 19:38:11] mike at eastghost dot com
I tried your suggested fix and agree you are correct. This is not a bug, just
humantax_error. BTW, I already changed the code to use preg_replace_callback()
(vs what was an array of subject-regexs and replacement-strings passed to
preg_replace) and also agree it is faster to use preg_replace_callback().
Thank
you for looking.
------------------------------------------------------------------------
[2012-02-07 10:50:28] anon at anon dot anon
Not a bug and it has nothing to do with UTF8. The error message says why it's
not working: the eval'd code has a syntax error, because you forgot to wrap the
argument to debdcode_post in quotes. It should be:
$html_oursr[0] = 'debdcode_post(\'$1\')';
It works for `debdcode_post(a0)` because a0 is parsed as a constant (if you do
`error_reporting(-1);` you will see the notice about the use of the undefined
constant), but `debdcode_post(0a)` is always a syntax error.
But the better (faster) solution is to use preg_replace_callback.
------------------------------------------------------------------------
[2012-02-07 10:10:23] mike at eastghost dot com
Description:
------------
Passing following UTF8 text thru 3rd line of the test script (i.e.,
preg_replace() function) causes an error in preg_replace function:
[post=0a /]
Whereas, passing following UTF8 text similarly causes no error:
[post=a0 /]
Problem seems to be caused only when the "=" is followed by an integer then
followed by a letter. I briefly tried other combinations without causing error.
Workaround is to replace third line of test script with this line (i.e., use
the
preg_replace_callback() instead of preg_replace()
$out = preg_replace_callback( '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uiu',
'debdcode_post', $i_html );
Test script:
---------------
$html_ours[0] = '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uieu';
$html_oursr[0] = 'debdcode_post( $1 )'; // irrelevant, use any misc func that
looks up post id in db
$out = preg_replace( $html_ours, $html_oursr, $i_html );
Expected result:
----------------
The general use is in a BBCODE-like parser for use in a FORUMS app.
What should happen:
In the source text (in UTF-8 format),
the string "[post=4ablahblah /]"
should be picked out of any given arbitrary input
by the preg_replace()
and then translated to a hyperlink
by the debdcode_post(). What is happening instead is the error in
preg_replace,
presumably from malformed UTF-8 or possibly a bug inside preg_replace when
dealing with the particular character sequence "=<integer><letter(s) and/or
integer(s)>. Note that it's the "=" followed by an integer and then followed
by
at least one letter and/or more integers that triggers the error. I hope this
helps; thank you for looking.
Actual result:
--------------
Parse error: syntax error, unexpected T_STRING in
/apath/Class/common_functions.inc(1405) : regexp code on line 1
Fatal error: preg_replace() [<a href='function.preg-replace'>function.preg-
replace</a>]: Failed evaluating code: debdcode_post( 4f30abfddc79595474000020 )
in
<file:line>
------------------------------------------------------------------------
--
Edit this bug report at https://bugs.php.net/bug.php?id=61001&edit=1