ID: 48446
Updated by: [email protected]
Reported By: shawn at shawnbiddle dot com
Status: Open
Bug Type: Scripting Engine problem
Operating System: Linux
PHP Version: 5.2.9
New Comment:
Yeah, that's just how the tokenizer/scanner has always worked. It stops
at "<s" to avoid a long PHP opening tag (e.g. <script language="php">,
etc.) from being taken as inline HTML. The regular expressions in the
scanner can't "look ahead" to make sure what follows is NOT a PHP
opening tag, and it would be more complicated, if it's even possible
(been awhile since I looked), to do extra checking in the code after
scanning additional input...
The good news, however, is that a new scanner is used for PHP 5.3, and
as of a few weeks ago (5.3.0 RC2), it now works as you'd expect. All
continuous HTML is kept as one token. :-)
Previous Comments:
------------------------------------------------------------------------
[2009-06-01 16:48:12] shawn at shawnbiddle dot com
Description:
------------
If token_get_all is run on a script that contains both PHP and HTML it
will split T_INLINE_HTML tokens up any time it runs across an html tag
starting with s. My example uses span but as I said, it's any tag
starting with s.
Reproduce code:
---------------
<?php
print_r(token_get_all('<?php echo "Hello World!";
?><h6>Hello</h6><span class="test">Hello!</span><?php echo "Goodbye,
World!"; ?>'));
?>
Expected result:
----------------
Array
(
[0] => Array
(
[0] => 367
[1] => <?php
[2] => 1
)
[1] => Array
(
[0] => 316
[1] => echo
[2] => 1
)
[2] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[3] => Array
(
[0] => 315
[1] => "Hello World!"
[2] => 1
)
[4] => ;
[5] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[6] => Array
(
[0] => 369
[1] => ?>
[2] => 1
)
[7] => Array
(
[0] => 311
[1] => <h6>Hello</h6><span class="test">Hello!</span>
[2] => 1
)
[8] => Array
(
[0] => 367
[1] => <?php
[2] => 1
)
[9] => Array
(
[0] => 316
[1] => echo
[2] => 1
)
[10] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[11] => Array
(
[0] => 315
[1] => "Goodbye, World!"
[2] => 1
)
[12] => ;
[13] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[14] => Array
(
[0] => 369
[1] => ?>
[2] => 1
)
)
Actual result:
--------------
Array
(
[0] => Array
(
[0] => 367
[1] => <?php
[2] => 1
)
[1] => Array
(
[0] => 316
[1] => echo
[2] => 1
)
[2] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[3] => Array
(
[0] => 315
[1] => "Hello World!"
[2] => 1
)
[4] => ;
[5] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[6] => Array
(
[0] => 369
[1] => ?>
[2] => 1
)
[7] => Array
(
[0] => 311
[1] => <h6>Hello</h6>
[2] => 1
)
[8] => Array
(
[0] => 311
[1] => <s
[2] => 1
)
[9] => Array
(
[0] => 311
[1] => pan class="test">Hello!</span>
[2] => 1
)
[10] => Array
(
[0] => 367
[1] => <?php
[2] => 1
)
[11] => Array
(
[0] => 316
[1] => echo
[2] => 1
)
[12] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[13] => Array
(
[0] => 315
[1] => "Goodbye, World!"
[2] => 1
)
[14] => ;
[15] => Array
(
[0] => 370
[1] =>
[2] => 1
)
[16] => Array
(
[0] => 369
[1] => ?>
[2] => 1
)
)
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=48446&edit=1