ID: 33093 Updated by: [EMAIL PROTECTED] Reported By: [EMAIL PROTECTED] -Status: Open +Status: Feedback Bug Type: Unknown/Other Function Operating System: Mac OS X 10.4.1 PHP Version: 5.0.4 New Comment:
wheres the missing data? php -r 'var_dump(token_get_all("<?php echo \$var ?>"));' array(6) { [0]=> array(2) { [0]=> int(366) [1]=> string(6) "<?php " } [1]=> array(2) { [0]=> int(316) [1]=> string(4) "echo" } [2]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [3]=> array(2) { [0]=> int(309) [1]=> string(4) "$var" } [4]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [5]=> array(2) { [0]=> int(368) [1]=> string(2) "?>" } } php -r 'var_dump(token_get_all("<?php \necho \$var\n?>"));' array(7) { [0]=> array(2) { [0]=> int(366) [1]=> string(6) "<?php " } [1]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [2]=> array(2) { [0]=> int(316) [1]=> string(4) "echo" } [3]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [4]=> array(2) { [0]=> int(309) [1]=> string(4) "$var" } [5]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [6]=> array(2) { [0]=> int(368) [1]=> string(2) "?>" } Previous Comments: ------------------------------------------------------------------------ [2005-05-21 18:40:38] [EMAIL PROTECTED] Description: ------------ It appears that token_get_all() does not report T_OPEN_TAG and T_WHITESPACE properly, depending on the whitespace following the opening tag. For example, when parsing ... <?php echo $var ?> ... you get T_OPEN_TAG, T_ECHO, T_WHITESPACE, T_VAR, T_WHITESPACE, and T_CLOSE_TAG. This is not entirely the expected result (I would expect T_WHITESPACE between the open tag and the echo). However, when parsing the functional equivalent... <?php echo $var ?> you get "<", "?", T_STRING ("php"), T_WHITESPACE, T_ECHO, T_WHITESPACE, T_VAR, T_WHITESPACE, and T_CLOSE_TAG. In addition, the first whitespace value reported does not include all the newlines (it drops one). Although Macs use \r for their newlines natively, the test code uses the Unix-standard \n, so I don't think it's Mac-related. If this is in fact a bug, the current behavior makes it difficult to write a reliable userland code auditor and report proper line numbers. Am I missing some assumptions behind the behavior of the tokenizer function? ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=33093&edit=1