I still think the patch isn't good. encaps_list which is the main parser rule can parse:
T_VARIABLE T_OBJECT_OPERATOR T_STRING
Your version of T_STRING would break this.
Again I might be missing something but my hunch is that it would break.
Andi
At 03:06 AM 11/11/2002 -0500, George Schlossnagle wrote:
The patch I submitted included BACKQUOTES in the token matching as well. I'm not convinced that is bad, but I will try to thoroughly test it tomorrow, and if it's broken, I'll just case it for " and heredocs.George On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote:OH I missed that. I'll check it out this evening as I have to go now. Andi At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote:Unless I misunderstand the way this works, it's not a problem that it returns a T_STRING, only possibly that it does so inside a >> BACKQUOTES. Function names and constants aren't available as barewords in DOUBLE_QUOTES or HEREDOCs, right? On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote:Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l =================================================================== RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 -0000 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 -0000 @@ -686,6 +686,7 @@ HNUM "0x"[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^&+-/*=%!~<>?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^&+-/*=%!~$<>?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>{LABEL} { +<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval->type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval->type = IS_STRING; return T_STRING; } -} - - -<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval->value.str.val = (char *) estrndup(yytext, yyleng); - zendlval->value.str.len = yyleng; - zendlval->type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } <ST_SINGLE_QUOTE>([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:--snip+ fprintf(stderr, "%s:%d\n", __FILE__,__LINE__);What's this fprintf()? This seems to be put just for debugging purpose. Moriyoshreturn T_STRING; } -<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>{LABEL_OR_WHITESPACE} { +<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval->type = IS_STRING; + fprintf(stderr, "%s:%d\n", __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>{ESCAPED_AND_WHITESPACE } { + HANDLE_NEWLINES(yytext, yyleng); + zendlval->value.str.val = (char *) estrndup(yytext, yyleng); + zendlval->value.str.len = yyleng; + zendlval->type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + <ST_SINGLE_QUOTE>([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It's the list, I don't think they allow attachments....do you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:On Sun, 10 Nov 2002, George Schlossnagle wrote:For those who came to Dan & my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me.hmm, your MUA is getting senile :) no attachment... Derick- -- ~Paul Nicholson Design Specialist @ WebPower Design "The web....the way you want it!" [EMAIL PROTECTED] www.webpowerdesign.net "It said uses Windows 98 or better, so I loaded Linux!" Registered Linux User #183202 using Register Linux System # 81891 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -----END PGP SIGNATURE------- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php-- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php-- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php-- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php-- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php
-- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php