Re: [PHP-DEV] ZEND_ADD_STRING patch
Hi, There is a problem with the patch committed. It incorrectly tokenizes things like $foo = %-{$bar} (this breaks the PEAR installer, amongst other things) I've attached a fix for it. Also, it looks like you didn't accept the part of the fix that allows for enhanced handling of heredocs. Is there a reason why? I'm sticking that in this patch again, in case you merged my last change by hand and missed that accidentally. On Friday, November 15, 2002, at 06:48 PM, George Schlossnagle wrote: Much sexier indeed. There are some flaws with it: o Tokenizes heredocs on whitespace o Doesn't count lines correctly for debug (since strings now have newlines in them) Here's a revised patch to yours that fixes those (heredocs are tokenized on newlines - I think that is best case) Andi Gutmans wrote: I propose something like the following: (not tested) It's definitely a sexier patch :) Andi RCS file: /repository/ZendEngine2/zend_language_scanner.l,v retrieving revision 1.62 diff -u -u -r1.62 zend_language_scanner.l --- zend_language_scanner.l 5 Nov 2002 22:01:35 - 1.62 +++ zend_language_scanner.l 15 Nov 2002 23:22:34 - @@ -474,6 +474,7 @@ EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?@]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] @@ -1076,6 +1077,12 @@ return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} { + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); @@ -1085,7 +1092,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; @@ -1374,7 +1381,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; Andi Index: Zend/zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.54 diff -u -3 -r1.54 zend_language_scanner.l --- Zend/zend_language_scanner.l 13 Nov 2002 03:28:23 - 1.54 +++ Zend/zend_language_scanner.l 15 Nov 2002 23:47:29 - @@ -95,7 +95,7 @@ \ while (pboundary) { \ if (*p == '\n') { \ - CG(zend_lineno)++; \ + CG(zend_lineno)++; \ } else if ((*p == '\r') (p+1 boundary) (*(p+1) != '\n')) { \ CG(zend_lineno)++; \ } \ @@ -707,6 +707,8 @@ EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t #'.:;,()|^+/*=%!~?@]|-[^])+ +ENCAPSED_STRING_WITH_NEWLINE ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?@]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] @@ -1287,6 +1289,13 @@ return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING_WITH_NEWLINE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zend_copy_value(zendlval, yytext, yyleng); @@ -1295,7 +1304,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{ENCAPSED_STRING} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1598,7 +1607,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
At 11:35 AM 11/16/2002 -0500, George Schlossnagle wrote: Hi, There is a problem with the patch committed. It incorrectly tokenizes things like $foo = %-{$bar} (this breaks the PEAR installer, amongst other things) I've attached a fix for it. Also, it looks like you didn't accept the part of the fix that allows for enhanced handling of heredocs. Is there a reason why? I'm sticking that in this patch again, in case you merged my last change by hand and missed that accidentally. Yeah I merged by hand (because of whitespace problems as you don't attach patches). Please send the me required patch vs. the current CVS. Thanks, Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Here's the patch. Looks like everything but the heredoc part is in cvs now. -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
George Schlossnagle wrote: I'm a tool. I sent the wrong patch to the list. Thanks to Andrei for pointing it out. Here is the _right_ patch (finally). diff -u -3 -r1.53 zend_language_scanner.l --- zend_language_scanner.l8 Nov 2002 13:40:54 -1.53 +++ zend_language_scanner.l15 Nov 2002 20:20:33 - -37,6 +37,7 %x ST_BACKQUOTE %x ST_HEREDOC %x ST_LOOKING_FOR_PROPERTY +%x ST_EXPECTING_OBJECT %x ST_LOOKING_FOR_VARNAME %x ST_COMMENT %x ST_ONE_LINE_COMMENT -692,6 +693,7 HNUM0x[0-9a-fA-F]+ LABEL[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+-/*=%!~?]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?] ENCAPSED_TOKENS [\[\]{}$] -823,13 +825,25 return T_EXTENDS; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}-{LABEL} { +yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC); +yyless(0); +} + + +ST_IN_SCRIPTING,ST_EXPECTING_OBJECT- { yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC); return T_OBJECT_OPERATOR; } ST_LOOKING_FOR_PROPERTY{LABEL} { -yy_pop_state(TSRMLS_C); +if(yy_top_state(TSRMLS_C) == ST_EXPECTING_OBJECT) { +yy_pop_state(TSRMLS_C); +yy_pop_state(TSRMLS_C); +} +else { +yy_pop_state(TSRMLS_C); +} zend_copy_value(zendlval, yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; -1265,7 +1279,7 return T_INLINE_HTML; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { +ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { zend_copy_value(zendlval, (yytext+1), (yyleng-1)); zendlval-type = IS_STRING; return T_VARIABLE; -1278,13 +1292,26 return T_STRING; } - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE{LABEL_OR_WHITESPACE} { +HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; } +ST_HEREDOC{LABEL} { +zend_copy_value(zendlval, yytext, yyleng); +zendlval-type = IS_STRING; +return T_STRING; +} + +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +HANDLE_NEWLINES(yytext, yyleng); +zendlval-value.str.val = (char *) estrndup(yytext, yyleng); +zendlval-value.str.len = yyleng; +zendlval-type = IS_STRING; +return T_ENCAPSED_AND_WHITESPACE; +} ST_IN_SCRIPTING{WHITESPACE} { zendlval-value.str.val = yytext; /* no copying - intentional */ -1581,14 +1608,6 } } - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { -HANDLE_NEWLINES(yytext, yyleng); -zendlval-value.str.val = (char *) estrndup(yytext, yyleng); -zendlval-value.str.len = yyleng; -zendlval-type = IS_STRING; -return T_ENCAPSED_AND_WHITESPACE; -} ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Hey, I think this patch makes the scanner much more complicated to understand. I have an idea of a patch which would make it much cleaner although under very certain cases might be a tad bit less optimized when it comes to the amount of tokens but it'd save all of the yyless() and push_stacks which are also not the fastest. I suggest changing the LABEL_OR_WHITESPACE (the name you gave it isn't too good so I'd change that too :) Change it to something like: LABEL_OR_WHITESPACE ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?] | - | -[^])+ (I removed the - and and added possibilities of mixing them in all ways except for -) This is a very small change and much much cleaner. The only case which wouldn't be optimized is if you have -foo in your encapsed strings which doesn't happen too often and the speed difference would be negligible and we'd get 99% gain and a much cleaner scanner without rescanning input which is also slower (yyless()) and less state pushing. Try it out and let me know how the results are. Also *please* send diffs also as attachments so that when people apply them we won't get bad whitespace in our sources. Thanks! Andi At 03:23 PM 11/15/2002 -0500, George Schlossnagle wrote: George Schlossnagle wrote: I'm a tool. I sent the wrong patch to the list. Thanks to Andrei for pointing it out. Here is the _right_ patch (finally). diff -u -3 -r1.53 zend_language_scanner.l --- zend_language_scanner.l8 Nov 2002 13:40:54 -1.53 +++ zend_language_scanner.l15 Nov 2002 20:20:33 - -37,6 +37,7 %x ST_BACKQUOTE %x ST_HEREDOC %x ST_LOOKING_FOR_PROPERTY +%x ST_EXPECTING_OBJECT %x ST_LOOKING_FOR_VARNAME %x ST_COMMENT %x ST_ONE_LINE_COMMENT -692,6 +693,7 HNUM0x[0-9a-fA-F]+ LABEL[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+-/*=%!~?]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?] ENCAPSED_TOKENS [\[\]{}$] -823,13 +825,25 return T_EXTENDS; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}-{LABEL} { +yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC); +yyless(0); +} + + +ST_IN_SCRIPTING,ST_EXPECTING_OBJECT- { yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC); return T_OBJECT_OPERATOR; } ST_LOOKING_FOR_PROPERTY{LABEL} { -yy_pop_state(TSRMLS_C); +if(yy_top_state(TSRMLS_C) == ST_EXPECTING_OBJECT) { +yy_pop_state(TSRMLS_C); +yy_pop_state(TSRMLS_C); +} +else { +yy_pop_state(TSRMLS_C); +} zend_copy_value(zendlval, yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; -1265,7 +1279,7 return T_INLINE_HTML; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { +ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { zend_copy_value(zendlval, (yytext+1), (yyleng-1)); zendlval-type = IS_STRING; return T_VARIABLE; -1278,13 +1292,26 return T_STRING; } - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE{LABEL_OR_WHITESPACE} { +HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; } +ST_HEREDOC{LABEL} { +zend_copy_value(zendlval, yytext, yyleng); +zendlval-type = IS_STRING; +return T_STRING; +} + +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +HANDLE_NEWLINES(yytext, yyleng); +zendlval-value.str.val = (char *) estrndup(yytext, yyleng); +zendlval-value.str.len = yyleng; +zendlval-type = IS_STRING; +return T_ENCAPSED_AND_WHITESPACE; +} ST_IN_SCRIPTING{WHITESPACE} { zendlval-value.str.val = yytext; /* no copying - intentional */ -1581,14 +1608,6 } } - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { -HANDLE_NEWLINES(yytext, yyleng); -zendlval-value.str.val = (char *) estrndup(yytext, yyleng); -zendlval-value.str.len = yyleng; -zendlval-type = IS_STRING; -return T_ENCAPSED_AND_WHITESPACE; -} ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Andi Gutmans wrote: Try it out and let me know how the results are. Also *please* send diffs also as attachments so that when people apply them we won't get bad whitespace in our sources. php-dev seems to eat my attachments -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
I propose something like the following: (not tested) It's definitely a sexier patch :) Andi RCS file: /repository/ZendEngine2/zend_language_scanner.l,v retrieving revision 1.62 diff -u -u -r1.62 zend_language_scanner.l --- zend_language_scanner.l 5 Nov 2002 22:01:35 - 1.62 +++ zend_language_scanner.l 15 Nov 2002 23:22:34 - -474,6 +474,7 EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?] -1076,6 +1077,12 return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} { + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); -1085,7 +1092,7 } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; -1374,7 +1381,7 } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
At 06:23 PM 11/15/2002 -0500, George Schlossnagle wrote: Andi Gutmans wrote: Try it out and let me know how the results are. Also *please* send diffs also as attachments so that when people apply them we won't get bad whitespace in our sources. php-dev seems to eat my attachments Maybe you should try pine or mutt. They usually work. Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Much sexier indeed. There are some flaws with it: o Tokenizes heredocs on whitespace o Doesn't count lines correctly for debug (since strings now have newlines in them) Here's a revised patch to yours that fixes those (heredocs are tokenized on newlines - I think that is best case) Andi Gutmans wrote: I propose something like the following: (not tested) It's definitely a sexier patch :) Andi RCS file: /repository/ZendEngine2/zend_language_scanner.l,v retrieving revision 1.62 diff -u -u -r1.62 zend_language_scanner.l --- zend_language_scanner.l 5 Nov 2002 22:01:35 - 1.62 +++ zend_language_scanner.l 15 Nov 2002 23:22:34 - -474,6 +474,7 EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?] -1076,6 +1077,12 return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} { + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); -1085,7 +1092,7 } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; -1374,7 +1381,7 } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; Andi Index: Zend/zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.54 diff -u -3 -r1.54 zend_language_scanner.l --- Zend/zend_language_scanner.l13 Nov 2002 03:28:23 - 1.54 +++ Zend/zend_language_scanner.l15 Nov 2002 23:47:29 - -95,7 +95,7 \ while (pboundary) { \ if (*p == '\n') { \ - CG(zend_lineno)++; \ + CG(zend_lineno)++; + \ } else if ((*p == '\r') (p+1 boundary) (*(p+1) != '\n')) { \ CG(zend_lineno)++; \ } \ -707,6 +707,8 EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t #'.:;,()|^+/*=%!~?]|-[^])+ +ENCAPSED_STRING_WITH_NEWLINE ([a-zA-Z0-9_\x7f-\xff \t\n\r +#'.:;,()|^+/*=%!~?]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?] -1287,6 +1289,13 return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING_WITH_NEWLINE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zend_copy_value(zendlval, yytext, yyleng); -1295,7 +1304,7 } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{ENCAPSED_STRING} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; -1598,7 +1607,7 } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
I commited it. Thanks, Andi At 06:48 PM 11/15/2002 -0500, George Schlossnagle wrote: Much sexier indeed. There are some flaws with it: o Tokenizes heredocs on whitespace o Doesn't count lines correctly for debug (since strings now have newlines in them) Here's a revised patch to yours that fixes those (heredocs are tokenized on newlines - I think that is best case) Andi Gutmans wrote: I propose something like the following: (not tested) It's definitely a sexier patch :) Andi RCS file: /repository/ZendEngine2/zend_language_scanner.l,v retrieving revision 1.62 diff -u -u -r1.62 zend_language_scanner.l --- zend_language_scanner.l 5 Nov 2002 22:01:35 - 1.62 +++ zend_language_scanner.l 15 Nov 2002 23:22:34 - @@ -474,6 +474,7 @@ EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?@]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] @@ -1076,6 +1077,12 @@ return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} { + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); @@ -1085,7 +1092,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{LABEL} { zendlval-value.str.val = (char *)estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; @@ -1374,7 +1381,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; Andi Index: Zend/zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.54 diff -u -3 -r1.54 zend_language_scanner.l --- Zend/zend_language_scanner.l13 Nov 2002 03:28:23 - 1.54 +++ Zend/zend_language_scanner.l15 Nov 2002 23:47:29 - @@ -95,7 +95,7 @@ \ while (pboundary) { \ if (*p == '\n') { \ - CG(zend_lineno)++; \ + CG(zend_lineno)++; \ } else if ((*p == '\r') (p+1 boundary) (*(p+1) != '\n')) { \ CG(zend_lineno)++; \ } \ @@ -707,6 +707,8 @@ EXPONENT_DNUM (({LNUM}|{DNUM})[eE][+-]?{LNUM}) HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* +ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t #'.:;,()|^+/*=%!~?@]|-[^])+ +ENCAPSED_STRING_WITH_NEWLINE ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?@]|-[^])+ WHITESPACE [ \n\r\t]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] @@ -1287,6 +1289,13 @@ return T_VARIABLE; } +ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING_WITH_NEWLINE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *)estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_STRING; +} ST_IN_SCRIPTING{LABEL} { zend_copy_value(zendlval, yytext, yyleng); @@ -1295,7 +1304,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_HEREDOC{ENCAPSED_STRING} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1598,7 +1607,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { HANDLE_NEWLINES(yytext, yyleng); zendlval-value.str.val = (char *) estrndup(yytext, yyleng); zendlval-value.str.len = yyleng; -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Hi Andi, The last patch I submitted was broken as well. Following that, I had the bright idea to run the prospective changes through the unit-tester to ensure correct performance. Here's a patch which achieves that. It does not work for heredocs (i.e. they are tokenized as before, but behave correctly from a language perspective), but otherwise optimizes correctly. Here you go: diff -u -3 -r1.53 zend_language_scanner.l --- Zend/zend_language_scanner.l8 Nov 2002 13:40:54 -1.53 +++ Zend/zend_language_scanner.l12 Nov 2002 22:11:31 - -37,6 +37,7 %x ST_BACKQUOTE %x ST_HEREDOC %x ST_LOOKING_FOR_PROPERTY +%x ST_EXPECTING_OBJECT %x ST_LOOKING_FOR_VARNAME %x ST_COMMENT %x ST_ONE_LINE_COMMENT -692,6 +693,7 HNUM0x[0-9a-fA-F]+ LABEL[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+-/*=%!~?]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?] ENCAPSED_TOKENS [\[\]{}$] -823,13 +825,22 return T_EXTENDS; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}-{LABEL} { +yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC); +yyless(0); +} + + +ST_IN_SCRIPTING,ST_EXPECTING_OBJECT- { yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC); return T_OBJECT_OPERATOR; } ST_LOOKING_FOR_PROPERTY{LABEL} { yy_pop_state(TSRMLS_C); +if(yy_top_state(TSRMLS_C) == ST_EXPECTING_OBJECT) { +yy_pop_state(TSRMLS_C); +} zend_copy_value(zendlval, yytext, yyleng); zendlval-value.str.len = yyleng; zendlval-type = IS_STRING; -1265,7 +1276,7 return T_INLINE_HTML; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { +ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { zend_copy_value(zendlval, (yytext+1), (yyleng-1)); zendlval-type = IS_STRING; return T_VARIABLE; -1278,13 +1289,26 return T_STRING; } - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE{LABEL_OR_WHITESPACE} { +HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; } +ST_HEREDOC{LABEL} { +zend_copy_value(zendlval, yytext, yyleng); +zendlval-type = IS_STRING; +return T_STRING; +} + +ST_HEREDOC{ESCAPED_AND_WHITESPACE} { +HANDLE_NEWLINES(yytext, yyleng); +zendlval-value.str.val = (char *) estrndup(yytext, yyleng); +zendlval-value.str.len = yyleng; +zendlval-type = IS_STRING; +return T_ENCAPSED_AND_WHITESPACE; +} ST_IN_SCRIPTING{WHITESPACE} { zendlval-value.str.val = yytext; /* no copying - intentional */ -1581,14 +1605,6 } } - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { -HANDLE_NEWLINES(yytext, yyleng); -zendlval-value.str.val = (char *) estrndup(yytext, yyleng); -zendlval-value.str.len = yyleng; -zendlval-type = IS_STRING; -return T_ENCAPSED_AND_WHITESPACE; -} ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
The patch I submitted included BACKQUOTES in the token matching as well. I'm not convinced that is bad, but I will try to thoroughly test it tomorrow, and if it's broken, I'll just case it for and heredocs. George On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote: OH I missed that. I'll check it out this evening as I have to go now. Andi At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote: Unless I misunderstand the way this works, it's not a problem that it returns a T_STRING, only possibly that it does so inside a BACKQUOTES. Function names and constants aren't available as barewords in DOUBLE_QUOTES or HEREDOCs, right? On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote: Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote: that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote: --snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE } { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP
Re: [PHP-DEV] ZEND_ADD_STRING patch
Hi, I still think the patch isn't good. encaps_list which is the main parser rule can parse: T_VARIABLE T_OBJECT_OPERATOR T_STRING Your version of T_STRING would break this. Again I might be missing something but my hunch is that it would break. Andi At 03:06 AM 11/11/2002 -0500, George Schlossnagle wrote: The patch I submitted included BACKQUOTES in the token matching as well. I'm not convinced that is bad, but I will try to thoroughly test it tomorrow, and if it's broken, I'll just case it for and heredocs. George On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote: OH I missed that. I'll check it out this evening as I have to go now. Andi At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote: Unless I misunderstand the way this works, it's not a problem that it returns a T_STRING, only possibly that it does so inside a BACKQUOTES. Function names and constants aren't available as barewords in DOUBLE_QUOTES or HEREDOCs, right? On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote: Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote: that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote: --snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE } { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202
Re: [PHP-DEV] ZEND_ADD_STRING patch
Hi, You're right. This patch should address that concern: diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 11 Nov 2002 22:17:09 - @@ -32,6 +32,7 @@ %} %x ST_IN_SCRIPTING +%x ST_EXPECTING_OBJECT %x ST_DOUBLE_QUOTES %x ST_SINGLE_QUOTE %x ST_BACKQUOTE @@ -686,6 +687,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -817,7 +819,13 @@ return T_EXTENDS; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- { +ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}- {LABEL} { +yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC); +yyless(0); +} + +ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_H EREDOC- { +fprintf(stderr, matched T_OBJECT_OPERATOR\n); yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC); return T_OBJECT_OPERATOR; } @@ -830,7 +838,7 @@ return T_STRING; } -ST_LOOKING_FOR_PROPERTY{ANY_CHAR} { +ST_EXPECTING_OBJECT,ST_LOOKING_FOR_PROPERTY{ANY_CHAR} { yyless(0); yy_pop_state(TSRMLS_C); } @@ -1255,7 +1263,7 @@ return T_INLINE_HTML; } -ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} { +ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BAC KQUOTE${LABEL} { zend_copy_value(zendlval, (yytext+1), (yyleng-1)); zendlval-type = IS_STRING; return T_VARIABLE; @@ -1269,7 +1277,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1577,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Monday, November 11, 2002, at 02:44 PM, Andi Gutmans wrote: Hi, I still think the patch isn't good. encaps_list which is the main parser rule can parse: T_VARIABLE T_OBJECT_OPERATOR T_STRING Your version of T_STRING would break this. Again I might be missing something but my hunch is that it would break. Andi At 03:06 AM 11/11/2002 -0500, George Schlossnagle wrote: The patch I submitted included BACKQUOTES in the token matching as well. I'm not convinced that is bad, but I will try to thoroughly test it tomorrow, and if it's broken, I'll just case it for and heredocs. George On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote: OH I missed that. I'll check it out this evening as I have to go now. Andi At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote: Unless I misunderstand the way this works, it's not a problem that it returns a T_STRING, only possibly that it does so inside a BACKQUOTES. Function names and constants aren't available as barewords in DOUBLE_QUOTES or HEREDOCs, right? On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote: Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote: that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l == = RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - - ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { -
Re: [PHP-DEV] ZEND_ADD_STRING patch
On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick -- --- Derick Rethans http://derickrethans.nl/ JDI Media Solutions --[ if you hold a unix shell to your ear, do you hear the c? ]- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
On Sunday, November 10, 2002, at 05:06 PM, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. George -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
I got the second attachment mail ok but I'll just inline it here: --- Zend/zend_language_scanner.l2002-11-10 16:53:27.0 -0500 +++ /Users/george/src/php4/Zend/zend_language_scanner.l 2002-11-10 16:39:11.0 -0500 @@ -686,7 +686,6 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ -LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1266,13 +1265,15 @@ ST_IN_SCRIPTING{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
--snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote: --snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote: that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote: --snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] ZEND_ADD_STRING patch
Unless I misunderstand the way this works, it's not a problem that it returns a T_STRING, only possibly that it does so inside a BACKQUOTES. Function names and constants aren't available as barewords in DOUBLE_QUOTES or HEREDOCs, right? On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote: Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote: that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote: --snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing
Re: [PHP-DEV] ZEND_ADD_STRING patch
OH I missed that. I'll check it out this evening as I have to go now. Andi At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote: Unless I misunderstand the way this works, it's not a problem that it returns a T_STRING, only possibly that it does so inside a BACKQUOTES. Function names and constants aren't available as barewords in DOUBLE_QUOTES or HEREDOCs, right? On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote: Hi, A patch which improves on this would be welcome. However, this patch at first glance is bogus. You are returning T_STRING with possible spaces and other non A-Za-z_ chars. This token is also used as tokens such as constants and function names . Andi At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote: that would be my debugging from my 'clean' cvs copy. :) You don't want that. Sorry. Here's a better patch: Index: zend_language_scanner.l === RCS file: /repository/Zend/zend_language_scanner.l,v retrieving revision 1.51 diff -u -3 -r1.51 zend_language_scanner.l --- zend_language_scanner.l 2 Nov 2002 16:32:26 - 1.51 +++ zend_language_scanner.l 10 Nov 2002 23:30:28 - @@ -686,6 +686,7 @@ HNUM 0x[0-9a-fA-F]+ LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* WHITESPACE [ \n\r\t]+ +LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+ TABS_AND_SPACES [ \t]* TOKENS [;:,.\[\]()|^+-/*=%!~$?@] ENCAPSED_TOKENS [\[\]{}$] @@ -1269,7 +1270,7 @@ } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; return T_STRING; @@ -1569,15 +1570,6 @@ zendlval-type = IS_STRING; return T_STRING; } -} - - -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { - HANDLE_NEWLINES(yytext, yyleng); - zendlval-value.str.val = (char *) estrndup(yytext, yyleng); - zendlval-value.str.len = yyleng; - zendlval-type = IS_STRING; - return T_ENCAPSED_AND_WHITESPACE; } ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote: --snip +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); What's this fprintf()? This seems to be put just for debugging purpose. Moriyosh return T_STRING; } -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} { +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} { zend_copy_value(zendlval, yytext, yyleng); zendlval-type = IS_STRING; +fprintf(stderr, %s:%d\n, __FILE__,__LINE__); return T_STRING; } @@ -1572,6 +1573,15 @@ } } + +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} { + HANDLE_NEWLINES(yytext, yyleng); + zendlval-value.str.val = (char *) estrndup(yytext, yyleng); + zendlval-value.str.len = yyleng; + zendlval-type = IS_STRING; + return T_ENCAPSED_AND_WHITESPACE; +} + ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ { HANDLE_NEWLINES(yytext, yyleng); zend_copy_value(zendlval, yytext, yyleng); On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It's the list, I don't think they allow attachmentsdo you have web space you could upload to? On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote: On Sun, 10 Nov 2002, George Schlossnagle wrote: For those who came to Dan my or Derick's talk at the Int. PHP Conference, we both covered the bad inefficiency in the parser that results in strings with variables in them being tokenized on whitespace. This results in a huge number of unnecessary opcodes in strings. Attached (hopefully, as my new MUA seems to be fickle) is a first shot at a fix to the parser to keep this from happening, so that you don't need an optimizer to clear up this issue. I've tested this locally. It still introduces a single unnecessary opcode after variable in certain cases, but it works for me. hmm, your MUA is getting senile :) no attachment... Derick - -- ~Paul Nicholson Design Specialist @ WebPower Design The webthe way you want it! [EMAIL PROTECTED] www.webpowerdesign.net It said uses Windows 98 or better, so I loaded Linux! Registered Linux User #183202 using Register Linux System # 81891 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7 u+KZNZj2lZWzXmRiZmYrq4U= =ChWV -END PGP SIGNATURE- -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP