Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-16 Thread George Schlossnagle
Hi,

There is a problem with the patch committed.  It incorrectly tokenizes 
things like

$foo = %-{$bar}

(this breaks the PEAR installer, amongst other things)

I've attached a fix for it.

Also, it looks like you didn't accept the part of the fix that allows 
for enhanced handling of heredocs.  Is there a reason why?  I'm 
sticking that in this patch again, in case you merged my last change by 
hand and missed that accidentally.




On Friday, November 15, 2002, at 06:48 PM, George Schlossnagle wrote:


Much sexier indeed.  There are some flaws with it:

o  Tokenizes heredocs on whitespace
o  Doesn't count lines correctly for debug (since strings now have 
newlines in them)

Here's a revised patch to yours that fixes those (heredocs are 
tokenized on newlines - I think that is best case)


Andi Gutmans wrote:

I propose something like the following: (not tested)
It's definitely a sexier patch :)

Andi

RCS file: /repository/ZendEngine2/zend_language_scanner.l,v
retrieving revision 1.62
diff -u -u -r1.62 zend_language_scanner.l
--- zend_language_scanner.l 5 Nov 2002 22:01:35 -   1.62
+++ zend_language_scanner.l 15 Nov 2002 23:22:34 -
@@ -474,6 +474,7 @@
 EXPONENT_DNUM  (({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r 
#'.:;,()|^+/*=%!~?@]|-[^])+
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
@@ -1076,6 +1077,12 @@
return T_VARIABLE;
 }

+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} {
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}

 ST_IN_SCRIPTING{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
@@ -1085,7 +1092,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
@@ -1374,7 +1381,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
HANDLE_NEWLINES(yytext, yyleng);
zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;



Andi




Index: Zend/zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.54
diff -u -3 -r1.54 zend_language_scanner.l
--- Zend/zend_language_scanner.l	13 Nov 2002 03:28:23 -	1.54
+++ Zend/zend_language_scanner.l	15 Nov 2002 23:47:29 -
@@ -95,7 +95,7 @@
 \
 	while (pboundary) {		\
 		if (*p == '\n') {		\
-			CG(zend_lineno)++;	\
+		CG(zend_lineno)++;	\
 		} else if ((*p == '\r')  (p+1  boundary)  (*(p+1) != '\n')) 
{		\
 			CG(zend_lineno)++;	\
 		}		\
@@ -707,6 +707,8 @@
 EXPONENT_DNUM	(({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM	0x[0-9a-fA-F]+
 LABEL	[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t 
#'.:;,()|^+/*=%!~?@]|-[^])+
+ENCAPSED_STRING_WITH_NEWLINE ([a-zA-Z0-9_\x7f-\xff \t\n\r 
#'.:;,()|^+/*=%!~?@]|-[^])+
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
@@ -1287,6 +1289,13 @@
 	return T_VARIABLE;
 }

+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING_WITH_NEWLINE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}

 ST_IN_SCRIPTING{LABEL} {
  	zend_copy_value(zendlval, yytext, yyleng);
@@ -1295,7 +1304,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{ENCAPSED_STRING} {
  	zend_copy_value(zendlval, yytext, yyleng);
 	zendlval-type = IS_STRING;
 	return T_STRING;
@@ -1598,7 +1607,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
 	HANDLE_NEWLINES(yytext, yyleng);
 	zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
 	zendlval-value.str.len = yyleng;

--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php
-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-16 Thread Andi Gutmans
At 11:35 AM 11/16/2002 -0500, George Schlossnagle wrote:

Hi,

There is a problem with the patch committed.  It incorrectly tokenizes 
things like

$foo = %-{$bar}

(this breaks the PEAR installer, amongst other things)

I've attached a fix for it.

Also, it looks like you didn't accept the part of the fix that allows for 
enhanced handling of heredocs.  Is there a reason why?  I'm sticking that 
in this patch again, in case you merged my last change by hand and missed 
that accidentally.

Yeah I merged by hand (because of whitespace problems as you don't attach 
patches).
Please send the me required patch vs. the current CVS.

Thanks,

Andi


--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-16 Thread George Schlossnagle
Here's the patch.  Looks like everything but the heredoc part is in cvs 
now.






-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread George Schlossnagle
George Schlossnagle wrote:


I'm a tool.  I sent the wrong patch to the list.  Thanks to Andrei for 
pointing it out.  Here is the _right_ patch (finally).


diff -u -3 -r1.53 zend_language_scanner.l
--- zend_language_scanner.l8 Nov 2002 13:40:54 -1.53
+++ zend_language_scanner.l15 Nov 2002 20:20:33 -
 -37,6 +37,7 
%x ST_BACKQUOTE
%x ST_HEREDOC
%x ST_LOOKING_FOR_PROPERTY
+%x ST_EXPECTING_OBJECT
%x ST_LOOKING_FOR_VARNAME
%x ST_COMMENT
%x ST_ONE_LINE_COMMENT
 -692,6 +693,7 
HNUM0x[0-9a-fA-F]+
LABEL[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+-/*=%!~?]+
TABS_AND_SPACES [ \t]*
TOKENS [;:,.\[\]()|^+-/*=%!~$?]
ENCAPSED_TOKENS [\[\]{}$]
 -823,13 +825,25 
return T_EXTENDS;
}

-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}-{LABEL} {
+yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC);
+yyless(0);
+}
+
+
+ST_IN_SCRIPTING,ST_EXPECTING_OBJECT- {
yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC);
return T_OBJECT_OPERATOR;
}

ST_LOOKING_FOR_PROPERTY{LABEL} {
-yy_pop_state(TSRMLS_C);
+if(yy_top_state(TSRMLS_C) == ST_EXPECTING_OBJECT) {
+yy_pop_state(TSRMLS_C);
+yy_pop_state(TSRMLS_C);
+}
+else {
+yy_pop_state(TSRMLS_C);
+}
 zend_copy_value(zendlval, yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
 -1265,7 +1279,7 
return T_INLINE_HTML;
}

-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} {
+ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} 
{
 zend_copy_value(zendlval, (yytext+1), (yyleng-1));
zendlval-type = IS_STRING;
return T_VARIABLE;
 -1278,13 +1292,26 
return T_STRING;
}

-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE{LABEL_OR_WHITESPACE} {
+HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
}

+ST_HEREDOC{LABEL} {
+zend_copy_value(zendlval, yytext, yyleng);
+zendlval-type = IS_STRING;
+return T_STRING;
+}
+
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+HANDLE_NEWLINES(yytext, yyleng);
+zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+zendlval-value.str.len = yyleng;
+zendlval-type = IS_STRING;
+return T_ENCAPSED_AND_WHITESPACE;
+}

ST_IN_SCRIPTING{WHITESPACE} {
zendlval-value.str.val = yytext; /* no copying - intentional */
 -1581,14 +1608,6 
}
}

-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-HANDLE_NEWLINES(yytext, yyleng);
-zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-zendlval-value.str.len = yyleng;
-zendlval-type = IS_STRING;
-return T_ENCAPSED_AND_WHITESPACE;
-}

ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
HANDLE_NEWLINES(yytext, yyleng);



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread Andi Gutmans
Hey,

I think this patch makes the scanner much more complicated to understand. I 
have an idea of a patch which would make it much cleaner although under 
very certain cases might be a tad bit less optimized when it comes to the 
amount of tokens but it'd save all of the yyless() and push_stacks which 
are also not the fastest.
I suggest changing the LABEL_OR_WHITESPACE (the name you gave it isn't too 
good so I'd change that too :)
Change it to something like:
LABEL_OR_WHITESPACE ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?] | 
- | -[^])+
(I removed the - and  and added possibilities of mixing them in all ways 
except for -)
This is a very small change and much much cleaner.
The only case which wouldn't be optimized is if you have -foo in your 
encapsed strings which doesn't happen too often and the speed difference 
would be negligible and we'd get 99% gain and a much cleaner scanner 
without rescanning input which is also slower (yyless()) and less state 
pushing.
Try it out and let me know how the results are. Also *please* send diffs 
also as attachments so that when people apply them we won't get bad 
whitespace in our sources.

Thanks!
Andi

At 03:23 PM 11/15/2002 -0500, George Schlossnagle wrote:
George Schlossnagle wrote:


I'm a tool.  I sent the wrong patch to the list.  Thanks to Andrei for 
pointing it out.  Here is the _right_ patch (finally).


diff -u -3 -r1.53 zend_language_scanner.l
--- zend_language_scanner.l8 Nov 2002 13:40:54 -1.53
+++ zend_language_scanner.l15 Nov 2002 20:20:33 -
 -37,6 +37,7 
%x ST_BACKQUOTE
%x ST_HEREDOC
%x ST_LOOKING_FOR_PROPERTY
+%x ST_EXPECTING_OBJECT
%x ST_LOOKING_FOR_VARNAME
%x ST_COMMENT
%x ST_ONE_LINE_COMMENT
 -692,6 +693,7 
HNUM0x[0-9a-fA-F]+
LABEL[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+-/*=%!~?]+
TABS_AND_SPACES [ \t]*
TOKENS [;:,.\[\]()|^+-/*=%!~$?]
ENCAPSED_TOKENS [\[\]{}$]
 -823,13 +825,25 
return T_EXTENDS;
}
-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}-{LABEL} {
+yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC);
+yyless(0);
+}
+
+
+ST_IN_SCRIPTING,ST_EXPECTING_OBJECT- {
yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC);
return T_OBJECT_OPERATOR;
}
ST_LOOKING_FOR_PROPERTY{LABEL} {
-yy_pop_state(TSRMLS_C);
+if(yy_top_state(TSRMLS_C) == ST_EXPECTING_OBJECT) {
+yy_pop_state(TSRMLS_C);
+yy_pop_state(TSRMLS_C);
+}
+else {
+yy_pop_state(TSRMLS_C);
+}
 zend_copy_value(zendlval, yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
 -1265,7 +1279,7 
return T_INLINE_HTML;
}
-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} {
+ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} 
{
 zend_copy_value(zendlval, (yytext+1), (yyleng-1));
zendlval-type = IS_STRING;
return T_VARIABLE;
 -1278,13 +1292,26 
return T_STRING;
}
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE{LABEL_OR_WHITESPACE} {
+HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
}
+ST_HEREDOC{LABEL} {
+zend_copy_value(zendlval, yytext, yyleng);
+zendlval-type = IS_STRING;
+return T_STRING;
+}
+
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+HANDLE_NEWLINES(yytext, yyleng);
+zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+zendlval-value.str.len = yyleng;
+zendlval-type = IS_STRING;
+return T_ENCAPSED_AND_WHITESPACE;
+}
ST_IN_SCRIPTING{WHITESPACE} {
zendlval-value.str.val = yytext; /* no copying - intentional */
 -1581,14 +1608,6 
}
}
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-HANDLE_NEWLINES(yytext, yyleng);
-zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-zendlval-value.str.len = yyleng;
-zendlval-type = IS_STRING;
-return T_ENCAPSED_AND_WHITESPACE;
-}
ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
HANDLE_NEWLINES(yytext, yyleng);



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread George Schlossnagle
Andi Gutmans wrote:



Try it out and let me know how the results are. Also *please* send 
diffs also as attachments so that when people apply them we won't get 
bad whitespace in our sources. 


php-dev seems to eat my attachments




--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread Andi Gutmans
I propose something like the following: (not tested)
It's definitely a sexier patch :)

Andi

RCS file: /repository/ZendEngine2/zend_language_scanner.l,v
retrieving revision 1.62
diff -u -u -r1.62 zend_language_scanner.l
--- zend_language_scanner.l 5 Nov 2002 22:01:35 -   1.62
+++ zend_language_scanner.l 15 Nov 2002 23:22:34 -
 -474,6 +474,7 
 EXPONENT_DNUM  (({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+/*=%!~?]|-[^])+
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?]
 -1076,6 +1077,12 
return T_VARIABLE;
 }

+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} {
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}

 ST_IN_SCRIPTING{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
 -1085,7 +1092,7 
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
 -1374,7 +1381,7 
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
HANDLE_NEWLINES(yytext, yyleng);
zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;



Andi


--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread Andi Gutmans
At 06:23 PM 11/15/2002 -0500, George Schlossnagle wrote:

Andi Gutmans wrote:



Try it out and let me know how the results are. Also *please* send diffs 
also as attachments so that when people apply them we won't get bad 
whitespace in our sources.


php-dev seems to eat my attachments


Maybe you should try pine or mutt. They usually work.

Andi


--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread George Schlossnagle
Much sexier indeed.  There are some flaws with it:

o  Tokenizes heredocs on whitespace
o  Doesn't count lines correctly for debug (since strings now have 
newlines in them)

Here's a revised patch to yours that fixes those (heredocs are tokenized 
on newlines - I think that is best case)


Andi Gutmans wrote:

I propose something like the following: (not tested)
It's definitely a sexier patch :)

Andi

RCS file: /repository/ZendEngine2/zend_language_scanner.l,v
retrieving revision 1.62
diff -u -u -r1.62 zend_language_scanner.l
--- zend_language_scanner.l 5 Nov 2002 22:01:35 -   1.62
+++ zend_language_scanner.l 15 Nov 2002 23:22:34 -
 -474,6 +474,7 
 EXPONENT_DNUM  (({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r 
#'.:;,()|^+/*=%!~?]|-[^])+
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?]
 -1076,6 +1077,12 
return T_VARIABLE;
 }

+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} {
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}

 ST_IN_SCRIPTING{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
 -1085,7 +1092,7 
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
 -1374,7 +1381,7 
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
HANDLE_NEWLINES(yytext, yyleng);
zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;



Andi





Index: Zend/zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.54
diff -u -3 -r1.54 zend_language_scanner.l
--- Zend/zend_language_scanner.l13 Nov 2002 03:28:23 -  1.54
+++ Zend/zend_language_scanner.l15 Nov 2002 23:47:29 -
 -95,7 +95,7 
   
 \
while (pboundary) {   
 \
if (*p == '\n') {  
 \
-   CG(zend_lineno)++; 
 \
+   CG(zend_lineno)++; 
+ \
} else if ((*p == '\r')  (p+1  boundary)  (*(p+1) != '\n')) { 
 \
CG(zend_lineno)++; 
 \
}  
 \
 -707,6 +707,8 
 EXPONENT_DNUM  (({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t #'.:;,()|^+/*=%!~?]|-[^])+ 
+ENCAPSED_STRING_WITH_NEWLINE ([a-zA-Z0-9_\x7f-\xff \t\n\r 
+#'.:;,()|^+/*=%!~?]|-[^])+ 
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?]
 -1287,6 +1289,13 
return T_VARIABLE;
 }
 
+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING_WITH_NEWLINE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}
 
 ST_IN_SCRIPTING{LABEL} {
zend_copy_value(zendlval, yytext, yyleng);
 -1295,7 +1304,7 
 }
 
 
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{ENCAPSED_STRING} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
 -1598,7 +1607,7 
 }
 
 
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
HANDLE_NEWLINES(yytext, yyleng);
zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-15 Thread Andi Gutmans
I commited it.
Thanks,

Andi

At 06:48 PM 11/15/2002 -0500, George Schlossnagle wrote:

Much sexier indeed.  There are some flaws with it:

o  Tokenizes heredocs on whitespace
o  Doesn't count lines correctly for debug (since strings now have 
newlines in them)

Here's a revised patch to yours that fixes those (heredocs are tokenized 
on newlines - I think that is best case)


Andi Gutmans wrote:

I propose something like the following: (not tested)
It's definitely a sexier patch :)

Andi

RCS file: /repository/ZendEngine2/zend_language_scanner.l,v
retrieving revision 1.62
diff -u -u -r1.62 zend_language_scanner.l
--- zend_language_scanner.l 5 Nov 2002 22:01:35 -   1.62
+++ zend_language_scanner.l 15 Nov 2002 23:22:34 -
@@ -474,6 +474,7 @@
 EXPONENT_DNUM  (({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t\n\r 
#'.:;,()|^+/*=%!~?@]|-[^])+
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
@@ -1076,6 +1077,12 @@
return T_VARIABLE;
 }

+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING} {
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}

 ST_IN_SCRIPTING{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
@@ -1085,7 +1092,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{LABEL} {
zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
@@ -1374,7 +1381,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
HANDLE_NEWLINES(yytext, yyleng);
zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;



Andi





Index: Zend/zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.54
diff -u -3 -r1.54 zend_language_scanner.l
--- Zend/zend_language_scanner.l13 Nov 2002 03:28:23 -  1.54
+++ Zend/zend_language_scanner.l15 Nov 2002 23:47:29 -
@@ -95,7 +95,7 @@

\
while (pboundary) 
{ 
\
if (*p == '\n') 
{ 
\
-   CG(zend_lineno)++; 
 \
+   CG(zend_lineno)++; 
 \
} else if ((*p == '\r')  (p+1  boundary)  (*(p+1) != 
'\n')) {  \
CG(zend_lineno)++; 
   \
} 
   \
@@ -707,6 +707,8 @@
 EXPONENT_DNUM  (({LNUM}|{DNUM})[eE][+-]?{LNUM})
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
+ENCAPSED_STRING ([a-zA-Z0-9_\x7f-\xff \t #'.:;,()|^+/*=%!~?@]|-[^])+
+ENCAPSED_STRING_WITH_NEWLINE ([a-zA-Z0-9_\x7f-\xff \t\n\r 
#'.:;,()|^+/*=%!~?@]|-[^])+
 WHITESPACE [ \n\r\t]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
@@ -1287,6 +1289,13 @@
return T_VARIABLE;
 }

+ST_DOUBLE_QUOTES,ST_BACKQUOTE{ENCAPSED_STRING_WITH_NEWLINE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *)estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_STRING;
+}

 ST_IN_SCRIPTING{LABEL} {
zend_copy_value(zendlval, yytext, yyleng);
@@ -1295,7 +1304,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_HEREDOC{ENCAPSED_STRING} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1598,7 +1607,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
HANDLE_NEWLINES(yytext, yyleng);
zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
zendlval-value.str.len = yyleng;


--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-12 Thread George Schlossnagle
Hi Andi,

The last patch I submitted was broken as well.  Following that, I had 
the bright idea to run the prospective changes through the unit-tester 
to ensure correct performance.  Here's a patch which achieves that.  It 
does not work for heredocs (i.e. they are tokenized as before, but 
behave correctly from a language perspective), but otherwise optimizes 
correctly.

Here you go:

diff -u -3 -r1.53 zend_language_scanner.l
--- Zend/zend_language_scanner.l8 Nov 2002 13:40:54 -1.53
+++ Zend/zend_language_scanner.l12 Nov 2002 22:11:31 -
 -37,6 +37,7 
%x ST_BACKQUOTE
%x ST_HEREDOC
%x ST_LOOKING_FOR_PROPERTY
+%x ST_EXPECTING_OBJECT
%x ST_LOOKING_FOR_VARNAME
%x ST_COMMENT
%x ST_ONE_LINE_COMMENT
 -692,6 +693,7 
HNUM0x[0-9a-fA-F]+
LABEL[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \t\n\r #'.:;,()|^+-/*=%!~?]+
TABS_AND_SPACES [ \t]*
TOKENS [;:,.\[\]()|^+-/*=%!~$?]
ENCAPSED_TOKENS [\[\]{}$]
 -823,13 +825,22 
return T_EXTENDS;
}

-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}-{LABEL} {
+yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC);
+yyless(0);
+}
+
+
+ST_IN_SCRIPTING,ST_EXPECTING_OBJECT- {
yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC);
return T_OBJECT_OPERATOR;
}

ST_LOOKING_FOR_PROPERTY{LABEL} {
yy_pop_state(TSRMLS_C);
+if(yy_top_state(TSRMLS_C) == ST_EXPECTING_OBJECT) {
+yy_pop_state(TSRMLS_C);
+}
 zend_copy_value(zendlval, yytext, yyleng);
zendlval-value.str.len = yyleng;
zendlval-type = IS_STRING;
 -1265,7 +1276,7 
return T_INLINE_HTML;
}

-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} {
+ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} 
{
 zend_copy_value(zendlval, (yytext+1), (yyleng-1));
zendlval-type = IS_STRING;
return T_VARIABLE;
 -1278,13 +1289,26 
return T_STRING;
}

-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE{LABEL_OR_WHITESPACE} {
+HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
}

+ST_HEREDOC{LABEL} {
+zend_copy_value(zendlval, yytext, yyleng);
+zendlval-type = IS_STRING;
+return T_STRING;
+}
+
+ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+HANDLE_NEWLINES(yytext, yyleng);
+zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+zendlval-value.str.len = yyleng;
+zendlval-type = IS_STRING;
+return T_ENCAPSED_AND_WHITESPACE;
+}

ST_IN_SCRIPTING{WHITESPACE} {
zendlval-value.str.val = yytext; /* no copying - intentional */
 -1581,14 +1605,6 
}
}

-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-HANDLE_NEWLINES(yytext, yyleng);
-zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-zendlval-value.str.len = yyleng;
-zendlval-type = IS_STRING;
-return T_ENCAPSED_AND_WHITESPACE;
-}

ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
HANDLE_NEWLINES(yytext, yyleng);



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-11 Thread George Schlossnagle
The patch I submitted included BACKQUOTES in the token matching as  
well.  I'm not convinced that is bad, but I will try to thoroughly test  
it tomorrow, and if it's broken, I'll just case it for  and heredocs.

George


On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote:

OH I missed that. I'll check it out this evening as I have to go now.

Andi

At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote:

Unless I misunderstand the way this works, it's not a problem that it  
returns a T_STRING, only possibly that it does so inside a  BACKQUOTES.
Function names and constants aren't available as barewords in  
DOUBLE_QUOTES or HEREDOCs, right?


On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote:

Hi,

A patch which improves on this would be welcome.
However, this patch at first glance is bogus. You are returning  
T_STRING with possible spaces and other non A-Za-z_ chars. This  
token is also used as tokens such as constants and function names .

Andi

At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:
that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r  
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE}  
{
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {





On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:

--snip

+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);


What's this fprintf()? This seems to be put just for debugging  
purpose.

Moriyosh



 return T_STRING;
  }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
 zend_copy_value(zendlval, yytext, yyleng);
 zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
 return T_STRING;
  }

@@ -1572,6 +1573,15 @@
 }
  }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE 
} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext,  
yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
  ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
 HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you  
have web
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:
On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser  
that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary  
opcodes in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a  
first
shot
at a fix to the parser to  keep this from happening, so that  
you
don't
need an optimizer to clear up this issue.  I've tested this  
locally.
It still introduces a single unnecessary opcode after variable  
in
certain cases, but it works for me.

hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-



--
PHP 

Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-11 Thread Andi Gutmans
Hi,

I still think the patch isn't good. encaps_list which is the main parser 
rule can parse:
 T_VARIABLE T_OBJECT_OPERATOR T_STRING
Your version of T_STRING would break this.
Again I might be missing something but my hunch is that it would break.
Andi

At 03:06 AM 11/11/2002 -0500, George Schlossnagle wrote:
The patch I submitted included BACKQUOTES in the token matching as
well.  I'm not convinced that is bad, but I will try to thoroughly test
it tomorrow, and if it's broken, I'll just case it for  and heredocs.

George


On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote:


OH I missed that. I'll check it out this evening as I have to go now.

Andi

At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote:

Unless I misunderstand the way this works, it's not a problem that it
returns a T_STRING, only possibly that it does so inside a  BACKQUOTES.
Function names and constants aren't available as barewords in
DOUBLE_QUOTES or HEREDOCs, right?


On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote:


Hi,

A patch which improves on this would be welcome.
However, this patch at first glance is bogus. You are returning
T_STRING with possible spaces and other non A-Za-z_ chars. This
token is also used as tokens such as constants and function names .

Andi

At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:

that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE}
{
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {





On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:


--snip

+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);


What's this fprintf()? This seems to be put just for debugging
purpose.

Moriyosh




 return T_STRING;
  }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
 zend_copy_value(zendlval, yytext, yyleng);
 zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
 return T_STRING;
  }

@@ -1572,6 +1573,15 @@
 }
  }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE } {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext,
yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
  ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
 HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you
have web
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:

On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser
that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary
opcodes in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a
first
shot
at a fix to the parser to  keep this from happening, so that
you
don't
need an optimizer to clear up this issue.  I've tested this
locally.
It still introduces a single unnecessary opcode after variable
in
certain cases, but it works for me.


hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 

Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-11 Thread George Schlossnagle
Hi,

You're right.  This patch should address that concern:

diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 11 Nov 2002 22:17:09 -
@@ -32,6 +32,7 @@
 %}

 %x ST_IN_SCRIPTING
+%x ST_EXPECTING_OBJECT
 %x ST_DOUBLE_QUOTES
 %x ST_SINGLE_QUOTE
 %x ST_BACKQUOTE
@@ -686,6 +687,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r  
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -817,7 +819,13 @@
return T_EXTENDS;
 }

-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC- {
+ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC${LABEL}- 
{LABEL} {
+yy_push_state(ST_EXPECTING_OBJECT TSRMLS_CC);
+yyless(0);
+}
+
+ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_H 
EREDOC- {
+fprintf(stderr, matched T_OBJECT_OPERATOR\n);
yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC);
return T_OBJECT_OPERATOR;
 }
@@ -830,7 +838,7 @@
return T_STRING;
 }

-ST_LOOKING_FOR_PROPERTY{ANY_CHAR} {
+ST_EXPECTING_OBJECT,ST_LOOKING_FOR_PROPERTY{ANY_CHAR} {
yyless(0);
yy_pop_state(TSRMLS_C);
 }
@@ -1255,7 +1263,7 @@
return T_INLINE_HTML;
 }

-ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BACKQUOTE${LABEL} {
+ST_EXPECTING_OBJECT,ST_IN_SCRIPTING,ST_DOUBLE_QUOTES,ST_HEREDOC,ST_BAC 
KQUOTE${LABEL} {
zend_copy_value(zendlval, (yytext+1), (yyleng-1));
zendlval-type = IS_STRING;
return T_VARIABLE;
@@ -1269,7 +1277,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1577,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {



On Monday, November 11, 2002, at 02:44 PM, Andi Gutmans wrote:

Hi,

I still think the patch isn't good. encaps_list which is the main  
parser rule can parse:
 T_VARIABLE T_OBJECT_OPERATOR T_STRING
Your version of T_STRING would break this.
Again I might be missing something but my hunch is that it would break.
Andi

At 03:06 AM 11/11/2002 -0500, George Schlossnagle wrote:
The patch I submitted included BACKQUOTES in the token matching as
well.  I'm not convinced that is bad, but I will try to thoroughly  
test
it tomorrow, and if it's broken, I'll just case it for  and heredocs.

George


On Monday, November 11, 2002, at 01:56 AM, Andi Gutmans wrote:

OH I missed that. I'll check it out this evening as I have to go now.

Andi

At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote:

Unless I misunderstand the way this works, it's not a problem that  
it
returns a T_STRING, only possibly that it does so inside a   
BACKQUOTES.
Function names and constants aren't available as barewords in
DOUBLE_QUOTES or HEREDOCs, right?


On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote:

Hi,

A patch which improves on this would be welcome.
However, this patch at first glance is bogus. You are returning
T_STRING with possible spaces and other non A-Za-z_ chars. This
token is also used as tokens such as constants and function names .

Andi

At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:

that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
== 
=
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -
1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
- 
ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE}
{
-   

Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread Derick Rethans
On Sun, 10 Nov 2002, George Schlossnagle wrote:

 For those who came to Dan  my or Derick's talk at the Int. PHP 
 Conference, we both covered the bad inefficiency in the parser that 
 results in strings with variables in them being tokenized on 
 whitespace.  This results in a huge number of unnecessary opcodes in 
 strings.
 
 Attached (hopefully, as my new MUA seems to be fickle) is a first shot 
 at a fix to the parser to  keep this from happening, so that you don't 
 need an optimizer to clear up this issue.  I've tested this locally.  
 It still introduces a single unnecessary opcode after variable in 
 certain cases, but it works for me.

hmm, your MUA is getting senile :) no attachment...

Derick

-- 

---
 Derick Rethans   http://derickrethans.nl/ 
 JDI Media Solutions
--[ if you hold a unix shell to your ear, do you hear the c? ]-


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread George Schlossnagle


On Sunday, November 10, 2002, at 05:06 PM, George Schlossnagle wrote:


For those who came to Dan  my or Derick's talk at the Int. PHP 
Conference, we both covered the bad inefficiency in the parser that 
results in strings with variables in them being tokenized on 
whitespace.  This results in a huge number of unnecessary opcodes in 
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a first shot 
at a fix to the parser to  keep this from happening, so that you don't 
need an optimizer to clear up this issue.  I've tested this locally.  
It still introduces a single unnecessary opcode after variable in 
certain cases, but it works for me.

George


--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php
-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread Paul Nicholson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you have web space 
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:
 On Sun, 10 Nov 2002, George Schlossnagle wrote:
  For those who came to Dan  my or Derick's talk at the Int. PHP
  Conference, we both covered the bad inefficiency in the parser that
  results in strings with variables in them being tokenized on
  whitespace.  This results in a huge number of unnecessary opcodes in
  strings.
 
  Attached (hopefully, as my new MUA seems to be fickle) is a first shot
  at a fix to the parser to  keep this from happening, so that you don't
  need an optimizer to clear up this issue.  I've tested this locally.
  It still introduces a single unnecessary opcode after variable in
  certain cases, but it works for me.

 hmm, your MUA is getting senile :) no attachment...

 Derick

- -- 
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread George Schlossnagle
I got the second attachment mail ok  but I'll just inline it here:


--- Zend/zend_language_scanner.l2002-11-10 16:53:27.0 
-0500
+++ /Users/george/src/php4/Zend/zend_language_scanner.l 2002-11-10 
16:39:11.0 -0500
@@ -686,7 +686,6 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
-LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r 
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1266,13 +1265,15 @@
 ST_IN_SCRIPTING{LABEL} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
return T_STRING;
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
return T_STRING;
 }

@@ -1572,6 +1573,15 @@
}
 }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
HANDLE_NEWLINES(yytext, yyleng);
zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you have web 
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:
On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary opcodes in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a first 
shot
at a fix to the parser to  keep this from happening, so that you 
don't
need an optimizer to clear up this issue.  I've tested this locally.
It still introduces a single unnecessary opcode after variable in
certain cases, but it works for me.

hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread Moriyoshi Koizumi
--snip
 +fprintf(stderr, %s:%d\n, __FILE__,__LINE__);

What's this fprintf()? This seems to be put just for debugging purpose.

Moriyosh



  return T_STRING;
   }
 
 
 -ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
 +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
  zend_copy_value(zendlval, yytext, yyleng);
  zendlval-type = IS_STRING;
 +fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
  return T_STRING;
   }
 
 @@ -1572,6 +1573,15 @@
  }
   }
 
 +
 +ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
 +   HANDLE_NEWLINES(yytext, yyleng);
 +   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
 +   zendlval-value.str.len = yyleng;
 +   zendlval-type = IS_STRING;
 +   return T_ENCAPSED_AND_WHITESPACE;
 +}
 +
   ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
  HANDLE_NEWLINES(yytext, yyleng);
  zend_copy_value(zendlval, yytext, yyleng);
 
 
 On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:
 
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  It's the list, I don't think they allow attachmentsdo you have web 
  space
  you could upload to?
 
  On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:
  On Sun, 10 Nov 2002, George Schlossnagle wrote:
  For those who came to Dan  my or Derick's talk at the Int. PHP
  Conference, we both covered the bad inefficiency in the parser that
  results in strings with variables in them being tokenized on
  whitespace.  This results in a huge number of unnecessary opcodes in
  strings.
 
  Attached (hopefully, as my new MUA seems to be fickle) is a first 
  shot
  at a fix to the parser to  keep this from happening, so that you 
  don't
  need an optimizer to clear up this issue.  I've tested this locally.
  It still introduces a single unnecessary opcode after variable in
  certain cases, but it works for me.
 
  hmm, your MUA is getting senile :) no attachment...
 
  Derick
 
  - --
  ~Paul Nicholson
  Design Specialist @ WebPower Design
  The webthe way you want it!
  [EMAIL PROTECTED]
  www.webpowerdesign.net
 
  It said uses Windows 98 or better, so I loaded Linux!
  Registered Linux User #183202 using Register Linux System # 81891
  -BEGIN PGP SIGNATURE-
  Version: GnuPG v1.0.6 (GNU/Linux)
  Comment: For info see http://www.gnupg.org
 
  iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
  u+KZNZj2lZWzXmRiZmYrq4U=
  =ChWV
  -END PGP SIGNATURE-
 
 
 -- 
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, visit: http://www.php.net/unsub.php
 


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread George Schlossnagle
that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r 
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {





On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:

--snip

+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);


What's this fprintf()? This seems to be put just for debugging purpose.

Moriyosh




 return T_STRING;
  }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
 zend_copy_value(zendlval, yytext, yyleng);
 zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
 return T_STRING;
  }

@@ -1572,6 +1573,15 @@
 }
  }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
  ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
 HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you have 
web
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:
On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary opcodes 
in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a first
shot
at a fix to the parser to  keep this from happening, so that you
don't
need an optimizer to clear up this issue.  I've tested this 
locally.
It still introduces a single unnecessary opcode after variable in
certain cases, but it works for me.

hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread Andi Gutmans
Hi,

A patch which improves on this would be welcome.
However, this patch at first glance is bogus. You are returning T_STRING 
with possible spaces and other non A-Za-z_ chars. This token is also used 
as tokens such as constants and function names .

Andi

At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:
that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {





On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:


--snip

+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);


What's this fprintf()? This seems to be put just for debugging purpose.

Moriyosh




 return T_STRING;
  }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
 zend_copy_value(zendlval, yytext, yyleng);
 zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
 return T_STRING;
  }

@@ -1572,6 +1573,15 @@
 }
  }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
  ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
 HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you have web
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:

On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary opcodes in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a first
shot
at a fix to the parser to  keep this from happening, so that you
don't
need an optimizer to clear up this issue.  I've tested this locally.
It still introduces a single unnecessary opcode after variable in
certain cases, but it works for me.


hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread George Schlossnagle
Unless I misunderstand the way this works, it's not a problem that it 
returns a T_STRING, only possibly that it does so inside a BACKQUOTES.  
Function names and constants aren't available as barewords in 
DOUBLE_QUOTES or HEREDOCs, right?


On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote:

Hi,

A patch which improves on this would be welcome.
However, this patch at first glance is bogus. You are returning 
T_STRING with possible spaces and other non A-Za-z_ chars. This token 
is also used as tokens such as constants and function names .

Andi

At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:
that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r 
#'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {





On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:

--snip

+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);


What's this fprintf()? This seems to be put just for debugging 
purpose.

Moriyosh



 return T_STRING;
  }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
 zend_copy_value(zendlval, yytext, yyleng);
 zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
 return T_STRING;
  }

@@ -1572,6 +1573,15 @@
 }
  }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} 
{
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
  ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
 HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you have 
web
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:
On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser 
that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary 
opcodes in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a first
shot
at a fix to the parser to  keep this from happening, so that you
don't
need an optimizer to clear up this issue.  I've tested this 
locally.
It still introduces a single unnecessary opcode after variable in
certain cases, but it works for me.

hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP Development Mailing 

Re: [PHP-DEV] ZEND_ADD_STRING patch

2002-11-10 Thread Andi Gutmans
OH I missed that. I'll check it out this evening as I have to go now.

Andi

At 01:48 AM 11/11/2002 -0500, George Schlossnagle wrote:

Unless I misunderstand the way this works, it's not a problem that it 
returns a T_STRING, only possibly that it does so inside a BACKQUOTES.
Function names and constants aren't available as barewords in 
DOUBLE_QUOTES or HEREDOCs, right?


On Monday, November 11, 2002, at 01:12 AM, Andi Gutmans wrote:

Hi,

A patch which improves on this would be welcome.
However, this patch at first glance is bogus. You are returning T_STRING 
with possible spaces and other non A-Za-z_ chars. This token is also used 
as tokens such as constants and function names .

Andi

At 06:31 PM 11/10/2002 -0500, George Schlossnagle wrote:
that would be my debugging from my 'clean' cvs copy.  :)

You don't want that.  Sorry.  Here's a better patch:

Index: zend_language_scanner.l
===
RCS file: /repository/Zend/zend_language_scanner.l,v
retrieving revision 1.51
diff -u -3 -r1.51 zend_language_scanner.l
--- zend_language_scanner.l 2 Nov 2002 16:32:26 -   1.51
+++ zend_language_scanner.l 10 Nov 2002 23:30:28 -
@@ -686,6 +686,7 @@
 HNUM   0x[0-9a-fA-F]+
 LABEL  [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
 WHITESPACE [ \n\r\t]+
+LABEL_OR_WHITESPACE [a-zA-Z0-9_\x7f-\xff \n\t\r #'.:;,()|^+-/*=%!~?@]+
 TABS_AND_SPACES [ \t]*
 TOKENS [;:,.\[\]()|^+-/*=%!~$?@]
 ENCAPSED_TOKENS [\[\]{}$]
@@ -1269,7 +1270,7 @@
 }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
zend_copy_value(zendlval, yytext, yyleng);
zendlval-type = IS_STRING;
return T_STRING;
@@ -1569,15 +1570,6 @@
zendlval-type = IS_STRING;
return T_STRING;
}
-}
-
-
-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
-   HANDLE_NEWLINES(yytext, yyleng);
-   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
-   zendlval-value.str.len = yyleng;
-   zendlval-type = IS_STRING;
-   return T_ENCAPSED_AND_WHITESPACE;
 }

 ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {





On Sunday, November 10, 2002, at 06:25 PM, Moriyoshi Koizumi wrote:


--snip

+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);


What's this fprintf()? This seems to be put just for debugging purpose.

Moriyosh




 return T_STRING;
  }


-ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL_OR_WHITESPACE} {
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{LABEL} {
 zend_copy_value(zendlval, yytext, yyleng);
 zendlval-type = IS_STRING;
+fprintf(stderr, %s:%d\n, __FILE__,__LINE__);
 return T_STRING;
  }

@@ -1572,6 +1573,15 @@
 }
  }

+
+ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC{ESCAPED_AND_WHITESPACE} {
+   HANDLE_NEWLINES(yytext, yyleng);
+   zendlval-value.str.val = (char *) estrndup(yytext, yyleng);
+   zendlval-value.str.len = yyleng;
+   zendlval-type = IS_STRING;
+   return T_ENCAPSED_AND_WHITESPACE;
+}
+
  ST_SINGLE_QUOTE([^'\\]|\\[^'\\])+ {
 HANDLE_NEWLINES(yytext, yyleng);
 zend_copy_value(zendlval, yytext, yyleng);


On Sunday, November 10, 2002, at 06:05 PM, Paul Nicholson wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's the list, I don't think they allow attachmentsdo you have web
space
you could upload to?

On Sunday 10 November 2002 05:16 pm, Derick Rethans wrote:

On Sun, 10 Nov 2002, George Schlossnagle wrote:

For those who came to Dan  my or Derick's talk at the Int. PHP
Conference, we both covered the bad inefficiency in the parser that
results in strings with variables in them being tokenized on
whitespace.  This results in a huge number of unnecessary opcodes in
strings.

Attached (hopefully, as my new MUA seems to be fickle) is a first
shot
at a fix to the parser to  keep this from happening, so that you
don't
need an optimizer to clear up this issue.  I've tested this locally.
It still introduces a single unnecessary opcode after variable in
certain cases, but it works for me.


hmm, your MUA is getting senile :) no attachment...

Derick


- --
~Paul Nicholson
Design Specialist @ WebPower Design
The webthe way you want it!
[EMAIL PROTECTED]
www.webpowerdesign.net

It said uses Windows 98 or better, so I loaded Linux!
Registered Linux User #183202 using Register Linux System # 81891
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9zuZNDyXNIUN3+UQRAlYEAJ9PE5IKScOc+7/Kk1a71jJ87o7+EgCfV9z7
u+KZNZj2lZWzXmRiZmYrq4U=
=ChWV
-END PGP SIGNATURE-



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP