Re: [PHP-DEV] concatenation operator
On 06/30/2012 04:51 PM, Johannes Schlüter wrote: On Sat, 2012-06-30 at 03:53 -0700, Adi Mutu wrote: Only thing that helps is learning the code structure and digging through it. Any hint/documentation to learn that? Use the source. ;-) A bit more seriously: No, there's no good single place to look at, there are different blogs etc looking at specific pieces in detail, but the best thing to do is looking at the code (the filenames in Zend/ give a good idea what they are for ...), take a question and time and start digging. For some things it's also good to look into xdebug, vld, runkit, ... and see where they hook in to do their magic. And well, the path from main() in sapi/cli/php_cli.c to execute() is not that long, what then happens is a bit more complicated though (while then again, once you're in, quite easy for most parts, too) johannes There is a wiki page linking to some useful resources: https://wiki.php.net/internals/references Chris -- christopher.jo...@oracle.com http://twitter.com/#!/ghrd -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
By initialization i mean the latest point possible where I can set a breakpoint, but right before my scripts starts executing it's emalloc's or efree's. but even then you will see many things you're probably not interested in such as?.. Only thing that helps is learning the code structure and digging through it. Any hint/documentation to learn that? Thanks. From: Johannes Schlüter johan...@schlueters.de To: Adi Mutu adi_mut...@yahoo.com Cc: Felipe Pena felipe...@gmail.com; PHP Developers Mailing List internals@lists.php.net Sent: Saturday, June 30, 2012 12:36 AM Subject: Re: [PHP-DEV] concatenation operator On Fri, 2012-06-29 at 11:47 -0700, Adi Mutu wrote: Sorry for the late reply, I was away for a while.. I don't think I have dtrace because I'm on fedora.but i'll research. As said: Currently only on Solaris, MacOS and BSD. Oracle is porting DTrace to Oracle Linux. RedHat created SystemTap which is similar, ut I have never used it. If i would want to set a breakpoint after php's initialization process, but right before the scripts execution, so that after that I can set breakpoints to emalloc and efree which are executed only during my scripts execution where should i set it? Hope the question was clear enough. Depends on your view what initialization is. But execute() might be a place which helps ... but even then you will see many things you're probably not interested in. Only thing that helps is learning the code structure and digging through it. dtrace related: Why have you used 'execute:return' and not concat_function:return? What's with the execute function? That was a bug since I quickly edited an older script. In this case it doesn't change the result. johannes
Re: [PHP-DEV] concatenation operator
On Sat, 2012-06-30 at 03:53 -0700, Adi Mutu wrote: By initialization i mean the latest point possible where I can set a breakpoint, but right before my scripts starts executing it's emalloc's or efree's. Does executing include compilation? Does it include creating a stack frame etc. for the main routine? ... but even then you will see many things you're probably not interested in such as?.. Well, PHP is complex, it does quite a few things in order to run a seemingly small script. Only thing that helps is learning the code structure and digging through it. Any hint/documentation to learn that? Use the source. ;-) A bit more seriously: No, there's no good single place to look at, there are different blogs etc looking at specific pieces in detail, but the best thing to do is looking at the code (the filenames in Zend/ give a good idea what they are for ...), take a question and time and start digging. For some things it's also good to look into xdebug, vld, runkit, ... and see where they hook in to do their magic. And well, the path from main() in sapi/cli/php_cli.c to execute() is not that long, what then happens is a bit more complicated though (while then again, once you're in, quite easy for most parts, too) johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
Hello, Sorry for the late reply, I was away for a while.. I don't think I have dtrace because I'm on fedora.but i'll research. If i would want to set a breakpoint after php's initialization process, but right before the scripts execution, so that after that I can set breakpoints to emalloc and efree which are executed only during my scripts execution where should i set it? Hope the question was clear enough. dtrace related: Why have you used 'execute:return' and not concat_function:return? What's with the execute function? Thanks, A. From: Johannes Schlüter johan...@schlueters.de To: Adi Mutu adi_mut...@yahoo.com Cc: Felipe Pena felipe...@gmail.com; PHP Developers Mailing List internals@lists.php.net Sent: Thursday, June 7, 2012 11:18 PM Subject: Re: [PHP-DEV] concatenation operator On Thu, 2012-06-07 at 12:53 -0700, Adi Mutu wrote: Ok Johannes, thanks for the answer. I'll try to look deeper. I basically just wanted to know what happens when you concatenate two strings? what emalloc/efree happens. This depends. As always. As said what has to be done is one allocation for the result value ... and then the zval magic, which depends on refcount, references, ... Also can you tell me if possible how to put a breakpoint to emalloc/efree which are executed only after all core functions are registered? because it takes like a million years like this and a million F8 presses... Depends on your debugger. Most allow conditional breakpoints or have a breakpoint and while holding at some place add a few more ... For such a question my preference is using DTrace (on Solaris, Mac or BSD), something like this session: $ cat test.d #!/sbin/dtrace pid$target::concat_function:entry { self-in_concat = 1; } pid$target::execute:return { self-in_concat = 0; } pid$target::_emalloc:entry / self-in_concat / { trace(arg0); ustack(); } pid$target::_erealloc:entry / self-in_concat / { trace(arg0); trace(arg1); ustack(); } $ cat test1.php ?php $a = foo; $b = bar; $a.$b; $ dtrace -s test.d -c 'php test1.php' dtrace: script 'test.d' matched 4 probes dtrace: pid 16406 has exited CPU ID FUNCTION:NAME 3 100372 _emalloc:entry 7 php`_emalloc php`concat_function+0x270 php`ZEND_CONCAT_SPEC_CV_CV_HANDLER+0xcd php`execute+0x3d9 php`dtrace_execute+0xe7 php`zend_execute_scripts+0xf5 php`php_execute_script+0x2e8 php`do_cli+0x864 php`main+0x6e2 php`_start+0x83 $ cat test2.php ?php $a = 23; $b = bar; $a.$b; $ dtrace -s test.d -c 'php test2.php' dtrace: script 'test.d' matched 4 probes dtrace: pid 16425 has exited CPU ID FUNCTION:NAME 1 100373 _erealloc:entry 0 79 php`_erealloc php`xbuf_format_converter+0x11ee php`vspprintf+0x34 php`zend_spprintf+0x2f php`_convert_to_string+0x174 php`zend_make_printable_zval+0x5ec php`concat_function+0x3c php`ZEND_CONCAT_SPEC_CV_CV_HANDLER+0xcd php`execute+0x3d9 php`dtrace_execute+0xe7 php`zend_execute_scripts+0xf5 php`php_execute_script+0x2e8 php`do_cli+0x864 php`main+0x6e2 php`_start+0x83 1 100372 _emalloc:entry 6 php`_emalloc php`concat_function+0x270 php`ZEND_CONCAT_SPEC_CV_CV_HANDLER+0xcd php`execute+0x3d9 php`dtrace_execute+0xe7 php`zend_execute_scripts+0xf5 php`php_execute_script+0x2e8 php`do_cli+0x864 php`main+0x6e2 php`_start+0x83 So, when having two constant strings there's a single malloc, in this case allocating 7 bytes (strlen(foo)+strlen(bar)+1), if you have a different type it has to be converted first ... johannes
Re: [PHP-DEV] concatenation operator
On Fri, 2012-06-29 at 11:47 -0700, Adi Mutu wrote: Sorry for the late reply, I was away for a while.. I don't think I have dtrace because I'm on fedora.but i'll research. As said: Currently only on Solaris, MacOS and BSD. Oracle is porting DTrace to Oracle Linux. RedHat created SystemTap which is similar, ut I have never used it. If i would want to set a breakpoint after php's initialization process, but right before the scripts execution, so that after that I can set breakpoints to emalloc and efree which are executed only during my scripts execution where should i set it? Hope the question was clear enough. Depends on your view what initialization is. But execute() might be a place which helps ... but even then you will see many things you're probably not interested in. Only thing that helps is learning the code structure and digging through it. dtrace related: Why have you used 'execute:return' and not concat_function:return? What's with the execute function? That was a bug since I quickly edited an older script. In this case it doesn't change the result. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
On 13/06/12 05:26, Morgan L. Owens wrote: After reading the performance improvements RFC about interned strings, and its passing mention of a special data structure (e.g. zend_string) instead of char*, I've been thinking a little bit about this and what such a structure could be. But rather than interned strings, I thought that _implicit_ concatenation would be a bigger win in the long term. Like interning, it relies on strings being immutable. This zend_string is a composite type. Leaves are _almost_ identical to existing string zvals - char* val, int len - but also an additional child_count field. For leaves, child_count is zero (not incidentally indicating that it _is_ a leaf). For internal nodes, val is a list of zend_strings (child_count of them). len still refers to the total string length (the sum of the len fields of its children). So a string that has been built up through concatenation is represented by a tree (actually a dag) of zend_strings. The edges in this dag are all properly reference-counted; discarding a string decrements the reference counts of its children. How do you list then? As a single-linked list? That would avoid reuse of the component strings in different superstrings except from matching ends... -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: Re: [PHP-DEV] concatenation operator
On 2012-06-15 04:00, Ángel González wrote: On 13/06/12 05:26, Morgan L. Owens wrote: After reading the performance improvements RFC about interned strings, and its passing mention of a special data structure (e.g. zend_string) instead of char*, I've been thinking a little bit about this and what such a structure could be. But rather than interned strings, I thought that _implicit_ concatenation would be a bigger win in the long term. Like interning, it relies on strings being immutable. This zend_string is a composite type. Leaves are _almost_ identical to existing string zvals - char* val, int len - but also an additional child_count field. For leaves, child_count is zero (not incidentally indicating that it _is_ a leaf). For internal nodes, val is a list of zend_strings (child_count of them). len still refers to the total string length (the sum of the len fields of its children). So a string that has been built up through concatenation is represented by a tree (actually a dag) of zend_strings. The edges in this dag are all properly reference-counted; discarding a string decrements the reference counts of its children. How do you list then? As a single-linked list? That would avoid reuse of the component strings in different superstrings except from matching ends... I was thinking just in terms of an array (the composite would be pointing either to an array of characters or an array of strings). Mainly just because that's how I pictured it (and haven't thought of a reason not to, since the number of children is known when the concatenated string is created, and fixed due to immutability). Component strings aren't copied as such, only referenced. In that sense the choice of array vs. list comes down to where that reference is kept - in the parent string or the elder sibling. Sharing common suffixes would save a number of references, but when concatenating two existing strings, the list of component references in the _prefix_ would need to be copied for the sake of whatever else is using it at the time (otherwise they would end up with the concatenated string as well). Speaking of concatenation, unless potentially scary stuff is done, concatenating three strings is done by concatenating two of them, then concatenating the result with the third, giving a binary tree; so why am I suggesting an array of arbitrary length? Think of an implementation of PHP's join()/implode() that exploits this structure. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: Re: [PHP-DEV] concatenation operator
On 2012-06-08 08:18, Johannes Schlüter wrote: On Thu, 2012-06-07 at 12:53 -0700, Adi Mutu wrote: Ok Johannes, thanks for the answer. I'll try to look deeper. I basically just wanted to know what happens when you concatenate two strings? what emalloc/efree happens. This depends. As always. As said what has to be done is one allocation for the result value ... and then the zval magic, which depends on refcount, references, ... So, when having two constant strings there's a single malloc, in this case allocating 7 bytes (strlen(foo)+strlen(bar)+1), if you have a different type it has to be converted first ... After reading the performance improvements RFC about interned strings, and its passing mention of a special data structure (e.g. zend_string) instead of char*, I've been thinking a little bit about this and what such a structure could be. But rather than interned strings, I thought that _implicit_ concatenation would be a bigger win in the long term. Like interning, it relies on strings being immutable. This zend_string is a composite type. Leaves are _almost_ identical to existing string zvals - char* val, int len - but also an additional child_count field. For leaves, child_count is zero (not incidentally indicating that it _is_ a leaf). For internal nodes, val is a list of zend_strings (child_count of them). len still refers to the total string length (the sum of the len fields of its children). So a string that has been built up through concatenation is represented by a tree (actually a dag) of zend_strings. The edges in this dag are all properly reference-counted; discarding a string decrements the reference counts of its children. Only when the character data is needed for something does it need to be allocated for and copied into one place (the internal node can then become a leaf). -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
that's nice, but i haven't understood a thing...i know something about php core and php extensions, but nothing about the Zend engine specific. Can you point me to some resources on this topic? Thanks, From: Felipe Pena felipe...@gmail.com To: Adi Mutu adi_mut...@yahoo.com Cc: PHP Developers Mailing List internals@lists.php.net Sent: Tuesday, June 5, 2012 11:17 PM Subject: Re: [PHP-DEV] concatenation operator Hi, 2012/6/5 Adi Mutu adi_mut...@yahoo.com: Hello, Can somebody point me to where the concatenation operator is implemented ? . operator. Thanks, See http://lxr.php.net/xref/PHP_TRUNK/Zend/zend_vm_def.h#133 -- Regards, Felipe Pena -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
On Thu, 2012-06-07 at 11:50 -0700, Adi Mutu wrote: that's nice, but i haven't understood a thing...i know something about php core and php extensions, but nothing about the Zend engine specific. The mentioned place is directly in the VM, which in general is harder to understand, but well, it directs to the concat_function on http://lxr.php.net/xref/PHP_TRUNK/Zend/zend_operators.c#1234 Knowing basic C should be enough to understand the code there. The actual algorithm can also easily be guessed (allocate a buffer which can hold both strings at once and copy them over,the code is a tiny bit more complex as it tries tore use an existing buffer than allocating something completely new) The question is: What do you actually want to know? Can you point me to some resources on this topic? Unfortunately not. The source is the best documentation we have for that. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
Ok Johannes, thanks for the answer. I'll try to look deeper. I basically just wanted to know what happens when you concatenate two strings? what emalloc/efree happens. Also can you tell me if possible how to put a breakpoint to emalloc/efree which are executed only after all core functions are registered? because it takes like a million years like this and a million F8 presses... Thanks. From: Johannes Schlüter johan...@schlueters.de To: Adi Mutu adi_mut...@yahoo.com Cc: Felipe Pena felipe...@gmail.com; PHP Developers Mailing List internals@lists.php.net Sent: Thursday, June 7, 2012 10:44 PM Subject: Re: [PHP-DEV] concatenation operator On Thu, 2012-06-07 at 11:50 -0700, Adi Mutu wrote: that's nice, but i haven't understood a thing...i know something about php core and php extensions, but nothing about the Zend engine specific. The mentioned place is directly in the VM, which in general is harder to understand, but well, it directs to the concat_function on http://lxr.php.net/xref/PHP_TRUNK/Zend/zend_operators.c#1234 Knowing basic C should be enough to understand the code there. The actual algorithm can also easily be guessed (allocate a buffer which can hold both strings at once and copy them over,the code is a tiny bit more complex as it tries tore use an existing buffer than allocating something completely new) The question is: What do you actually want to know? Can you point me to some resources on this topic? Unfortunately not. The source is the best documentation we have for that. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
On Thu, 2012-06-07 at 12:53 -0700, Adi Mutu wrote: Ok Johannes, thanks for the answer. I'll try to look deeper. I basically just wanted to know what happens when you concatenate two strings? what emalloc/efree happens. This depends. As always. As said what has to be done is one allocation for the result value ... and then the zval magic, which depends on refcount, references, ... Also can you tell me if possible how to put a breakpoint to emalloc/efree which are executed only after all core functions are registered? because it takes like a million years like this and a million F8 presses... Depends on your debugger. Most allow conditional breakpoints or have a breakpoint and while holding at some place add a few more ... For such a question my preference is using DTrace (on Solaris, Mac or BSD), something like this session: $ cat test.d #!/sbin/dtrace pid$target::concat_function:entry { self-in_concat = 1; } pid$target::execute:return { self-in_concat = 0; } pid$target::_emalloc:entry / self-in_concat / { trace(arg0); ustack(); } pid$target::_erealloc:entry / self-in_concat / { trace(arg0); trace(arg1); ustack(); } $ cat test1.php ?php $a = foo; $b = bar; $a.$b; $ dtrace -s test.d -c 'php test1.php' dtrace: script 'test.d' matched 4 probes dtrace: pid 16406 has exited CPU IDFUNCTION:NAME 3 100372 _emalloc:entry 7 php`_emalloc php`concat_function+0x270 php`ZEND_CONCAT_SPEC_CV_CV_HANDLER+0xcd php`execute+0x3d9 php`dtrace_execute+0xe7 php`zend_execute_scripts+0xf5 php`php_execute_script+0x2e8 php`do_cli+0x864 php`main+0x6e2 php`_start+0x83 $ cat test2.php ?php $a = 23; $b = bar; $a.$b; $ dtrace -s test.d -c 'php test2.php' dtrace: script 'test.d' matched 4 probes dtrace: pid 16425 has exited CPU IDFUNCTION:NAME 1 100373 _erealloc:entry 0 79 php`_erealloc php`xbuf_format_converter+0x11ee php`vspprintf+0x34 php`zend_spprintf+0x2f php`_convert_to_string+0x174 php`zend_make_printable_zval+0x5ec php`concat_function+0x3c php`ZEND_CONCAT_SPEC_CV_CV_HANDLER+0xcd php`execute+0x3d9 php`dtrace_execute+0xe7 php`zend_execute_scripts+0xf5 php`php_execute_script+0x2e8 php`do_cli+0x864 php`main+0x6e2 php`_start+0x83 1 100372 _emalloc:entry 6 php`_emalloc php`concat_function+0x270 php`ZEND_CONCAT_SPEC_CV_CV_HANDLER+0xcd php`execute+0x3d9 php`dtrace_execute+0xe7 php`zend_execute_scripts+0xf5 php`php_execute_script+0x2e8 php`do_cli+0x864 php`main+0x6e2 php`_start+0x83 So, when having two constant strings there's a single malloc, in this case allocating 7 bytes (strlen(foo)+strlen(bar)+1), if you have a different type it has to be converted first ... johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] concatenation operator
Hi, 2012/6/5 Adi Mutu adi_mut...@yahoo.com: Hello, Can somebody point me to where the concatenation operator is implemented ? . operator. Thanks, See http://lxr.php.net/xref/PHP_TRUNK/Zend/zend_vm_def.h#133 -- Regards, Felipe Pena -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php