Re: [PHP-DEV] How deep is copy on write?
Using references does not speed up PHP. It does that already internally, if I'm not mistaken. The point of my post was that assigning values to tree arrays are in general faster than a full array copy. Hannes On 19 January 2011 08:36, Ben Schmidt mail_ben_schm...@yahoo.com.au wrote: Yep. PHP does clock up memory very quickly for big arrays, objects with lots of members and/or lots of small objects with large overheads. There are a LOT of zvals and zobjects and things around the place, and their overhead isn't all that small. Of course, if you go to the trouble to construct arrays using references, you can avoid some of that, because a copy-on-write will just copy the reference. It does mean you're passing references, though. $bar['baz'] = 1; $poink['narf'] = 1; $a['foo']['bar'] = $bar; $a['foo']['poink'] = $poink; Then if you test($a), $bar and $poink will be changed, since they are 'passed by reference'--no copying needs to be done. It's almost as if $b were passed by reference, but setting $b['blip'] wouldn't show up in $a, because $a itself would be copied in that case, including the references, which would continue to refer to $bar and $poink. So a much quicker copy, but obviously not the same level of isolation that you might expect or desire. Unless you did some jiggerypokery like $b_bar=$b['bar']; $b['bar']=$b_bar; which would break the reference and make a copy of just that part of the array. But this is a pretty nasty caller-callee co-operative kind of thing. Just a thought to throw into the mix, though. Disclaimer: I'm somewhat out of my depth here. But I'm sure someone will jump on me if I'm wrong. Ben. On 19/01/11 6:09 PM, Larry Garfield wrote: That's what I was afraid of. So it does copy the entire array. Crap. :-) Am I correct that each level in the array represents its own ZVal, with the additional memory overhead a ZVal has (however many bytes that is)? That is, the array below would have $a, foo, bar, baz, bob, narf, poink, poink/narf = 8 ZVals? (That seems logical to me because each its its own variable that just happens to be an array, but I want to be sure.) --Larry Garfield On Wednesday, January 19, 2011 1:01:44 am Ben Schmidt wrote: It does the whole of $b. It has to, because when you change 'baz', a reference in 'bar' needs to change to point to the newly copied 'baz', so 'bar' is written...and likewise 'foo' is written. Ben. On 19/01/11 5:45 PM, Larry Garfield wrote: Hi folks. I have a question about the PHP runtime that I hope is appropriate for this list. (If not, please thwap me gently; I bruise easily.) I know PHP does copy-on-write. However, how deeply does it copy when dealing with nested arrays? This is probably easiest to explain with an example... $a['foo']['bar']['baz'] = 1; $a['foo']['bar']['bob'] = 1; $a['foo']['bar']['narf'] = 1; $a['foo']['poink']['narf'] = 1; function test($b) { // Assume each of the following lines in isolation... // Does this copy just the one variable baz, or the full array? $b['foo']['bar']['baz'] = 2; // Does this copy $b, or just $b['foo']['poink']? $b['foo']['poink']['stuff'] = 3; return $b; } // I know this is wasteful; I'm trying to figure out just how wasteful. $a = test($a); test() in this case should take $b by reference, but I'm trying to determine how much of a difference it is. (In practice my use case has a vastly larger array, so any inefficiencies are multiplied.) --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
What about objects? class Foo { public $foo; } function test($o) { $o-foo-foo-foo = 2; } $bar = new Foo; $bar-foo = new Foo; $bar-foo-foo = new Foo; test( $bar ); --- Also... is it better to pass an object as a parameter rather than many values? function withValues($anInteger, $aBool, $aString) { var_dump($anInteger, $aBool, $aString); } function withObject(ParamOject $o) { var_dump( $o-theInteger(), $o-theBool(), $o-theString() ); } Martin Scotta On Wed, Jan 19, 2011 at 5:03 AM, Hannes Landeholm landeh...@gmail.comwrote: Using references does not speed up PHP. It does that already internally, if I'm not mistaken. The point of my post was that assigning values to tree arrays are in general faster than a full array copy. Hannes On 19 January 2011 08:36, Ben Schmidt mail_ben_schm...@yahoo.com.au wrote: Yep. PHP does clock up memory very quickly for big arrays, objects with lots of members and/or lots of small objects with large overheads. There are a LOT of zvals and zobjects and things around the place, and their overhead isn't all that small. Of course, if you go to the trouble to construct arrays using references, you can avoid some of that, because a copy-on-write will just copy the reference. It does mean you're passing references, though. $bar['baz'] = 1; $poink['narf'] = 1; $a['foo']['bar'] = $bar; $a['foo']['poink'] = $poink; Then if you test($a), $bar and $poink will be changed, since they are 'passed by reference'--no copying needs to be done. It's almost as if $b were passed by reference, but setting $b['blip'] wouldn't show up in $a, because $a itself would be copied in that case, including the references, which would continue to refer to $bar and $poink. So a much quicker copy, but obviously not the same level of isolation that you might expect or desire. Unless you did some jiggerypokery like $b_bar=$b['bar']; $b['bar']=$b_bar; which would break the reference and make a copy of just that part of the array. But this is a pretty nasty caller-callee co-operative kind of thing. Just a thought to throw into the mix, though. Disclaimer: I'm somewhat out of my depth here. But I'm sure someone will jump on me if I'm wrong. Ben. On 19/01/11 6:09 PM, Larry Garfield wrote: That's what I was afraid of. So it does copy the entire array. Crap. :-) Am I correct that each level in the array represents its own ZVal, with the additional memory overhead a ZVal has (however many bytes that is)? That is, the array below would have $a, foo, bar, baz, bob, narf, poink, poink/narf = 8 ZVals? (That seems logical to me because each its its own variable that just happens to be an array, but I want to be sure.) --Larry Garfield On Wednesday, January 19, 2011 1:01:44 am Ben Schmidt wrote: It does the whole of $b. It has to, because when you change 'baz', a reference in 'bar' needs to change to point to the newly copied 'baz', so 'bar' is written...and likewise 'foo' is written. Ben. On 19/01/11 5:45 PM, Larry Garfield wrote: Hi folks. I have a question about the PHP runtime that I hope is appropriate for this list. (If not, please thwap me gently; I bruise easily.) I know PHP does copy-on-write. However, how deeply does it copy when dealing with nested arrays? This is probably easiest to explain with an example... $a['foo']['bar']['baz'] = 1; $a['foo']['bar']['bob'] = 1; $a['foo']['bar']['narf'] = 1; $a['foo']['poink']['narf'] = 1; function test($b) { // Assume each of the following lines in isolation... // Does this copy just the one variable baz, or the full array? $b['foo']['bar']['baz'] = 2; // Does this copy $b, or just $b['foo']['poink']? $b['foo']['poink']['stuff'] = 3; return $b; } // I know this is wasteful; I'm trying to figure out just how wasteful. $a = test($a); test() in this case should take $b by reference, but I'm trying to determine how much of a difference it is. (In practice my use case has a vastly larger array, so any inefficiencies are multiplied.) --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
On Wed, 19 Jan 2011 14:23:49 -, Martin Scotta martinsco...@gmail.com wrote: What about objects? With objects less copying occurs because the object value (zval) data is actually just a pointer and an id that for most purposes works as a pointer. However, it should be said that while a copy of an array forces more memory to be copied, the inner zvals are not actually copied. In this snippet: $a = array(1, 2, array(3)); $b = $a; function separate($dummy) { } separate($a); the copy that occurs when you force the separation of the zval that is shared by $a and $b ($b = $a doesn't copy the array in $a to $b, it merely copies the zval pointer of $a to $b and increments its reference count) is just a shallow copy of hash table and a increment of the first level zvals' refcounts. This means the zvals that have their pointers stored in the array $a's HashTable are not themselves copied. Interestingly (or should I say, unfortunately), this happens even if the inner zvals are references. See http://php.net/manual/en/language.references.whatdo.php the part on arrays. class Foo { public $foo; } function test($o) { $o-foo-foo-foo = 2; } $bar = new Foo; $bar-foo = new Foo; $bar-foo-foo = new Foo; test( $bar ); This example shows no copying (in the sense of new zval allocation on passing or assignment) at all. --- Also... is it better to pass an object as a parameter rather than many values? function withValues($anInteger, $aBool, $aString) { var_dump($anInteger, $aBool, $aString); } function withObject(ParamOject $o) { var_dump( $o-theInteger(), $o-theBool(), $o-theString() ); } It should be indifferent. In normal circumstances, there is no zval copying at all (only the pointers of arguments' symbols are copied). Only when you start throwing references into the mix will you start forcing copied. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
So it sounds like the general answer is that if you pass a complex array to a function by value and mess with it, data is duplicated for every item you modify and its direct ancestors up to the root variable but not for the rest of the tree. For objects, because of their pass by handle-type behavior you are (usually) modifying the same data directly so there's no duplication. Does that sound correct? Related: What is the overhead of a ZVal? I'm assuming it's a fixed number of bytes. --Larry Garfield On 1/19/11 11:27 AM, Gustavo Lopes wrote: On Wed, 19 Jan 2011 14:23:49 -, Martin Scotta martinsco...@gmail.com wrote: What about objects? With objects less copying occurs because the object value (zval) data is actually just a pointer and an id that for most purposes works as a pointer. However, it should be said that while a copy of an array forces more memory to be copied, the inner zvals are not actually copied. In this snippet: $a = array(1, 2, array(3)); $b = $a; function separate($dummy) { } separate($a); the copy that occurs when you force the separation of the zval that is shared by $a and $b ($b = $a doesn't copy the array in $a to $b, it merely copies the zval pointer of $a to $b and increments its reference count) is just a shallow copy of hash table and a increment of the first level zvals' refcounts. This means the zvals that have their pointers stored in the array $a's HashTable are not themselves copied. Interestingly (or should I say, unfortunately), this happens even if the inner zvals are references. See http://php.net/manual/en/language.references.whatdo.php the part on arrays. class Foo { public $foo; } function test($o) { $o-foo-foo-foo = 2; } $bar = new Foo; $bar-foo = new Foo; $bar-foo-foo = new Foo; test( $bar ); This example shows no copying (in the sense of new zval allocation on passing or assignment) at all. --- Also... is it better to pass an object as a parameter rather than many values? function withValues($anInteger, $aBool, $aString) { var_dump($anInteger, $aBool, $aString); } function withObject(ParamOject $o) { var_dump( $o-theInteger(), $o-theBool(), $o-theString() ); } It should be indifferent. In normal circumstances, there is no zval copying at all (only the pointers of arguments' symbols are copied). Only when you start throwing references into the mix will you start forcing copied. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
On 19 January 2011 20:05, la...@garfieldtech.com la...@garfieldtech.com wrote: So it sounds like the general answer is that if you pass a complex array to a function by value and mess with it, data is duplicated for every item you modify and its direct ancestors up to the root variable but not for the rest of the tree. For objects, because of their pass by handle-type behavior you are (usually) modifying the same data directly so there's no duplication. Does that sound correct? Related: What is the overhead of a ZVal? I'm assuming it's a fixed number of bytes. http://lmgtfy.com/?q=php+zvall=1 Regards Peter -- hype WWW: plphp.dk / plind.dk LinkedIn: plind BeWelcome/Couchsurfing: Fake51 Twitter: kafe15 /hype -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
On 20/01/11 6:05 AM, la...@garfieldtech.com wrote: So it sounds like the general answer is that if you pass a complex array to a function by value and mess with it, data is duplicated for every item you modify and its direct ancestors up to the root variable but not for the rest of the tree. For objects, because of their pass by handle-type behavior you are (usually) modifying the same data directly so there's no duplication. Does that sound correct? Yes. Related: What is the overhead of a ZVal? I'm assuming it's a fixed number of bytes. It seems not, though a zval has a fixed size. What that size is will depend on the compiler and architecture of the system being used, or at least on the ABI. From zend.h: typedef union _zvalue_value { long lval; /* long value */ double dval;/* double value */ struct { char *val; int len; } str; HashTable *ht; /* hash table value */ zend_object_value obj; } zvalue_value; struct _zval_struct { /* Variable information */ zvalue_value value; /* value */ zend_uint refcount__gc; zend_uchar type;/* active type */ zend_uchar is_ref__gc; }; The zvalue_value union will probably be 8 or 12 bytes, depending on the architecture. The whole struct will then probably be between 14 and 24 bytes, depending on the architecture and structure alignment and so on. For my system: $ cd php-5.3.3 $ ./configure $ cd Zend $ gcc -I. -I../TSRM -x c - END #include zend.h int main(void) { printf(%lu\n,sizeof(zval)); return 0; } END $ file ./a.out ./a.out: Mach-O 64-bit executable $ ./a.out 24 $ gcc -I. -I../TSRM -arch i386 -x c - END #include zend.h int main(void) { printf(%lu\n,sizeof(zval)); return 0; } END $ file ./a.out ./a.out: Mach-O executable i386 $ ./a.out 16 You can figure out what you think the overhead is from that. For a string, arguably the whole structure is overhead, since the string is stored elsewhere via pointer. Likewise for objects. For a double, the payload is 8 bytes, and stored in the zval, so there's less overhead. An integer, with a payload of 4 bytes, is somewhere in between. Ben. --Larry Garfield On 1/19/11 11:27 AM, Gustavo Lopes wrote: On Wed, 19 Jan 2011 14:23:49 -, Martin Scotta martinsco...@gmail.com wrote: What about objects? With objects less copying occurs because the object value (zval) data is actually just a pointer and an id that for most purposes works as a pointer. However, it should be said that while a copy of an array forces more memory to be copied, the inner zvals are not actually copied. In this snippet: $a = array(1, 2, array(3)); $b = $a; function separate($dummy) { } separate($a); the copy that occurs when you force the separation of the zval that is shared by $a and $b ($b = $a doesn't copy the array in $a to $b, it merely copies the zval pointer of $a to $b and increments its reference count) is just a shallow copy of hash table and a increment of the first level zvals' refcounts. This means the zvals that have their pointers stored in the array $a's HashTable are not themselves copied. Interestingly (or should I say, unfortunately), this happens even if the inner zvals are references. See http://php.net/manual/en/language.references.whatdo.php the part on arrays. class Foo { public $foo; } function test($o) { $o-foo-foo-foo = 2; } $bar = new Foo; $bar-foo = new Foo; $bar-foo-foo = new Foo; test( $bar ); This example shows no copying (in the sense of new zval allocation on passing or assignment) at all. --- Also... is it better to pass an object as a parameter rather than many values? function withValues($anInteger, $aBool, $aString) { var_dump($anInteger, $aBool, $aString); } function withObject(ParamOject $o) { var_dump( $o-theInteger(), $o-theBool(), $o-theString() ); } It should be indifferent. In normal circumstances, there is no zval copying at all (only the pointers of arguments' symbols are copied). Only when you start throwing references into the mix will you start forcing copied. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
On Wednesday, January 19, 2011 4:45:14 pm Ben Schmidt wrote: Related: What is the overhead of a ZVal? I'm assuming it's a fixed number of bytes. It seems not, though a zval has a fixed size. What that size is will depend on the compiler and architecture of the system being used, or at least on the ABI. Ah, yes, of course. Oh C... *snip* The zvalue_value union will probably be 8 or 12 bytes, depending on the architecture. The whole struct will then probably be between 14 and 24 bytes, depending on the architecture and structure alignment and so on. *snip* You can figure out what you think the overhead is from that. For a string, arguably the whole structure is overhead, since the string is stored elsewhere via pointer. Likewise for objects. For a double, the payload is 8 bytes, and stored in the zval, so there's less overhead. An integer, with a payload of 4 bytes, is somewhere in between. Hm. OK, so if I'm assuming a 64-bit architecture (most servers these days, I'd think) and just looking for a rough approximation, it sounds like 20 bytes per zval/variable is a not unreasonable estimation. At least close enough for determining the memory overhead of a general algorithm. Thanks again! --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
It does the whole of $b. It has to, because when you change 'baz', a reference in 'bar' needs to change to point to the newly copied 'baz', so 'bar' is written...and likewise 'foo' is written. Ben. On 19/01/11 5:45 PM, Larry Garfield wrote: Hi folks. I have a question about the PHP runtime that I hope is appropriate for this list. (If not, please thwap me gently; I bruise easily.) I know PHP does copy-on-write. However, how deeply does it copy when dealing with nested arrays? This is probably easiest to explain with an example... $a['foo']['bar']['baz'] = 1; $a['foo']['bar']['bob'] = 1; $a['foo']['bar']['narf'] = 1; $a['foo']['poink']['narf'] = 1; function test($b) { // Assume each of the following lines in isolation... // Does this copy just the one variable baz, or the full array? $b['foo']['bar']['baz'] = 2; // Does this copy $b, or just $b['foo']['poink']? $b['foo']['poink']['stuff'] = 3; return $b; } // I know this is wasteful; I'm trying to figure out just how wasteful. $a = test($a); test() in this case should take $b by reference, but I'm trying to determine how much of a difference it is. (In practice my use case has a vastly larger array, so any inefficiencies are multiplied.) --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
That's what I was afraid of. So it does copy the entire array. Crap. :-) Am I correct that each level in the array represents its own ZVal, with the additional memory overhead a ZVal has (however many bytes that is)? That is, the array below would have $a, foo, bar, baz, bob, narf, poink, poink/narf = 8 ZVals? (That seems logical to me because each its its own variable that just happens to be an array, but I want to be sure.) --Larry Garfield On Wednesday, January 19, 2011 1:01:44 am Ben Schmidt wrote: It does the whole of $b. It has to, because when you change 'baz', a reference in 'bar' needs to change to point to the newly copied 'baz', so 'bar' is written...and likewise 'foo' is written. Ben. On 19/01/11 5:45 PM, Larry Garfield wrote: Hi folks. I have a question about the PHP runtime that I hope is appropriate for this list. (If not, please thwap me gently; I bruise easily.) I know PHP does copy-on-write. However, how deeply does it copy when dealing with nested arrays? This is probably easiest to explain with an example... $a['foo']['bar']['baz'] = 1; $a['foo']['bar']['bob'] = 1; $a['foo']['bar']['narf'] = 1; $a['foo']['poink']['narf'] = 1; function test($b) { // Assume each of the following lines in isolation... // Does this copy just the one variable baz, or the full array? $b['foo']['bar']['baz'] = 2; // Does this copy $b, or just $b['foo']['poink']? $b['foo']['poink']['stuff'] = 3; return $b; } // I know this is wasteful; I'm trying to figure out just how wasteful. $a = test($a); test() in this case should take $b by reference, but I'm trying to determine how much of a difference it is. (In practice my use case has a vastly larger array, so any inefficiencies are multiplied.) --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] How deep is copy on write?
Yep. PHP does clock up memory very quickly for big arrays, objects with lots of members and/or lots of small objects with large overheads. There are a LOT of zvals and zobjects and things around the place, and their overhead isn't all that small. Of course, if you go to the trouble to construct arrays using references, you can avoid some of that, because a copy-on-write will just copy the reference. It does mean you're passing references, though. $bar['baz'] = 1; $poink['narf'] = 1; $a['foo']['bar'] = $bar; $a['foo']['poink'] = $poink; Then if you test($a), $bar and $poink will be changed, since they are 'passed by reference'--no copying needs to be done. It's almost as if $b were passed by reference, but setting $b['blip'] wouldn't show up in $a, because $a itself would be copied in that case, including the references, which would continue to refer to $bar and $poink. So a much quicker copy, but obviously not the same level of isolation that you might expect or desire. Unless you did some jiggerypokery like $b_bar=$b['bar']; $b['bar']=$b_bar; which would break the reference and make a copy of just that part of the array. But this is a pretty nasty caller-callee co-operative kind of thing. Just a thought to throw into the mix, though. Disclaimer: I'm somewhat out of my depth here. But I'm sure someone will jump on me if I'm wrong. Ben. On 19/01/11 6:09 PM, Larry Garfield wrote: That's what I was afraid of. So it does copy the entire array. Crap. :-) Am I correct that each level in the array represents its own ZVal, with the additional memory overhead a ZVal has (however many bytes that is)? That is, the array below would have $a, foo, bar, baz, bob, narf, poink, poink/narf = 8 ZVals? (That seems logical to me because each its its own variable that just happens to be an array, but I want to be sure.) --Larry Garfield On Wednesday, January 19, 2011 1:01:44 am Ben Schmidt wrote: It does the whole of $b. It has to, because when you change 'baz', a reference in 'bar' needs to change to point to the newly copied 'baz', so 'bar' is written...and likewise 'foo' is written. Ben. On 19/01/11 5:45 PM, Larry Garfield wrote: Hi folks. I have a question about the PHP runtime that I hope is appropriate for this list. (If not, please thwap me gently; I bruise easily.) I know PHP does copy-on-write. However, how deeply does it copy when dealing with nested arrays? This is probably easiest to explain with an example... $a['foo']['bar']['baz'] = 1; $a['foo']['bar']['bob'] = 1; $a['foo']['bar']['narf'] = 1; $a['foo']['poink']['narf'] = 1; function test($b) { // Assume each of the following lines in isolation... // Does this copy just the one variable baz, or the full array? $b['foo']['bar']['baz'] = 2; // Does this copy $b, or just $b['foo']['poink']? $b['foo']['poink']['stuff'] = 3; return $b; } // I know this is wasteful; I'm trying to figure out just how wasteful. $a = test($a); test() in this case should take $b by reference, but I'm trying to determine how much of a difference it is. (In practice my use case has a vastly larger array, so any inefficiencies are multiplied.) --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php