Edit report at http://bugs.php.net/bug.php?id=49374&edit=1

 ID:               49374
 Comment by:       grzegorz dot drozd at esky dot pl
 Reported by:      wmeler at wp-sa dot pl
 Summary:          [PATCH] serialization adds random references
 Status:           Open
 Type:             Bug
 Package:          Strings related
 Operating System: *
 PHP Version:      5.*, 6

 New Comment:

Hello.



It seems to me that I just encountered this error. By implementing a new
service which includes a series (more than 70) objects each of which
contains an array of objects I noticed the appearance of random objects
from another part of the data.

The data structure is as follows:



List

 item 1

 item 2

 item 3

   subitem 1

   subitem 2

   subitem 3

      element *

      element *

      element *



Objects marked with an asterisk appear in other places in this structure
with a semi random frequency. Serialized session has a size of about 2-3
MB.



All the elements are objects with very similar name:

List_item_subitem_element or List_item_subitem etc.



There are about 70 instances of item with 2-10 instances of subitem and
2-4 instances of element.


Previous Comments:
------------------------------------------------------------------------
[2009-12-29 19:31:22] [email protected]

Never heard such thing as putting the patch as file somewhere where it
can be downloaded from? Don't be so arrogant, we might even commit your
patches sometime.

------------------------------------------------------------------------
[2009-08-27 06:42:11] wmeler at wp-sa dot pl

I can't provide you simple script to reproduce the problem, as it
depends on everything what is on heap - environment variables,
compilation, memory manager etc.

Why didn't I send following patch? because your bug reporting form has
no such input ... You also state: "Please do not insert any huge text
here. Keep the description as short as possible. If we want to get a
strace, we will ask for it separately."

Feel free to change bug reporting form. I hope that patch will fit in
comment...



--- ext/standard/var.c.orig     2009-08-27 09:18:53.628753000 +0200

+++ ext/standard/var.c  2009-08-27 09:39:06.932120000 +0200

@@ -470,25 +470,23 @@

 static inline int php_add_var_hash(HashTable *var_hash, zval *var, void
*var_old TSRMLS_DC) /* {{{ */

 {

        ulong var_no;

-       char id[32], *p;

+       char id[32];

        register int len;

 

-       /* relies on "(long)" being a perfect hash function for data
pointers,

-        * however the actual identity of an object has had to be determined

+       /* pointers to variables stored in binary hash keys

+        * the actual identity of an object has had to be determined

         * by its object handle and the class entry since 5.0. */

        if ((Z_TYPE_P(var) == IS_OBJECT) && Z_OBJ_HT_P(var)->get_class_entry)
{

-               p = smart_str_print_long(id + sizeof(id) - 1,

-                               (((size_t)Z_OBJCE_P(var) << 5)

-                               | ((size_t)Z_OBJCE_P(var) >> (sizeof(long) * 8 
- 5)))

-                               + (long) Z_OBJ_HANDLE_P(var));

-               *(--p) = 'O';

-               len = id + sizeof(id) - 1 - p;

+               zend_class_entry *ce = Z_OBJCE_P(var); 

+               memcpy(id,&ce,sizeof(void *));

+               
memcpy(id+sizeof(void*),&(Z_OBJ_HANDLE_P(var)),sizeof(zend_object_handle));

+               len = sizeof(void*)+sizeof(zend_object_handle);

        } else {

-               p = smart_str_print_long(id + sizeof(id) - 1, (long) var);

-               len = id + sizeof(id) - 1 - p;

+               memcpy(id,&var,sizeof(void *));

+               len = sizeof(void *);

        }

 

-       if (var_old && zend_hash_find(var_hash, p, len, var_old) == SUCCESS)
{

+       if (var_old && zend_hash_find(var_hash, id, len, var_old) == SUCCESS)
{

                if (!Z_ISREF_P(var)) {

                        /* we still need to bump up the counter, since non-refs 
will

                         * be counted separately by unserializer */

@@ -500,7 +498,7 @@

 

        /* +1 because otherwise hash will think we are trying to store NULL
pointer */

        var_no = zend_hash_num_elements(var_hash) + 1;

-       zend_hash_add(var_hash, p, len, &var_no, sizeof(var_no), NULL);

+       zend_hash_add(var_hash, id, len, &var_no, sizeof(var_no), NULL);

        return SUCCESS;

 }

 /* }}} */

------------------------------------------------------------------------
[2009-08-26 15:03:57] [email protected]

How about an example script to reproduce this and the patch? Why do we
need to ask for the patch separately anyway? Just show the patch or
don't. Don't "ask to ask".

------------------------------------------------------------------------
[2009-08-26 14:01:30] wmeler at wp-sa dot pl

Description:
------------
Serialization relies on perfect hashing (without collisions) of
variables in ext/standard/var.c - php_add_var_hash. Collision result in
random reference to previously serialized variable. 

It is possible to happen because hash function used for objects is not
perfect one - for two objects of different classes it is possible to get
the same hash result.

I've just fixed the same problem in 4.x where collisions were more
frequent because of use of HANDLE_NUMERIC in zend_hash_add and
zend_hash_next_index_insert.

Problem is extremely hard to reproduce and debug because of pointer
value sensitivity, while easy to fix. Instead of single
smart_str_print_long(hash) we could use concatenation - two calls -
smart_str_print_long(Z_OBJCE_P(var);smart_str_print_long(Z_OBJ_HANDLE_P(var))
- or even faster version with binary memcpy of two pointers without 'O'
prefix. If you wan't to I can provide this simple patch for 5.3.0.



------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=49374&edit=1

Reply via email to