ID:               37168
 Updated by:       [EMAIL PROTECTED]
 Reported By:      dolecek at sky dot cz
-Status:           Open
+Status:           Feedback
 Bug Type:         WDDX related
 Operating System: NetBSD
 PHP Version:      5.1.2
 New Comment:

Please try using this CVS snapshot:

  http://snaps.php.net/php5.1-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5.1-win32-latest.zip

I've tried your sample code and I see only linear increase in 
the execution time. And certainly at no point did it take 
seconds to execute each block.


Previous Comments:
------------------------------------------------------------------------

[2006-04-22 18:42:15] dolecek at sky dot cz

It's a minimal intrusion patch, designed to pinpoint and show the
problem - feel free to integrate whatever way is best for PHP project.

------------------------------------------------------------------------

[2006-04-22 17:55:03] [EMAIL PROTECTED]

The patch is definitely wrong, as this is just a hack overriding
functions in this particular case only.
Fix the original functions instead (if there are any problems).


------------------------------------------------------------------------

[2006-04-22 17:49:32] dolecek at sky dot cz

Oh, and the patch 2) should also include the #define SMART_STR_PREALLOC
    4192, so that the initial size is power-of-2 (the test result was
run with that part included).

------------------------------------------------------------------------

[2006-04-22 17:38:54] dolecek at sky dot cz

The OS was set to NetBSD for this bug report because the tests were run
on NetBSD.

------------------------------------------------------------------------

[2006-04-22 17:35:35] dolecek at sky dot cz

Description:
------------
We use WDDX for persistent storage on a project, the production system
runs on MS Windows. The structure is typically small array of several
asociative arrays. We recently noticed that bigger arrays takes
significantly more time to serialize then small ones. Increasing the
size of array twice resulted to about 5-10 times time increase, with
even bigger increase as the size increased.

Problem boils down do smart string API, used by WDDX. WDDX uses it to
store the serialization results.

With default sizes, the initial smart string size is 76 bytes and the
buffer grows only by 128+1 bytes. When the buffer size grows beyond
trivial sizes, the whole smart_string_appendl() et.al. starts being
dominated by the time spend reallocating the buffer, i.e. realloc() and
the associated memory-to-memory copies and CPU cache trashing.

I've tried two strategies to alleviate the problem.

1. increase the buffer grow size to 4192 bytes
2. enforce power-of-2 size of the buffer, and always
   at least double the size of buffer

Many malloc()/realloc() implementation optimize power-of-2 sizes, and
the doubling also ensures the total number of calls to realloc() and
associated memory trashing is minimized.

1) helps a lot, but the time increase is still not prortional to the
array size increase.

2) fixes the problem, the time increase is mostly exactly proportional
to the size increase.

Note - the same problem has been observed for standard serializer, i.e.
serialize(). Briefly looking at source it seems the problem is the same
as for WDDX.

Patch for 1):

--- ext/wddx/wddx.c.orig        2006-04-22 19:27:19.000000000 +0200
+++ ext/wddx/wddx.c
@@ -22,6 +22,8 @@
 
 #if HAVE_WDDX
 
+#define SMART_STR_PREALLOC     4192
+
 #include "ext/xml/expat_compat.h"
 #include "php_wddx.h"
 #include "php_wddx_api.h"
                   256

Patch for 2):
--- ext/wddx/wddx.c.orig        2006-01-01 13:50:16.000000000 +0100
+++ ext/wddx/wddx.c
@@ -38,2 +44,21 @@

+#undef SMART_STR_DO_REALLOC
+#define SMART_STR_DO_REALLOC(d, what) \
+       (d)->c = SMART_STR_REALLOC((d)->c, (d)->a, (what))
+
+#undef smart_str_alloc4
+#define smart_str_alloc4(d, n, what, newlen) do {                     
                \
+       if (!(d)->c) {                                                 
                                                \
+               (d)->len = 0;                                          
                                                \
+               (d)->a = SMART_STR_START_SIZE; \
+       } \
+\
+               newlen = (d)->len + (n);                               
                                        \
+               if (newlen >= (d)->a || !(d)->c) {                     
                                                        \
+                       while((d)->a < newlen+1)                       
                                        \
+                                       (d)->a += (d)->a; \
+                       SMART_STR_DO_REALLOC(d, what);                 
                                \
+               }                                                      
                                                                \
+} while (0)
+
 #define WDDX_BUF_LEN                   256


Reproduce code:
---------------
<?php

$item = array();

for($i=0; $i < 20; $i++)
        $item['item'.$i] = 'content '.$i;

$sample = array(50, 100, 200, 400, 800, 1600, 3200, 6400);

foreach($sample as $cnt) {
        // build the sample array to be serialized
        $var = array();
        for($i=0; $i < $cnt; $i++) {
                $var[] = $item;
        }

        $st = microtime(true);
        wddx_serialize_value($var);
        echo "$cnt: ".(microtime(true)-$st)."\n";
}


Expected result:
----------------
Expected is linear increase of time.

1) result:

50: 0.0034401416778564
100: 0.0071568489074707
200: 0.020088911056519
400: 0.10020208358765
800: 0.56227111816406
1600: 2.8473780155182
3200: 11.858402013779
6400: 48.511730909348

2) result:
50: 0.0030708312988281
100: 0.0057220458984375
200: 0.013122081756592
400: 0.03044605255127
800: 0.06494402885437
1600: 0.13331294059753
3200: 0.26938605308533
6400: 0.53914594650269



Actual result:
--------------
This is the initial (without patch) result:

50: 0.011401891708374
100: 0.052657127380371
200: 0.2442729473114
400: 2.2745268344879
800: 16.260624885559
1600: 86.947965145111

(3200 and 6400 skipped due to too long run)


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=37168&edit=1

Reply via email to