From: dolecek at sky dot cz
Operating system: NetBSD
PHP version: 5.1.2
PHP Bug Type: Performance problem
Bug description: WDDX serializer inefficient with larger structures
Description:
------------
We use WDDX for persistent storage on a project, the production system
runs on MS Windows. The structure is typically small array of several
asociative arrays. We recently noticed that bigger arrays takes
significantly more time to serialize then small ones. Increasing the size
of array twice resulted to about 5-10 times time increase, with even
bigger increase as the size increased.
Problem boils down do smart string API, used by WDDX. WDDX uses it to
store the serialization results.
With default sizes, the initial smart string size is 76 bytes and the
buffer grows only by 128+1 bytes. When the buffer size grows beyond
trivial sizes, the whole smart_string_appendl() et.al. starts being
dominated by the time spend reallocating the buffer, i.e. realloc() and
the associated memory-to-memory copies and CPU cache trashing.
I've tried two strategies to alleviate the problem.
1. increase the buffer grow size to 4192 bytes
2. enforce power-of-2 size of the buffer, and always
at least double the size of buffer
Many malloc()/realloc() implementation optimize power-of-2 sizes, and the
doubling also ensures the total number of calls to realloc() and
associated memory trashing is minimized.
1) helps a lot, but the time increase is still not prortional to the array
size increase.
2) fixes the problem, the time increase is mostly exactly proportional to
the size increase.
Note - the same problem has been observed for standard serializer, i.e.
serialize(). Briefly looking at source it seems the problem is the same as
for WDDX.
Patch for 1):
--- ext/wddx/wddx.c.orig 2006-04-22 19:27:19.000000000 +0200
+++ ext/wddx/wddx.c
@@ -22,6 +22,8 @@
#if HAVE_WDDX
+#define SMART_STR_PREALLOC 4192
+
#include "ext/xml/expat_compat.h"
#include "php_wddx.h"
#include "php_wddx_api.h"
256
Patch for 2):
--- ext/wddx/wddx.c.orig 2006-01-01 13:50:16.000000000 +0100
+++ ext/wddx/wddx.c
@@ -38,2 +44,21 @@
+#undef SMART_STR_DO_REALLOC
+#define SMART_STR_DO_REALLOC(d, what) \
+ (d)->c = SMART_STR_REALLOC((d)->c, (d)->a, (what))
+
+#undef smart_str_alloc4
+#define smart_str_alloc4(d, n, what, newlen) do {
\
+ if (!(d)->c) {
\
+ (d)->len = 0;
\
+ (d)->a = SMART_STR_START_SIZE; \
+ } \
+\
+ newlen = (d)->len + (n);
\
+ if (newlen >= (d)->a || !(d)->c) {
\
+ while((d)->a < newlen+1)
\
+ (d)->a += (d)->a; \
+ SMART_STR_DO_REALLOC(d, what);
\
+ }
\
+} while (0)
+
#define WDDX_BUF_LEN 256
Reproduce code:
---------------
<?php
$item = array();
for($i=0; $i < 20; $i++)
$item['item'.$i] = 'content '.$i;
$sample = array(50, 100, 200, 400, 800, 1600, 3200, 6400);
foreach($sample as $cnt) {
// build the sample array to be serialized
$var = array();
for($i=0; $i < $cnt; $i++) {
$var[] = $item;
}
$st = microtime(true);
wddx_serialize_value($var);
echo "$cnt: ".(microtime(true)-$st)."\n";
}
Expected result:
----------------
Expected is linear increase of time.
1) result:
50: 0.0034401416778564
100: 0.0071568489074707
200: 0.020088911056519
400: 0.10020208358765
800: 0.56227111816406
1600: 2.8473780155182
3200: 11.858402013779
6400: 48.511730909348
2) result:
50: 0.0030708312988281
100: 0.0057220458984375
200: 0.013122081756592
400: 0.03044605255127
800: 0.06494402885437
1600: 0.13331294059753
3200: 0.26938605308533
6400: 0.53914594650269
Actual result:
--------------
This is the initial (without patch) result:
50: 0.011401891708374
100: 0.052657127380371
200: 0.2442729473114
400: 2.2745268344879
800: 16.260624885559
1600: 86.947965145111
(3200 and 6400 skipped due to too long run)
--
Edit bug report at http://bugs.php.net/?id=37168&edit=1
--
Try a CVS snapshot (PHP 4.4):
http://bugs.php.net/fix.php?id=37168&r=trysnapshot44
Try a CVS snapshot (PHP 5.1):
http://bugs.php.net/fix.php?id=37168&r=trysnapshot51
Try a CVS snapshot (PHP 6.0):
http://bugs.php.net/fix.php?id=37168&r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=37168&r=fixedcvs
Fixed in release:
http://bugs.php.net/fix.php?id=37168&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=37168&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=37168&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=37168&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=37168&r=support
Expected behavior: http://bugs.php.net/fix.php?id=37168&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=37168&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=37168&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=37168&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=37168&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=37168&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=37168&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=37168&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=37168&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=37168&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=37168&r=mysqlcfg