ID: 47643
Comment by: cisa at cisa85 dot de
Reported By: viper7 at viper-7 dot com
Status: Open
Bug Type: Performance problem
Operating System: *
PHP Version: 5.2.9
New Comment:
Like I described [1] I use this function to get the performance I
need:
function array_diff_fast($data1, $data2) {
$data1 = array_flip($data1);
$data2 = array_flip($data2);
foreach($data2 as $hash => $key) {
if (isset($data1[$hash])) unset($data1[$hash]);
}
return array_flip($data1);
}
Thanks to Viper for his help.
[1]
http://nohostname.de/blog/2009/03/24/bug-gefunden-array_diff-in-php-526-unglaublich-langsam/
Previous Comments:
------------------------------------------------------------------------
[2009-03-13 11:49:36] viper7 at viper-7 dot com
Description:
------------
This bug was reported in ##php on freenode, and after some thorough
testing on multiple machines we determined it must be an engine bug.
array_diff on two large arrays of md5 hashes (600,000 elements each)
takes approximately 4 seconds on a fast server in PHP 5.2.4 and below
(confirmed with PHP 5.2.0), but over 4 hours (!) on PHP 5.2.6 and
greater (confirmed with PHP 5.2.9 and PHP 5.3.0 beta2)
Reproduce code:
---------------
<?php
$i=0; $j=500000;
while($i < 600000) {
$i++; $j++;
$data1[] = md5($i);
$data2[] = md5($j);
}
$time = microtime(true);
echo "Starting array_diff\n";
$data_diff1 = array_diff($data1, $data2);
$time = microtime(true) - $time;
echo 'array_diff() took ' . number_format($time, 3) . ' seconds and
returned ' . count($data_diff1) . " entries\n";
?>
Expected result:
----------------
Starting array_diff
array_diff() took 3.778 seconds and returned 500000 entries
Actual result:
--------------
Starting array_diff
array_diff() took 14826.278 seconds and returned 500000 entries
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=47643&edit=1