From:             [EMAIL PROTECTED]
Operating system: WinME
PHP version:      4.2.3
PHP Bug Type:     Performance problem
Bug description:  Populating a large array of arrays doesn't scale linearly

Steps to Reproduce:
0 - Standard PHP 4.2.3 installed using the installer.
1 - Create an array of strings using something like explode on a CSV line
2 - Add that array to another array
3 - Repeat steps 1 and 2 1000, 2000, 3000, 4000, 5000 and 6000 times

Example snippet (short):
$times = 6000;   /* change this to 1000-6000 */
$line =
"'x','s','n','f','n','a','c','b','y','e',?,'s','s','o','o','p','o','o','p','o','c','l','e'";

$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
  $bits = explode ( ",", $line );
  $storage[] = $bits;
}


Expected Behaviour:
The time it takes to do perform this is linearly proportional to the
number of iterations it performs.


Actual Behaviour:
The results of running (on a Celeron 800) are as follows:
1000 - 2.174001 seconds
2000 - 22.422081 seconds
3000 - 74.593858 seconds
4000 - 148.223771 seconds
5000 - 254.329387 seconds
6000 - 371.621738 seconds

The time it takes pretty much doubles for every increase of 1000.


The funny thing is I was able to do similar operations independently with
linear time.  For example, a slight modification to the script:

$times = 6000;
$line =
"'x','s','n','f','n','a','c','b','y','e',?,'s','s','o','o','p','o','o','p','o','c','l','e'";

$storage = array ( );
$otherBits = explode ( ",", $line );
for ( $a = 0; $a < $times; $a++ )
{
  $bits = explode ( ",", $line );
  $storage[] = $otherBits;
}


...and it runs in less than half of a second, even at $times = 6000.  So I
know I can add arrays to an array.




A much longer repro script (that demonstrates that these operations work
fine independently in various combinations) follows:

------start long repro script----------
<pre>
<?php
  $trace = true;
  $start_time = 0;
  $times = 6000;

function getmicrotime ( )
{
  list ( $usec, $sec ) = explode ( " ", microtime () );
  return ( (float) $usec + (float) $sec );
}

function start ( $message )
{
  global $trace, $start_time;
  $start_time = getmicrotime ( );
  if ( $trace )
  {
    printf ( "%s... ", $message );
    flush ( );
  }
}


function stop ( )
{
  global $trace, $start_time;
  if ( $trace )
  {
    $current_time = getmicrotime ( );
    $runningTime = $current_time - $start_time;
    printf ( "%.6f seconds\n", $runningTime );
    flush ( );
  }
}

  set_time_limit ( 0 );
  error_reporting ( E_ALL );

  // allocate some memory so as to not bias the first result
  $storage = array ( );
  $line =
"'x','s','n','f','n','a','c','b','y','e',?,'s','s','o','o','p','o','o','p','o','c','l','e'";

  $otherBits = explode ( ",", $line );
  for ( $a = 0; $a < $times; $a++ )
  {
    $storage[] = $line;
  }


  start ( "A: Adding $times times the same string to an array" );
  $storage = array ( );
  for ( $a = 0; $a < $times; $a++ )
  {
    $storage[] = $line;
  }
  stop ( );


  start ( "B: Adding $times times the same array to an array" );
  $storage = array ( );
  for ( $a = 0; $a < $times; $a++ )
  {
    $storage[] = $otherBits;
  }
  stop ( );


  start ( "C: Creating $times different arrays" );
  for ( $a = 0; $a < $times; $a++ )
  {
    $bits = explode ( ",", $line );
  }
  stop ( );


  start ( "D: Tests A and C" );
  $storage = array ( );
  for ( $a = 0; $a < $times; $a++ )
  {
    $storage[] = $line;
    $bits = explode ( ",", $line );
  }
  stop ( );


  start ( "E: Tests A and B" );
  $storage = array ( );
  for ( $a = 0; $a < $times; $a++ )
  {
    $storage[] = $line;
    $bits = explode ( ",", $line );
  }
  stop ( );

  start ( "F: Tests B and C with different arrays" );
  $storage = array ( );
  for ( $a = 0; $a < $times; $a++ )
  {
    $bits = explode ( ",", $line );
    $storage[] = $otherBits;
  }
  stop ( );


  start ( "G: Tests B and C with the array just created" );
  $storage = array ( );
  for ( $a = 0; $a < $times; $a++ )
  {
    $bits = explode ( ",", $line );
    $storage[] = $bits;
  }
  stop ( );

?>
</pre>
------end long repro script------------


Test "G" in the long repro script is the one that takes "forever" compared
to the other operations.  Note that even though the same $line is exploded
on every iteration, I am expecting these to be different (read from a
file) in every iteration, hence the need to explode inside the loop.


I also checked the bug database.  Unlike
http://bugs.php.net/bug.php?id=13598 I am not concerned with how much
memory my script takes.  And I don't think that I am experiencing the same
problem as http://bugs.php.net/bug.php?id=6333 because I *am* able to
create an array of arrays the same size. (ie: test "F" in the long repro
script)  I didn't find any other bugs that were similar to mine, although
I might have missed something (sorry) because I wasn't sure how to express
the problem.


Work-arounds appreciated (I tried pre-allocating with array_fill - no
avail), but a fix would even be better.

Good luck and thanks in advance!
-- 
Edit bug report at http://bugs.php.net/?id=19499&edit=1
-- 
Try a CVS snapshot:  http://bugs.php.net/fix.php?id=19499&r=trysnapshot
Fixed in CVS:        http://bugs.php.net/fix.php?id=19499&r=fixedcvs
Fixed in release:    http://bugs.php.net/fix.php?id=19499&r=alreadyfixed
Need backtrace:      http://bugs.php.net/fix.php?id=19499&r=needtrace
Try newer version:   http://bugs.php.net/fix.php?id=19499&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=19499&r=support
Expected behavior:   http://bugs.php.net/fix.php?id=19499&r=notwrong
Not enough info:     http://bugs.php.net/fix.php?id=19499&r=notenoughinfo
Submitted twice:     http://bugs.php.net/fix.php?id=19499&r=submittedtwice
register_globals:    http://bugs.php.net/fix.php?id=19499&r=globals

Reply via email to