Part 3 - In reference to the below code - I have replaced the fopen(), fread() 
and fclose() with file_get_contents() 

This does not result in any noticeable speed improvement. 

What does however, is taking out the array_merge() statement. To say this 
makes it a lot quicker, is a an understatement - it's a hell of a difference.

Also the speed from start to finish is now linear. This would explain the 
previous noted slow down - the further it gets through (and the bigger the 
overall array gets) the lot slower it gets to merge data in, ven though it's 
only merging small quantities in per time. (average - 5 to 20 records)

preg_match_all() is run once per file.

The output from preg_match_all is an array - afaik there is no other option 
here.

The question is what can I then do with the array output from preg_match_all() 
to store it, along with the combined data from the previous files scanned, 
that does not involve CPU intensive and increasingly slow (the more files 
processed) calls to array_merge() as it gets through the job?

FYI - each file is text, usually under 20kb, and has 5-20 regex 'matches' on 
average.

-----------------------------------------
Ok. It is probably about time I posted the awlfully slow section of code for 
people to look at and state their opinions-

(Prior to this part the code has recursively scanned a directory and created 
an array with all the path / file names).

foreach ($files as $filename) {

// Determine MIME type: (This uses pear mime_type module)
$mt=MIME_Type::autoDetect("$filename");

// If suitable MIME type open, read and process: (ie: all text files)
if (substr($mt,0,4) == "text") {

// obtain filesize needed for fread():
// $fs = filesize("$filename"); COMMENTED OUT NOW

// open the file:
// $thefile = fopen("$filename","rb"); COMMENTED OUT NOW

// read the file:
// $content = fread($thefile,$fs); COMMENTED OUT NOW and replaced 
// with file_get_contents()

// extract the information: ($pattern is a previously defined PCRE regex)
preg_match_all($pattern,$content,$matches);

// add it to our array:
$results = array_merge($matches[0],$results);

// unset the temporary array:
unset($matches);

// close the file:
// fclose($thefile); COMMENTED OUT NOW

// count valid file(s) scanned:
$pv = $pv + 1;

}; // end the MIME type statement

// count files(s) scanned:
$p1 = $p1 + 1;
$bar1->update($p1);

// count data scanned:
$fc = $fc + $fs;

}; // end loop:

--~--~---------~--~----~------------~-------~--~----~
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[email protected]
-~----------~----~----~----~------~----~------~--~---

Reply via email to