Re: [PHP] fail on preg_match_all

2008-02-21 Thread Nathan Rixham

Hamilton Turner wrote:
Just a follow-up on this, the problem was 'Fatal error: Allowed memory 
size of 8388608 bytes exhausted'


After some nice help, I found that this is actually a common problem 
with intense regexes in php. Quick fix is using ini_set() to increase 
your memory_limit to something massive, like '400M'. This gives your 
script access to that much memory for its life-time. If you have this 
problem, then you probably also have to do this


set_time_limit(0);  //remove any max execution time

Hamilton


PS - for anyone confused, here was the script . . . i didnt think it was 
that confusing, sorry guys!


function parse_access($file_name)
{
   // read file data into variable
   $fh = fopen($file_name, 'r') or die(cant open file for reading);
   $theData = fread($fh, filesize($file_name));
   fclose($fh);

   // perform regex
   $regex = '!(\d{0,3}\.\d{0,3}\.\d{0,3}\.\d{0,3}) - - 
\[(\d{2})/(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Aug|Oct|Nov|Dec)/(\d{4}):(\d{2}):(\d{2}):(\d{2}) 
-\d+] {1,4}GET ([._0-9\-%a-zA-Z/,?=]+) ([.0-9a-zA-Z%/\-,_?]+) (\d{3}) 
(\d+) \[(.+?)] \[(.+?)] \[(.+?)] (\d+) (\d+) (\d+) (\d+) (\d+) (\d+)!';

   //echo $regex . 'brbrhrbr';
   $num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);
   //echo after regex - we are still alive!;

   //go on to do some boring stuff, like write this to an array, perform 
stuff, graph stuff, blah blah

}






Can you also post a few lines from your access log so we've got 
something to test against.


The regex looks incorrect to me in a few places:
-\d+] {1,4}
for example.

How to debug your script:

make a copy of the log file and trim it down to say 20 lines; run the 
script on it to verify it's doing what you want it to.


check the size of the real log file, then multiply it by 2.5 and see 
if the total is greater than your php max memory setting. (exp: $theData 
will hold the full file, matches will also hold another copy of most of 
the file, then a bit extra for php to use)


Regards

Nathan

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] fail on preg_match_all

2008-02-21 Thread Stut

Hamilton Turner wrote:
Just a follow-up on this, the problem was 'Fatal error: Allowed memory 
size of 8388608 bytes exhausted'


After some nice help, I found that this is actually a common problem 
with intense regexes in php. Quick fix is using ini_set() to increase 
your memory_limit to something massive, like '400M'. This gives your 
script access to that much memory for its life-time. If you have this 
problem, then you probably also have to do this


set_time_limit(0);  //remove any max execution time

Hamilton

PS - for anyone confused, here was the script . . . i didnt think it was 
that confusing, sorry guys!


function parse_access($file_name)
{
   // read file data into variable
   $fh = fopen($file_name, 'r') or die(cant open file for reading);
   $theData = fread($fh, filesize($file_name));
   fclose($fh);

   // perform regex
   $regex = '!(\d{0,3}\.\d{0,3}\.\d{0,3}\.\d{0,3}) - - 
\[(\d{2})/(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Aug|Oct|Nov|Dec)/(\d{4}):(\d{2}):(\d{2}):(\d{2}) 
-\d+] {1,4}GET ([._0-9\-%a-zA-Z/,?=]+) ([.0-9a-zA-Z%/\-,_?]+) (\d{3}) 
(\d+) \[(.+?)] \[(.+?)] \[(.+?)] (\d+) (\d+) (\d+) (\d+) (\d+) (\d+)!';

   //echo $regex . 'brbrhrbr';
   $num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);
   //echo after regex - we are still alive!;

   //go on to do some boring stuff, like write this to an array, perform 
stuff, graph stuff, blah blah

}


Increasing the memory limit is the worst possible solution to this 
problem.


It's a giant file, and your regex is basically pulling out each line. 
For the love of $DEITY learn about fgets and process each line one by 
one rather than loading in the whole file. It'll be a lot faster and 
won't suck your memory dry while it runs.


-Stut

--
http://stut.net/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] fail on preg_match_all

2008-02-21 Thread Robin Vickery
On 21/02/2008, Nathan Rixham [EMAIL PROTECTED] wrote:
  The regex looks incorrect to me in a few places:
  -\d+] {1,4}
  for example.

That's ok, albeit confusing:

* The ']' is a literal ']' not the closing bracket of a character class.
* The {1,4} applies to the space character.

-robin

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] fail on preg_match_all

2008-02-21 Thread Richard Lynch
On Wed, February 20, 2008 11:34 pm, Hamilton Turner wrote:
 Does anyone know why a server would simply fail on this line?

 $num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);

 if i know the file handle is valid (i grabbed it using 'or die'), and
 the regex is valid

Define fail...

There are any number of things that could be going wrong.

Check your logs, dump out the data, etc

-- 
Some people have a gift link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/from/lynch
Yeah, I get a buck. So?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] fail on preg_match_all

2008-02-20 Thread Hamilton Turner

Does anyone know why a server would simply fail on this line?

$num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);

if i know the file handle is valid (i grabbed it using 'or die'), and 
the regex is valid


hamy

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] fail on preg_match_all

2008-02-20 Thread Chris

Hamilton Turner wrote:

Does anyone know why a server would simply fail on this line?

$num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);

if i know the file handle is valid (i grabbed it using 'or die'), and 
the regex is valid


What file handle? preg_match doesn't work on resources, it works on data.

What do you mean by die? return no results? segfaults the server?  ?

--
Postgresql  php tutorials
http://www.designmagick.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] fail on preg_match_all

2008-02-20 Thread Jim Lucas

Hamilton Turner wrote:

Does anyone know why a server would simply fail on this line?

$num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);

if i know the file handle is valid (i grabbed it using 'or die'), and 
the regex is valid


hamy



Wow, what a lack of information!  If I was holding an Easter Egg basket 
right now, could you tell me how many eggs I had in it?


Why don't you provide some useful code examples!

Namely:
1. Show us some context
2. What is in $regex
3. Show us the code that generates the data in $theData

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] fail on preg_match_all

2008-02-20 Thread Paul Scott

On Wed, 2008-02-20 at 23:34 -0600, Hamilton Turner wrote:
 if i know the file handle is valid (i grabbed it using 'or die'), and 
   ^

This is probably your problem, you are trying to match on a resource
handle, not a string or something.

Check out http://www.php.net/preg_match_all

 the regex is valid

Won't really matter if the data is in the wrong format!

--Paul
-- 
.
| Chisimba PHP5 Framework - http://avoir.uwc.ac.za   |
::

All Email originating from UWC is covered by disclaimer 
http://www.uwc.ac.za/portal/public/portal_services/disclaimer.htm 

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] fail on preg_match_all

2008-02-20 Thread Hamilton Turner
Just a follow-up on this, the problem was 'Fatal error: Allowed memory 
size of 8388608 bytes exhausted'


After some nice help, I found that this is actually a common problem 
with intense regexes in php. Quick fix is using ini_set() to increase 
your memory_limit to something massive, like '400M'. This gives your 
script access to that much memory for its life-time. If you have this 
problem, then you probably also have to do this


set_time_limit(0);  //remove any max execution time

Hamilton


PS - for anyone confused, here was the script . . . i didnt think it was 
that confusing, sorry guys!


function parse_access($file_name)
{
   // read file data into variable
   $fh = fopen($file_name, 'r') or die(cant open file for reading);
   $theData = fread($fh, filesize($file_name));
   fclose($fh);

   // perform regex
   $regex = '!(\d{0,3}\.\d{0,3}\.\d{0,3}\.\d{0,3}) - - 
\[(\d{2})/(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Aug|Oct|Nov|Dec)/(\d{4}):(\d{2}):(\d{2}):(\d{2}) 
-\d+] {1,4}GET ([._0-9\-%a-zA-Z/,?=]+) ([.0-9a-zA-Z%/\-,_?]+) (\d{3}) 
(\d+) \[(.+?)] \[(.+?)] \[(.+?)] (\d+) (\d+) (\d+) (\d+) (\d+) (\d+)!';

   //echo $regex . 'brbrhrbr';
   $num = preg_match_all($regex, $theData, $match, PREG_SET_ORDER);
   //echo after regex - we are still alive!;

   //go on to do some boring stuff, like write this to an array, 
perform stuff, graph stuff, blah blah

}