Re: regex headache

Jim Gibson Mon, 03 Feb 2014 13:45:32 -0800

On Feb 3, 2014, at 12:30 PM, Paul Fontenot wrote:

> Hi, I am attempting to write a regex but it is giving me a headache.
> 
> I have two log entries
> 
> 1. Feb  3 12:54:28 cdrtva01a1005 [12: 54:27,532] ERROR
> [org.apache.commons.logging.impl.Log4JLogger]
> 2. Feb  3 12:54:28 cdrtva01a1005 [12: 54:27,532] ERROR [STDERR]
> 
> I am using the following
> "^\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}\s+\w+\s+\[\d{1,2}:\s+\d{1,2}:\d{1,
> 2},\d{3}\]\s+\w+\s+\[[a-zA-Z0-9.]\]"
> 
> My problem is this greedy little '.' - I need to just be a period. How do I
> match #1 and not match #2?



You appear to be making the job too difficult. The only difference between 
lines 1. and 2. is the last column. To differentiate those two, you can do this 
(assuming the string is in $_):

if( /\[STDERR\]/ ) {
  # process line 2
}else{
  # process line 1
}

Do you really need to match each field in the entire line? If so, I would try 
splitting the lines on whitespace and extracting the columns you need that way. 
Whether or not that works depends upon: 1) how much variation there can be in 
your log entries, and 2) what exactly you need to extract from each entry. 
Fixing that regex may not be the most productive approach in the long term.

As for your specific question, a period in a character class (e.g., [.]) will 
match a period. A period in the regex pattern will match any character (except 
possibly a newline). To match a period character, escape the period: /\./


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex headache

Reply via email to