Phil Pennock <[email protected]> (Mo 13 Jul 2009 23:44:15 CEST):
> On 2009-07-13 at 22:54 +0200, Karl Fischer wrote:
> > I followed this thread with interest and I'm still a little puzzled with the
> > specific exim syntax, but in terms of regex and just extracting the header 
> > names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g
> > 
> > This saves looping through map/extract by getting rid of the unwanted 1st.
> 
> Good point.
> 
> However, you're also not stripping out space between the header name and
> the following colon, which is valid.  This email could validly be
> constructed with:
> ----------------------------8< cut here >8------------------------------
> From: Phil ....
> To  : Karl ...
> Cc  : exim-users ....
> ----------------------------8< cut here >8------------------------------

Ah. Thanks for the hint.

> With a little further optimisation, we get:
> 
>   s/(?>\s*:.*?\n)(?>\s+.*?\n)*/:/g
> 
> although actually I'm not sure there would be any backtracking needed
> for your s///g and it's probably only the \s*: that benefits from the
> protection.  (I can't be bothered to benchmark it).
> 
> > In exim syntax I'd assume this to be (not tested yet):
> > 
> > MESSAGE_HEADERS = ${lc:${sg 
> > {$message_headers_raw}{\N:.*?\n(\s+.*?\n)*\N}{:}}}
> 
> ${lc:${sg{$message_headers_raw}{\N(?>\s*:.*?\n)(?>\s+.*?\n)*\N}{:}}}

I'm still at my version - instead of cutting away the tail, I'm
selecting the head of the logical header line:

  ${lc:${sg {$message_headers_raw}{\N(?m)(^\S+(?=\s*):)?.*?\n\N}{\$1}}}

But I'm not sure about efficency or readability.

-- 
Heiko

Attachment: signature.asc
Description: Digital signature

-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

Reply via email to