Re: [exim] catching newlines with ${sg {}{}{}}

2008-06-20 Thread Marten Lehmann
Hello,

 Exim uses real PCRE; Philip Hazel is the original author of both.
 
 .*\nX-purgate-ID: (.*?)\n.*

 whereas $1 would contain the id. Unfortunately, the sg expansion item 
 does seem to work with newlines.
 
 If you double-check the documentation on ${sg ...} then you'll see the
 reminder:

I read this, but I don't understand where is the difference wether \n is 
expanded by exim to a newline (without using \N or using \\n) or using 
\N so PCRE transforms \n to a newline.

 Try:
   ${sg{$spam_report}{\N^.*\n\s*X-purgate-ID: (.*?)\n.*$\N}{\$1}}

Thanks, this works. But it only works, because I know the exact format 
of $spam_report. How can I tell ${sg{}} to include \n to the matching 
characters of .*? I think in Perl this was done by the modifier /s.

Is there any way on exim (besides to embed Perl) to extract a value like

$id = $1 if $spam_report =~ /(^|\n)X-purgate-ID: (.*?)(\n|$)/s ?

Kind regards
Marten

-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] catching newlines with ${sg {}{}{}}

2008-06-20 Thread Phil Pennock
On 2008-06-20 at 14:48 +0200, Marten Lehmann wrote:
 I read this, but I don't understand where is the difference wether \n is 
 expanded by exim to a newline (without using \N or using \\n) or using 
 \N so PCRE transforms \n to a newline.

In this case, not much.  Just be sure to also use \$ instead of $ to
anchor the end of the regexp, etc etc.  \N removes the need for one
layer of backslashes so is generally less error-prone.

  Try:
${sg{$spam_report}{\N^.*\n\s*X-purgate-ID: (.*?)\n.*$\N}{\$1}}
 
 Thanks, this works. But it only works, because I know the exact format 
 of $spam_report. How can I tell ${sg{}} to include \n to the matching 
 characters of .*? I think in Perl this was done by the modifier /s.

Embed the modifier at the start of the regexp with (?s) -- see manual
pages perlre(1) (for Perl's documentation) or pcrepattern(3) (comes with
later versions of pcre).  The latter describes this under INTERNAL
OPTION SETTING.

 Is there any way on exim (besides to embed Perl) to extract a value like
 
 $id = $1 if $spam_report =~ /(^|\n)X-purgate-ID: (.*?)(\n|$)/s ?

Again, adding the \s* at the start so that X-purgate-ID: doesn't need to
be at the beginning of the line:

 
${sg{$spam_report}{\N(?s)^(?:.*\n|)\s*X-purgate-ID:\s+([^\n]+)(?:|\n.*)$\N}{\$1}}

The main reason it's longer is because sg is short for Perl's s///g so
you need to handle the lines which _don't_ match and don't get away with
conditional setting.

That's the one long regexp approach.  The closest you'll get to a
conditional is to use map-filter on lists generated by using newline as
a separator:

  ${map\
{\n ${filter {\n $spam_report}{match{$item}{\N^X-purgate-ID:\N\
{${sg{$item}{\N^[^:]+:\s*(.*)\N}{\$1}}}\
}

The straight regexp is probably faster, since you enter the regexp
engine just once.

Regards,
-Phil


-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


[exim] catching newlines with ${sg {}{}{}}

2008-06-17 Thread Marten Lehmann
Hello,

I need to extract the value X-purgate-ID from $spam_report:

X-purgate: Spam
X-purgate-ID: 150741::080616223818-6C9786C0-73CE72D8/2129941411-0/0-3
X-purgate-Ad: For more information about eXpurgate please visit 
http://www.expurgate.net/

With real PCRE, the expression would look like this:

.*\nX-purgate-ID: (.*?)\n.*

whereas $1 would contain the id. Unfortunately, the sg expansion item 
does seem to work with newlines.

It is easy to remove all lines but the first:

${sg{${sg{$spam_report}{X-purgate: }{}}}{\n.*}{}}

This returns: Spam

But all tries to extract the id (or furthermore remove anything before 
and after) in one step failed. The only way that worked was this:

${sg{${sg{${sg{$spam_report}{X-purgate: .*\n}{}}}{X-purgate-Ad: 
.*}{}}}{.*X-purgate-ID: (.*)\n.*}{\$1}}

But it looks very ugly. Any ideas, how this could be done nicer?

Kind regards
Marten

-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] catching newlines with ${sg {}{}{}}

2008-06-17 Thread Jakob Hirsch
Marten Lehmann wrote:

 I need to extract the value X-purgate-ID from $spam_report:
 
 X-purgate: Spam
   X-purgate-ID: 150741::080616223818-6C9786C0-73CE72D8/2129941411-0/0-3
   X-purgate-Ad: For more information about eXpurgate please visit 
 http://www.expurgate.net/

Does this work?

${extract {X-purgate-ID:} {$spam_report}}


-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] catching newlines with ${sg {}{}{}}

2008-06-17 Thread Phil Pennock
On 2008-06-17 at 20:53 +0200, Marten Lehmann wrote:
 I need to extract the value X-purgate-ID from $spam_report:
 
 X-purgate: Spam
   X-purgate-ID: 150741::080616223818-6C9786C0-73CE72D8/2129941411-0/0-3
   X-purgate-Ad: For more information about eXpurgate please visit 
 http://www.expurgate.net/
 
 With real PCRE, the expression would look like this:

Exim uses real PCRE; Philip Hazel is the original author of both.

 .*\nX-purgate-ID: (.*?)\n.*
 
 whereas $1 would contain the id. Unfortunately, the sg expansion item 
 does seem to work with newlines.

If you double-check the documentation on ${sg ...} then you'll see the
reminder:

8 cut here 8--
   Because all three arguments are expanded before use,
if any $ or \ characters are required in the regular expression or in the
substitution string, they have to be escaped. For example:

${sg{abcdef}{^(...)(...)\$}{\$2\$1}}

yields defabc, and

${sg{1=A 4=D 3=C}{\N(\d+)=\N}{K\$1=}}

yields K1=A K4=D K3=C. Note the use of \N to protect the contents of
the regular expression from string expansion.
8 cut here 8--

Try:
  ${sg{$spam_report}{\N^.*\n\s*X-purgate-ID: (.*?)\n.*$\N}{\$1}}

Note the \s* to match the whitespace you showed above, the \N at each
end of the regex field and the \$1, so that $1 would be expanded by
the regex engine, instead of expanded as an Exim variable passed in to
be used in the substitution pattern.

That is, it's perfectly fine to use $acl_m_foo as the substitution, Exim
expanded that for you; so to use $1 for a regexp, you pass \$1.

Regards,
-Phil

-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/