Re: [exim] From header with encoding not parsed?

2023-04-13 Thread Jeremy Harris via Exim-users

On 13/04/2023 23:24, Martin D Kealey via Exim-users wrote:

On Thu, 13 Apr 2023 at 19:36, Slavko  wrote in
exim-users@exim.org:


Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users <
exim-users@exim.org> napísal:

Hi, I have a variable to extract the email address in from header set

like this:


${lc:${address:$h_From:}}


Header is valid, but after decoding it contains comma without
qoutes, the comma is address separator and thus results in
list of two "addresses", first without valid address, thus empty...



My take on this is that Exim is wrong there.

Anywhere else, splitting addresses on commas happens before decoding, and
this should be no different.


Uh, it's only a list if and when you use that string (the result of that 
expansion)
where a list is expected.  And the list separator is also defined
by the context.

I don't agree with "Exim is wrong there".

--
Cheers,
  Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-13 Thread Martin D Kealey via Exim-users
On Thu, 13 Apr 2023 at 19:36, Slavko  wrote in
exim-users@exim.org:

> Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users <
> exim-users@exim.org> napísal:
> > Hi, I have a variable to extract the email address in from header set
> like this:
> >
> > ${lc:${address:$h_From:}}
>
> Header is valid, but after decoding it contains comma without
> qoutes, the comma is address separator and thus results in
> list of two "addresses", first without valid address, thus empty...
>

My take on this is that Exim is wrong there.

Anywhere else, splitting addresses on commas happens before decoding, and
this should be no different.

One way to do that would be to treat encoded characters as if they were
quoted.

-Martin
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-13 Thread Jeremy Harris via Exim-users

On 13/04/2023 09:54, Victor Ustugov via Exim-users wrote:

I'm not talking about what should be encoded, but about what can be
received in a real email from a spammer, some kind of script or
something like that.


A mail sender could send you *anything*.
--
Cheers,
  Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-13 Thread Victor Ustugov via Exim-users
Slavko via Exim-users wrote on 12.04.2023 22:38:

 Hi, I have a variable to extract the email address in from header set like 
 this:

 ${lc:${address:$h_From:}}
>>>
>>> Header is valid, but after decoding it contains comma without
>>> qoutes, the comma is address separator and thus results in
>>> list of two "addresses", first without valid address, thus empty...
>>>
>>> Use raw header for address extracting -- $rh_From: that works
>>> for both, quoted and encoded content...
>>
>>
>> What about the colon without encoding?
>>
>> From: =?utf-8?Q?My=20Bizness:=20Inc.?= 
> 
> AFAIK colon have to be encoded, quote from by RFC 2047, section
> 5 (the From: and similar):
> 
> characters that may be used in a "Q"-encoded 'encoded-word' is
> restricted to:  "!", "*", "+", "-", "/", "=", and "_">

I'm not talking about what should be encoded, but about what can be
received in a real email from a spammer, some kind of script or
something like that.


-- 
Best wishes Victor Ustugov
mailto:vic...@corvax.kiev.ua
public GnuPG/PGP key: https://victor.corvax.kiev.ua/corvax.asc

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-13 Thread Victor Ustugov via Exim-users
Jasen Betts via Exim-users wrote on 13.04.2023 10:07:
> On 2023-04-12, Victor Ustugov via Exim-users  wrote:
>> Slavko via Exim-users wrote on 12.04.2023 20:42:
>>> Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users 
>>>  napísal:
 Hi, I have a variable to extract the email address in from header set like 
 this:

 ${lc:${address:$h_From:}}
>>>
>>> Header is valid, but after decoding it contains comma without
>>> qoutes, the comma is address separator and thus results in
>>> list of two "addresses", first without valid address, thus empty...
>>>
>>> Use raw header for address extracting -- $rh_From: that works
>>> for both, quoted and encoded content...
>>
>>
>> What about the colon without encoding?
>>
>> From: =?utf-8?Q?My=20Bizness:=20Inc.?= 
> 
> yes, the colon breaks it. it's not a valid from header.

I know. But email clients correctly display the From header shown above.
And it is quite possible to get such a header in an incoming email.

> RFC5322 is a bit of a rabbit hole to dive into.
> 
> but the short story is none of these should be used in "bare" names 
> 
>   specials=   "(" / ")" /; Special characters that do
>   "<" / ">" /;  not appear in atext
>   "[" / "]" /
> ":" / ";" /
>   "@" / "\" /
> "," / "." /
> DQUOTE
> 
> except where there is specific permission given
> 
> 
> Easiest fix for the sender is to use quotes.
> 
> From: "=?utf-8?Q?My=20Bizness:=20Inc.?=" 

in order to insert double quotes, I need to separate the From header
into the address and the part of the header that comes before it. Why do
I need to add quotes if I have already determined which part of the
header is the address?


-- 
Best wishes Victor Ustugov
mailto:vic...@corvax.kiev.ua
public GnuPG/PGP key: https://victor.corvax.kiev.ua/corvax.asc

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-13 Thread Jasen Betts via Exim-users
On 2023-04-12, Victor Ustugov via Exim-users  wrote:
> Slavko via Exim-users wrote on 12.04.2023 20:42:
>> Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users 
>>  napísal:
>>> Hi, I have a variable to extract the email address in from header set like 
>>> this:
>>>
>>> ${lc:${address:$h_From:}}
>> 
>> Header is valid, but after decoding it contains comma without
>> qoutes, the comma is address separator and thus results in
>> list of two "addresses", first without valid address, thus empty...
>> 
>> Use raw header for address extracting -- $rh_From: that works
>> for both, quoted and encoded content...
>
>
> What about the colon without encoding?
>
> From: =?utf-8?Q?My=20Bizness:=20Inc.?= 

yes, the colon breaks it. it's not a valid from header.

RFC5322 is a bit of a rabbit hole to dive into.

but the short story is none of these should be used in "bare" names 

  specials=   "(" / ")" /; Special characters that do
  "<" / ">" /;  not appear in atext
  "[" / "]" /
  ":" / ";" /
  "@" / "\" /
  "," / "." /
  DQUOTE

except where there is specific permission given


Easiest fix for the sender is to use quotes.

From: "=?utf-8?Q?My=20Bizness:=20Inc.?=" 

-- 
 Jasen.
  Слава Україні

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread MRob via Exim-users
@Jeremy I think location is no problem since the address is successfully 
extracted in most cases. Only this one problem, because the encoded 
comma



Simple put that line in some file and try itself by -bem, eg:


Thank you Slavko so i will not bother list with that kind of question!


${address:} expansion is following RFC 2822... so maybe its ok and the
importance is $h_ should never be used with ${address:} because that
address expansion will decode it anyway??


Hard to say, headers can be broken (by mistake or by purpose)
in many ways


Also question about $h_ decoding, I dont remember if quoting is 
required
if it is encoded like my exmaple. Is the example a invalid header 
because
it needs quoting? Or is the problem that i'm using two unrelated steps 
for

full parsing? ($h_ then ${address:})


Looking like RFC2822 requires quote when have comma in display-name but 
doesn't talk about when encoding used on display-name so I still dont 
know if its valid header. I will guess that it is required to decode 
then parse as normal non-encoded rfc2822 header, thus this header is not 
valid?


If that's right then using $h_ to do decoding then ${address:} to parse 
and extract address is ok even though its two separate operations. 
Otherwise, exim would need an operation that does both in once

(does ${address:} do decoding or only parsing?)

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread Slavko via Exim-users
Dňa 12. apríla 2023 19:15:19 UTC používateľ MRob via Exim-users 
 napísal:
>On 2023-04-12 17:42, Slavko via Exim-users wrote:

>> Use raw header for address extracting -- $rh_From: that works
>> for both, quoted and encoded content...
>
>If using rh_From: is there risk to get tricked with header like:
>
>From: "spammer_addr...@example.bad" 

Simple put that line in some file and try itself by -bem, eg:

exim -bem /file/with/that_header '${address:$rh_From:}'

>${address:} expansion is following RFC 2822... so maybe its ok and the 
>importance is $h_ should never be used with ${address:} because that address 
>expansion will decode it anyway??

Hard to say, headers can be broken (by mistake or by purpose)
in many ways. One usually do not need look into From: headers
from foreign source, but will want eg. to extract domain from it
for DKIM (DMARC intended) signature from own messages, thus
ensure valid From: header on MSA with in depth inspection.

I delegate in depth message inspection to rspamd, with
some exceptions -- mostly Subject: and attachments (eg. for
DMARC reports extraction/routing).

>Also question about $h_ decoding, I dont remember if quoting is required if it 
>is encoded like my exmaple. Is the example a invalid header because it needs 
>quoting? Or is the problem that i'm using two unrelated steps for full 
>parsing? ($h_ then ${address:})

RFC defines when quotes are required, the "@" is one of that
case, exim properly checks that syntax with control=verifyXY
ACL (sorry i forgot exact) condition.

AFAIK, the name part is either quoted (for ASCII only) or
encoded (for nonASCII). But i often see encoded ASCII
only chars (rspamd detects that), and often in legitime
messages...

BTW, i am always surprised how problematic are nonASCII
things. My first bigger computer project was to teach computer
to print chars nowadays known as Latin2 & Cyrillic (in 1984 :-) ),
Nowadays it is no problem to print/show that, but...

regards


-- 
Slavko
https://www.slavino.sk/

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread Jeremy Harris via Exim-users

On 12/04/2023 17:50, MRob via Exim-users wrote:

Hi, I have a variable to extract the email address in from header set like this:

${lc:${address:$h_From:}}

But it comes out blank(empty) given a "from" header like this one:

From: =?utf-8?Q?My=20Bizness=2C=20Inc.?= 

I think thats a valid header? Did i do somethings wrong please? Thanks!


You didn't say whree you are trying to do that expansion.
If it's before data phase, the headers have not yet been received.

--
Cheers,
  Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread Slavko via Exim-users
Dňa 12. apríla 2023 18:43:09 UTC používateľ Victor Ustugov via Exim-users 
 napísal:
>Slavko via Exim-users wrote on 12.04.2023 20:42:
>> Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users 
>>  napísal:
>>> Hi, I have a variable to extract the email address in from header set like 
>>> this:
>>>
>>> ${lc:${address:$h_From:}}
>> 
>> Header is valid, but after decoding it contains comma without
>> qoutes, the comma is address separator and thus results in
>> list of two "addresses", first without valid address, thus empty...
>> 
>> Use raw header for address extracting -- $rh_From: that works
>> for both, quoted and encoded content...
>
>
>What about the colon without encoding?
>
>From: =?utf-8?Q?My=20Bizness:=20Inc.?= 

AFAIK colon have to be encoded, quote from by RFC 2047, section
5 (the From: and similar):

characters that may be used in a "Q"-encoded 'encoded-word' is
restricted to: 

regards


-- 
Slavko
https://www.slavino.sk/

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread MRob via Exim-users

On 2023-04-12 17:42, Slavko via Exim-users wrote:

Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users
 napísal:
Hi, I have a variable to extract the email address in from header set 
like this:


${lc:${address:$h_From:}}


Header is valid, but after decoding it contains comma without
qoutes, the comma is address separator and thus results in
list of two "addresses", first without valid address, thus empty...

Use raw header for address extracting -- $rh_From: that works
for both, quoted and encoded content...


thank you Slavko!

If using rh_From: is there risk to get tricked with header like:

From: "spammer_addr...@example.bad" 

${address:} expansion is following RFC 2822... so maybe its ok and the 
importance is $h_ should never be used with ${address:} because that 
address expansion will decode it anyway??


Also question about $h_ decoding, I dont remember if quoting is required 
if it is encoded like my exmaple. Is the example a invalid header 
because it needs quoting? Or is the problem that i'm using two unrelated 
steps for full parsing? ($h_ then ${address:})


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread Victor Ustugov via Exim-users
Slavko via Exim-users wrote on 12.04.2023 20:42:
> Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users 
>  napísal:
>> Hi, I have a variable to extract the email address in from header set like 
>> this:
>>
>> ${lc:${address:$h_From:}}
> 
> Header is valid, but after decoding it contains comma without
> qoutes, the comma is address separator and thus results in
> list of two "addresses", first without valid address, thus empty...
> 
> Use raw header for address extracting -- $rh_From: that works
> for both, quoted and encoded content...


What about the colon without encoding?

From: =?utf-8?Q?My=20Bizness:=20Inc.?= 


-- 
Best wishes Victor Ustugov
mailto:vic...@corvax.kiev.ua
public GnuPG/PGP key: https://victor.corvax.kiev.ua/corvax.asc

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] From header with encoding not parsed?

2023-04-12 Thread Slavko via Exim-users
Dňa 12. apríla 2023 16:50:29 UTC používateľ MRob via Exim-users 
 napísal:
>Hi, I have a variable to extract the email address in from header set like 
>this:
>
>${lc:${address:$h_From:}}

Header is valid, but after decoding it contains comma without
qoutes, the comma is address separator and thus results in
list of two "addresses", first without valid address, thus empty...

Use raw header for address extracting -- $rh_From: that works
for both, quoted and encoded content...

regards


-- 
Slavko
https://www.slavino.sk/

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


[exim] From header with encoding not parsed?

2023-04-12 Thread MRob via Exim-users
Hi, I have a variable to extract the email address in from header set 
like this:


${lc:${address:$h_From:}}

But it comes out blank(empty) given a "from" header like this one:

From: =?utf-8?Q?My=20Bizness=2C=20Inc.?= 

I think thats a valid header? Did i do somethings wrong please? Thanks!

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/