On Tue, 10 Mar 2009, deiva shanmugam wrote:
>     When an header containing control characters like form feed, page 
> eject(^L) , vertical tab(^K) etc. is canonicalized using relaxed 
> canonicalization algorithm, vertical tab is converted to a space.

^L and ^K are illegal in headers according to RFC5322.  See section 2.2.

> But, according to the RFC, stripping/reducing the WSP, unfolding of headers
> etc. alone are being specified. Nothing regarding the special characters is
> being told. So, in that case, canonicalization shouldn't disturb the control
> characters?
>
> Header passed to the canonicalization:
>
> Content-Type:*^L*multipart/alternative;*^K*
> boundary=^M"----=_NextPart_000_0000_JLZRZLJO.OULDWYUC"

The ^L and ^K are not allowed in headers (see above).  The ^M is only 
allowed if it folds headers, which means it would have to be followed by 
whitespace (section 2.2.3).  In your example, it's not.

> Result of relaxed canonicalization using sendmail's dkim code:
>
> content-type:multipart/alternative;
> boundary=^M"----=_NextPart_000_0000_JLZRZLJO.OULDWYUC"
>
> So, what is the procedure that need to be followed, when control 
> characters are encountered?

It's unspecified, since the standard doesn't allow them.

I imagine the MTA is using the ^L as whitespace separating the header 
field name from its value so it just gets dropped; it's discarding the ^K 
since it's not allowed; and it's keeping the ^M as the next line isn't 
really a new header so it must be a continuation of the one previous.

------------------------------------------------------------------------------
_______________________________________________
dkim-milter-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dkim-milter-discuss

Reply via email to