2010/12/9 Kevin P. Fleming <kpflem...@digium.com>:
> ISO-8859 isn't specific enough; there are 16 subsections of ISO-8859,
> with different encodings. The character you are trying represent has
> different encodings in many of them.

Yes :(


> In SMTP there is some sort of syntax that can be used to specify the
> character encoding of the display name portion of a header string... but
> I don't know if that's allowed in SIP or not. Based on the ABNF you've
> posted above it's clearly not allowed.

It's not allowed, sure.

The problem is the following:

Currently my parser applies official BNF grammar for unknown header values:

 unknown-header   =  header-name HCOLON header-value CRLF

 header-value      =  *(TEXT-UTF8char / UTF8-CONT / LWS)
 TEXT-UTF8char   =  %x21-7E / UTF8-NONASCII
 UTF8-NONASCII   =  %xC0-DF 1UTF8-CONT
                       /  %xE0-EF 2UTF8-CONT
                       /  %xF0-F7 3UTF8-CONT
                       /  %xF8-Fb 4UTF8-CONT
                       /  %xFC-FD 5UTF8-CONT
 UTF8-CONT       =  %x80-BF


I've relaxed it:

  unknown-header   =  header-name HCOLON header-value CRLF
  header-value   = ( any )*


However it makes the parser invalid/wrong in some cases as when a
custom header value contains line folding. The correct grammar (above)
avoids this problem. So I need a "mix", something not so strict as the
official BNF but it must not invalidate well formed headers (even if
exotic).


Thanks a lot.

-- 
Iñaki Baz Castillo
<i...@aliax.net>

_______________________________________________
Sip-implementors mailing list
Sip-implementors@lists.cs.columbia.edu
https://lists.cs.columbia.edu/cucslists/listinfo/sip-implementors

Reply via email to