9 feb 2011 kl. 17.16 skrev Brett Tate:

>> From: [email protected] [sip-implementors-
>> [email protected]] On Behalf Of Saúl Ibarra Corretgé
>> [[email protected]]
>> 
>> I've been working in unicode support with SIP and some doubts just
>> raised: lets say we have the following SIP URIs:
>> 
>> a) sip:me@saúl.com
>> b) sip:saú[email protected]
>> 
>> For a) case should I encode the domain part using punycode, same as
>> it's done in email or web domains?
>> _______________________________________________
>> 
>> Within SIP, all characters are encoded via UTF-8.  In regard to URIs,
>> they are sequences of characters drawn from the Unicode character set.
>> So when either of those URIs is sent on the wire, it would be
>> represented by the octets 73 (s) 61 (a) C3 BA (ú) 6C (l) (unless I've
>> made a mistake).
>> 
>> In some contexts in SIP, in some parts of a URI, a character can be
>> represented by an escape %xx.  That format can only represent
>> characters with values x00 to xFF, although all the characters in your
>> URIs are in that range.  Using escapes allows other sequences of octets
>> to represent these URIs.
> 
> The last paragraph of rfc3261 section 19.1.2 reflects that 
> escaping/unescaping host is not appropriate.
> 
> "Note that character escaping is not allowed in the host component of a SIP 
> or SIPS URI (the % character is not valid in its expansion). This is likely 
> to change in the future as requirements for Internationalized Domain Names 
> are finalized.  Current implementations MUST NOT attempt to improve 
> robustness by treating received escaped characters in the host component as 
> literally equivalent to their unescaped counterpart.  The behavior required 
> to meet the requirements of IDN may be significantly different."
> 
> RFC 5891 provides some details concerning International Domain Names.
> 
Agree. And to clarify a bit more:

Important to note is that I think that you should never store the viewing 
version, like "olle@blåbärsmjölk.se" in infrastructure databases, nor handle it 
in the protocol messages. The software that shows URI's to users and accept 
user input will have to translate between the network version and the view 
version. A software like Asterisk may accept 

dial(sip/olle@blåbärsmjölk.se) 

in the dialplan text, but not use it in any SIP message, nor stored it in the 
CDR. The URI used will be sip:[email protected].

(Blåbärsmjölk = blueberry milk. A word that contains the three "special" 
characters we use in Sweden.)

/O
_______________________________________________
Sip-implementors mailing list
[email protected]
https://lists.cs.columbia.edu/cucslists/listinfo/sip-implementors

Reply via email to