Yves Dorfsman writes: >Hallvard B Furuseth wrote: >> Yves Dorfsman writes: >>> Is there any reason for not amend the LDIF RFC to accept utf-8 chars >>> without base64 encoding ? >> >> It does allow that. > > Really ?
Whoops... I seem to remember it was intended to be allowed. Something like "MUST accept UTF-8 input but possibly SHOULD output ASCII". I do remember there were some UTF-8-related changes in the syntax in the last drafts for the RFC. > There is one paragraph which is confusing in the RFC > (http://www.ietf.org/rfc/rfc2849.txt): > > 4) Any dn or rdn that contains characters other than those > defined as "SAFE-UTF8-CHAR", or begins with a character other > than those defined as "SAFE-INIT-UTF8-CHAR", above, MUST be > base-64 encoded. Other values MAY be base-64 encoded. Any > value that contains characters other than those defined as > "SAFE-CHAR", or begins with a character other than those > defined as "SAFE-INIT-CHAR", above, MUST be base-64 encoded. > Other values MAY be base-64 encoded. > > But then, SAFE-UT8-CHAR is not defined anywhere, and then: OTOH note #2 ends with 2) (...) Implementations SHOULD NOT fold lines in the middle of a multi-byte UTF-8 character. which is a pretty pointless instruction if UTF-8 is not allowed. > So then, shouldn't the RFC make that clear ? When I researched this, I > remember seeing that one of the LDAP server (either the Oracle or the IBM, > can't remember) added an extension to LDIF, and accepted a "charset:" tag. > Maybe we should add that to the RFC, or these days, just make it utf-8 ? Sounds like a plan. The RFC needs some other fixes too. Of course, that needs Someone(TM) to do the work of updating it and pushing it through the IETF process. (Discussion would occur on the [email protected] list.) -- Hallvard
