Chris wrote:
>> The only good way I can see is to define that all DN, RDN, password and >> all attributes in a list (I might have forgotten something) to be translated. >> The attribute list could actually list the binary attributes as most are text. > >The only way you can do it is use the schema, and then build in knowledge >about what each syntax requires. You'd have to make that extensible, because >new syntaxes are defined in some environments and markets. Or let it be done as an option when doing new Net::LDAP >I meant if you had to encode attribute values (OCTET STRINGs) yourself, >while the API magically treated LDAPString values as UTF-8, then you'd need: > > $ldap->add ( 'cn=Chris Ridd', > attrs => [ > objectClass => [qw(top person extensibleObject)], > cn => String2utf8('Chris Ridd'), > sn => String2utf8('Ridd'), > myAttr => $foo, > seeAlso => String2utf8('cn=Chris Ridd') > ] > ); > >You would have to wrap the seeAlso value in something (eg String2utf8) to >translate the string into UTF-8 before sending to the directory, while you >would *not* have to translate the entry's DN into UTF-8 as you're proposing >that the API does that. In my proposal all strings will be translated (except those representing binary values). So I would not wrap any data with "String2utf8". So it would not be confusing at all. >> As it is now it is very error prone and confusing. For every call >> I make I have to remember call translate-to-utf-8(string) on every >> parameter that is a string. For example: >> $ldap->bind(String2utf8("cn=xåx,o=example"), >> password=>String2utf8("myåpasswd")); > >Password's an interesting example because it is defined to be OCTET STRING >and the character set used in it is defined to be a "local matter". Yes I know password is a difficult thing. I have many problems with that in my mixed Unix/MS Windows/Mac environment. But the only way to get it to work, is to use the same character set for all passwords in a database. As LDAPv3 says UTF-8 for strings and passwords entered by humans normally are strings, I would expect the normal case to be UTF-8 encoded passwords. >> A API should use local character set as default and internally convert >> to protocol character set. It could be an option to enable exposing >> protocol character set. > >That isn't the case with other APIs. The C libldap API just sends the raw >bytes to and fro. The various Java APIs use Unicode strings, because that's >what's native to Java anyway. I know that many API implementors do not think about this. I am very tired of having to do different translations between system character set and protocol character sets. Protocol character sets and formats should be hidden from the programmer. In Java it works as it should, system character set is UTF-16 and the Java APIs do the translation to the protocol character set. Here you do not have to think about character set issues. In the Java LDAP API (JNDI) it has a list of attributs that are known to be non-string (and you can add to that list). Those attributes will not be translated. >There's certainly scope for doing all this in a schema-aware version of the >Net::LDAP API, but I think the way Net::LDAP currently works without messing >around with the data is also very useful. A schema aware version could be done, though schema support is bad in some LDAP servers. Also it would slow done opening an ldap connection if the schema must be fetched. Simple to do like Java - have a list of the known non-string attributes with a possibility for the user to add more. Dan