The charset conversion that is happening in LDAP is actually quite specialized.  
The general functionality of converting from one charset to another already exists in 
APR in the form of apr_xlat_xxx().  LDAP is only interested in converting the user ID 
from a given charset to UTF-8.  Up until auth_ldap calls ap_get_basic_auth_pw(), the 
user ID and password are encrypted in the "Authentication" header entry.  Until the 
user ID and password have been decrypted, the conversion to UTF-8 can not occur.  
Therefore the conversion must take place from within auth_ldap or any other 
authentication module after decrypting the user information.  A module or filter 
outside of the authentication module that does a blind charset conversion on the 
header information, would not work because it would not be able to decrypt the user ID 
and password, convert it and re-encrypt it in order to make the process transparent to 
all authentication modules.  (Actually you could probably make it work for base64, but 
what about digest?)
   On the other hand, the one place that the conversion could be done is within the 
call to ap_get_basic_auth_pw().  But ap_get_basic_auth_pw() or whatever function 
handles decrypting digest authentication, would have to be modified so that it had 
access to the "accept-language" header values.  This would allow it to convert from 
the assumed browser's charset to UTF-8 or any other charset.  But the down side is 
that the "accept-language" header value does not guarantee that that is the charset 
the browser used when it sent the request.  It is simply an indicator of what 
charset(s) the browser will accept.  Auth_LDAP would be utilizing this functionality 
to at least attempt to do the right thing rather than always failing.
   I do agree that we need some type of functionality that will convert requests made 
in a particular charset to a universal charset that Apache can rely on.  I'm just not 
sure this is it.  It seems to work for auth_LDAP, but I'm not sure how to generalize 
it.  This is where a much broader discussion need to take place.



Brad Nicholes
Senior Software Engineer
Novell, Inc., the leading provider of Net business solutions
http://www.novell.com 

>>> [EMAIL PROTECTED] Thursday, December 12, 2002 4:09:57 AM >>>
>    This patch eliminates the hardcoded charset table.  Instead it
>    reads the charset table from a conf file.  The directive
>    AuthLDAPCharsetConfig allows the admin to specify the charset conf
>    file.  Is there also a need to specify additional conversions
>    directly in the httpd.conf file through a different directive?  It
>    seems that the charset conf file would be sufficient.  If there are
>    multiple charsets per language, these can be set by specifying the
>    5 character language ID rather than the 2 character ID similar to
>    the example in the charset.conv file for chinese. 

As nd said, if someone needs additional conversion, he will scream for it. 
:-)

But something else is going around in my head. Why should this charset 
conversion be limited to ladp? Well, I don't know where we need the 
conversion table too. But the table itself should be general available to 
all modules. Maybe some other modules would like to do the same.
A core (?) directive like LanguageCharsetConfig might be much more useful 
then AuthLDAPCharsetConfig. So the next step would be to move the 
conversion function to core or apr or so, too. Each module, which needs a 
conversion, can call this funtion instead of having its own code.

Maybe there are some overlapping with mod_charset_lite which also does 
charset conversion. 

Kess

Reply via email to