Re: [rt-users] Bad characters in names loaded from LDAP (AD)
On 11 Oct 2016, at 5:51, Jan Burian wrote: Hi Bill, thank you for your response. Sry not to mention our database. We use PostreSQL. After I wrote first email a also checked encoding in database. The database was with following parameters: Name| Encoding | Collate | Ctype -+-+-+-- rt4 | UTF8 | en_US.UTF-8 | en_US.UTF-8 And so my beautiful theory is destroyed by your brutal facts. :) 1) I dump database with UTF-8 encoding parameter. 2) Then I drop the databases. 3) Create new database with following parameters: Name| Encoding | Collate | Ctype -+-+-+-- rt4 | UTF8 | cs_CZ.UTF-8 | cs_CZ.UTF-8 4) And then import database from dump. But after that change names are loading from LDAP still with bad characters :-/. Indeed: the Collate and Ctype parameters are encoding-specific rulesets for how characters are related to each other, not variations on encoding. When the user writes first email to queue, then is also autocreated as unprivileged. If he/she was his/her name in From header, then is used as RealName RT attribute. But in this case is his/her name saved correctly. *Example from the log - autocreated from LDAP:* [6937] [Tue Sep 27 15:59:25 2016] [info]: RT::User::CanonicalizeUserInfoFromExternalAuth returning Disabled: , EmailAddress: no...@vsup.cz, Gecos: novak, Name: novak, Privileged: 1, RealName: MatouÅ¡ Novák, WorkPhone: (/opt/rt4/sbin/../lib/RT/User.pm:811) [6937] [Tue Sep 27 15:59:25 2016] [info]: Autocreated external user novak ( 61 ) (/opt/rt4/sbin/../lib/RT/Authen/ExternalAuth.pm:356) [6937] [Tue Sep 27 15:59:25 2016] [info]: RT::Authen::ExternalAuth::LDAP::GetAuth External Auth OK ( My_LDAP ): novak (/opt/rt4/sbin/../lib/RT/Authen/ExternalAuth/LDAP.pm:348) [6937] [Tue Sep 27 15:59:26 2016] [info]: RT::User::CanonicalizeUserInfoFromExternalAuth returning EmailAddress: no...@vsup.cz, Name: novak, *RealName: MatouÅ¡ Novák*, WorkPhone: (/opt/rt4/sbin/../lib/RT/User.pm:811) * **Example from the log - autocreated from email:* [6026] [Mon Oct 10 06:26:02 2016] [info]: RT::User::CanonicalizeUserInfoFromExternalAuth returning Comments: Autocreated on ticket submission, Disabled: , EmailAddress: tereza.skvar...@seznam.cz, Name: tereza.skvar...@seznam.cz, Privileged: , *RealName: Tereza Škvárová* (/opt/rt4/sbin/../lib/RT/User.pm:811) Any other ideas? Yes: At least one of your FCGI handlers (PID 6937) is using an 8-bit encoding and at least one (PID 6026) is using UTF-8. Note that both of those cases are being logged by the RT::User::CanonicalizeUserInfoFromExternalAuth method, which uses LDAP to retrieve the attribute it uses for the "RealName" field in RT. The first was logged by process 6937, the second by process 6026. The *reason* for that is a bit of a mystery. It's clear that the 2 processes were not started near the same time (unless that server is VERY busy spawning processes) so if you can determine what was different about how they were launched (likely a involving a locale environment variable, most likely LANG or LC_ALL) you can probably make sure that the improper launch doesn't happen. - RT 4.4 and RTIR training sessions, and a new workshop day! https://bestpractical.com/training * Boston - October 24-26 * Los Angeles - Q1 2017
Re: [rt-users] Bad characters in names loaded from LDAP (AD)
Hi all, I finally resolved the issue with support from RT engineers. So big thanks to them. I'm posting the fix, if someone will be interested (maybe in the future), so it can be found in list archive. Here is answer from RT engineers: /We use Net::LDAP and there is an option called 'raw' that might properly convert the incoming content to utf8. That's the first thing to try since we pass parameters through to Net::LDAP and you can put it right in the config file. //https://metacpan.org/pod/distribution/perl-ldap/lib/Net/LDAP.pod//However, there is likely another bit of code we need to add to RT to be explicit about the incoming text and treat it as utf8 when told to do so. We can file it as a bug, or provide some commercial assistance if you are interested. / So I add raw => qr/(?i:^jpegPhoto|;binary)/ as net_ldap_args parameter in RT_SiteConfig.pm. Now it is all working fine, the names are imported correctly from LDAP (MS AD, LDAP protocol version 3). I also suggested to add information about raw option with example to RT docs. Best regards Jan Burian On 11.10.2016 11:51, Jan Burian wrote: > Hi Bill, > > thank you for your response. Sry not to mention our database. > We use PostreSQL. > After I wrote first email a also checked encoding in database. > > The database was with following parameters: >Name| Encoding | Collate | Ctype > -+-+-+-- > rt4 | UTF8 | en_US.UTF-8 | en_US.UTF-8 > > 1) I dump database with UTF-8 encoding parameter. > 2) Then I drop the databases. > 3) Create new database with following parameters: > >Name| Encoding | Collate | Ctype > -+-+-+-- > rt4 | UTF8 | cs_CZ.UTF-8 | cs_CZ.UTF-8 > > 4) And then import database from dump. > > But after that change names are loading from LDAP still with bad > characters :-/. > > When the user writes first email to queue, then is also autocreated as > unprivileged. If he/she was his/her name in From header, then is used > as RealName RT attribute. But in this case is his/her name saved > correctly. > > *Example from the log - autocreated from LDAP:* > [6937] [Tue Sep 27 15:59:25 2016] [info]: > RT::User::CanonicalizeUserInfoFromExternalAuth returning Disabled: , > EmailAddress: no...@vsup.cz, Gecos: novak, Name: novak, Privileged: 1, > RealName: MatouÅ¡ Novák, WorkPhone: > (/opt/rt4/sbin/../lib/RT/User.pm:811) > [6937] [Tue Sep 27 15:59:25 2016] [info]: Autocreated external user > novak ( 61 ) (/opt/rt4/sbin/../lib/RT/Authen/ExternalAuth.pm:356) > [6937] [Tue Sep 27 15:59:25 2016] [info]: > RT::Authen::ExternalAuth::LDAP::GetAuth External Auth OK ( My_LDAP ): > novak (/opt/rt4/sbin/../lib/RT/Authen/ExternalAuth/LDAP.pm:348) > [6937] [Tue Sep 27 15:59:26 2016] [info]: > RT::User::CanonicalizeUserInfoFromExternalAuth returning EmailAddress: > no...@vsup.cz, Name: novak, *RealName: MatouÅ¡ Novák*, WorkPhone: > (/opt/rt4/sbin/../lib/RT/User.pm:811) > * > **Example from the log - autocreated from email:* > [6026] [Mon Oct 10 06:26:02 2016] [info]: > RT::User::CanonicalizeUserInfoFromExternalAuth returning Comments: > Autocreated on ticket submission, Disabled: , EmailAddress: > tereza.skvar...@seznam.cz, Name: tereza.skvar...@seznam.cz, > Privileged: , *RealName: Tereza Škvárová* > (/opt/rt4/sbin/../lib/RT/User.pm:811) > > Any other ideas? > > Best regards > Jan Burian > > On 11.10.2016 05:41, Bill Cole wrote: >> On 10 Oct 2016, at 16:26, Jan Burian wrote: >> >>> Hi all, >>> >>> we have RT 4.4.0 on CentOS 7 and Perl v5.22.1. And we are starting to >>> use RT in production. >>> >>> We configured RT to authenticate users via LDAP >>> (RT::Authen::ExternalAuth::LDAP). Our LDAP server is MS AD (Win 2008 >>> R2). >> [...] >>> Authentication is working fine. Users can log in, if the user doesn't >>> exist in RT the account is autocreated. All the configured attributes >>> are transferred. >> >> This is a strong sign that the LDAP part is working correctly. If the >> LDAP server (AD) and client (Perl's Net::LDAP module) are using >> mismatched encodings, it is likely to show up in authentication >> failures due to incompatible encodings of the same (logical) >> characters that 8-bit encodings assign to byte values 0x80-0xff. >> >> Fortunately, it is somewhere between arcane and impossible to make >> Net::LDAP use anything other than UTF-8. There's *probably* some way >> to make it do T.61 for ancient-history compatibility, but that's >> mostly pointless. >> >> [...] >>> We had similar problem with Moodle. When we configured Moodle against >>> Active Directory and set cp1250 encoding, then it was doing exactly >>> same >>> thing. After we changed encoding for LDAP connector to utf-8 then the >>> names was >>> corrected. >> >> Which makes sense: LDAP v3 by default uses UTF-8 and you have a >> modern system with a mature LDAP client. I know of no way to >>
Re: [rt-users] Bad characters in names loaded from LDAP (AD)
Hi Bill, thank you for your response. Sry not to mention our database. We use PostreSQL. After I wrote first email a also checked encoding in database. The database was with following parameters: Name| Encoding | Collate | Ctype -+-+-+-- rt4 | UTF8 | en_US.UTF-8 | en_US.UTF-8 1) I dump database with UTF-8 encoding parameter. 2) Then I drop the databases. 3) Create new database with following parameters: Name| Encoding | Collate | Ctype -+-+-+-- rt4 | UTF8 | cs_CZ.UTF-8 | cs_CZ.UTF-8 4) And then import database from dump. But after that change names are loading from LDAP still with bad characters :-/. When the user writes first email to queue, then is also autocreated as unprivileged. If he/she was his/her name in From header, then is used as RealName RT attribute. But in this case is his/her name saved correctly. *Example from the log - autocreated from LDAP:* [6937] [Tue Sep 27 15:59:25 2016] [info]: RT::User::CanonicalizeUserInfoFromExternalAuth returning Disabled: , EmailAddress: no...@vsup.cz, Gecos: novak, Name: novak, Privileged: 1, RealName: MatouÅ¡ Novák, WorkPhone: (/opt/rt4/sbin/../lib/RT/User.pm:811) [6937] [Tue Sep 27 15:59:25 2016] [info]: Autocreated external user novak ( 61 ) (/opt/rt4/sbin/../lib/RT/Authen/ExternalAuth.pm:356) [6937] [Tue Sep 27 15:59:25 2016] [info]: RT::Authen::ExternalAuth::LDAP::GetAuth External Auth OK ( My_LDAP ): novak (/opt/rt4/sbin/../lib/RT/Authen/ExternalAuth/LDAP.pm:348) [6937] [Tue Sep 27 15:59:26 2016] [info]: RT::User::CanonicalizeUserInfoFromExternalAuth returning EmailAddress: no...@vsup.cz, Name: novak, *RealName: MatouÅ¡ Novák*, WorkPhone: (/opt/rt4/sbin/../lib/RT/User.pm:811) * **Example from the log - autocreated from email:* [6026] [Mon Oct 10 06:26:02 2016] [info]: RT::User::CanonicalizeUserInfoFromExternalAuth returning Comments: Autocreated on ticket submission, Disabled: , EmailAddress: tereza.skvar...@seznam.cz, Name: tereza.skvar...@seznam.cz, Privileged: , *RealName: Tereza Škvárová* (/opt/rt4/sbin/../lib/RT/User.pm:811) Any other ideas? Best regards Jan Burian On 11.10.2016 05:41, Bill Cole wrote: > On 10 Oct 2016, at 16:26, Jan Burian wrote: > >> Hi all, >> >> we have RT 4.4.0 on CentOS 7 and Perl v5.22.1. And we are starting to >> use RT in production. >> >> We configured RT to authenticate users via LDAP >> (RT::Authen::ExternalAuth::LDAP). Our LDAP server is MS AD (Win 2008 >> R2). > [...] >> Authentication is working fine. Users can log in, if the user doesn't >> exist in RT the account is autocreated. All the configured attributes >> are transferred. > > This is a strong sign that the LDAP part is working correctly. If the > LDAP server (AD) and client (Perl's Net::LDAP module) are using > mismatched encodings, it is likely to show up in authentication > failures due to incompatible encodings of the same (logical) > characters that 8-bit encodings assign to byte values 0x80-0xff. > > Fortunately, it is somewhere between arcane and impossible to make > Net::LDAP use anything other than UTF-8. There's *probably* some way > to make it do T.61 for ancient-history compatibility, but that's > mostly pointless. > > [...] >> We had similar problem with Moodle. When we configured Moodle against >> Active Directory and set cp1250 encoding, then it was doing exactly same >> thing. After we changed encoding for LDAP connector to utf-8 then the >> names was >> corrected. > > Which makes sense: LDAP v3 by default uses UTF-8 and you have a modern > system with a mature LDAP client. I know of no way to configure a > CentOS 7/Perl 5.22 system such that the LDAP interaction with an AD > LDAP server talking UTF-8 would be the source of this sort of encoding > conflict. I'm mildly surprised that anything talking LDAPv3 can be > made to use cp1250 encoding, but I suppose Microsoft makes their own > rules to go along with their own unique code pages. > > [...] >> Also I red thath MS AD in LDAP protocol version 3 returns any string to >> LDAP client in utf-8 encoding. >> I really don't know where could be a problem. > > The most likely place is in your database. I'm guessing that you are > using MySQL, which defaults to latin1 encoding. When you store a UTF-8 > string into a latin1 table, it breaks any multi-byte characters into 2 > or 3 characters, but the right bits are still there. This issue has > come up a few times on this list over the past decade and I think Best > Practical has documented how to safely convert a RT database with that > sort of problem from latin1 to utf8. It is probably worth looking > through their docs (possibly one of the UPGRADING* files?) and the RT > Wiki for a solution. I expect it could be done with a binary dump of > the database, altering of any latin1 tables to use utf8, and a > re-import of the binary dump. I'm not
Re: [rt-users] Bad characters in names loaded from LDAP (AD)
On 10 Oct 2016, at 16:26, Jan Burian wrote: Hi all, we have RT 4.4.0 on CentOS 7 and Perl v5.22.1. And we are starting to use RT in production. We configured RT to authenticate users via LDAP (RT::Authen::ExternalAuth::LDAP). Our LDAP server is MS AD (Win 2008 R2). [...] Authentication is working fine. Users can log in, if the user doesn't exist in RT the account is autocreated. All the configured attributes are transferred. This is a strong sign that the LDAP part is working correctly. If the LDAP server (AD) and client (Perl's Net::LDAP module) are using mismatched encodings, it is likely to show up in authentication failures due to incompatible encodings of the same (logical) characters that 8-bit encodings assign to byte values 0x80-0xff. Fortunately, it is somewhere between arcane and impossible to make Net::LDAP use anything other than UTF-8. There's *probably* some way to make it do T.61 for ancient-history compatibility, but that's mostly pointless. [...] We had similar problem with Moodle. When we configured Moodle against Active Directory and set cp1250 encoding, then it was doing exactly same thing. After we changed encoding for LDAP connector to utf-8 then the names was corrected. Which makes sense: LDAP v3 by default uses UTF-8 and you have a modern system with a mature LDAP client. I know of no way to configure a CentOS 7/Perl 5.22 system such that the LDAP interaction with an AD LDAP server talking UTF-8 would be the source of this sort of encoding conflict. I'm mildly surprised that anything talking LDAPv3 can be made to use cp1250 encoding, but I suppose Microsoft makes their own rules to go along with their own unique code pages. [...] Also I red thath MS AD in LDAP protocol version 3 returns any string to LDAP client in utf-8 encoding. I really don't know where could be a problem. The most likely place is in your database. I'm guessing that you are using MySQL, which defaults to latin1 encoding. When you store a UTF-8 string into a latin1 table, it breaks any multi-byte characters into 2 or 3 characters, but the right bits are still there. This issue has come up a few times on this list over the past decade and I think Best Practical has documented how to safely convert a RT database with that sort of problem from latin1 to utf8. It is probably worth looking through their docs (possibly one of the UPGRADING* files?) and the RT Wiki for a solution. I expect it could be done with a binary dump of the database, altering of any latin1 tables to use utf8, and a re-import of the binary dump. I'm not enough of a MySQL expert to detail that process (I generally use Postgres where possible.) - RT 4.4 and RTIR training sessions, and a new workshop day! https://bestpractical.com/training * Boston - October 24-26 * Los Angeles - Q1 2017