Hi, On Wednesday 06 April 2005 07:58, Erik Ableson wrote: > Le 5 avr. 2005, à 21:59, Peter Marschall a écrit : > > It is not said that Net::LDAP::LDIF is the part with the problems. > > It might as well be MIIS. > > That's entirely possible. although the odd behaviour is that when I do > a full import of the complete data file created from raw text files > without any encoding of attribute values, it doesn't complain. When I > take that file and do a ldifdiff against an older file, the generated > LDIF file using the Net::LDAP::LDIF library coughs up an entry like the > following : > > [...] > src_givenname:: Sk9TyQ== > src_givennameetatcivil:: Sk9TyQ==
These are the only attributes that have double colons. Pleeling off the Base64 encoding they come out as: src_givenname: JOSÉ src_givennameetatcivil: JOSÉ And this is exactly the problem: values in LDIF files are expectedto be in UTF-8. The data you provide is Latin-1 (aka ISO-8859-1). Please try to convert the attribute values containing non-ASCII characters in attributes that have directoryString syntax from the local character sets to UTF-8 and then encode the result with Base64. For the example above I have done it for you: src_givenname:: Sk9Tw4k= src_givennameetatcivil:: Sk9Tw4k= Please give it a try. I guess MIIS wil la ccept them. > Ditto - although what's curious is that it's not a global issue - there > are many other entries that work just fine with the encoded attribute > values. My issues are generally not around the DN though, since I > normalise the data before creating the DNs. It all depends on the data. See above. > True, although the context appears to be limited to the DN and not > attribute values. If the importing application is on the same codepage > as the source data, then it should be OK to pass in any raw value > within the codepage, DN excepted. Please forget about codepages in the LDAP context. LDAP uses UTF-8 for strings. <rant> This looks like another MS ploy to "extend" standards and then claim to be standard conform: - according to RFC 2252 underscores are not allowed in attribute names - directoryStrings in LDIF files are required to be UTF-8 encoded. If MIIS imports files with strings in the local codepage, then the import file is anything but definitely not LDIF. IHMO Net::LDAP::LDIF should stick to te standard. </rant> The following command line might help you in de-Base64-ing LDIF-Files generated with Net::LDAP::LDIF: ( perl -p -0040 -e 's/\n //' | \ perl -p -MMIME::Base64 -e 's/([\w-]+)::\s*(.*)$/"$1: ".decode_base64($2)/e ) < INPUT.ldif > OUTPUT.miis To convert from the local character set to UTF-8 you may use iconv (part of GNU libc on Unix systems), recode (http://recode.progiciels-bpi.ca/) or umap (part of the Unicode::Map8 perl module). CU Peter -- Peter Marschall eMail: [EMAIL PROTECTED]