RE: upgrade to 3.0alpha20: accented chars in filenames unreadable
Hi, At this location: http://lists.samba.org/pipermail/samba-technical/2002-October/040097.html, Gerald (Jerry) Carter [EMAIL PROTECTED] asks if anyone can comment. It is a manifestation of the same problem I was seeing with files in profiles. To refresh your memory, here is the issue: On Tue, 15 Oct 2002, Louis-David Mitterrand wrote: > > Upon upgrading from 2.2.5 to 3.0alpha20 on Debian unstable, filenames > with accented characters (ie: éàî etc.) became unreadable. For example > in W2K a filename previously called "résumé.xls" became "r" when looking > at the samba share; and the filename is impossible to modify from windows: > samba log says "file not found". From the shell the file looks like > "r?sum?.xls" but the "?" are actually 0x83. > > Now, the file contents are intact and if modified from the unix command > line to non-accented they become accessible again from windows. > > FWIW I used the following command to sanitize all filenames: > > % rename -v 's/\x8c/i/g;s/[\x83\x8a\x82]/e/g' **/* > > I know I'm using an "alpha" samba and "unstable" debian, but still I'd > like to understand what happened, if possible. > > Is this a known issue? It has to do with upgrading and the fact that Samba-head (and thus Samba-3.0) now uses UTF-8 for pass-through file names. This will differ from what was originally used, esp if nothing was set before. Steve Langasek has just posted some scripts that can do some level of conversion. Regards - Richard Sharpe, [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], http://www.richardsharpe.com
Re: [Samba] upgrade to 3.0alpha20: accented chars in filenames unreadable
On Wed, Oct 16, 2002 at 05:03:01PM +0200, Ignacio Coupeau wrote: > >>the samba share; and the filename is impossible to modify from windows: > >>samba log says "file not found". From the shell the file looks like > >>"r?sum?.xls" but the "?" are actually 0x83. > > In a hurry I used > unix charset = "CP850" > http://www.unav.es/cti/ldap-smb/smb-ldap-3-howto.html#internationalization > > this solved our problems (redhat 7.2; samba-3.0a20) for example in the > profile load on the spanish xp (ie Star menu-->menú Inicio). Thanks for sharing this. It certainly is an excellent stopgap measure, until proper filename conversion can be done. The best way, if possible, would be to retain backward compatibility for reading samba-2.2.x filenames (as with "unix charset") while having new or modified files written in unicode (or whatever the default in samba-3.x). BTW: keep up the great job on your smb-ldap howto, it is a precious ressource. Cheers, -- PANOPE: Au Prince votre fils l'un donne son suffrage, Madame ; et de l'Etat l'autre oubliant les lois, Au fils de l'étrangère ose donner sa voix. (Phèdre, J-B Racine, acte 1, scène 4)
Re: upgrade to 3.0alpha20: accented chars in filenames unreadable
On Wed, Oct 16, 2002 at 09:30:20AM -0500, Steve Langasek wrote: > > The current Debian Samba package uses the following shell snippet to > convert between 2.2-style character set settings and 3.0-style settings, > if the user has opted to let Debian manage the smb.conf file directly. > If the user has chosen to not allow automatic management of smb.conf, any > "character set" and "client code page" values in smb.conf will need to be > converted by hand to the new "unix charset" and "dos charset" values. > > If the user previously had these settings in smb.conf, and they were > converted but accents are still broken, please let me know. (Preferably, > a bug would be filed with the Debian BTS.) But the problem occurs if smb.conf: - is not managed by debconf, - does not contain any "character" setting, Which is probably a very common situation among samba admins using debian. There should be a big warning during installation if these two condtions are met, suggesting that "unix charset" should be used if filenames contain accented chars. -- HIPPOLYTE: N'osez-vous confier ce secret à ma foi ? THESEE: Perfide, oses-tu bien te montrer devant moi ? (Phèdre, J-B Racine, acte 4, scène 2)
Re: [Samba] upgrade to 3.0alpha20: accented chars in filenames unreadable
Gerald (Jerry) Carter wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Can anyone comment on this? > > > On Tue, 15 Oct 2002, Louis-David Mitterrand wrote: > > >>Upon upgrading from 2.2.5 to 3.0alpha20 on Debian unstable, filenames >>with accented characters (ie: éàî etc.) became unreadable. For example >>in W2K a filename previously called "résumé.xls" became "r" when looking at >>the samba share; and the filename is impossible to modify from windows: >>samba log says "file not found". From the shell the file looks like >>"r?sum?.xls" but the "?" are actually 0x83. In a hurry I used unix charset = "CP850" http://www.unav.es/cti/ldap-smb/smb-ldap-3-howto.html#internationalization this solved our problems (redhat 7.2; samba-3.0a20) for example in the profile load on the spanish xp (ie Star menu-->menú Inicio). Ignacio -- Ignacio Coupeau, Ph.D. e-mail: [EMAIL PROTECTED] CTI, Director fax:948 425619 University of Navarra voice: 948 425600 Pamplona, SPAINhttp://www.unav.es/cti/
Re: [Samba] upgrade to 3.0alpha20: accented chars in filenames unreadable
Hello, On Wed, Oct 16, 2002 at 09:16:58AM -0500, Gerald (Jerry) Carter wrote: > Can anyone comment on this? > > Upon upgrading from 2.2.5 to 3.0alpha20 on Debian unstable, filenames > > with accented characters (ie: éàî etc.) became unreadable. For example > > in W2K a filename previously called "résumé.xls" became "r" when looking at > > the samba share; and the filename is impossible to modify from windows: > > samba log says "file not found". From the shell the file looks like > > "r?sum?.xls" but the "?" are actually 0x83. > > Now, the file contents are intact and if modified from the unix command > > line to non-accented they become accessible again from windows. > > FWIW I used the following command to sanitize all filenames: > > % rename -v 's/\x8c/i/g;s/[\x83\x8a\x82]/e/g' **/* > > I know I'm using an "alpha" samba and "unstable" debian, but still I'd > > like to understand what happened, if possible. > > Is this a known issue? The current Debian Samba package uses the following shell snippet to convert between 2.2-style character set settings and 3.0-style settings, if the user has opted to let Debian manage the smb.conf file directly. If the user has chosen to not allow automatic management of smb.conf, any "character set" and "client code page" values in smb.conf will need to be converted by hand to the new "unix charset" and "dos charset" values. If the user previously had these settings in smb.conf, and they were converted but accents are still broken, please let me know. (Preferably, a bug would be filed with the Debian BTS.) Regards, Steve Langasek postmodern programmer # Update charset settings? if ! grep -q "^[[:space:]]*unix charset[[:space:]]*=" /etc/samba/smb.conf then db_get samba-common/character_set || true DISPLAYCHARSET="${RET}" if [ -n "$DISPLAYCHARSET" ] then TMPFILE=`mktemp -q /tmp/smb.conf.XX` sed -e "/^[[:space:]]*character set[[:space:]]*=/c \\ display charset = $DISPLAYCHARSET\\ unix charset = $DISPLAYCHARSET" < /etc/samba/smb.conf > ${TMPFILE} mv -f ${TMPFILE} /etc/samba/smb.conf fi fi if ! grep -q "^[[:space:]]*dos charset[[:space:]]*=" /etc/samba/smb.conf then db_get samba-common/codepage || true DOSCHARSET="${RET}" if [ -n "$DOSCHARSET" ] then TMPFILE=`mktemp -q /tmp/smb.conf.XX` sed -e "/^[[:space:]]*client code page[[:space:]]*=/c \\ dos charset = $DOSCHARSET" < /etc/samba/smb.conf > ${TMPFILE} mv -f ${TMPFILE} /etc/samba/smb.conf fi fi msg03727/pgp0.pgp Description: PGP signature