Problem: Files are not accessible because of problems with non-ascii 
characters, they cannot be deleted or renamed, rsync fails

Using OpenSolaris 2010.02 svn 123, just updated all to 124 - no change.

Background: following the ideas on this thread:
http://opensolaris.org/jive/thread.jspa?threadID=110207

I created a backup volume/pool using these options for the purpose of testing:
# zpool create -O casesensitivity=mixed  -O utf8only=on -O normalization=formD 
backusb /dev/dsk/c0t0d0

My motivation is that my main access to the files is from MacOSX 10.5. The 
source pool 'mpool' does NOT use utf8only or normalization, only the 
destination 'backusb' does. Files have originally come through OSX and Windows. 

Symptom: rsync failed on a few files, apparently it could copy the files just 
fine, but the renaming the temp file to its proper name failed. I found several 
copies (from several attempts) in the destination folder. As can be seen, ls 
has it's problems and rm and mv don't work. Initially I could still see a 
proper directory entry with file size, that has since 'disappeared'.

(output cropped)
# ls -al /backusb/private/my\ docs/Bibliographix/styles/.K*
/backusb/private/my docs/Bibliographix/styles/.K?lner Zeitschrift f?r 
Soziologie.style.eyaave: No such file or directory

# rm /backusb/private/my\ docs/Bibliographix/styles/.K*
rm: /backusb/private/my docs/Bibliographix/styles/.K?lner Zeitschrift f?r 
Soziologie.style.eyaave: No such file or directory


There are many other files that copy/rsync just fine and that use accented, 
Japanese, Chinese or Thai characters. Only very few files cause a problem.

For the purpose of testing, I copied a different file name with the same 
umlauts from the terminal into the gui text editor in my source 'mpool'. Then 
copied just the umlauts into the name of the problematic file and copied and 
pasted the whole name back to the terminal for a rename/mv which looked like 
that:

mv K?lner\ Zeitschrift\ f?r\ Soziologie.style   K?lner-Zeitschrift-f?r

The source name is created using the bash tab expansion the destination pasted. 
Then I ran rsync again - it worked!

Thinking I'll have a look at the encoding I saved the text editor file as UTF8 
and looked at it with ghex2 - both variants of the problematic umlauts were the 
same!!  

I'm stumped. I suspect that the normalization gets in the way: the files are 
obviously there, but can't be accessed. 

My questions where I hope for some ideas:
1 - how can I see what encoding is actually used for a particular umlaut on the 
disk/filename? 
2 - how do I get rid of the 'stuck' files?


======================
Some logs:
# zfs get all mpool
NAME   PROPERTY              VALUE                  SOURCE
mpool  type                  filesystem             -
mpool  creation              Sat Sep 19 18:56 2009  -
mpool  used                  794G                   -
mpool  available             120G                   -
mpool  referenced            2.40G                  -
mpool  compressratio         1.00x                  -
mpool  mounted               yes                    -
mpool  quota                 none                   default
mpool  reservation           none                   default
mpool  recordsize            128K                   default
mpool  mountpoint            /mpool                 default
mpool  sharenfs              off                    default
mpool  checksum              on                     default
mpool  compression           off                    default
mpool  atime                 on                     default
mpool  devices               on                     default
mpool  exec                  on                     default
mpool  setuid                on                     default
mpool  readonly              off                    default
mpool  zoned                 off                    default
mpool  snapdir               hidden                 default
mpool  aclmode               groupmask              default
mpool  aclinherit            restricted             default
mpool  canmount              on                     default
mpool  shareiscsi            off                    default
mpool  xattr                 on                     default
mpool  copies                1                      default
mpool  version               3                      -
mpool  utf8only              off                    -
mpool  normalization         none                   -
mpool  casesensitivity       sensitive              -
mpool  vscan                 off                    default
mpool  nbmand                off                    default
mpool  sharesmb              off                    default
mpool  refquota              none                   default
mpool  refreservation        none                   default
mpool  primarycache          all                    default
mpool  secondarycache        all                    default
mpool  usedbysnapshots       0                      -
mpool  usedbydataset         2.40G                  -
mpool  usedbychildren        791G                   -
mpool  usedbyrefreservation  0                      -
mpool  logbias               latency                default


root at opensolaris:~# zfs get all backusb
NAME     PROPERTY              VALUE                  SOURCE
backusb  type                  filesystem             -
backusb  creation              Tue Oct  6 12:38 2009  -
backusb  used                  788G                   -
backusb  available             125G                   -
backusb  referenced            26K                    -
backusb  compressratio         1.00x                  -
backusb  mounted               yes                    -
backusb  quota                 none                   default
backusb  reservation           none                   default
backusb  recordsize            128K                   default
backusb  mountpoint            /backusb               default
backusb  sharenfs              off                    default
backusb  checksum              on                     default
backusb  compression           off                    default
backusb  atime                 on                     default
backusb  devices               on                     default
backusb  exec                  on                     default
backusb  setuid                on                     default
backusb  readonly              off                    default
backusb  zoned                 off                    default
backusb  snapdir               hidden                 default
backusb  aclmode               groupmask              default
backusb  aclinherit            restricted             default
backusb  canmount              on                     default
backusb  shareiscsi            off                    default
backusb  xattr                 on                     default
backusb  copies                1                      default
backusb  version               4                      -
backusb  utf8only              on                     -
backusb  normalization         formD                  -
backusb  casesensitivity       mixed                  -
backusb  vscan                 off                    default
backusb  nbmand                off                    default
backusb  sharesmb              off                    default
backusb  refquota              none                   default
backusb  refreservation        none                   default
backusb  primarycache          all                    default
backusb  secondarycache        all                    default
backusb  usedbysnapshots       0                      -
backusb  usedbydataset         26K                    -
backusb  usedbychildren        788G                   -
backusb  usedbyrefreservation  0                      -
backusb  logbias               latency                default
-- 
This message posted from opensolaris.org

Reply via email to