Re: [zfs-discuss] utf8only-property

2008-02-28 Thread Richard L. Hamilton
 So, I set utf8only=on and try to create a file with a
 filename that is
 a byte array that can't be decoded to text using
 UTF-8. What's supposed
 to happen? Should fopen(), or whatever syscall
 'touch' uses, fail?
 Should the syscall somehow escape utf8-incompatible
 bytes, or maybe
 replace them with ?s or somesuch? Or should it
 automatically convert the
 filename from the active locale's fs-encoding
 (LC_CTYPE?) to UTF-8?

First, utf8only can AFAIK only be set when a filesystem is created.

Second, use the source, Luke:
http://src.opensolaris.org/source/search?q=defs=refs=z_utf8path=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Futs%2Fcommon%2Ffs%2Fzfs%2Fzfs_vnops.chist=project=%2Fonnv

Looks to me like lookups, file create, directory create, creating symlinks,
and creating hard links will all fail with error EILSEQ (Illegal byte 
sequence)
if utf8only is enabled and they are presented with a name that is not valid
UTF-8.  Thus, on a filesystem where it is enabled (since creation), no such
names can be created or would ever be there to be found anyway.

So in that case, the system is refusing non UTF-8 compatible byte strings
and there's no need to escape anything.

Further, your last sentence suggests that you might hold the
incorrect idea that the kernel knows or cares what locale an application is
running in: it does not.  Nor indeed does the kernel know about environment
variables at all, except as the third argument passed to execve(2); it
doesn't interpret them, or even validate that they are of the usual
name=value form, they're typically handled pretty much the same as the
command line args, and the only illusion of magic is that with the more
widely used variants of exec that don't explicitly pass the environment,
they internally call execve(2) with the external variable environ as the
last arg, thus passing the environment automatically.

There have been Unix-like OSs that make the environment available to
additional system calls (give or take what's a true system call in the
example I'm thinking of, namely variant links (symlinks with embedded
environment variable references) in the now defunct Apollo Domain/OS),
but AFAIK, that's not the case in those that are part of the historical
Unix source lineage.  (I have no idea off the top of my head whether
or not Linux, or oddballs like OSF/1 might make environment variables
implicitly available to syscalls other than execve(2).)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] utf8only-property

2008-02-27 Thread Marcus Sundman
So, I set utf8only=on and try to create a file with a filename that is
a byte array that can't be decoded to text using UTF-8. What's supposed
to happen? Should fopen(), or whatever syscall 'touch' uses, fail?
Should the syscall somehow escape utf8-incompatible bytes, or maybe
replace them with ?s or somesuch? Or should it automatically convert the
filename from the active locale's fs-encoding (LC_CTYPE?) to UTF-8?

- Marcus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss