The short answer is that this is something that the user-space file system would have to "take care of" itself.
There's a fundamental limitation on file name length in the Mac OS X kernel's BSD layer. Look at getdirentries(2). The dirent structure has a 255 *byte* limit on the file name length. This limit can be seen codified as __DARWIN_MAXNAMLEN and MAXNAMLEN in <sys/dirent.h>, and also as NAME_MAX in <sys/syslimits.h>. What this means is that the kernel-level readdir() implementation of a file system (any file system--not just MacFUSE) _must_ return no more than 255 bytes for a file name. Now, as some might point out, HFS+ supports file names that are up to 255 *Unicode characters* long. (HFS+ uses an array of 255 uint16_t's for this purpose. Together with a uint16_t length field, the on-disk byte-space used by an HFS+ file name is up to 255*2 + 2 = 512 bytes. The in-flight UTF-8 length can be still longer.) So how do HFS+ file names that are longer than 255 bytes work? Well, when HFS+ encounters such file names, it mangles them before feeding them to getdirentries (). The mangling scheme limits the name to 255 bytes by replacing the trailing part of the name with a hexadecimal representation of the file's Catalog Node ID. However, the scheme preserves any file extension, so it's not always the truly trailing part that it replaces. Of course, such a mangling scheme also needs to ensure that it returns valid UTF-8 (that is, it cuts off the original name at an appropriate character.) To see this scheme in action, create a really long file name and look at it in the Terminal (that is, through the BSD layer). For an original name like "中中中中...中中中.txt" you'll see something like "中中中...中中 中#9C0282.txt", where the number of "中"s is smaller in the mangled name. To make the whole thing work, the lookup code in HFS+ supports looking up mangled names if a file name indeed happens to seem like a mangled name and the original lookup fails. This isn't too hard in HFS+ because Catalog search through the node ID is normal. The handlers for rename() etc. also need to take this into account. All in all, there's a whole bunch of extra stuff that HFS+ does for this. The limitation aside, you can actually get the complete, non-truncated name of such a file from a user-space program. (The Finder can show you the full name, for example.) For that, you'll have to go through the getattrlist(2) interface and retrieve the ATTR_CMN_NAME attribute. OK, so what about MacFUSE? Well, as I've said before, MacFUSE doesn't know, nor wants to know, any encoding at the kernel level. All file/ folder names fed to MacFUSE must be no longer than 255 bytes. If you return a longer name in readdir(), MacFUSE will EIO, which will fail that readdir call. MacFUSE also fails a lookup if the requested name is longer than 255 bytes. (This latter behavior isn't strictly necessary actually. I could consider removing this check in the future, which will make calls other than readdir() work like they do with HFS+.) So, the gist is that the user-space file system needs to do its own mangling and return names that are limited to 255 bytes. Even if MacFUSE could automagically somehow do mangling, it would still be insufficient. That's because when given a mangled name by a program, MacFUSE would still need to ask the user-space file system to look it up. It's note quite the same situation with HFS+ because unlike MacFUSE, the kernel implementation of HFS+ knows "all" about the file system volume in question and its contents. As for fuse_fill_dir_t... it's not encountering any problem. It's not the one that's bailing out on a > 255 byte file name. As a file system writer (user-space or otherwise), there will be times when you just need to "know" some caveats and limitations. Amit On Jan 21, 8:26 pm, Erik Larsson <[email protected]> wrote: > Hi, > > I've been running into an annoying problem with really long file names > returned by NTFS-3G. It seems that as soon as the length of any file > name string fed to fuse_fill_dir_t during a readdir operation exceeds > 255 bytes, the entire directory becomes unreadable (no entries show up > at all when trying to list it). The call to fuse_fill_dir_t still > returns 0, so from the implementor's point of view, no problem can be > detected. > You should be able to reproduce this easily by modifying the hello_path > variable in hellofs to get a sizeable file name. > > Maybe this is an example of undefined behaviour, and maybe 255 bytes is > some kind of limit for what the MacFUSE/FUSE will handle, but it does > mean that a MacFUSE file system is not able to produce as long file > names as for instance the HFS+ driver and the built in NTFS driver can > produce. > > An example: Consider a file name with 255 chinese characters (say 255 > repetitions of 中, zhong). This fits into both the NTFS and HFS+ > structures which use 255 UTF-16 units internally for file name storage. > While the HFS+ driver allows the use of such a file name, NTFS-3G can't > do it since the file name expands to 765 bytes when encoded to UTF-8, > which makes the readdir operation fail as stated above. > > Is there a solution to this, or is it a known limitation? > > In the hypothetical worst case, a UTF-8 encoded file name returned by > NTFS-3G may be as large as 2295 bytes after NFD normalization (255 > korean characters decomposed into 3 jamos, each taking up 3 bytes in > UTF-8 form), so I've been running into this problem not only with test > cases but with korean file names with as little as 25-30 characters. > > In any case, I don't think the fuse_fill_dir_t behaves correctly... if > it encounters any problem when fed a filename that is too large, there > should be a way of detecting failure. > > Regards, > > - Erik --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "MacFUSE" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/macfuse?hl=en -~----------~----~----~----~------~----~------~--~---
