On Sun, 13 Nov 2011 23:08:30 +0000 David Holland <dholland-t...@netbsd.org> wrote:
> I was recently talking to some people who'd been working with some > (physicists, I think) doing data-intensive simulation of some kind, > and that reminded me: for various reasons, many people who are doing > serious data collection or simulation tend to encode vast amounts of > metadata in the names of their data files. Arguably this is a bad way > of doing things, but there are reasons for it and not so many clear > alternatives... anyway, 256 character filenames often aren't enough in > that context. It's only my opinion, but they really should be using multiple files or a database for the metadata with as necessary a "link" to an actual file for data. But I also tend to think the same of software relying on extended attributes, resource forks and the like (with the possible exception of a specialized facility for extended permissions :) > (This sort of usage also often involves things like 50,000 files in > one directory, so the columnizing behavior of ls is far from the top > of the list of relevant issues.) This reminds me, does anyone know about the current state of UFS_DIRHASH? I remember reading about some issues with it and ending up disabling it on my kernels, yet huge directories can occur in a number of scenarios (probably a more pressing issue than extending file names, actually)... > > The 255 limit was just because that's how many bytes a one byte length > > field permitted, not because anyone thought names that long made sense. > > But if you're going to increase it, why stop at 511? That number > > means nothing - the next logical limit would be 65535 wouldn't it? > > Well... yes but there are other considerations. As you noted, going > past one physical sector is problematic; going past one filesystem > block very problematic. Plus, as long as MMU pages remain 4K, > allocating contiguous kernel virtual space for path buffers (since if > NAME_MAX were raised to 64K, PATH_MAX would have to be at least that > large) could start to be a problem. I agree, especially with all the software that allocates path/file name buffers on the stack (but even on the heap it could be a general memory waste with 64KB, other than the memory management performance issues). -- Matt