-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 According to Paul Eggert on 11/6/2005 3:59 AM: > >>My approach was to add a dependency on the c-ctype module to use c_isalpha > > That seems a bit overkill here, since you can assume ASCII and DOS. > Why not use this instead? > > #define c_isalpha(c) (((unsigned int) (c) | ('a' - 'A')) - 'a' <= 'z' - 'a')
Agreed - all platforms with drive letter prefixes are based on ASCII encodings, so I can resubmit without a dependency on c-ctype. Should I add a dependency on the verify module to ensure that drive letter prefixes are only used on ASCII platforms? Should I still move FILE_SYSTEM_PREFIX_LEN out of config.h and into dirname.h? > >>Second, cygwin currently normalizes all drive letters as absolute paths, >>which is different from djgpp > > > Sigh. This stuff will probably break a lot of code, not just dirname. > Do we really want to bother with it? Perhaps we should just tell > users not to mess with names like "c:foo". I assume they're pretty > rare. Will that simplify the code here and elsewhere? That was the intent of tests/test-dirname.c - a good listing of all sorts of problematic filenames, and their division into base_name and dir_name on the three styles of platforms (unix with no drive letters, cygwin with drive letters always starting an absolute path, and DOS with drive letters potentially starting a relative path). I made the code as simple as I could while making it correct for all three cases. > > >>+ { >>+ if (prefix_len) >>+ base = p; >>+ else if (! DOUBLE_SLASH_IS_DISTINCT_ROOT || 2 < p - base) >>+ base = p - 1; >>+ } > > > This code doesn't seem to match the new comments for basename. Agreed that the comment does not quite match the code. But it is doing exactly what I intended - this block is only invoked when the filename consists of just a prefix and a sequence of slashes. If there was a prefix, it results in the empty string; otherwise, it results in the final one or two slashes (turning c:/, c://, c:///, etc into ''; turning ///, ////, etc into the final /, leaving / alone, and leaving // alone only if DOUBLE_SLASH_IS_DISTINCT_ROOT). > Furthermore, I don't see why the basename of "A:/" should be "/". or > the basename of "//" should be "//". The point of basename and > dirname is that if you concatenate them, with a slash between them, > then you get the same file. But I don't see how this rule generalizes > to DOS under the proposed code. Your formulation is not quite correct. POSIX poses the rule as: if "$string" is a valid filename, then 'cd "$(dirname "$string")"; ls "$(basename "$string")"' lists the same file (although POSIX fails to mention that this will fail if the filename contains a trailing newline). Note that POSIX requires basename("/") to return "/", and basename("//") to return either "/" or "//". ALL other names subjected to basename() will never contain a slash. Likewise, POSIX requires dirname("/") to return "/" and dirname("//") to return "//", and all other names subjected to dirname() will never contain a trailing slash. On platforms where // is special, your formulation of dirname("//")/basename("//") results in "/////", which is certainly not the same as "//". But "cd //; ls //" does indeed list the same file as the original "ls //". Now, on applying that to systems with drive letters. If basename("c:/") returns "/", we have violated the POSIX rule of thumb: "cd c:/; ls /" is much different than "ls c:/". Yet without my patch, this is what base_name was doing. Part of the issue here is that POSIX lets basename("c:/") modify its argument, but that our definition of base_name is that we can only return a pointer to somewhere inside the passed argument without modification. My patch makes base_name("c:/") return "". The other alternative is to treat c:/ like a root, and just like base_name("/") returns "/", have base_name("c:/") return "c:/". But by your earlier rule of appending a slash between the dir_name and base_name, my patch would produce "c://" on DOS, which is still a valid spelling of c:/, whereas having base_name return c:/ would create the invalid name "c://c:/". For that matter, would it help to redefine our base_name to ALWAYS return the empty string if the argument is a root directory? Then the rule of recreating a valid filename becomes simpler - concatenate dir_name (less a trailing slash), a slash, and the base_name. Then for "//" on platforms where it is special, you get "//" less one slash, /, and "", resulting in "//". Yes, this is different than POSIX basename(), but we already know that base_name is not an exact match to POSIX. My only worry about changing semantics like that is that it may break existing uses of base_name. Another thing to think about. Currently, strip_trailing_slashes("///") currently calls base_name("///"), gets a single "/", skips past that slash, then returns false (leaving the user with "///" still). But it might be more intuitive if it changed the input string to "/\0/" and returned true. > > Many of the rest of the changes look pretty complicated, and I didn't > take the time to review them, since I want to know what dirname and > basename are supposed to do before worrying about the details. I guess the first thing to review, then, would be my testsuite (which did pass in all three configurations - unix, cygwin, and DOS). If you can agree that every one of those proposed filename divisions into a directory and basename makes sense, then your review becomes a matter of validating the codepaths to ensure that the testsuite is always passed, and that the testsuite is comprehensive enough. I am also working on a followup patch to coreutils that fixes basename(1) (POSIX requires that "basename // /" return "//" on platforms where // is special), as well as improving the documentation and testsuite of basename and dirname. - -- Life is short - so eat dessert first! Eric Blake [EMAIL PROTECTED] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDb1m884KuGfSFAYARApbUAKCnBiXIpLOF4eLc4aknfRSbjJabDwCgz+zu Fe3X89Qb2GvODfEWZIaPAmI= =348K -----END PGP SIGNATURE----- _______________________________________________ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib