Andreas Schwab <[EMAIL PROTECTED]> writes: > $ echo '+::::: > +:::::' | uniq -u > uniq: string comparison failed: No such file or directory > uniq: Set LC_ALL='C' to work around the problem. > uniq: The strings compared were `+:::::' and `+:::::'.
Ouch! It sounds like your strcoll is broken. GNU 'ls' has a workaround for broken strcoll (basically: resort the directory directory) but this workaround isn't appropriate for programs like 'uniq' and 'sort', since they can't easily undo the work they've already done. I guess we should fix coreutils so that it detects broken strcoll at configure-time, and refuses to use strcoll at all if it is broken. Likewise for other GNU programs. We need a test program for this. I cannot reproduce the problem on my GNU/Linux host, under any of the locales it has installed. What locale were you using when you ran into the problem? > The whole error checking in memcoll and xmemcoll is completely bogus. The > C standard says in 7.5#3: It's not bogus, since it is relying on POSIX. See <http://www.opengroup.org/onlinepubs/007904975/functions/strcoll.html>, which says: Since no return value is reserved to indicate an error, an application wishing to check for error situations should set errno to 0, then call strcoll(), then check errno.... The strcoll() function may fail if: [EINVAL] The s1 or s2 arguments contain characters outside the domain of the collating sequence. The underlying problem here is: what should 'sort', 'uniq', 'ls', etc. do when strcoll returns bogus results? They can't just take the bogus results and continue, since that can lead to real problems. For example, if you use strcoll as the underlying comparison function to qsort, and if strcoll fails, then strcoll is no longer a total order and qsort is allowed to dump core (and indeed does dump core, on some platforms). This problem also comes up with alphasort, which is a companion to scandir in glibc and is being proposed as a POSIX extension. The problem is that scandir+alphasort can dump core when the directory has file names with names that don't compare (e.g., due to encoding errors). This is unacceptable. _______________________________________________ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
