Re: [bug-gnulib] New GNULIB glob module?
Derek Price [EMAIL PROTECTED] writes: It may be. It looks like the change was intentional (http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/generic/glob.c?rev=1.52content-type=text/x-cvsweb-markupcvsroot=glibc), but I still disagree. I agree with you. Historically, the * pattern in the shell has always matched dangling links. Otherwise rm * wouldn't work as desired. I ran your test program on NetBSD 1.6.1 and it does return the broken symlink. OK, let's argue that BSD is right and Solaris is wrong, and glibc should be fixed. If necessary we can take this to the POSIX committee. In the mean time we can submit the three patches. I'll take a look at doing that. ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
Paul Eggert wrote: Derek Price [EMAIL PROTECTED] writes: 1. Corrects an incorrect check for a successful return from getlogin_r to assume only 0 means success, per the POSIX2 spec: http://www.opengroup.org/onlinepubs/009695399/functions/getlogin_r.html. 2. Moves the check for GLOB_MARK directory status (and the append of `/') into glob_in_dir, where it is more efficient than performing a second pass and sometimes calling stat a second time on each file or directory. All calls to stat are avoided when dirent-d_type is available. No call to realloc of the directory name is ever necessary since room for the slash can be allocated in the first pass. These changes sound reasonable, though we should submit them as separate patches. The attached glibc-understand-getlogin_r.diff may come off as a little fuzzy, but it should work for (1). I've also reattached the patch for 2 3 minus the above chunk, as glibc-glob-list-links2.diff. Is (2) independent of (3)? (Please see below for why this is important.) (2) could be separated with a little more work, but lets finish the discussion below before I provide a second patch. 3. Ignores broken links only when GLOB_ONLYDIR is set. With glibc versions 2.3.3 through 2.3.5, the following in an empty directory would return nothing: ln -s doesnt-exist linkname glob (*, ...) This fix syncs with the comments in the file, syncs with the POSIX2 spec, restores the pre-glibc-2.3.3 behavior, and simply makes more sense - why should `ls *' fail to list broken links? This change sounds controversial to me. glibc 2.3.5 behaves similarly It may be. It looks like the change was intentional (http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/generic/glob.c?rev=1.52content-type=text/x-cvsweb-markupcvsroot=glibc), but I still disagree. to Solaris 8 and to Solaris 10 -- I just checked, with the following program and with the working directory containing only a dangling symlink: #include glob.h #include stdio.h int main (void) { glob_t g; int r = glob (*, 0, NULL, g); int i; if (r != 0) { fprintf (stderr, glob failed (%s)\n, r == GLOB_NOMATCH ? GLOB_NOMATCH : r == GLOB_NOSPACE ? GLOB_NOSPACE : other glob failure); return 1; } for (i = 0; i g.gl_pathc; i++) puts (g.gl_pathv[i]); return 0; } Solaris 8 and 10 both report glob failed (GLOB_NOMATCH). Let's separate (3) into a separate patch and think about it more carefully before submitting it. This would involve creating at least one new function for (2) which would just need to be removed again for (3), and the POSIX2 glob spec states that glob returns GLOB_NOMATCH only when, the pattern does not match any existing pathname, and GLOB_NOCHECK was not set in flags. (http://www.opengroup.org/onlinepubs/009695399/functions/glob.html) Nowhere does the POSIX2 glob spec specify that a broken symlink should not be considered an existing pathname. After all, the link exists, only its target does not. Interpreted in this way, glob cannot be used by a program which, for instance, wished to verify that all links matching a pattern had valid targets since the broken links would not be returned by glob. If glob is only going to consider a link as if it is its target, then what about the case where a matching link points to a file in the same directory that also matches the pattern? Should glob only return one or the other? Perhaps a GNU extension similar to GLOB_ONLYDIR is in order (GLOB_VERIFY_SYMLINKS?), but I do not think glob should be making these value judgements when the user did not request it. It certainly does not appear to be implied by the POSIX spec as I read it. Have you investigated with FreeBSD glob does? No, but I ran your test program on NetBSD 1.6.1 and it does return the broken symlink. I copied and pasted most of my above commentary on why this should be considered a bug into https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=126460, where I found the discussion that apparently caused the original bug fix that broke this. Cheers, Derek --- ../glob-glibc2gnulib/lib/glob.c 2005-05-31 18:12:01.0 -0400 +++ lib/glob.c 2005-05-27 13:52:28.0 -0400 @@ -538,7 +539,7 @@ glob (const char *pattern, int flags, buflen = 20; name = __alloca (buflen); - success = getlogin_r (name, buflen) = 0; + success = getlogin_r (name, buflen) == 0; if (success) { struct passwd *p; --- ../glob-glibc2gnulib/lib/glob.c 2005-05-31 18:12:01.0 -0400 +++ lib/glob.c 2005-05-27 13:52:28.0 -0400 @@ -175,6 +175,7 @@ # define __glob_pattern_p glob_pattern_p #endif /* _LIBC */ +#include stdbool.h #include fnmatch.h
Re: [bug-gnulib] New GNULIB glob module?
Paul Eggert wrote: Derek Price [EMAIL PROTECTED] writes: submission part. Perhaps it would be smoother if someone already known to the glibc team introduced me and this patch? Yes, probably. I'm willing to have a go at it. I suggest submitting two patches. (1) the part that makes it work with gnulib. glob-glibc2gnulib11.diff attached: 2005-05-31 Derek Price [EMAIL PROTECTED] Paul Eggert [EMAIL PROTECTED] * glob.c: Update copyright. Assume freestanding C89 compiler. Simplify cruft that may be replaced with GNULIB modules. Make no attempt to find 64-bit versions of file access functions directly when !_LIBC. Move definitions of GLOB_* macros to glob_.h. (DIRENT_MUST_BE, DIRENT_MIGHT_BE_SYMLINK, DIRENT_MIGHT_BE_DIR): New macros to abstract dirent-d_type access. (GETPW_R_SIZE_MAX, LOGIN_NAME_MAX): New macros to abstract sysconf access. * glob.h: Protect/define contents as necessary to coexist with GNULIB. (2) the bug fix. It's OK for (2) to assume that (1) has already been applied. glibc-glob-list-links.diff attached. Despite its name, it actually does three things: 1. Corrects an incorrect check for a successful return from getlogin_r to assume only 0 means success, per the POSIX2 spec: http://www.opengroup.org/onlinepubs/009695399/functions/getlogin_r.html. 2. Moves the check for GLOB_MARK directory status (and the append of `/') into glob_in_dir, where it is more efficient than performing a second pass and sometimes calling stat a second time on each file or directory. All calls to stat are avoided when dirent-d_type is available. No call to realloc of the directory name is ever necessary since room for the slash can be allocated in the first pass. 3. Ignores broken links only when GLOB_ONLYDIR is set. With glibc versions 2.3.3 through 2.3.5, the following in an empty directory would return nothing: ln -s doesnt-exist linkname glob (*, ...) This fix syncs with the comments in the file, syncs with the POSIX2 spec, restores the pre-glibc-2.3.3 behavior, and simply makes more sense - why should `ls *' fail to list broken links? 2005-05-31 Derek R. Price [EMAIL PROTECTED] * glob.c: #include stdbool.h (glob): Only 0 return from getlogin_r means success, according to POSIX 2. Move GLOB_MARK rescan into... (glob_in_dir): ...here - it improves efficiency. Don't fail to return broken links when GLOB_ONLYDIR is not set. (link_exists_p): Rename to... (is_dir_p): ...this and update functionality accordingly. Regards, Derek --- ../glibc-2.3.5/sysdeps/generic/glob.c 2004-10-27 14:21:02.0 -0400 +++ lib/glob.c 2005-05-31 18:12:01.0 -0400 @@ -1,4 +1,4 @@ -/* Copyright (C) 1991-2002, 2003, 2004 Free Software Foundation, Inc. +/* Copyright (C) 1991-2002, 2003, 2004, 2005 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -16,23 +16,16 @@ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. */ -/* AIX requires this to be the first thing in the file. */ -#if defined _AIX !defined __GNUC__ - #pragma alloca -#endif - #ifdef HAVE_CONFIG_H # include config.h #endif -/* Enable GNU extensions in glob.h. */ -#ifndef _GNU_SOURCE -# define _GNU_SOURCE 1 -#endif +#include glob.h #include errno.h #include sys/types.h #include sys/stat.h +#include stddef.h /* Outcomment the following line for production quality code. */ /* #define NDEBUG 1 */ @@ -40,30 +33,7 @@ #include stdio.h /* Needed on stupid SunOS for assert. */ - -/* Comment out all this code if we are using the GNU C Library, and are not - actually compiling the library itself. This code is part of the GNU C - Library, but also included in many other GNU distributions. Compiling - and linking in this code is a waste when using the GNU C library - (especially if it is a shared library). Rather than having every GNU - program understand `configure --with-gnu-libc' and omit the object files, - it is simpler to just do this in the source for each such file. */ - -#define GLOB_INTERFACE_VERSION 1 -#if !defined _LIBC defined __GNU_LIBRARY__ __GNU_LIBRARY__ 1 -# include gnu-versions.h -# if _GNU_GLOB_INTERFACE_VERSION == GLOB_INTERFACE_VERSION -# define ELIDE_CODE -# endif -#endif - -#ifndef ELIDE_CODE #if !defined _LIBC || !defined GLOB_ONLY_P - -#if defined STDC_HEADERS || defined __GNU_LIBRARY__ -# include stddef.h -#endif - #if defined HAVE_UNISTD_H || defined _LIBC # include unistd.h # ifndef POSIX @@ -73,22 +43,13 @@ # endif #endif -#if !defined _AMIGA !defined VMS !defined WINDOWS32 -# include pwd.h -#endif +#include pwd.h -#if !defined __GNU_LIBRARY__
Re: [bug-gnulib] New GNULIB glob module?
Derek Price [EMAIL PROTECTED] writes: 1. Corrects an incorrect check for a successful return from getlogin_r to assume only 0 means success, per the POSIX2 spec: http://www.opengroup.org/onlinepubs/009695399/functions/getlogin_r.html. 2. Moves the check for GLOB_MARK directory status (and the append of `/') into glob_in_dir, where it is more efficient than performing a second pass and sometimes calling stat a second time on each file or directory. All calls to stat are avoided when dirent-d_type is available. No call to realloc of the directory name is ever necessary since room for the slash can be allocated in the first pass. These changes sound reasonable, though we should submit them as separate patches. Is (2) independent of (3)? (Please see below for why this is important.) 3. Ignores broken links only when GLOB_ONLYDIR is set. With glibc versions 2.3.3 through 2.3.5, the following in an empty directory would return nothing: ln -s doesnt-exist linkname glob (*, ...) This fix syncs with the comments in the file, syncs with the POSIX2 spec, restores the pre-glibc-2.3.3 behavior, and simply makes more sense - why should `ls *' fail to list broken links? This change sounds controversial to me. glibc 2.3.5 behaves similarly to Solaris 8 and to Solaris 10 -- I just checked, with the following program and with the working directory containing only a dangling symlink: #include glob.h #include stdio.h int main (void) { glob_t g; int r = glob (*, 0, NULL, g); int i; if (r != 0) { fprintf (stderr, glob failed (%s)\n, r == GLOB_NOMATCH ? GLOB_NOMATCH : r == GLOB_NOSPACE ? GLOB_NOSPACE : other glob failure); return 1; } for (i = 0; i g.gl_pathc; i++) puts (g.gl_pathv[i]); return 0; } Solaris 8 and 10 both report glob failed (GLOB_NOMATCH). Let's separate (3) into a separate patch and think about it more carefully before submitting it. Have you investigated with FreeBSD glob does? It seems to use gl_lstat, which our glob doesn't. That's odd. What's the point of having a gl_lstat if it's never used? ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
Paul Eggert wrote: Derek Price [EMAIL PROTECTED] writes: Fair enough, but why undo the change to glob.m4? Shouldn't I just change the target of the AC_DEFINE from MISSING_SYS_CDEFS_H to _SYS_CDEFS_H? Yes, you're right. Sorry, I'd forgotten the trick that I had suggested. (This suggests that it deserves a nice fat comment. :-) Done. I added this comment to both glob_.h glob.m4, with different comment leaders, of course: /* Note the reversal of the common HAVE_SYS_CDEFS_H idiom below. In this way, #ifndef _SYS_CDEFS_H may be used to include sys/cdefs.h both when it has been checked for via the GNULIB configure test and found and when it has not been checked for, which we can presume means that the glob.h GNULIB shares with GLIBC is being included as a system header and not as part of GNULIB, in which case sys/cdefs.h may be assumed. */ I'm going to commit what I have as the new glob module soon if noone says otherwise. I and the CVS team made a few more minor changes to make it work on Solaris, BSD, and Windows, but it is now compiling and running on some 7 diverse platforms. After that I will write up a ChangeLog entry for the glob-glibc2gnulib diff and submit our changes back to the glibc team, unless someone here who is used to working with them would like to take a go at the actual submission part. Perhaps it would be smoother if someone already known to the glibc team introduced me and this patch? Is the definitive version of shared files usually in glibc, gnulib, or wherever someone last remembered to merge when this happens? Cheers, Derek ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Conrad T. Pino wrote: breaks the Windows build since Microsoft does NOT provide sys/cdefs.h implementation. Yes. Since we don't run configure on Windows, _SYS_CDEFS_H needed to be defined to 1 in the windows-NT/config.h.in.in. I've done so and installed it. Regards, Derek -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (Cygwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCmPCxLD1OTBfyMaQRAqfrAKDcNY5MOH8PnOYdID9dZgqNxj8b6wCgy3E6 ELXlzCQfccQvTI7rfH3GpY0= =EgB+ -END PGP SIGNATURE- ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
Paul Eggert wrote: Derek Price [EMAIL PROTECTED] writes: I chose the _SYS_CDEFS_H route since it seemed simplest to me, though I chose to name the macro `MISSING_SYS_CDEFS_H'. Sorry, that's not right, since it fails in the following scenario: #define MISSING_SYS_CDEFS_H 27 #include glob.h in an ordinary program that is not using gnulib, but is using glibc. MISSING_SYS_CDEFS_H is part of the user's namespace, and should not affect how glob.h works. Let's just use _SYS_CDEFS_H. We can then remove all mention of MISSING_SYS_CDEFS_H from glob_.h and undo the most recent change to glob.m4. This should be OK: if we run into some non-glibc that has this problem it will probably use some other include file anyway. Fair enough, but why undo the change to glob.m4? Shouldn't I just change the target of the AC_DEFINE from MISSING_SYS_CDEFS_H to _SYS_CDEFS_H? Regards, Derek ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
Larry Jones wrote: Derek Price writes: Larry, can you tell us if defining _POSIX_PTHREAD_SEMANTICS would work to get the POSIX version of getpwnam_r on Solaris? It looks like it. I've committed Paul's patch to the CVS CVS tree, as well as removing the associated glob.c changes, and I will install it in GNULIB if it passes CVS nightly testing on Solaris. Regards, Derek ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
Derek Price [EMAIL PROTECTED] writes: But let's step back a second. Why are we worried about building gnulib glob.c under glibc? It will never happen, right? So perhaps we needn't worry about this problem at all. Won't it? I thought the idea was that when you and I settled on something that looked good I would attempt to send it back to the libc folks so that the files wouldn't need to be maintained separately. Yes, that's the idea, but that's a different scenario. In that scenario we are building glob.c as part of glibc, so _LIBC is 1. I am talking about the scenario where glob.c is part of a user program (i.e., glob.c is imported from gnulib), and where the C library is glibc, but we are compiling glob.c anyway. Will that ever happen? Answering my own question: yes it will, in general, because perhaps the gnulib version fixes a bug that's in the glibc version, and the user wants the gnulib version. So we should worry about this. And (as far as I can see) the simplest fix is to do this: #if defined _LIBC || HAVE_SYS_CDEFS_H # include sys/cdefs.h #endif with the appropriate change to glob.m4 to define HAVE_SYS_CDEFS_H. ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
One other remark about glob.c. You should remove this comment, as it's obsolete now: /* Comment out all this code if we are using the GNU C Library, and are not actually compiling the library itself. This code is part of the GNU C Library, but also included in many other GNU distributions. Compiling and linking in this code is a waste when using the GNU C library (especially if it is a shared library). Rather than having every GNU program understand `configure --with-gnu-libc' and omit the object files, it is simpler to just do this in the source for each such file. */ Now for comments on m4/glob.m4 and modules/glob. This finishes up this round of reviewing. gl_GLOB_SUBSTITUTE and gl_GLOB both invoke gl_PREREQ_GLOB. This is overkill, surely. I'd say only the former needs to invoke it. + AC_CHECK_HEADERS([glob.h gnu-versions.h], [], [GLOB_H=glob.h]) + + if test -z $GLOB_H; then +AC_EGREP_CPP([gl_gnu_glob_version = 1$], +[#include gnu-versions.h +gl_gnu_glob_version = _GNU_GLOB_INTERFACE_VERSION], + [], [GLOB_H=glob.h]) + fi This seems a bit brittle. Why not simply try to compile this program? #include glob.h char a[_GNU_GLOB_INTERFACE_VERSION == 1 ? 1 : -1]; + AC_CHECK_HEADERS_ONCE([dirent.h sys/ndir.h sys/dir.h ndir.h])dnl Substitute AC_REQUIRE([AC_HEADER_DIRENT]) for this line. + AC_CHECK_FUNCS_ONCE([dirent64 getlogin_r getpwnam_r stat64])dnl The other three functions are fine, but: dirent64 is a function? I thought it was a structure tag. Surely you mean to use AC_CHECK_TYPE([struct dirent64]), and for the code to to refer to HAVE_STRUCT_DIRENT64 rather than HAVE_DIRENT64? Come to think of it, how does the code currently work here? In the normal case (where the maintainer has not used configure's --disable-largefile option, and where the system supports a 64-bit file interface) we want glob to use the 64-bit file interface. Is that what is happening here? I can't quite follow it. +++ modules/glob 13 May 2005 16:23:46 - @@ -0,0 +1,43 @@ +Description: +Search for files and directories with paths matching a pattern. + +Depends-on: +alloca +extensions +fnmatch +mempcpy +realloc I don't see why this module depends on the realloc module. It doesn't ever realloc anything to size zero. +License: +??? LGPL ??? Yes, it's LGPL. ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
Derek Price [EMAIL PROTECTED] writes: We've been removing those sort of constructs from CVS as part of the move away from KR support. IIRC, I was told that typecasts to and from (void *) were necessary on some really old systems, but that I could be confident that we wouldn't encounter any C89 systems that required it. That's correct. Perhaps this is simply legacy code related to the __ptr_t stuff and could be removed as well? Yes, quite possibly. How about if we simply assume C89-or-later for now. If someone runs into a problem with a bogus masquerading compiler we can fix it later. why does that preclude a #include stddef.h in glob.h? Because in glibc #include glob.h isn't supposed to define irrelevant identifiers like wchar_t (which stddef.h does define). It's OK for #include glob.h to define these identifiers in gnulib, though. Partly because we can't avoid it, and partly because we're not as persnickety anyway. I'll review the rest of your message later (sorry, too many meetings today!). ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib
Re: [bug-gnulib] New GNULIB glob module?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Karl Berry wrote: Subject: [bug-gnulib] New GNULIB glob module? Would it be possible to simply use the libc code as-is? I guess I mean, with whatever changes are needed sent back to libc. So much stuff in gnulib is 95% the same as libc. It doesn't seem good. Well, like I said, a bunch of the cruft I cut out could be removed because it was much simpler to depend on other GNULIB modules. The alternative would have been to add a bunch of tests to detect the things that were being switched on, and I expect would have been much more complicated since those problems have already been solved in the GNULIB modules the file now references and I would have had to test the new tests... To have libc use the same code, it would be best if they imported the GNULIB modules my new glob depends on and then reimported my glob files. We might have to find a slightly better compromise on the header file since I'm not sure that all the assumptions I made there that are valid with GNULIB would be valid at run time for other programs expecting only glibc. (Glancing back at the header file now, it looks like I did a few things that should be undone/fixed anyhow. I reviewed my changes to glob.c more thoroughly, but I disabled a few things in glob.h like the 64 bit file offsets, intending to revisit the issue later. I don't know all the issues involved there... what's __REDIRECT_NTH and why does it break in my compiler? I was hoping someone else might be able to deal with that more quickly than I, but most of the rest of my changes should be valid and useful as/in the port to GNULIB...) Regards, Derek -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (Cygwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCgoiULD1OTBfyMaQRApY+AJ9VsOAvoa3endmyHCxT1ai/+Nch4ACfWAye yXShaWGtcwczl045kRTY7G4= =Wf8s -END PGP SIGNATURE- ___ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib