Bug#551158: Libc6: exim4 Segfault in libc-2.7.so
Hi, On Tue, Oct 27, 2009 at 06:21:24AM +0100, Fabio Rosciano wrote: thanks for helping out. Here it is, I can't see anything funny: [...] Yes, the logs are pretty much the same as when I did the upgrade, except I did not have libc6-dev-i386 installed and I went to 2.10.1-1 from 2.9-27. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551158: Libc6: exim4 Segfault in libc-2.7.so
On Mon, Oct 26, 2009 at 09:02:22AM +0100, Fabio Rosciano wrote: On Mon, 2009-10-26 at 08:10 +0100, Aurelien Jarno wrote: Do you have the list of packages that have been upgraded? I wish I did, but as soon as debconf asked would you like to upgrade libc6 now? and I answered yes, the system became completely unusable. Do you have /var/log/dpkg.log? Even if it does not contain the action that failed first, the lines before the failure may show if packages were being installed in an unexpected order. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Bug#551831: cupt: Incorrectly upgrades libc6, breaking the system
On Wed, Oct 21, 2009 at 12:01:51PM +0300, Eugene V. Lyubimkin wrote: However, this is another side of already archived http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=543365 (ironically, reported by you too). On i386 we have the issue: libc6-i686 strictly Pre-Depends on libc6 (= ...), and whatever package from this two cupt tries to upgrade first, the pre-dependency will be broken. Let me try to add libc maintainers to the loop to know the correct upgrade path. Hmm. remove old libc6-i686, upgrade libc6, install new libc6-i686 seems to be a sequence where the Pre-Depends never breaks. Since libc6-i686 is not needed for the system to function properly, removing it temporarily is not a problem. Now the question is can it be generalized to other packages? Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#545179: libc6: postinst must run telinit u
On Mon, Sep 07, 2009 at 10:46:34AM +0200, Bastian Blank wrote: I really can't explain you why the behaviour is still the same. The mentioned bug shows a different problem. I suspect that the referenced bug report was made with / being ext2, while nowadays ext3 is the default. If I'm right, then the automatic journal replay prevents fsck from complaining. Hmm, I've found the logs of the first boot after the last libc upgrade: dpkg.log fragment: 2009-08-31 21:36:58 status installed libc6 2.9-26 messages.log indicates the machine was shut down properly: Aug 31 21:41:09 twister shutdown[20750]: shutting down for system halt Aug 31 21:41:16 twister kernel: Kernel logging (proc) stopped. Aug 31 21:41:16 twister kernel: Kernel log daemon terminating. Aug 31 21:41:26 twister exiting on signal 15 kernel.log of the next boot: Sep 1 07:11:31 twister kernel: [4.286170] EXT3-fs: INFO: recovery required on readonly filesystem. Sep 1 07:11:31 twister kernel: [4.286272] EXT3-fs: write access will be enabled during recovery. Sep 1 07:11:31 twister kernel: [4.480816] kjournald starting. Commit interval 5 seconds Sep 1 07:11:31 twister kernel: [4.480934] EXT3-fs: md0: orphan cleanup on readonly fs Sep 1 07:11:31 twister kernel: [4.481039] ext3_orphan_cleanup: deleting unreferenced inode 658 Sep 1 07:11:31 twister kernel: [4.481076] ext3_orphan_cleanup: deleting unreferenced inode 529 Sep 1 07:11:31 twister kernel: [4.490777] ext3_orphan_cleanup: deleting unreferenced inode 527 Sep 1 07:11:31 twister kernel: [4.625170] EXT3-fs: md0: 3 orphan inodes deleted Sep 1 07:11:31 twister kernel: [4.625272] EXT3-fs: recovery complete. Sep 1 07:11:31 twister kernel: [4.628012] EXT3-fs: mounted filesystem with ordered data mode. Sep 1 07:11:31 twister kernel: [4.628130] VFS: Mounted root (ext3 filesystem) readonly on device 9:0. So it's now the kernel that complains about the unclean shutdown instead of fsck, but the issue seems to be very much the same. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: chmod 670
On Tue, Nov 13, 2007 at 01:25:35PM -0300, Patricio Rojo wrote: - If you try 'ls', then its contents are shown Yes, because you have read permission. - If you try 'cd' to it, you get permission denied. Yes, because you do not have search (x) permission. - If you try 'ls -l', you get many interrogation signs (?) instead of the properties of the file. Yes, because you do not have search (x) permission, so ls can not get the requested information, but it still has to display _something_. - If the user is changed to someone other than you, but the group remains the same, then you get full access. Yes, because group permission bits are used only if you are _not_ the owner of the file. Anyways, getting many '' is very awkward. No, specifying rw- rights for a directory what is awkward. You get what you've asked for. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#324900: nscd: umount /var fails (unclean shutdowns)
On Wed, Apr 25, 2007 at 09:35:04PM +0200, Pierre HABOUZIT wrote: I have the same problem but as it concerns a file that will be deleted anyway, it's not critical, and there is nothing that we can do (except code rc not to use bash I guess). It may be useful to have some way (like an environment variable) that would forbid using nscd even if it is running. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences -
Bug#413744: glibc - uses more than one cpu without asking
On Tue, Mar 06, 2007 at 11:25:14PM +0100, Bastian Blank wrote: One of the s390 buildds, lxdebian, have two cpus online but is only allowed to use one full. This is followed by a make call without -j. IMHO such policies should be enforced by binding the whole buildd to a single CPU by using tools like taskset or schedtool. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#412831: LD_ASSUME_KERNEL=2.2.5 denies to load libc6 version 2.5-0exp3
On Wed, Feb 28, 2007 at 02:02:52PM +0100, Achim Gaedke wrote: I could not find out whether it is intended to fail, but now I am convinced, it should not. Yes it should. LD_ASSUME_KERNEL=2.4 also fails, LD_ASSUME_KERNEL=2.6 works... LD_ASSUME_KERNEL=2.2 or 2.4 requests the use of LinuxThreads instead of NPTL, but LinuxThreads is no longer supported by glibc. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#330105: libc6-dev: __FD_SETSIZE equals to 1024 is too small
On Tue, Feb 06, 2007 at 12:49:24AM +0100, Pierre Habouzit wrote: TTBOMK __FD_SETSIZE is only used for fd_set's (so select, FD_* macros, ...), and redefining it won't work (I tried already in another life) without recompiling the libc at least -- if not the kernel too, I'm less sure about that one -- and that would be obvioulsy binary incompatible with the rest of the linuxes around the globe :) Just libc and everything that uses select - basically you have to rebuild the whole archive. AFAIK sys_select() in the kernel can handle arbitrary large number of file descriptiors, so one option is not to use glibc's select() wrapper but do your own. But a much better approach (both maintainability and performance wise) is to use epoll(). Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#395177: libc6: default library search path is inconsistent with gcc
On Wed, Oct 25, 2006 at 02:37:04PM +0200, Vincent Lefevre wrote: 5. Run it. You should get: GMP . Library: 4.2.1 - Header: 4.2.17 This means the soname of gmp 4.2.1 and 4.2.17 is the same (or you'd have got an error while loading shared libraries ... message). But that also means that adding /usr/local/lib to /etc/ld.so.conf would _NOT_ solve your problem since the loader would still find libgmp.so.X in /usr/lib first. If you want to have two different shared libraries sharing the same soname on the same system, then there is no other solution than setting LD_LIBRARY_PATH to point explicitely to the version you want to use. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#395177: libc6: default library search path is inconsistent with gcc
On Fri, Oct 27, 2006 at 11:25:51AM +0200, Vincent Lefevre wrote: Not necessarily. The soname isn't defined in the header file, is it? (At compile time, it seems that the library was also 4.2.1, because I get the same problem when using -static, i.e., by not using shared libraries.) Build-time linking has nothing to do with glibc. Besides, ld _does_ search /usr/local/lib before /usr/lib by default. But that also means that adding /usr/local/lib to /etc/ld.so.conf would _NOT_ solve your problem since the loader would still find libgmp.so.X in /usr/lib first. So, couldn't the dynamic loader take into account /usr/local/lib by default (before /usr/lib), just like cpp takes into account /usr/local/include by default (before /usr/include)? That would be a security nightmare as /usr/local is often writable/owned by users other than root (for example, looking at my etch chroot, it is writable by group 'staff' by default). Btw., this is also one of the many reasons why compiling software as root is generally considered insecure and discouraged. If this is really the only possibility, it should probably be set in /etc/profile in the default configuration (at install time) and other shell init files. That would be a security problem as well. Only the local system administrator can decide whether things installed in /usr/local should override system software or not. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: sysctl syscall removed in 2.6.18+ kernels
On Tue, Aug 15, 2006 at 06:11:18PM +0200, Aurelien Jarno wrote: It already started to annoy some people having their log filled with such messages. IMHO if the message is not rate-limited you should bug the kernel developers (preferably upstream on linux-kernel). Printing the message 1-2 times per boot is OK but after that it is meaningless. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#381905: linux-kernel-headers: NGROUPS_MAX doesn't match the kernel and causes test failures
On Mon, Aug 07, 2006 at 11:43:07AM -0600, Vladislav Yasevich wrote: It is important to specify proper limits for this. This is a request to set NGROUPS_MAX to 65535, so it matches the kernel, in linux-kernel-headers package. As already said, there is no way a fixed constant (NGROUPS_MAX) can match a value that can change during run-time. LTP should not use NGROUPS_MAX but use sysconf(_SC_NGRPOUPS_MAX) instead, which does The Right Thing(tm). Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#381905: linux-kernel-headers: NGROUPS_MAX doesn't match the kernel and causes test failures
On Tue, Aug 08, 2006 at 12:58:42PM -0400, Vlad Yasevich wrote: Another possibility is to backport some of the changes from glibc 2.4 that uses the /proc interface for these sysconf calls to get the values from the currently running kernel. This would fix things once and for all, but this might be too invasive for Sarge to accept. No need to go to glibc 2.4. sysconf in glibc 2.3.6 in etch already uses /proc according to strace. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#378303: linux-kernel-headers: bitmap.h: 'BITS_PER_LONG' undeclared
On Sat, Jul 15, 2006 at 08:42:56AM +0200, Julien Danjou wrote: In file included from /usr/include/linux/cpumask.h:86, from /usr/include/asm-i486/processor.h:23, from /usr/include/asm/processor.h:8, from /usr/include/asm-i486/atomic.h:6, from /usr/include/asm/atomic.h:8, from swapon02.c:87: Anything that includes atomic.h from user space is fundamentally broken. The definitions in atomic.h are _NOT_ atomic on many architectures when used in user space (if they compile at all). Fix ltp. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences -
Bug#376756: libc6: popen() should return NULL
On Tue, Jul 04, 2006 at 05:08:19PM -0300, Jakson Aquino wrote: $ gcc testpopen.c $ ./a.out Before: F = (nil) After: F = 0x501010 sh: nothing: command not found This means that popen() _did_ succeed (it has invoked the shell). The fact that the shell could not interpret the command and therefore exited with an error is not popen()'s business anymore. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: mktime libc6 bug (#196177) still in sarge
On Tue, Jun 13, 2006 at 10:48:05AM +0200, Kecskemethy Zoltan wrote: Recently I upgraded my woody webserver to sarge. Now I have 2.3.2.ds1-22sarge3 version of libc6 package installed on my system. Our websites uses mktime php funciton and it seems it gives us back wrong data because of a glibc bug. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=196177 When we use date before year 1970. It seems to me our stable package still has this bug in it. :( To me it seems the bug is in the application, not in glibc. time_t is defined as seconds elapsed since 00:00:00 on January 1, 1970, Coordinated Universal Time (UTC), so it by definition can not represent date values before 1970. SUSv3 says about mktime: If the time since the Epoch cannot be represented, the function shall return the value (time_t)-1. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: IP addresses sorted in reverse order
On Thu, Jun 01, 2006 at 11:54:59AM +0200, Matus UHLAR - fantomas wrote: Nothing relies. It's just if you will receive addresses in some order, you should not reorder them unless you know what order they should be delivered in (e.g. ordering via RFC3484) RFC3484 has _nothing_ to do with the NSS. If you are using the glibc interfaces, then you should only consult the relevant standards (POSIX, SUS, whatever). And those standards do not contain ordering constraints. If you want to rely on the address ordering in the DNS reply, then you should not use the generic NSS interface (like gethostbyname() or getaddrinfo()) but you should use the resolver directly instead (see resolv(3)). I'm afraid this is not applicable and also you probably did not understand me. If there are clients on network A and network B and servers on network A and network B, DNS server may sort replies to clients so client A would get address of server A first, server B next. Client B would get addresses of server B first, server A next. I'm perfectly aware of that and it seems you are the one who do not understand me. If you are already able to generate different DNS replies for A and B, then the SRV records should look like: When queried by client A: _ssh._tcp.server.dom.ain. SRV 0 0 22 server-A.dom.ain. _ssh._tcp.server.dom.ain. SRV 1 0 22 server-B.dom.ain. When queried by client B: _ssh._tcp.server.dom.ain. SRV 1 0 22 server-A.dom.ain. _ssh._tcp.server.dom.ain. SRV 0 0 22 server-B.dom.ain. (Note the difference in the Priority field). This does _exactly_ what you want and is standard-compliant. You just have to modify your ssh client to query the SRV records for _ssh._tcp when it wants to connect to server.dom.ain (and likewise for any other services). An other option would be to play with routing instead of the DNS to direct your clients to the nearest server. Pardon? If one of servers is in our company's network and other is in differet network, company, town, how do you imagine this? Certainly you can only do this if you control the routing decisions of the clients. But since you filed this bugreport I assume you have full control of the clients, otherwise the bugreport has no sense. And yes, messing with the routing is always tricky, but on intranets it is sometimes more efficient than DNS games. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: IP addresses sorted in reverse order
On Wed, May 31, 2006 at 11:28:44PM +0200, Matus UHLAR - fantomas wrote: Searching mailinglists, bug databases did not give me correct answer. Does glibc sorty/reorder IP addresses gotten from DNS? Is this fixed in any newer versions of glibc? AFAIK there are no requirements about the order of addresses returned by any NSS calls. In particular, the returned order does not need to match the order in the underlying database (DNS in this case). Anything that relies on the returned addresses having a specific order is just plainly bogus. If you want to access resources in a controlled order, then you should choose a method that was designed for this purpose, like SRV records. There are not many software that supports SRV records out of the box so it is quite likely that you have to modify the clients. An other option would be to play with routing instead of the DNS to direct your clients to the nearest server. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#367962: Please don't ship a /lib64 symlink in the package on amd64
On Fri, May 19, 2006 at 10:58:33AM +0200, Goswin von Brederlow wrote: Local admins are already allowed to convert directories into links, e.g. to move parts ot the directory tree to another disk. According to Steve Langasek in Message-ID: [EMAIL PROTECTED] that's not allowed and you should use bind mounts instead. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#365233: libc6: memory leak in getprotobyname
On Fri, Apr 28, 2006 at 06:51:54PM +0200, Slaven Rezic wrote: #include netdb.h main() { struct protoent *pent; while(1) { pent = getprotobyname(tcp); } } valgrind shows that the leaking function is fopen(), called from getprotobyname_r(). It seems that getprotobyname_r() does not check if it already has /etc/protocols open and always opens a new file descriptor for it. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#365048: libc6 does not respect STATUS and ACTION options in nsswitch.conf
On Thu, Apr 27, 2006 at 10:40:32AM -0400, Jesse W. Hathaway wrote: [...] struct passwd *pw = getpwnam(user); if (pw == NULL) return 0; if (getgrouplist(user, pw-pw_gid, NULL, ng) 0) { groups = (gid_t *) malloc(ng * sizeof (gid_t)); getgrouplist(user, pw-pw_gid, groups, ng); } [...] doing an strace on the above program when searching for a user in /etc/passwd shows ldap being searched, with or without [SUCCESS=return] in nsswitch.conf. The above is not a good example. Do LDAP lookups happen with a single getpwnam() call _only_? If yes, then it is a bug, otherwise it's not. getgrouplist() and initgroups() will _always_ enumerate all NSS group data sources regardless of action statements. It may be unfortunate sometimes due to the generated load, but this is how their semantics are defined. The only solution is not to use LDAP for the group database at all. Changing nsswitch to [UNAVAIL=return] disables ldap lookups for all requests even if the user is not in /etc/passwd. Note that the UNAVAIL status refers only to the generic availability of the service, it has nothing to do with the user being defined or not. That said, files [UNAVAIL=return] ldap should not disable ldap (quite the contrary, it should have basically no effect unless you delete /etc/passwd etc.), so this may need further investigation. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#365048: libc6 does not respect STATUS and ACTION options in nsswitch.conf
On Fri, Apr 28, 2006 at 08:03:39AM -0400, Jesse W. Hathaway wrote: Why it it defined that getgrouplist() and initgroups() _always_ enumerate all NSS goups? Just think about the simple case when an user defined in /etc/passwd is also a member of a group that is only defined in LDAP. getgrouplist() and initgroups() MUST support this. Or an other viewpoint: when enumerating entries neither the SUCCESS nor the NOTFOUND conditions occur until all backends are exhausted, so [SUCCESS=return] or [NOTFOUND=return] has no effect on enumeration. Btw. both the nsswitch.conf man page and the glibc documentation say: The second item in the specification gives the user much finer control on the lookup process. So they only mention the _lookup_ process (i.e. getXXbyYY()), they do not say that action statements would have any effect on enumeration. This can cause problems for system daemons. For instance apache2 does an initgroups every time it spawns a thread, which results in my ldap servers being pounded when I have high load on my webservers. Nscd is a possible solution to the problem, but the version in stable does not cache initgroup requests, and the version in unstable invalidates them prematurely. Having the ability to not search other databases for local name service lookups seems like a valuable function. That is a well-known scenario and the usual advice is do not use LDAP as the group NSS backend. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#365048: libc6 does not respect STATUS and ACTION options in nsswitch.conf
On Fri, Apr 28, 2006 at 10:51:38AM -0400, Jesse W. Hathaway wrote: I do understand why this feature is needed. However, the additional feature of having the ability to disable this function is also needed. It is quite common to not have any of the users, used for system daemons, to be included in groups found in network directories. It seems needless to query network directories for system daemons such as apache. Yes, in some cases such a feature would be useful, but that feature currently does not exist. Enumeration is a lookup process, so I still think the man page is unclear, as to what effect the action statement will have in the group database option. The documentation might be improved, but the documentation of SUCCESS talks about the wanted entry and the documentation of NOTFOUND talks about the needed value, both terms having no meaning for enumeration. Well, you can interpret those terms as all possible entries; either way you get that SUCCESS and NOTFOUND action rules have no effect on enumeration. Given that one of the main features of LDAP and NIS are consistent groups across all machines, I think it would be beneficial to support querying network directories selectively. I think the reason this was not solved much easier is that it is not a problem for NIS/NIS+. They need much less resources than LDAP. Enumerating over a couple thousand users using NIS+ was not a problem when I last did it; doing the same with LDAP produces quite a significant load. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#363442: libc6-xen should not conflict with any other libc6-$flavor
On Wed, Apr 19, 2006 at 01:02:59PM -0500, Adam Heath wrote: I'm ready to upload xen 3.0.2, with a dependency on libc6-xen. IMHO just go ahead with the upload :-) The removal of the other optimized flavors due to the conflict with libc6-xen should only cause some performance regression when you boot a non-xen kernel, it should not have any effect on usability. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#362966: Acknowledgement (nscd aborts)
On Sun, Apr 16, 2006 at 12:31:41PM -0700, Richard A Nelson wrote: Since things are still going after the change, this probably shouldn't be that high a priority issue... it shouldn't abort - a syslog note would be much nicer :) No message should be generated. Pointing multiple user names to the same UID is perfectly legal. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: r1381 - in glibc-package/trunk/debian: . local/manpages local/usr_sbin
On Mon, Apr 10, 2006 at 11:28:18AM +0200, Aurelien Jarno wrote: However, that does not fix #345907. Does anybody has an idea how to fix that? I think it is a good idea to let the user update it's timezone manually, given the way we handle timezone in the stable release (though it should now be easy with the tzdata package). How about storing the md5sum of /etc/localtime and updating it only if the checksum has not been changed (and warning the user otherwise). X.org does something similar for auto-updating xorg.conf if the user did not change it manually. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#346342: libc6: REALLY annoying: destroys workaround all the time
On Mon, Apr 10, 2006 at 07:42:41AM +0200, Aurelien Jarno wrote: Then on the bug itself, I will try to investigate that. The solution is not trivial, if you look at the tzconfig script, you may notice that the script use a readlink on /etc/localtime. Replacing it by a plain file may have consequences that have to be investigated. tzconfig from libc6-2.3.6-5 already checks if /etc/localtime is a symlink or not and seems to handle both cases fine. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3
On Fri, Feb 24, 2006 at 08:46:23AM +, Brian M. Carlson wrote: Neither side is willing to lose and give in all the way. I tried a compromise. Apparently, that didn't work, so let me try another one: glibc could no longer claim compliance with SUSv3/POSIX 1003.1-2001 or SUSv2. Then there will be no bug, and we can all be happy. Well, it would certainly make sense to document the known deviations from the various standards. Keeping the list up-to-date may be problematic however, so I think this still should be coordinated with upstream. * libfoo is compiled against glibc 2.3.6. * bar is compiled against libfoo and glibc 2.3.6. * A new version of bar comes out, and is compiled against glibc 2.3.7 (which no longer has the bug in question, let's say). * Now, several things could happen: + bar passes the error code it gets from some libc function to libfoo, and libfoo tries to handle it, libfoo errors out and bar no longer works. IMHO this is acceptable as this should not be common and can be handled by using versioned dependencies on libfoo. + bar passes the error code it gets from some libc function to libfoo, which then tries to log the error by using strerror. This will cause silent breakage. IMHO the version of strerror() need not be incremented in this case, so both bar and libfoo will use the same strerror() version. This solution will avoid the vast majority of problems, but it isn't perfect. I am getting the impression that the others here want a perfect solution, and other than changing SONAMEs (which no one wants to do), it can't be done, AFAICT. If someone can find a solution which works in all cases, please let me know. I don't think a perfect solution is needed. You only need to keep the benefits/expected breakage ratio acceptable, and since fixing the errno values provides only a really small benefit, the expected breakage should be quite low. I found the change in question, where Mr. Drepper claims that Linus rejected his patch to create an ENOTSUP, and so he defined ENOTSUP to EOPNOTSUPP. But I cannot find the patch that Linus rejected. Nor did I. It is well known that Ulrich Drepper and Linus Torvalds do not get along well. In this case I think Ulrich was wrong, but he did not want to acknowledge it. Sure, it is convinient if the kernel returns the same error code as glibc wants, but nothing stops glibc from remapping the error code if this is not the case. After all, the standards define the behaviour of the _library_ functions, not the kernel-glibc interface. I also find his claim in the glibc bug I opened that it is part of the ABI unconvincing, especially now that I know it was he who made the change, and therefore part of the ABI. Also, I have no idea why the two errors were made the same, when item number 2 in the PROJECTS file is: [ 2] Test compliance with standards. If you have access to recent standards (IEEE, ISO, ANSI, X/Open, ...) and/or test suites you could do some checks as the goal is to be compliant with all standards if they do not contradict each other. This has apparently been in that file since it was checked in, 9 years and 8 months ago. Yep, Ulrich Drepper seems to be a man hard to deal with. But it is true that ENOTSUP == EOPNOTSUPP is part of the _current_ ABI, and changing that would be very unwise if it is not accepted upstream. No, I can be sure they have separate values now, because glibc defines _POSIX_VERSION to 200112L. That indicates *complete* and *total* support of the base portions of POSIX 1003.1-2001. Such portions include the two error code in question. No, that at most indicates _intent_ to support the standard. The standard says that compliant systems should set that symbol to that value; it does not (and can not) say that non-compliant systems should not set it. Complete and total support would mean being officially certified, but that's not the case. And that's exactly the reason why real-world applications use solutions like autoconf instead of depending on feature macros. It's not just Linux, commercial vendors also have their fair share of standard non-compliance bugs from time to time. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences -
Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3
On Thu, Feb 23, 2006 at 04:30:55AM +, Brian M. Carlson wrote: By introducing a new define, you are breaking standard compliance. Well, there is no better way. You want to preserve binary compatibility at the expense of all else. I want to preserve standards compliance at the expense of all else. I am trying to offer a compromise. You can not preseve standards compliance by breaking standards compliance, so that's out. What remains is preserving binary compliance, and that can be achieved by doing nothing. Actually, no it won't. It will continue to return the wrong value (EOPNOTSUPP) that existing code returns. At some point, one might want to fix that with new @GLIBC_2.3.7 symbols, but I'm not going to implement that right now. Also, see the paragraph above. Oh, so you _do_ know how to fix it properly: - Make ENOTSUP and EOPNOTSUPP have different values in the header - Ensure that the implementations with the current symbol versions continue to return the old value to preserve binary compatibility - Create a new version for every affected function that does the desired error-code remapping So do it and propose a patch to upstream (or hire someone to do it for you). Handwaving and posting completely broken patches will not help. (Oh, and be prepared for Ulrich Drepper rejecting this exact change, as he already did in 1999). If and when that happens, my code will be broken, and I will be happy to fix it. Expecting that I act as if something will happen, when I cannot be certain it will, is silly. By the same argument, expecting that ENOTSUP and EOPNOTSUPP having different values in the future, when you cannot be certain they will (i.e. you haven't written a patch that got accepted upstream, and you have forced every Linux user to upgrade), is silly. So you should fix your code _now_, and remove the extra handling of ENOTSUP/EOPNOTSUPP if/when they will have separate values. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences -
Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3
On Tue, Feb 21, 2006 at 06:23:21AM +, Brian M. Carlson wrote: (void *)strerror(error_code); Not thread safe. Then use strerror_r(). Also, this code does not check that it is a *POSIX* error code. If you check the Linux sources, you can see that there are many error codes (mostly for NFS) that are not standard, and therefore are invalid for my program. Let's check. EBADHANDLE is 521 (fwiw, it is not #defined unless __KERNEL__ is defined, which should not be the case for user-space programs): /* We want the SUSv3 version of strerror_r(), not the GNU one */ #define _XOPEN_SOURCE 600 #include errno.h #include string.h #include stdio.h int main(void) { errno = 0; strerror_r(EINVAL, NULL, 0); printf(%d %s\n, errno, strerror(errno)); errno = 0; strerror_r(521, NULL, 0); printf(%d %s\n, errno, strerror(errno)); return 0; } This gives: 34 Numerical result out of range 22 Invalid argument I have a patch forthcoming which mitigates the damage and allows people that really want standards compliance to indicate so. SUSv3 clearly defines how an application should indicate that it desires standard compliance. By introducing a different schema you are actually _breaking_ standard compliance here. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3
On Sun, Feb 19, 2006 at 08:25:34AM +, Brian M. Carlson wrote: Anyway, my problem is that the fact that these two errors are the same is causing my code to break very badly. I have a library that contains its own error codes that will be negative if casted to an int. Additionally, I want to support the use of the standard errno.h error codes. To make my life easier, I am using a script to generate a list of valid error codes: the POSIX ones, as well as mine. The code generated by the script uses a switch statement to check whether a code is valid. But because two case statements cannot have the same value, I get compiler errors. I have logic to check for EAGAIN and EWOULDBLOCK, and only use one if both are the same; I'd prefer not to have to do this for other pairs. That seems overly complex. You should most certainly know the range of your own error codes, so something like the below looks much simpler (no script needed, no dependance on the value of standard error codes): if (error_code = MY_ERROR_MIN error_code = MY_ERROR_MAX) { /* handle your own error codes here */ } else { errno = 0; (void *)strerror(error_code); if (!errno) { /* error_code was known to libc */ } else { /* unknown, probably invalid error_code */ } } After a quick check at least Darwin 6.8 and FreeBSD 5.1 also uses the same value for ENOTSUP and EOPNOTSUPP, so if you want to be portable, you can not assume that these values are distinct. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#353031: posix_fadvise defines missing
On Thu, Feb 16, 2006 at 12:24:27AM -0500, Greg Stark wrote: If this is intentional (which seems unlikely, why should I have to define these things just to get a standard libc function?) then it's at the very least a documentation bug. The man page clearly indicates that only #include fcntl.h is required. Well, the man page says: CONFORMING TO SUSv3 (Advanced Realtime Option), POSIX 1003.1-2003. [...] And SUSv3 says: A POSIX-conforming application should ensure that the feature test macro _POSIX_C_SOURCE is defined before inclusion of any header. SUSv3 specifies that the value of _POSIX_C_SOURCE should be 200112L, while the glibc documentation specifies that unless you explicitely define a feature macro, the default value of _POSIX_C_SOURCE will be 2 (and thus posix_fadvise() will not be visible). So it is documented behavior. And no, posix_fadvise is not just a standard libc function; the only sane default for standard is what is described in ISO C99, and that does not contain posix_fadvise(). Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Fri, Dec 23, 2005 at 01:21:54PM -0800, Edward Buck wrote: The correct query order for mx1.hotmail.com (containing 2 dots) should be: 1. mx1.hotmail.com. - 2. mx1.hotmail.com. - A 3. mx1.hotmail.com.domain1.com. - 4. mx1.hotmail.com.domain1.com. - A 5. mx1.hotmail.com.domain2.com. - 6. mx1.hotmail.com.domain2.com. - A If step 1 or 2 returns a host address, step 3 and later are skipped. The Debian (or glibc) query order is: 1. mx1.hotmail.com. - 2. mx1.hotmail.com.domain1.com. - 3. mx1.hotmail.com.domain2.com. - 4. mx1.hotmail.com. - A 5. mx1.hotmail.com.domain1.com. - A 6. mx1.hotmail.com.domain2.com. - A With Debian's query order, mx1.hotmail.com exists as an A record yet the system doesn't check until it has already done 3 queries, 2 of which do not qualify as an 'initial absolute query'. Ok, let's clarify some things here. resolv.conf(5) describes the behaviour of a _single_ resolver query. If you look at resolv/nss_dns/dns-host.c in the glibc source, you'll see that gethostbyname(3) is implemented as _two_ distinct resolver invocations. Since it is nowhere specified how many resolver invocations gethostbyname(3) should cause, the glibc behaviour is correct and will result in the second list of DNS queries you specified. If you want to avoid the extra query, you should use getaddrinfo(3) or the GNU-specific gethostbyname2(3) and specify explicitely the address family you are interested in. The bug is not just limited to those who use the search line. If your resolv.conf contains 'domain ...', e.g. domain example.com nameserver x.x.x.x nameserver y.y.y.y Then a query of mx1.hotmail.com will ALWAYS yield: 1. mx1.hotmail.com. - 2. mx1.hotmail.com.example.com. - (extraneous) 3. mx1.hotmail.com. - A This is the same as before, as by default the search list is initialized to contain the local domain if there are no explicit search lines. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Wed, Dec 21, 2005 at 06:08:04PM +0100, Marco d'Itri wrote: I find this reasoning very peculiar. If an algorithm is inefficient and this causes problems then it is obviously buggy. An algorithm is buggy if it does not match the specification. I see no description about the lookup order wrt. multiple protocols, so the behaviour is conformant to the documentation. Also note that resolv.conf(5) explicitely says that using search lines may be slow and will generate a lot of network traffic if the servers for the listed domains are not local, and that queries will time out if no server is available for one of the domains. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu -
Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Wed, Dec 21, 2005 at 10:42:03AM -0800, Edward Buck wrote: On the first point, I (and thus my company) use search lines in combination with LAN-only DNS subdomains for internal address management. It allows us to use internal IP addresses for hosts without fiddling with /etc/hosts. All our host subdomains are managed in DNS. A LOT of scripts, i.e. for backup, rsync, load balancing, use short hostnames that get their address information from internal DNS zones, a process that depends on the search functionality in /etc/resolv.conf. My personal opinion is that this is wrong, and now you are trying to paper over an initial design flaw. Should you had a policy to always use full host names everywhere, you'd not have this problem now. In my experience relying on lookup service configuration is never good. To give you an idea of impact, I was recently greeted with an e-mail from a DNS service provider that I use saying that I was getting close to my query quota. It surprised me that I got this e-mail because I was never close to hitting the quota before. It turns out that 90% of the queries were coming from 1 server where I unwittingly added the domain to the search path! Well, resolv.conf(5) says about search lines that they will generate a lot of network traffic if the servers for the listed domains are not local. You should not add a search line for a domain not server by a local name server. In most cases this can be solved by installing a local caching-only name server. On the subject of work-arounds, I'm not having much luck finding one without recompiling glibc, which is not a good option IMO. If anyone has any ideas on this, please let me know. Did you try apt-get install bind9 and putting nameserver 127.0.0.1 in /etc/resolv.conf? You can also try lwresd libnss-lwres if you need something smaller, or djbdns if you like its author :-) This may reduce your DNS traffic even more than changing the lookup order in glibc would. Of course you have to pay with some memory and a little CPU usage. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Fri, Dec 23, 2005 at 01:31:16AM +0100, Marco d'Itri wrote: Yet another very peculiar definition from you. Well, that's the first thing thaught in programming theory here. If the algorithm matches the specification, then it is correct. If the specification does not cover something, then the algorithm is free to choose whatever behaviour it prefers. Of course, a correct algorithm is not neccessarily the best. Bubblesort is correct since the traditional sorting specification does not put constraints on the number of comparisons, but there is a reason people prefer to use qsort when there are more than a handful of elements :-) Which is true, but does not mean that the current behaviour is correct and should not be changed. Then convince upstream to change the current behaviour. Or write a patch, and convince the Debian glibc team that Debian should diverge from upstream behaviour. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu -
Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Tue, Dec 20, 2005 at 12:41:05AM +, Stephen Gran wrote: I guess the answer to this problem for you is to just disable ipv6 (unless you need it) - blacklisting the kernel module(s) ought to do it, although there may be some other parts I am unaware of. I doubt disabling IPv6 in the kernel would make any difference since querying for records does not require an IPv6 socket. You will only find out if IPv6 is disabled if you do have an record and you try to use the address. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu -
Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Wed, Dec 14, 2005 at 11:41:38AM -0800, Edward Buck wrote: If it's a frequently used feature, it wasn't available until sarge. Woody did not behave this way (I checked). Huh? $ cat /etc/debian_version 3.0 $ cat /etc/resolv.conf search hpcc.sztaki.hu lpds.sztaki.hu sztaki.hu nameserver 127.0.0.1 $ ping rs2.lvs PING rs2.lvs.sztaki.hu (193.6.200.132): 56 data bytes ... It is definitely available in Woody. I'm using it regularly. Also, this new feature completely breaks software that doesn't expect this feature, since postfix, telnet and others are doing WAY more DNS queries than they should. Depending on how many search domains are listed and how many caching nameservers are listed, in my case (2 search domains and 2 nameservers) I count at least 4 unnecessary queries. That's very bad. Well, why do you have any search domains then? It is for human convenience only, and a mail server usually does not have regular user accounts so no need for such convenience features. Sure, I can do that with telnet interactively. How do I tell postfix to do that without a patch? I guess I could try setting the ndots option to postfix's environment but that seems like a bad hack. The current behavior makes using the search lines impossible for busy servers, especially mail servers that do DNS queries for every piece of mail. Just imagine the excess DNS load on a server processing a million e-mail messages a day. That's what I'm seeing. For such setups I suggest running some local DNS-catching solution (nscd or a local caching-only name server). Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf
On Mon, Dec 12, 2005 at 09:13:13PM -0800, Edward Buck wrote: In a nutshell, when using 'search' lines in /etc/resolv.conf, the resolver always appends listed search domains to a hostname lookup even when the host being searched is fully-qualified (contains one or more dots). No, a host name containing a dot is _not_ a FQDN. A host name _ending_ with a dot is a FQDN. Using host.subdomain while search is set to some.domain to access host.subdomain.some.domain is a common and frequently used feature. This results in a LOT of needless DNS traffic. On a busy mail server, it makes using the 'search' lines extremely expensive (because DNS traffic increases exponentially). Here's an strace of 'telnet mx1.hotmail.com 25'. Oddly, it seems to do the right thing initially but the fully-qualified lookup must always fail, resulting in subsequent lookups using the search list. Then use a _real_ FQDN and try 'telnet mx1.hotmail.com. 25' (note the terminating dot). Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#331405: Accidential activation of nscd is too simple
On Mon, Oct 03, 2005 at 12:33:45PM +0200, Martin Samuelsson wrote: Obviously something has automatically dragged nscd into my system as one of it's dependencies. (It's marked A in aptitude) And having a software cacheing dns lookups from disconnected moments doesn't really make a laptop very useable when being connected. What is that something? Investigating the output of apt-cache rdepends nscd, libnss-pgsql1 and libnss-mdns Suggests: nscd, and libnss-ldap Recommends: it, but nothing Depends: on it. So you should've given a choice by whatever package installation frontend you've used. One could say that I should have better knowledge of exactly what software that is on my system, and how it is configured. However I've always found the debian way to be having software installed with reasonable defaults. Which I don't think this behaviour is, considering it simple to get installed without realizing it. Well, the description of nscd says: You should install this package only if you use slow Services like LDAP, NIS or NIS+. If you are not using one of these services, why did you choose to install nscd? (I don't dare to assume that you haven't even read the package description before letting an unknown and unrequested pacakge installed on your system...) Also, the default negative-ttl for the hosts map is just 20 seconds which I think _is_ a quite reasonable default. My suggestion would be that nscd was configured by default to not start or to never cache any data until explicitly told so by a simple, but active act from the system administrator. Why? You said I've always found the debian way to be having software installed with reasonable defaults. The only reasonable default for a program called Name Service Caching Daemon is to cache name service calls when installed. Otherwise why did you install it at all? Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#330929: how can adduser reliably find out wheter nscd is running?
On Mon, Oct 03, 2005 at 06:07:26PM +0200, Marc Haber wrote: That's how we're going to do it in the future. I would, however, like to have the fact documented that it is not an error to run nscd -i if no daemon is running. What do you mean by not an error? nscd -i of course will return a non-0 exit code since it cannot invalidate the cache if the daemon is not running, but that is OK for adduser. I think you simply should not care if nscd -i succeeds or fails. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#330929: how can adduser reliably find out wheter nscd is running?
On Tue, Oct 04, 2005 at 02:18:18PM +0200, Marc Haber wrote: I am concerned about nscd suddenly giving an actual error message instead of silently returning non-zero, which might confuse the users. Well, my personal preference would be not to worry about this until it becomes a real issue (that future error message may be descriptive enough, after all). Or you can just add 2/dev/null to the nscd -i invocation. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#330929: how can adduser reliably find out wheter nscd is running?
On Tue, Oct 04, 2005 at 02:41:26PM +0200, Marc Haber wrote: Adduser has _a lot_ of installations and is mainly used by maintainer scripts. Thus, _a lot_ of people are bound to see error messages generated by adduser and are bound to be confused by them. Or you can just add 2/dev/null to the nscd -i invocation. That will also kill real error messages. Are there any useful real error messages now? For example, if nscd is not running, trying to invalidate a non-existing map will not produce an error message, and if nscd is running, the error message is misleading. I still prefer not fixing what is not broken. glibc upgrades are always complex and require lots of testing. So I think even if nscd behavior is changed upstream in the future, there will be enough time before the next Debian release to discover any problems and update adduser if neccessary. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: libc 2.3.5 and heap corruption checking?
On Wed, Sep 07, 2005 at 09:29:04AM -0600, Ben Pearre wrote: *** glibc detected *** malloc(): memory corruption: 0x083ba0e8 *** zsh: abort (core dumped) matlab -nosplash -nojvm [...] (143)% export MALLOC_CHECK_=1 (0)% matlab [...] [1;2]*[3 4] Segmentation violation detected at Wed Sep 7 00:55:16 2005 [...] Well, setting MALLOC_CHECK_ prevents glibc from calling abort() when it detects heap corruption but it will _not_ fix the bug in Matlab. You can do some more investigation though: - Run Matlab on a machine with the old glibc, but set MALLOC_CHECK_ to 2. If that also crashes, then the bug is clearly in Matlab and you should go complain to your Matlab vendor. - Run Matlab under valgrind, that will show you more information about heap allocation (it can detect more problems than glibc´s internal checker and can also give much better location information if you have debugging symbols). Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#324900: nscd: umount /var fails (unclean shutdowns)
On Thu, Sep 01, 2005 at 02:04:25PM +0900, GOTO Masanori wrote: - bash unconditionally does some NSS calls during startup (getpwuid etc.); this in turn - loads all NSS modules that serve passwd maps - if a module uses libraries from /usr, now you have a live memory mapping under /usr so you cannot unmount it during shutdown Why is this unconditionally happenned? What setting does this cause this problem? Because bash calls getpwuid() to initialize the value of $SHELL before executing the script. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences, Laboratory of Parallel and Distributed Systems Address : H-1132 Budapest Victor Hugo u. 18-22. Hungary Phone/Fax : +36 1 329-78-64 (secretary) W3: http://www.lpds.sztaki.hu - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#324900: nscd: umount /var fails (unclean shutdowns)
On Wed, Aug 31, 2005 at 01:36:24PM +0900, GOTO Masanori wrote: root 1119 1 0 12:47 ?00:00:00 /bin/sh /etc/init.d/rc 6 [...] rc1119 root mem REG8,9 217016 228931 /var/db/nscd/passwd It's very weird behavior. Please disable nscd when your boot up time, and then run /etc/init.d/nscd. You can see which processes have such nscd file descriptor (fd), and you can clear process inheritance with pstree easily. If nothing has nscd fd, we can clear rc behaves oddly. Well, if /bin/sh is bash, then it is not weird at all, it is the same bash vs. NSS problem that came up several times in the past (last time quite recently on debian-devel). Previously it only happened with NSS modules that link to libraries under /usr, now it also affects nscd. What I _suspect_ is happening here: - by calling /etc/init.d/rc, bash is executed - bash unconditionally does some NSS calls during startup (getpwuid etc.); this in turn - loads all NSS modules that serve passwd maps - if a module uses libraries from /usr, now you have a live memory mapping under /usr so you cannot unmount it during shutdown - bash (libc) connects to nscd - nscd sends a file descriptor of /var/db/nscd/passwd to bash, and bash does an mmap(2) on the received fd - now you have a live memory mapping under /var thus you cannot unmount it during shutdown - /etc/init.d/rc eventually kills nscd but that does not help, since the bash process executing /etc/init.d/rc still has the cache file mapped (deleting the cache file also doesn't help since unlink(2) only operates on the directory and does not invalidate the memory mapping) Options you have to avoid the problem: 1. Stop using bash as /bin/sh 2. Stop using nscd 3. Convert /etc/init.d/rc to be an ELF executable instead of a shell script 4. Let /var (and /usr if you are running more exotic NSS modules) be the same filesystem as / 5. Redesign the init system so unmounting of local file systems is done _after_ /etc/init/rc has finished (and the program that does the unmounting must not be a shell script) Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#304604: locales: Upgrading fails
On Sat, Apr 16, 2005 at 06:18:02PM +0900, GOTO Masanori wrote: Could you investigate perl with glibc 2.3.2.ds1-20 + MALLOC_CHECK_=0 to 3? If perl has double free bug, such bug should be also appeared on sarge/sid environment. With glibc 2.3.2.ds1-21 MALLOC_CHECK_=3 dpkg-reconfigure dash also dies: [...] malloc: using debugging hooks free(): invalid pointer 0x88d9d20! Aborted Gabor -- Gabor Gombas CERN IT-GM-DM [EMAIL PROTECTED] Office: 31/S-016 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#304604: locales: Upgrading fails
On Fri, Apr 15, 2005 at 09:37:09AM +0900, GOTO Masanori wrote: Before we reassign to perl, we need to clear which setting triggers your problem. At least under my environment, this problem is not appeared. Could you confirm your environment settings (ex: LANG or malloc related value), and perl versions? LANG=en_US.UTF-8 LANGUAGE=en_US:en_GB:en LC_CTYPE=hu_HU.UTF-8 ii perl 5.8.4-8Larry Wall's Practical Extraction and Report FYI: Last few lines from strace dpkg-reconfigure dash: 28191 fsync(4) = 0 28191 stat64(/var/cache/debconf/config.dat, {st_mode=S_IFREG|0644, st_size=74096, ...}) = 0 28191 rename(/var/cache/debconf/config.dat, /var/cache/debconf/config.dat-old) = 0 28191 rename(/var/cache/debconf/config.dat-new, /var/cache/debconf/config.dat) = 0 28191 close(5) = 0 28191 close(4) = 0 28191 close(6) = 0 28191 close(8) = 0 28191 close(3) = 0 28191 munmap(0xb7f57000, 4096) = 0 28191 munmap(0xb7f56000, 4096) = 0 28191 close(7) = 0 28191 open(/dev/tty, O_RDWR|O_NONBLOCK|O_NOCTTY) = 3 28191 writev(3, [{*** glibc detected *** , 23}, {double free or corruption (!prev..., 33}, {: 0x, 4}, {08785b58, 8}, { ***\n, 5}], 5) = 73 28191 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0 28191 tgkill(28191, 28191, SIGABRT) = 0 28191 --- SIGABRT (Aborted) @ 0 (0) --- 28191 +++ killed by SIGABRT +++ Compared to the last lines from MALLOC_CHECK_=0 strace dpkg-reconfigure dash: 28157 fsync(4) = 0 28157 stat64(/var/cache/debconf/config.dat, {st_mode=S_IFREG|0644, st_size=74096, ...}) = 0 28157 rename(/var/cache/debconf/config.dat, /var/cache/debconf/config.dat-old) = 0 28157 rename(/var/cache/debconf/config.dat-new, /var/cache/debconf/config.dat) = 0 28157 close(5) = 0 28157 close(4) = 0 28157 close(6) = 0 28157 close(8) = 0 28157 close(3) = 0 28157 munmap(0xb7f57000, 4096) = 0 28157 munmap(0xb7f56000, 4096) = 0 28157 close(7) = 0 28157 exit_group(0) = ? So perl seems to be dying in exit(). Gabor -- Gabor Gombas CERN IT-GM-DM [EMAIL PROTECTED] Office: 31/S-016 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#304604: locales: Upgrading fails
Package: locales Version: 2.3.4-3 Severity: important Tags: experimental Hi, Trying to upgrade to 2.3.4-3 gives the following error: Setting up locales (2.3.4-3) ... Generating locales... en_US.ISO-8859-1... done en_US.ISO-8859-15... done en_US.UTF-8... done hu_HU.ISO-8859-2... done hu_HU.UTF-8... done Generation complete. *** glibc detected *** double free or corruption (!prev): 0x08781408 *** dpkg: error processing locales (--configure): subprocess post-installation script killed by signal (Aborted) Errors were encountered while processing: locales E: Sub-process /usr/bin/dpkg returned an error code (1) -- System Information: Debian Release: 3.1 APT prefers unstable APT policy: (990, 'unstable'), (101, 'experimental') Architecture: i386 (i686) Kernel: Linux 2.6.11 Locale: LANG=en_US.UTF-8, LC_CTYPE=hu_HU.UTF-8 (charmap=UTF-8) Versions of packages locales depends on: ii debconf 1.4.48 Debian configuration management sy ii libc6 [glibc-2.3.4-3] 2.3.4-3GNU C Library: Shared libraries an -- debconf information: * locales/default_environment_locale: en_US.UTF-8 * locales/locales_to_be_generated: en_US ISO-8859-1, en_US.ISO-8859-15 ISO-8859-15, en_US.UTF-8 UTF-8, hu_HU ISO-8859-2, hu_HU.UTF-8 UTF-8 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#304604: locales: Upgrading fails
On Fri, Apr 15, 2005 at 01:44:38AM +0900, GOTO Masanori wrote: Could you track this installation process in detail and clear which binary did this error cause? The error comes from perl (debconf). Actually all debconf-using packages seem to be broken with libc-2.3.4; dpkg-reconfigure also dies with the same error. You may reassign this bug to perl, but if perl is not fixed before sarge (unlikely, since AFAIK perl is frozen, and the bug does not affect sarge), a more generic workaround might be needed for the sarge-etch transition (yeah, I know it is in the distant future ;-) Ok, more stracing shows that the abort() happens when perl exits, so the bug would be harmless if malloc() friends would just not call abort() by default. Gabor -- Gabor Gombas CERN IT-GM-DM [EMAIL PROTECTED] Office: 31/S-016 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]