Bug#551158: Libc6: exim4 Segfault in libc-2.7.so

2009-10-28 Thread Gabor Gombas
Hi,

On Tue, Oct 27, 2009 at 06:21:24AM +0100, Fabio Rosciano wrote:

 thanks for helping out.
 Here it is, I can't see anything funny:

[...]

Yes, the logs are pretty much the same as when I did the upgrade, except
I did not have libc6-dev-i386 installed and I went to 2.10.1-1 from
2.9-27.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



-- 
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#551158: Libc6: exim4 Segfault in libc-2.7.so

2009-10-26 Thread Gabor Gombas
On Mon, Oct 26, 2009 at 09:02:22AM +0100, Fabio Rosciano wrote:
 On Mon, 2009-10-26 at 08:10 +0100, Aurelien Jarno wrote:
 
  Do you have the list of packages that have been upgraded?
 
 I wish I did, but as soon as debconf asked would you like to upgrade
 libc6 now? and I answered yes, the system became completely unusable.

Do you have /var/log/dpkg.log? Even if it does not contain the action
that failed first, the lines before the failure may show if packages
were being installed in an unexpected order.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



-- 
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Bug#551831: cupt: Incorrectly upgrades libc6, breaking the system

2009-10-21 Thread Gabor Gombas
On Wed, Oct 21, 2009 at 12:01:51PM +0300, Eugene V. Lyubimkin wrote:

 However, this is another side of already archived
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=543365 (ironically, reported
 by you too). On i386 we have the issue: libc6-i686 strictly Pre-Depends on
 libc6 (= ...), and whatever package from this two cupt tries to upgrade first,
 the pre-dependency will be broken.
 
 Let me try to add libc maintainers to the loop to know the correct upgrade 
 path.

Hmm. remove old libc6-i686, upgrade libc6, install new libc6-i686
seems to be a sequence where the Pre-Depends never breaks. Since
libc6-i686 is not needed for the system to function properly, removing
it temporarily is not a problem. Now the question is can it be
generalized to other packages?

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


--
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#545179: libc6: postinst must run telinit u

2009-09-08 Thread Gabor Gombas
On Mon, Sep 07, 2009 at 10:46:34AM +0200, Bastian Blank wrote:

  I really can't explain you why the behaviour is still the same.
 
 The mentioned bug shows a different problem.

I suspect that the referenced bug report was made with / being ext2,
while nowadays ext3 is the default. If I'm right, then the automatic
journal replay prevents fsck from complaining.

Hmm, I've found the logs of the first boot after the last libc upgrade:

dpkg.log fragment:

2009-08-31 21:36:58 status installed libc6 2.9-26

messages.log indicates the machine was shut down properly:

Aug 31 21:41:09 twister shutdown[20750]: shutting down for system halt
Aug 31 21:41:16 twister kernel: Kernel logging (proc) stopped.
Aug 31 21:41:16 twister kernel: Kernel log daemon terminating.
Aug 31 21:41:26 twister exiting on signal 15

kernel.log of the next boot:

Sep  1 07:11:31 twister kernel: [4.286170] EXT3-fs: INFO: recovery required 
on readonly filesystem.
Sep  1 07:11:31 twister kernel: [4.286272] EXT3-fs: write access will be 
enabled during recovery.
Sep  1 07:11:31 twister kernel: [4.480816] kjournald starting.  Commit 
interval 5 seconds
Sep  1 07:11:31 twister kernel: [4.480934] EXT3-fs: md0: orphan cleanup on 
readonly fs
Sep  1 07:11:31 twister kernel: [4.481039] ext3_orphan_cleanup: deleting 
unreferenced inode 658
Sep  1 07:11:31 twister kernel: [4.481076] ext3_orphan_cleanup: deleting 
unreferenced inode 529
Sep  1 07:11:31 twister kernel: [4.490777] ext3_orphan_cleanup: deleting 
unreferenced inode 527
Sep  1 07:11:31 twister kernel: [4.625170] EXT3-fs: md0: 3 orphan inodes 
deleted
Sep  1 07:11:31 twister kernel: [4.625272] EXT3-fs: recovery complete.
Sep  1 07:11:31 twister kernel: [4.628012] EXT3-fs: mounted filesystem with 
ordered data mode.
Sep  1 07:11:31 twister kernel: [4.628130] VFS: Mounted root (ext3 
filesystem) readonly on device 9:0.

So it's now the kernel that complains about the unclean shutdown
instead of fsck, but the issue seems to be very much the same.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



-- 
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: chmod 670

2007-11-14 Thread Gabor Gombas
On Tue, Nov 13, 2007 at 01:25:35PM -0300, Patricio Rojo wrote:

  - If you try 'ls', then its contents are shown

Yes, because you have read permission.

  - If you try 'cd' to it, you get permission denied.

Yes, because you do not have search (x) permission.

  - If you try 'ls -l', you get many interrogation signs (?) instead
 of the properties of the file.

Yes, because you do not have search (x) permission, so ls can not
get the requested information, but it still has to display _something_.

  - If the user is changed to someone other than you, but the group
 remains the same, then you get full access.

Yes, because group permission bits are used only if you are _not_ the
owner of the file.

 Anyways, getting many '' is very awkward.

No, specifying rw- rights for a directory what is awkward. You get
what you've asked for.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#324900: nscd: umount /var fails (unclean shutdowns)

2007-04-26 Thread Gabor Gombas
On Wed, Apr 25, 2007 at 09:35:04PM +0200, Pierre HABOUZIT wrote:

   I have the same problem but as it concerns a file that will be
 deleted anyway, it's not critical, and there is nothing that we can do
 (except code rc not to use bash I guess).

It may be useful to have some way (like an environment variable) that
would forbid using nscd even if it is running.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



Bug#413744: glibc - uses more than one cpu without asking

2007-03-08 Thread Gabor Gombas
On Tue, Mar 06, 2007 at 11:25:14PM +0100, Bastian Blank wrote:

 One of the s390 buildds, lxdebian, have two cpus online but is only
 allowed to use one full. This is followed by a make call without -j.

IMHO such policies should be enforced by binding the whole buildd to a
single CPU by using tools like taskset or schedtool.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#412831: LD_ASSUME_KERNEL=2.2.5 denies to load libc6 version 2.5-0exp3

2007-02-28 Thread Gabor Gombas
On Wed, Feb 28, 2007 at 02:02:52PM +0100, Achim Gaedke wrote:

 I could not find out whether it is intended to fail, but now I am convinced, 
 it should not.

Yes it should.

 LD_ASSUME_KERNEL=2.4 also fails, LD_ASSUME_KERNEL=2.6 works...

LD_ASSUME_KERNEL=2.2 or 2.4 requests the use of LinuxThreads instead of
NPTL, but LinuxThreads is no longer supported by glibc.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#330105: libc6-dev: __FD_SETSIZE equals to 1024 is too small

2007-02-07 Thread Gabor Gombas
On Tue, Feb 06, 2007 at 12:49:24AM +0100, Pierre Habouzit wrote:

   TTBOMK __FD_SETSIZE is only used for fd_set's (so select, FD_* macros,
 ...), and redefining it won't work (I tried already in another life)
 without recompiling the libc at least -- if not the kernel too, I'm less
 sure about that one -- and that would be obvioulsy binary incompatible
 with the rest of the linuxes around the globe :)

Just libc and everything that uses select - basically you have to
rebuild the whole archive.

AFAIK sys_select() in the kernel can handle arbitrary large number of
file descriptiors, so one option is not to use glibc's select() wrapper
but do your own. But a much better approach (both maintainability and
performance wise) is to use epoll().

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#395177: libc6: default library search path is inconsistent with gcc

2006-10-27 Thread Gabor Gombas
On Wed, Oct 25, 2006 at 02:37:04PM +0200, Vincent Lefevre wrote:

   5. Run it. You should get:
 
 GMP .  Library: 4.2.1  -  Header: 4.2.17

This means the soname of gmp 4.2.1 and 4.2.17 is the same (or you'd have
got an error while loading shared libraries ... message). But that
also means that adding /usr/local/lib to /etc/ld.so.conf would _NOT_
solve your problem since the loader would still find libgmp.so.X in
/usr/lib first.

If you want to have two different shared libraries sharing the same
soname on the same system, then there is no other solution than setting
LD_LIBRARY_PATH to point explicitely to the version you want to use.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#395177: libc6: default library search path is inconsistent with gcc

2006-10-27 Thread Gabor Gombas
On Fri, Oct 27, 2006 at 11:25:51AM +0200, Vincent Lefevre wrote:

 Not necessarily. The soname isn't defined in the header file, is it?
 (At compile time, it seems that the library was also 4.2.1, because
 I get the same problem when using -static, i.e., by not using shared
 libraries.)

Build-time linking has nothing to do with glibc. Besides, ld _does_
search /usr/local/lib before /usr/lib by default.

  But that also means that adding /usr/local/lib to /etc/ld.so.conf
  would _NOT_ solve your problem since the loader would still find
  libgmp.so.X in /usr/lib first.
 
 So, couldn't the dynamic loader take into account /usr/local/lib
 by default (before /usr/lib), just like cpp takes into account
 /usr/local/include by default (before /usr/include)?

That would be a security nightmare as /usr/local is often writable/owned
by users other than root (for example, looking at my etch chroot, it is
writable by group 'staff' by default). Btw., this is also one of the
many reasons why compiling software as root is generally considered
insecure and discouraged.

 If this is really the only possibility, it should probably be set
 in /etc/profile in the default configuration (at install time) and
 other shell init files.

That would be a security problem as well. Only the local system
administrator can decide whether things installed in /usr/local should
override system software or not.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: sysctl syscall removed in 2.6.18+ kernels

2006-08-16 Thread Gabor Gombas
On Tue, Aug 15, 2006 at 06:11:18PM +0200, Aurelien Jarno wrote:

 It already started to annoy some people having their log filled with
 such messages.

IMHO if the message is not rate-limited you should bug the kernel
developers (preferably upstream on linux-kernel). Printing the message
1-2 times per boot is OK but after that it is meaningless.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#381905: linux-kernel-headers: NGROUPS_MAX doesn't match the kernel and causes test failures

2006-08-08 Thread Gabor Gombas
On Mon, Aug 07, 2006 at 11:43:07AM -0600, Vladislav Yasevich wrote:

 It is important to specify proper limits for this.  This is a request to set
 NGROUPS_MAX to 65535, so it matches the kernel, in linux-kernel-headers
 package.

As already said, there is no way a fixed constant (NGROUPS_MAX) can
match a value that can change during run-time. LTP should not use
NGROUPS_MAX but use sysconf(_SC_NGRPOUPS_MAX) instead, which does The
Right Thing(tm).

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#381905: linux-kernel-headers: NGROUPS_MAX doesn't match the kernel and causes test failures

2006-08-08 Thread Gabor Gombas
On Tue, Aug 08, 2006 at 12:58:42PM -0400, Vlad Yasevich wrote:

 Another possibility is to backport some of the changes from glibc 2.4 
 that uses the /proc interface for these sysconf calls to get the values 
 from the currently running kernel.  This would fix things once and for 
 all, but this might be too invasive for Sarge to accept.

No need to go to glibc 2.4. sysconf in glibc 2.3.6 in etch already uses
/proc according to strace.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#378303: linux-kernel-headers: bitmap.h: 'BITS_PER_LONG' undeclared

2006-07-17 Thread Gabor Gombas
On Sat, Jul 15, 2006 at 08:42:56AM +0200, Julien Danjou wrote:

  In file included from /usr/include/linux/cpumask.h:86,
   from /usr/include/asm-i486/processor.h:23,
   from /usr/include/asm/processor.h:8,
   from /usr/include/asm-i486/atomic.h:6,
   from /usr/include/asm/atomic.h:8,
   from swapon02.c:87:

Anything that includes atomic.h from user space is fundamentally broken.
The definitions in atomic.h are _NOT_ atomic on many architectures when
used in user space (if they compile at all). Fix ltp.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



Bug#376756: libc6: popen() should return NULL

2006-07-05 Thread Gabor Gombas
On Tue, Jul 04, 2006 at 05:08:19PM -0300, Jakson Aquino wrote:

 $ gcc testpopen.c
 $ ./a.out
 Before: F = (nil)
 After:  F = 0x501010
 sh: nothing: command not found

This means that popen() _did_ succeed (it has invoked the shell). The
fact that the shell could not interpret the command and therefore exited
with an error is not popen()'s business anymore.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: mktime libc6 bug (#196177) still in sarge

2006-06-13 Thread Gabor Gombas
On Tue, Jun 13, 2006 at 10:48:05AM +0200, Kecskemethy Zoltan wrote:

  Recently I upgraded my woody webserver to sarge. Now I have 
 2.3.2.ds1-22sarge3 version of libc6 package installed on my system.
 
 Our websites uses mktime php funciton and it seems it gives us back wrong 
 data because of a glibc bug. 
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=196177 
 When we use date before year 1970. It seems 
 to me our stable package still has this bug in it. :(

To me it seems the bug is in the application, not in glibc. time_t is
defined as seconds elapsed since 00:00:00 on January 1, 1970,
Coordinated Universal Time (UTC), so it by definition can not represent
date values before 1970. SUSv3 says about mktime: If the time since the
Epoch cannot be represented, the function shall return the value
(time_t)-1.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: IP addresses sorted in reverse order

2006-06-02 Thread Gabor Gombas
On Thu, Jun 01, 2006 at 11:54:59AM +0200, Matus UHLAR - fantomas wrote:

 Nothing relies. It's just if you will receive addresses in some order, you
 should not reorder them unless you know what order they should be delivered
 in (e.g. ordering via RFC3484)

RFC3484 has _nothing_ to do with the NSS. If you are using the glibc
interfaces, then you should only consult the relevant standards (POSIX,
SUS, whatever). And those standards do not contain ordering constraints.

If you want to rely on the address ordering in the DNS reply, then you
should not use the generic NSS interface (like gethostbyname() or
getaddrinfo()) but you should use the resolver directly instead (see
resolv(3)).

 I'm afraid this is not applicable and also you probably did not understand
 me.
 
 If there are clients on network A and network B and servers on network A and
 network B, DNS server may sort replies to clients so client A would get
 address of server A first, server B next. Client B would get addresses of
 server B first, server A next.

I'm perfectly aware of that and it seems you are the one who do not
understand me. If you are already able to generate different DNS replies
for A and B, then the SRV records should look like:

When queried by client A:

_ssh._tcp.server.dom.ain.   SRV 0 0 22  server-A.dom.ain.
_ssh._tcp.server.dom.ain.   SRV 1 0 22  server-B.dom.ain.

When queried by client B:

_ssh._tcp.server.dom.ain.   SRV 1 0 22  server-A.dom.ain.
_ssh._tcp.server.dom.ain.   SRV 0 0 22  server-B.dom.ain.

(Note the difference in the Priority field). This does _exactly_ what
you want and is standard-compliant. You just have to modify your ssh
client to query the SRV records for _ssh._tcp when it wants to connect
to server.dom.ain (and likewise for any other services).

  An other option would be to play with routing instead of the DNS to
  direct your clients to the nearest server.
 
 Pardon? If one of servers is in our company's network and other is in
 differet network, company, town, how do you imagine this?

Certainly you can only do this if you control the routing decisions of
the clients. But since you filed this bugreport I assume you have full
control of the clients, otherwise the bugreport has no sense. And yes,
messing with the routing is always tricky, but on intranets it is
sometimes more efficient than DNS games.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: IP addresses sorted in reverse order

2006-06-01 Thread Gabor Gombas
On Wed, May 31, 2006 at 11:28:44PM +0200, Matus UHLAR - fantomas wrote:

 Searching mailinglists, bug databases did not give me correct answer.
 Does glibc sorty/reorder IP addresses gotten from DNS?
 Is this fixed in any newer versions of glibc?

AFAIK there are no requirements about the order of addresses returned by
any NSS calls. In particular, the returned order does not need to match
the order in the underlying database (DNS in this case). Anything that
relies on the returned addresses having a specific order is just plainly
bogus.

If you want to access resources in a controlled order, then you should
choose a method that was designed for this purpose, like SRV records.
There are not many software that supports SRV records out of the box so
it is quite likely that you have to modify the clients.

An other option would be to play with routing instead of the DNS to
direct your clients to the nearest server.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#367962: Please don't ship a /lib64 symlink in the package on amd64

2006-05-19 Thread Gabor Gombas
On Fri, May 19, 2006 at 10:58:33AM +0200, Goswin von Brederlow wrote:

 Local admins are already allowed to convert directories into links,
 e.g. to move parts ot the directory tree to another disk.

According to Steve Langasek in
Message-ID: [EMAIL PROTECTED]
that's not allowed and you should use bind mounts instead.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#365233: libc6: memory leak in getprotobyname

2006-05-02 Thread Gabor Gombas
On Fri, Apr 28, 2006 at 06:51:54PM +0200, Slaven Rezic wrote:

 #include netdb.h
 main() {
 struct protoent *pent;
 while(1) {
   pent = getprotobyname(tcp);
 }
 }

valgrind shows that the leaking function is fopen(), called from
getprotobyname_r(). It seems that getprotobyname_r() does not check if
it already has /etc/protocols open and always opens a new file
descriptor for it.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#365048: libc6 does not respect STATUS and ACTION options in nsswitch.conf

2006-04-28 Thread Gabor Gombas
On Thu, Apr 27, 2006 at 10:40:32AM -0400, Jesse W. Hathaway wrote:

[...]
struct passwd *pw = getpwnam(user);
if (pw == NULL)
   return 0;
 
if (getgrouplist(user, pw-pw_gid, NULL, ng)  0) {
   groups = (gid_t *) malloc(ng * sizeof (gid_t));
   getgrouplist(user, pw-pw_gid, groups, ng);
}
[...]

 doing an strace on the above program when searching for a user in
 /etc/passwd shows ldap being searched, with or without [SUCCESS=return]
 in nsswitch.conf.

The above is not a good example.  Do LDAP lookups happen with a single
getpwnam() call _only_? If yes, then it is a bug, otherwise it's not.

getgrouplist() and initgroups() will _always_ enumerate all NSS group
data sources regardless of action statements. It may be unfortunate
sometimes due to the generated load, but this is how their semantics are
defined. The only solution is not to use LDAP for the group database at
all.

 Changing nsswitch to [UNAVAIL=return] disables ldap
 lookups for all requests even if the user is not in /etc/passwd.

Note that the UNAVAIL status refers only to the generic availability of
the service, it has nothing to do with the user being defined or not.

That said, files [UNAVAIL=return] ldap should not disable ldap (quite
the contrary, it should have basically no effect unless you delete
/etc/passwd etc.), so this may need further investigation.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#365048: libc6 does not respect STATUS and ACTION options in nsswitch.conf

2006-04-28 Thread Gabor Gombas
On Fri, Apr 28, 2006 at 08:03:39AM -0400, Jesse W. Hathaway wrote:

 Why it it defined that getgrouplist() and initgroups() _always_
 enumerate all NSS goups?

Just think about the simple case when an user defined in /etc/passwd is
also a member of a group that is only defined in LDAP. getgrouplist()
and initgroups() MUST support this.

Or an other viewpoint: when enumerating entries neither the SUCCESS
nor the NOTFOUND conditions occur until all backends are exhausted, so
[SUCCESS=return] or [NOTFOUND=return] has no effect on enumeration.

Btw. both the nsswitch.conf man page and the glibc documentation say:

The second item in the specification gives the user much finer
control on  the  lookup  process.

So they only mention the _lookup_ process (i.e. getXXbyYY()), they do
not say that action statements would have any effect on enumeration.

 This can cause problems for system daemons. For
 instance apache2 does an initgroups every time it spawns a thread, which
 results in my ldap servers being pounded when I have high load on my
 webservers. Nscd is a possible solution to the problem, but the version
 in stable does not cache initgroup requests, and the version in unstable
 invalidates them prematurely. Having the ability to not search other
 databases for local name service lookups seems like a valuable function.

That is a well-known scenario and the usual advice is do not use LDAP
as the group NSS backend.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#365048: libc6 does not respect STATUS and ACTION options in nsswitch.conf

2006-04-28 Thread Gabor Gombas
On Fri, Apr 28, 2006 at 10:51:38AM -0400, Jesse W. Hathaway wrote:

 I do understand why this feature is needed. However, the additional 
 feature of having the ability to disable this function is also needed.
 It is quite common to not have any of the users, used for system
 daemons, to be included in groups found in network directories. It seems
 needless to query network directories for system daemons such as apache.

Yes, in some cases such a feature would be useful, but that feature
currently does not exist.

 Enumeration is a lookup process, so I still think the man page is
 unclear, as to what effect the action statement will have in the group
 database option.

The documentation might be improved, but the documentation of SUCCESS
talks about the wanted entry and the documentation of NOTFOUND talks
about the needed value, both terms having no meaning for enumeration.
Well, you can interpret those terms as all possible entries; either
way you get that SUCCESS and NOTFOUND action rules have no effect on
enumeration.

 Given that one of the main features of LDAP and NIS are consistent
 groups across all machines, I think it would be beneficial to support
 querying network directories selectively.

I think the reason this was not solved much easier is that it is not a
problem for NIS/NIS+. They need much less resources than LDAP.
Enumerating over a couple thousand users using NIS+ was not a problem
when I last did it; doing the same with LDAP produces quite a
significant load.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#363442: libc6-xen should not conflict with any other libc6-$flavor

2006-04-21 Thread Gabor Gombas
On Wed, Apr 19, 2006 at 01:02:59PM -0500, Adam Heath wrote:

 I'm ready to upload xen 3.0.2, with a dependency on libc6-xen.

IMHO just go ahead with the upload :-) The removal of the other
optimized flavors due to the conflict with libc6-xen should only cause
some performance regression when you boot a non-xen kernel, it should
not have any effect on usability.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#362966: Acknowledgement (nscd aborts)

2006-04-16 Thread Gabor Gombas
On Sun, Apr 16, 2006 at 12:31:41PM -0700, Richard A Nelson wrote:

 Since things are still going after the change, this probably shouldn't
 be that high a priority issue... it shouldn't abort - a syslog note
 would be much nicer :)

No message should be generated. Pointing multiple user names to the
same UID is perfectly legal.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: r1381 - in glibc-package/trunk/debian: . local/manpages local/usr_sbin

2006-04-10 Thread Gabor Gombas
On Mon, Apr 10, 2006 at 11:28:18AM +0200, Aurelien Jarno wrote:

 However, that does not fix #345907. Does anybody has an idea how to fix 
 that? I think it is a good idea to let the user update it's timezone 
 manually, given the way we handle timezone in the stable release (though 
 it should now be easy with the tzdata package).

How about storing the md5sum of /etc/localtime and updating it only if
the checksum has not been changed (and warning the user otherwise).
X.org does something similar for auto-updating xorg.conf if the user did
not change it manually.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#346342: libc6: REALLY annoying: destroys workaround all the time

2006-04-10 Thread Gabor Gombas
On Mon, Apr 10, 2006 at 07:42:41AM +0200, Aurelien Jarno wrote:

 Then on the bug itself, I will try to investigate that. The solution is
 not trivial, if you look at the tzconfig script, you may notice that the
 script use a readlink on /etc/localtime. Replacing it by a plain file
 may have consequences that have to be investigated.

tzconfig from libc6-2.3.6-5 already checks if /etc/localtime is a
symlink or not and seems to handle both cases fine.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3

2006-02-27 Thread Gabor Gombas
On Fri, Feb 24, 2006 at 08:46:23AM +, Brian M. Carlson wrote:

 Neither side is willing to lose and give in all the way.  I tried a
 compromise.  Apparently, that didn't work, so let me try another one:
 glibc could no longer claim compliance with SUSv3/POSIX 1003.1-2001 or
 SUSv2.  Then there will be no bug, and we can all be happy.

Well, it would certainly make sense to document the known deviations
from the various standards. Keeping the list up-to-date may be
problematic however, so I think this still should be coordinated with
upstream.

 * libfoo is compiled against glibc 2.3.6.
 * bar is compiled against libfoo and glibc 2.3.6.
 * A new version of bar comes out, and is compiled against glibc 2.3.7
 (which no longer has the bug in question, let's say).
 * Now, several things could happen:
   + bar passes the error code it gets from some libc function to libfoo,
 and libfoo tries to handle it, libfoo errors out and bar no longer
 works.

IMHO this is acceptable as this should not be common and can be handled
by using versioned dependencies on libfoo.

   + bar passes the error code it gets from some libc function to libfoo,
 which then tries to log the error by using strerror.  This will cause
 silent breakage.

IMHO the version of strerror() need not be incremented in this case, so
both bar and libfoo will use the same strerror() version.

 This solution will avoid the vast majority of problems, but it isn't
 perfect.  I am getting the impression that the others here want a
 perfect solution, and other than changing SONAMEs (which no one wants
 to do), it can't be done, AFAICT.  If someone can find a solution which
 works in all cases, please let me know.

I don't think a perfect solution is needed. You only need to keep the
benefits/expected breakage ratio acceptable, and since fixing the errno
values provides only a really small benefit, the expected breakage
should be quite low.

 I found the change in question, where Mr. Drepper claims that Linus
 rejected his patch to create an ENOTSUP, and so he defined ENOTSUP to
 EOPNOTSUPP.  But I cannot find the patch that Linus rejected.

Nor did I. It is well known that Ulrich Drepper and Linus Torvalds do
not get along well. In this case I think Ulrich was wrong, but he did
not want to acknowledge it. Sure, it is convinient if the kernel returns
the same error code as glibc wants, but nothing stops glibc from
remapping the error code if this is not the case. After all, the
standards define the behaviour of the _library_ functions, not the
kernel-glibc interface.

 I also
 find his claim in the glibc bug I opened that it is part of the ABI
 unconvincing, especially now that I know it was he who made the change,
 and therefore part of the ABI.
 
 Also, I have no idea why the two errors were made the same, when item
 number 2 in the PROJECTS file is:
 
 [ 2] Test compliance with standards.  If you have access to recent
  standards (IEEE, ISO, ANSI, X/Open, ...) and/or test suites you
  could do some checks as the goal is to be compliant with all
  standards if they do not contradict each other.
 
 This has apparently been in that file since it was checked in, 9 years
 and 8 months ago.

Yep, Ulrich Drepper seems to be a man hard to deal with. But it is true
that ENOTSUP == EOPNOTSUPP is part of the _current_ ABI, and changing
that would be very unwise if it is not accepted upstream.

 No, I can be sure they have separate values now, because glibc defines
 _POSIX_VERSION to 200112L.  That indicates *complete* and *total*
 support of the base portions of POSIX 1003.1-2001.  Such portions
 include the two error code in question.

No, that at most indicates _intent_ to support the standard. The
standard says that compliant systems should set that symbol to that
value; it does not (and can not) say that non-compliant systems should
not set it. Complete and total support would mean being officially
certified, but that's not the case.

And that's exactly the reason why real-world applications use solutions
like autoconf instead of depending on feature macros. It's not just
Linux, commercial vendors also have their fair share of standard
non-compliance bugs from time to time.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3

2006-02-23 Thread Gabor Gombas
On Thu, Feb 23, 2006 at 04:30:55AM +, Brian M. Carlson wrote:

  By introducing a new define, you are breaking standard compliance.
 
 Well, there is no better way.  You want to preserve binary compatibility
 at the expense of all else.  I want to preserve standards compliance at
 the expense of all else.  I am trying to offer a compromise.

You can not preseve standards compliance by breaking standards
compliance, so that's out. What remains is preserving binary compliance,
and that can be achieved by doing nothing.

 Actually, no it won't.  It will continue to return the wrong value
 (EOPNOTSUPP) that existing code returns.  At some point, one might want
 to fix that with new @GLIBC_2.3.7 symbols, but I'm not going to
 implement that right now.  Also, see the paragraph above.

Oh, so you _do_ know how to fix it properly:

- Make ENOTSUP and EOPNOTSUPP have different values in the header
- Ensure that the implementations with the current symbol versions
  continue to return the old value to preserve binary compatibility
- Create a new version for every affected function that does the desired
  error-code remapping

So do it and propose a patch to upstream (or hire someone to do it for
you). Handwaving and posting completely broken patches will not help.
(Oh, and be prepared for Ulrich Drepper rejecting this exact change, as
he already did in 1999).

 If and when that happens, my code will be broken, and I will be happy to
 fix it.  Expecting that I act as if something will happen, when I cannot
 be certain it will, is silly.

By the same argument, expecting that ENOTSUP and EOPNOTSUPP having
different values in the future, when you cannot be certain they will
(i.e. you haven't written a patch that got accepted upstream, and you
have forced every Linux user to upgrade), is silly. So you should fix
your code _now_, and remove the extra handling of ENOTSUP/EOPNOTSUPP
if/when they will have separate values.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -



Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3

2006-02-21 Thread Gabor Gombas
On Tue, Feb 21, 2006 at 06:23:21AM +, Brian M. Carlson wrote:

  (void *)strerror(error_code);
 
 Not thread safe.

Then use strerror_r().

 Also, this code does not check that it is a *POSIX*
 error code. If you check the Linux sources, you can see that there are
 many error codes (mostly for NFS) that are not standard, and therefore
 are invalid for my program.

Let's check. EBADHANDLE is 521 (fwiw, it is not #defined unless
__KERNEL__ is defined, which should not be the case for user-space
programs):

/* We want the SUSv3 version of strerror_r(), not the GNU one */
#define _XOPEN_SOURCE 600
#include errno.h
#include string.h
#include stdio.h

int main(void)
{
errno = 0;
strerror_r(EINVAL, NULL, 0);
printf(%d %s\n, errno, strerror(errno));
errno = 0;
strerror_r(521, NULL, 0);
printf(%d %s\n, errno, strerror(errno));
return 0;
}

This gives:

34 Numerical result out of range
22 Invalid argument

 I have a patch
 forthcoming which mitigates the damage and allows people that really
 want standards compliance to indicate so.

SUSv3 clearly defines how an application should indicate that it
desires standard compliance. By introducing a different schema you are
actually _breaking_ standard compliance here.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#227386: libc6-dev: ENOTSUP==EOPNOTSUPP, which violates SUSv3

2006-02-20 Thread Gabor Gombas
On Sun, Feb 19, 2006 at 08:25:34AM +, Brian M. Carlson wrote:

 Anyway, my problem is that the fact that these two errors are
 the same is causing my code to break very badly.  I have a
 library that contains its own error codes that will be negative
 if casted to an int.  Additionally, I want to support the use
 of the standard errno.h error codes.  To make my life easier,
 I am using a script to generate a list of valid error codes:
 the POSIX ones, as well as mine.  The code generated by the
 script uses a switch statement to check whether a code is
 valid.  But because two case statements cannot have the same
 value, I get compiler errors.  I have logic to check for
 EAGAIN and EWOULDBLOCK, and only use one if both are the same;
 I'd prefer not to have to do this for other pairs.

That seems overly complex. You should most certainly know the range of
your own error codes, so something like the below looks much simpler (no
script needed, no dependance on the value of standard error codes):

if (error_code = MY_ERROR_MIN  error_code = MY_ERROR_MAX)
{
/* handle your own error codes here */
}
else
{
errno = 0;
(void *)strerror(error_code);
if (!errno)
{
/* error_code was known to libc */
}
else
{
/* unknown, probably invalid error_code */
}
}

After a quick check at least Darwin 6.8 and FreeBSD 5.1 also uses the
same value for ENOTSUP and EOPNOTSUPP, so if you want to be portable,
you can not assume that these values are distinct.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#353031: posix_fadvise defines missing

2006-02-16 Thread Gabor Gombas
On Thu, Feb 16, 2006 at 12:24:27AM -0500, Greg Stark wrote:

 If this is intentional (which seems unlikely, why should I have to define
 these things just to get a standard libc function?) then it's at the very
 least a documentation bug. The man page clearly indicates that only #include
 fcntl.h is required.

Well, the man page says:

CONFORMING TO
SUSv3 (Advanced Realtime Option), POSIX 1003.1-2003. [...]

And SUSv3 says:

A POSIX-conforming application should ensure that the feature test macro
_POSIX_C_SOURCE is defined before inclusion of any header.

SUSv3 specifies that the value of _POSIX_C_SOURCE should be 200112L,
while the glibc documentation specifies that unless you explicitely
define a feature macro, the default value of _POSIX_C_SOURCE will be 2
(and thus posix_fadvise() will not be visible).

So it is documented behavior. And no, posix_fadvise is not just a
standard libc function; the only sane default for standard is what is
described in ISO C99, and that does not contain posix_fadvise().

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-23 Thread Gabor Gombas
On Fri, Dec 23, 2005 at 01:21:54PM -0800, Edward Buck wrote:

 The correct query order for mx1.hotmail.com (containing 2 dots) should be:
 
 1. mx1.hotmail.com. - 
 2. mx1.hotmail.com. - A
 3. mx1.hotmail.com.domain1.com. - 
 4. mx1.hotmail.com.domain1.com. - A
 5. mx1.hotmail.com.domain2.com. - 
 6. mx1.hotmail.com.domain2.com. - A
 
 If step 1 or 2 returns a host address, step 3 and later are skipped.
 
 The Debian (or glibc) query order is:
 
 1. mx1.hotmail.com. - 
 2. mx1.hotmail.com.domain1.com. - 
 3. mx1.hotmail.com.domain2.com. - 
 4. mx1.hotmail.com. - A
 5. mx1.hotmail.com.domain1.com. - A
 6. mx1.hotmail.com.domain2.com. - A
 
 With Debian's query order, mx1.hotmail.com exists as an A record yet the 
 system doesn't check until it has already done 3 queries, 2 of which do 
 not qualify as an 'initial absolute query'.

Ok, let's clarify some things here. resolv.conf(5) describes the
behaviour of a _single_ resolver query. If you look at
resolv/nss_dns/dns-host.c in the glibc source, you'll see that
gethostbyname(3) is implemented as _two_ distinct resolver invocations.
Since it is nowhere specified how many resolver invocations
gethostbyname(3) should cause, the glibc behaviour is correct and will
result in the second list of DNS queries you specified.

If you want to avoid the extra query, you should use getaddrinfo(3) or
the GNU-specific gethostbyname2(3) and specify explicitely the address
family you are interested in.
 
 The bug is not just limited to those who use the search line.  If your 
 resolv.conf contains 'domain ...', e.g.
 
 domain example.com
 nameserver x.x.x.x
 nameserver y.y.y.y
 
 Then a query of mx1.hotmail.com will ALWAYS yield:
 
 1. mx1.hotmail.com. - 
 2. mx1.hotmail.com.example.com. -  (extraneous)
 3. mx1.hotmail.com. - A

This is the same as before, as by default the search list is initialized
to contain the local domain if there are no explicit search lines.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-22 Thread Gabor Gombas
On Wed, Dec 21, 2005 at 06:08:04PM +0100, Marco d'Itri wrote:

 I find this reasoning very peculiar. If an algorithm is inefficient and
 this causes problems then it is obviously buggy.

An algorithm is buggy if it does not match the specification. I see no
description about the lookup order wrt. multiple protocols, so the
behaviour is conformant to the documentation.

Also note that resolv.conf(5) explicitely says that using search lines
may be slow and will generate  a lot  of  network  traffic  if  the
servers for the listed domains are not local, and that queries will time
out if no server is available for one of the domains.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -



Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-22 Thread Gabor Gombas
On Wed, Dec 21, 2005 at 10:42:03AM -0800, Edward Buck wrote:

 On the first point, I (and thus my company) use search lines in
 combination with LAN-only DNS subdomains for internal address
 management.  It allows us to use internal IP addresses for hosts without
 fiddling with /etc/hosts.  All our host subdomains are managed in DNS.
 A LOT of scripts, i.e. for backup, rsync, load balancing, use short
 hostnames that get their address information from internal DNS zones, a
 process that depends on the search functionality in /etc/resolv.conf.

My personal opinion is that this is wrong, and now you are trying to
paper over an initial design flaw. Should you had a policy to always use
full host names everywhere, you'd not have this problem now. In my
experience relying on lookup service configuration is never good.

 To give you an idea of impact, I was recently greeted with an e-mail
 from a DNS service provider that I use saying that I was getting close
 to my query quota.  It surprised me that I got this e-mail because I was
 never close to hitting the quota before.  It turns out that 90% of the
 queries were coming from 1 server where I unwittingly added the domain
 to the search path!

Well, resolv.conf(5) says about search lines that they will generate  a
lot  of  network  traffic  if  the  servers for the listed domains are
not local. You should not add a search line for a domain not server by
a local name server. In most cases this can be solved by installing a
local caching-only name server.

 On the subject of work-arounds, I'm not having much luck finding one
 without recompiling glibc, which is not a good option IMO.  If anyone
 has any ideas on this, please let me know.

Did you try apt-get install bind9 and putting nameserver 127.0.0.1
in /etc/resolv.conf? You can also try lwresd  libnss-lwres if you need
something smaller, or djbdns if you like its author :-)

This may reduce your DNS traffic even more than changing the lookup
order in glibc would. Of course you have to pay with some memory and a
little CPU usage.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-22 Thread Gabor Gombas
On Fri, Dec 23, 2005 at 01:31:16AM +0100, Marco d'Itri wrote:

 Yet another very peculiar definition from you.

Well, that's the first thing thaught in programming theory here. If the
algorithm matches the specification, then it is correct. If the
specification does not cover something, then the algorithm is free to
choose whatever behaviour it prefers.

Of course, a correct algorithm is not neccessarily the best. Bubblesort
is correct since the traditional sorting specification does not put
constraints on the number of comparisons, but there is a reason people
prefer to use qsort when there are more than a handful of elements :-)

 Which is true, but does not mean that the current behaviour is correct
 and should not be changed.

Then convince upstream to change the current behaviour. Or write a
patch, and convince the Debian glibc team that Debian should diverge
from upstream behaviour.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -



Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-20 Thread Gabor Gombas
On Tue, Dec 20, 2005 at 12:41:05AM +, Stephen Gran wrote:

 I guess the answer to this problem for you is to just disable ipv6
 (unless you need it) - blacklisting the kernel module(s) ought to do it,
 although there may be some other parts I am unaware of.

I doubt disabling IPv6 in the kernel would make any difference since
querying for  records does not require an IPv6 socket. You will only
find out if IPv6 is disabled if you do have an  record and you try
to use the address.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -



Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-15 Thread Gabor Gombas
On Wed, Dec 14, 2005 at 11:41:38AM -0800, Edward Buck wrote:

 If it's a frequently used feature, it wasn't available until sarge.
 Woody did not behave this way (I checked).

Huh?

$ cat /etc/debian_version
3.0
$ cat /etc/resolv.conf
search hpcc.sztaki.hu lpds.sztaki.hu sztaki.hu
nameserver 127.0.0.1
$ ping rs2.lvs
PING rs2.lvs.sztaki.hu (193.6.200.132): 56 data bytes
...

It is definitely available in Woody. I'm using it regularly.

 Also, this new feature
 completely breaks software that doesn't expect this feature, since
 postfix, telnet and others are doing WAY more DNS queries than they
 should.  Depending on how many search domains are listed and how many
 caching nameservers are listed, in my case (2 search domains and 2
 nameservers) I count at least 4 unnecessary queries.  That's very bad.

Well, why do you have any search domains then? It is for human
convenience only, and a mail server usually does not have regular user
accounts so no need for such convenience features.

 Sure, I can do that with telnet interactively.  How do I tell postfix to
 do that without a patch?  I guess I could try setting the ndots option
 to postfix's environment but that seems like a bad hack.  The current
 behavior makes using the search lines impossible for busy servers,
 especially mail servers that do DNS queries for every piece of mail.
 Just imagine the excess DNS load on a server processing a million e-mail
 messages a day.  That's what I'm seeing.

For such setups I suggest running some local DNS-catching solution (nscd
or a local caching-only name server).

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#343140: libc6: resolver always checks search list in /etc/resolv.conf

2005-12-14 Thread Gabor Gombas
On Mon, Dec 12, 2005 at 09:13:13PM -0800, Edward Buck wrote:

 In a nutshell, when using 'search' lines in /etc/resolv.conf, the
 resolver always appends listed search domains to a hostname lookup even
 when the host being searched is fully-qualified (contains one or more dots).

No, a host name containing a dot is _not_ a FQDN. A host name _ending_
with a dot is a FQDN.

Using host.subdomain while search is set to some.domain to access
host.subdomain.some.domain is a common and frequently used feature.

 This results in a LOT of needless DNS traffic.  On a busy mail server,
 it makes using the 'search' lines extremely expensive (because DNS traffic
 increases exponentially).

 Here's an strace of 'telnet mx1.hotmail.com 25'.  Oddly, it seems to do
 the right thing initially but the fully-qualified lookup must always
 fail, resulting in subsequent lookups using the search list.

Then use a _real_ FQDN and try 'telnet mx1.hotmail.com. 25' (note the
terminating dot).

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#331405: Accidential activation of nscd is too simple

2005-10-05 Thread Gabor Gombas
On Mon, Oct 03, 2005 at 12:33:45PM +0200, Martin Samuelsson wrote:

 Obviously something has automatically dragged nscd into my system as one
 of it's dependencies. (It's marked A in aptitude) And having a software
 cacheing dns lookups from disconnected moments doesn't really make a
 laptop very useable when being connected.

What is that something? Investigating the output of apt-cache rdepends
nscd, libnss-pgsql1 and libnss-mdns Suggests: nscd, and libnss-ldap
Recommends: it, but nothing Depends: on it. So you should've given a
choice by whatever package installation frontend you've used.

 One could say that I should have better knowledge of exactly what
 software that is on my system, and how it is configured. However I've
 always found the debian way to be having software installed with
 reasonable defaults. Which I don't think this behaviour is, considering
 it simple to get installed without realizing it.

Well, the description of nscd says: You should install this package
only if you use slow Services like LDAP, NIS or NIS+. If you are not
using one of these services, why did you choose to install nscd? (I
don't dare to assume that you haven't even read the package description
before letting an unknown and unrequested pacakge installed on your
system...)

Also, the default negative-ttl for the hosts map is just 20 seconds
which I think _is_ a quite reasonable default.

 My suggestion would be that nscd was configured by default to not start
 or to never cache any data until explicitly told so by a simple, but
 active act from the system administrator.

Why? You said I've always found the debian way to be having software
installed with reasonable defaults. The only reasonable default for a
program called Name Service Caching Daemon is to cache name service
calls when installed. Otherwise why did you install it at all?

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#330929: how can adduser reliably find out wheter nscd is running?

2005-10-04 Thread Gabor Gombas
On Mon, Oct 03, 2005 at 06:07:26PM +0200, Marc Haber wrote:

 That's how we're going to do it in the future. I would, however, like
 to have the fact documented that it is not an error to run nscd -i if
 no daemon is running.

What do you mean by not an error? nscd -i of course will return a
non-0 exit code since it cannot invalidate the cache if the daemon is
not running, but that is OK for adduser. I think you simply should not
care if nscd -i succeeds or fails.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#330929: how can adduser reliably find out wheter nscd is running?

2005-10-04 Thread Gabor Gombas
On Tue, Oct 04, 2005 at 02:18:18PM +0200, Marc Haber wrote:

 I am concerned about nscd suddenly giving an actual error message
 instead of silently returning non-zero, which might confuse the users.

Well, my personal preference would be not to worry about this until it
becomes a real issue (that future error message may be descriptive
enough, after all). Or you can just add 2/dev/null to the nscd -i
invocation.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#330929: how can adduser reliably find out wheter nscd is running?

2005-10-04 Thread Gabor Gombas
On Tue, Oct 04, 2005 at 02:41:26PM +0200, Marc Haber wrote:

 Adduser has _a lot_ of installations and is mainly used by maintainer
 scripts. Thus, _a lot_ of people are bound to see error messages
 generated by adduser and are bound to be confused by them.
 
  Or you can just add 2/dev/null to the nscd -i
  invocation.
 
 That will also kill real error messages.

Are there any useful real error messages now? For example, if nscd is
not running, trying to invalidate a non-existing map will not produce an
error message, and if nscd is running, the error message is misleading.

I still prefer not fixing what is not broken. glibc upgrades are always
complex and require lots of testing. So I think even if nscd behavior is
changed upstream in the future, there will be enough time before the
next Debian release to discover any problems and update adduser if
neccessary.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: libc 2.3.5 and heap corruption checking?

2005-09-08 Thread Gabor Gombas
On Wed, Sep 07, 2005 at 09:29:04AM -0600, Ben Pearre wrote:

 *** glibc detected *** malloc(): memory corruption: 0x083ba0e8 ***
 zsh: abort (core dumped)  matlab -nosplash -nojvm

[...]

 (143)% export MALLOC_CHECK_=1
 (0)% matlab
[...]
  [1;2]*[3 4]
 
 
Segmentation violation detected at Wed Sep  7 00:55:16 2005
 
[...]

Well, setting MALLOC_CHECK_ prevents glibc from calling abort() when it
detects heap corruption but it will _not_ fix the bug in Matlab.

You can do some more investigation though:

- Run Matlab on a machine with the old glibc, but set MALLOC_CHECK_ to
  2. If that also crashes, then the bug is clearly in Matlab and you
  should go complain to your Matlab vendor.

- Run Matlab under valgrind, that will show you more information about
  heap allocation (it can detect more problems than glibc´s internal
  checker and can also give much better location information if you have
  debugging symbols).

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#324900: nscd: umount /var fails (unclean shutdowns)

2005-09-01 Thread Gabor Gombas
On Thu, Sep 01, 2005 at 02:04:25PM +0900, GOTO Masanori wrote:

  - bash unconditionally does some NSS calls during startup (getpwuid
etc.); this in turn
  - loads all NSS modules that serve passwd maps - if a module uses
libraries from /usr, now you have a live memory mapping under /usr so
you cannot unmount it during shutdown
 
 Why is this unconditionally happenned?  What setting does this cause
 this problem?

Because bash calls getpwuid() to initialize the value of $SHELL before
executing the script.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#324900: nscd: umount /var fails (unclean shutdowns)

2005-08-31 Thread Gabor Gombas
On Wed, Aug 31, 2005 at 01:36:24PM +0900, GOTO Masanori wrote:

  root  1119 1  0 12:47 ?00:00:00 /bin/sh /etc/init.d/rc 6
[...]
  rc1119 root  mem   REG8,9  217016   228931 
  /var/db/nscd/passwd
 
 It's very weird behavior.  Please disable nscd when your boot up time,
 and then run /etc/init.d/nscd.  You can see which processes have such
 nscd file descriptor (fd), and you can clear process inheritance with
 pstree easily.  If nothing has nscd fd, we can clear rc behaves oddly.

Well, if /bin/sh is bash, then it is not weird at all, it is the same
bash vs. NSS problem that came up several times in the past (last time
quite recently on debian-devel). Previously it only happened with NSS
modules that link to libraries under /usr, now it also affects nscd.

What I _suspect_ is happening here:

- by calling /etc/init.d/rc, bash is executed
- bash unconditionally does some NSS calls during startup (getpwuid
  etc.); this in turn
- loads all NSS modules that serve passwd maps - if a module uses
  libraries from /usr, now you have a live memory mapping under /usr so
  you cannot unmount it during shutdown
- bash (libc) connects to nscd
- nscd sends a file descriptor of /var/db/nscd/passwd to bash, and bash
  does an mmap(2) on the received fd - now you have a live memory
  mapping under /var thus you cannot unmount it during shutdown
- /etc/init.d/rc eventually kills nscd but that does not help, since the
  bash process executing /etc/init.d/rc still has the cache file mapped
  (deleting the cache file also doesn't help since unlink(2) only
  operates on the directory and does not invalidate the memory mapping)

Options you have to avoid the problem:

1. Stop using bash as /bin/sh
2. Stop using nscd
3. Convert /etc/init.d/rc to be an ELF executable instead of a shell
   script
4. Let /var (and /usr if you are running more exotic NSS modules) be the
   same filesystem as /
5. Redesign the init system so unmounting of local file systems is done
   _after_ /etc/init/rc has finished (and the program that does the
   unmounting must not be a shell script)

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#304604: locales: Upgrading fails

2005-04-17 Thread Gabor Gombas
On Sat, Apr 16, 2005 at 06:18:02PM +0900, GOTO Masanori wrote:

 Could you investigate perl with glibc 2.3.2.ds1-20 + MALLOC_CHECK_=0
 to 3?  If perl has double free bug, such bug should be also appeared
 on sarge/sid environment.

With glibc 2.3.2.ds1-21 MALLOC_CHECK_=3 dpkg-reconfigure dash also
dies:

[...]
malloc: using debugging hooks
free(): invalid pointer 0x88d9d20!
Aborted

Gabor

-- 
Gabor Gombas   CERN IT-GM-DM
[EMAIL PROTECTED]   Office: 31/S-016


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#304604: locales: Upgrading fails

2005-04-15 Thread Gabor Gombas
On Fri, Apr 15, 2005 at 09:37:09AM +0900, GOTO Masanori wrote:

 Before we reassign to perl, we need to clear which setting triggers
 your problem.  At least under my environment, this problem is not
 appeared.  Could you confirm your environment settings (ex: LANG or
 malloc related value), and perl versions?

LANG=en_US.UTF-8
LANGUAGE=en_US:en_GB:en
LC_CTYPE=hu_HU.UTF-8

ii  perl   5.8.4-8Larry Wall's Practical Extraction and Report

FYI:

Last few lines from strace dpkg-reconfigure dash:

28191 fsync(4)  = 0
28191 stat64(/var/cache/debconf/config.dat, {st_mode=S_IFREG|0644, 
st_size=74096, ...}) = 0
28191 rename(/var/cache/debconf/config.dat, 
/var/cache/debconf/config.dat-old) = 0
28191 rename(/var/cache/debconf/config.dat-new, 
/var/cache/debconf/config.dat) = 0
28191 close(5)  = 0
28191 close(4)  = 0
28191 close(6)  = 0
28191 close(8)  = 0
28191 close(3)  = 0
28191 munmap(0xb7f57000, 4096)  = 0
28191 munmap(0xb7f56000, 4096)  = 0
28191 close(7)  = 0
28191 open(/dev/tty, O_RDWR|O_NONBLOCK|O_NOCTTY) = 3
28191 writev(3, [{*** glibc detected *** , 23}, {double free or corruption 
(!prev..., 33}, {: 0x, 4}, {08785b58, 8}, { ***\n, 5}], 5) = 73
28191 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
28191 tgkill(28191, 28191, SIGABRT) = 0
28191 --- SIGABRT (Aborted) @ 0 (0) ---
28191 +++ killed by SIGABRT +++


Compared to the last lines from MALLOC_CHECK_=0 strace dpkg-reconfigure dash:

28157 fsync(4)  = 0
28157 stat64(/var/cache/debconf/config.dat, {st_mode=S_IFREG|0644, 
st_size=74096, ...}) = 0
28157 rename(/var/cache/debconf/config.dat, 
/var/cache/debconf/config.dat-old) = 0
28157 rename(/var/cache/debconf/config.dat-new, 
/var/cache/debconf/config.dat) = 0
28157 close(5)  = 0
28157 close(4)  = 0
28157 close(6)  = 0
28157 close(8)  = 0
28157 close(3)  = 0
28157 munmap(0xb7f57000, 4096)  = 0
28157 munmap(0xb7f56000, 4096)  = 0
28157 close(7)  = 0
28157 exit_group(0) = ?

So perl seems to be dying in exit().

Gabor

-- 
Gabor Gombas   CERN IT-GM-DM
[EMAIL PROTECTED]   Office: 31/S-016


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#304604: locales: Upgrading fails

2005-04-14 Thread Gabor Gombas
Package: locales
Version: 2.3.4-3
Severity: important
Tags: experimental


Hi,

Trying to upgrade to 2.3.4-3 gives the following error:

Setting up locales (2.3.4-3) ...
Generating locales...
  en_US.ISO-8859-1... done
  en_US.ISO-8859-15... done
  en_US.UTF-8... done
  hu_HU.ISO-8859-2... done
  hu_HU.UTF-8... done
Generation complete.
*** glibc detected *** double free or corruption (!prev): 0x08781408 ***
dpkg: error processing locales (--configure):
 subprocess post-installation script killed by signal (Aborted)
Errors were encountered while processing:
 locales
E: Sub-process /usr/bin/dpkg returned an error code (1)

-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (990, 'unstable'), (101, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.11
Locale: LANG=en_US.UTF-8, LC_CTYPE=hu_HU.UTF-8 (charmap=UTF-8)

Versions of packages locales depends on:
ii  debconf   1.4.48 Debian configuration management sy
ii  libc6 [glibc-2.3.4-3] 2.3.4-3GNU C Library: Shared libraries an

-- debconf information:
* locales/default_environment_locale: en_US.UTF-8
* locales/locales_to_be_generated: en_US ISO-8859-1, en_US.ISO-8859-15 
ISO-8859-15, en_US.UTF-8 UTF-8, hu_HU ISO-8859-2, hu_HU.UTF-8 UTF-8


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#304604: locales: Upgrading fails

2005-04-14 Thread Gabor Gombas
On Fri, Apr 15, 2005 at 01:44:38AM +0900, GOTO Masanori wrote:

 Could you track this installation process in detail and clear which
 binary did this error cause?

The error comes from perl (debconf). Actually all debconf-using packages
seem to be broken with libc-2.3.4; dpkg-reconfigure also dies with the
same error.

You may reassign this bug to perl, but if perl is not fixed before sarge
(unlikely, since AFAIK perl is frozen, and the bug does not affect
sarge), a more generic workaround might be needed for the sarge-etch
transition (yeah, I know it is in the distant future ;-)

Ok, more stracing shows that the abort() happens when perl exits, so the
bug would be harmless if malloc()  friends would just not call abort()
by default.

Gabor

-- 
Gabor Gombas   CERN IT-GM-DM
[EMAIL PROTECTED]   Office: 31/S-016


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]