After some investigation, I've narrowed the problem down somewhat. But, first some system information:
CentOS 5.8, kernel 2.6.18-308.el5, 64-bit Intel system. We're using dnsmasq to cache dns-queries and nscd. We have an internal bind which xCAT manages. Once the qmaster starts up, I see this in the messages-file: 10/31/2012 09:48:11| main|srvname|W|local configuration srvname not defined - using global configuration I'm not entirely sure what that means, but some searches pointed to resolver/hostname issues. This led me to check the resolving, so I turned the hosts-section of nsswitch.conf to files only, and then added everything needed to /etc/hosts. I turned off nscd and dnsmasq. After that, it doesn't segfault anymore and everything seems to work as it should. Has anyone seen anything like this before? Even if the DNS has bogus and faulty information (although I can't see that it has), the qmaster shouldn't segfault, should it? As soon as I change back from pure files to "files dns" it takes 2-3 minutes and the qmaster segfaults again. It might be worth noting that this host is an SGE 6.2u5 qmaster usually, with the original configuration of the resolver, it works without problems. Any ideas of what could lead to this rather strange behaviour, or how to dig up more information to further narrow it down? Wbr Andreas -------------------------------------------------------------------------- Confidentiality Notice: This message is private and may contain confidential and proprietary information. If you have received this message in error, please notify us and remove it from your system and note that you must not copy, distribute or take any action in reliance on it. Any unauthorized use or disclosure of the contents of this message is not permitted and may be unlawful. _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
