Oscar 6.0.5 has solved the problem of the oscartst user not existing on the client nodes after a build. Thank you!
The network problem still exists but I suspect that's a hardware issue and not the fault of oscar at all. (And I have a working workaround). Cheers, Nick. -- Nick Triantafillou Computer Systems Officer Faculty of Informatics University of Wollongong Contact - Ph 02 4221 5669 Email: n...@uow.edu.au On 7/04/2010 3:28 PM, geoffroy.val...@free.fr wrote: > Hello, > > Please update your RPM to install the binary packages for oscar-6.0.5 and let > me know if you still have the problem. > > Thanks, > > ----- "Nicolas Triantafillou"<n...@uow.edu.au> a écrit : > >> Thanks for that Joe, I'll check into it after our easter long weekend. >> >> >> Does anyone have any idea why the oscartst user doesn't exist on any >> of my clients after deploying an image? I did a quick useradd oscartst >> on every client machine and it resolved all of my problems. >> >> Cheers, >> >> Nick. >> ________________________________________ >> From: Greenseid, Joseph M (IS) [joseph.greens...@ngc.com] >> Sent: Thursday, April 01, 2010 11:55 PM >> To: oscar-users@lists.sourceforge.net; >> oscar-users@lists.sourceforge.net >> Subject: Re: [Oscar-users] /home mount failed, ssh server->node / >> node->server failed >> >> I am remembering a long time ago I think I experienced something >> similar to your problem. If I remember correctly, I think that our >> problem was eventually traced to spanning tree on our switches; when >> we disabled spanning tree, we no longer needed the sleep statement >> before NFS attempted to do its mounts, because there was no pause >> anymore when the switches checked for loops in the network. >> >> If you have switches that implement spanning tree, maybe you could try >> turning it off and seeing if that what was causing the network >> issues? >> >> --Joe >> >> ________________________________ >> From: Nicolas Triantafillou [mailto:n...@uow.edu.au] >> Sent: Wed 3/31/2010 11:43 PM >> To: oscar-users@lists.sourceforge.net >> Subject: Re: [Oscar-users] /home mount failed, ssh server->node / >> node->server failed >> >> >> Thankyou Ibad, this certainly put me in the right direction, as our >> servers all have dual integrated NIC's (Dell PowerEdge 1750's). >> >> Unforunately the BIOS in these servers don't have the capability to >> just >> disable one of the integrated NIC's, it's both or none. I found an >> alternate solution on another website: >> http://crazytoon.com/2007/05/11/centos-and-redhat-problem-nfs-mount-at-boot-up-fails-with-error-system-error-no-route-to-host/ >> >> For email archive history in case that site goes down, this was the >> solution I used: >> >> vi /etc/init.d/netfs >> insert: action $”Sleeping for 30 secs: ” sleep 30 >> right after: [ ! -f /var/lock/subsys/portmap ]&& service portmap >> start >> and right before: action $”Mounting NFS filesystems: ” mount -a -t >> nfs,nfs4 >> >> That solves one of our problems.. now to find out why there's no >> oscartst user on any of my client machines :) >> >> Cheers, >> Nick. >> >> On 31/03/2010 9:13 PM, I.Kureshi U0850037 wrote: >>> In our cluster we have found that if you have multiple NIC on the >> compute nodes when they reboot often they fail to reconnect to the >> head node. This usually happens because it mixes up which eth is >> which. and after initializing eth0 it fails at Mounting NFS file >> system. we have by passed this by editing the ifcfg-eth0 and eth1 >> files and hardcoded the MAC addresses. This still sometimes doesnt >> work. the best way is to disable the NIC you are not using. >>> >>> Hope this helps >>> >>> Ibad >>> ________________________________________ >>> From: Nicolas Triantafillou [n...@uow.edu.au] >>> Sent: Wednesday, March 31, 2010 6:11 AM >>> To: oscar-users@lists.sourceforge.net >>> Subject: [Oscar-users] /home mount failed, ssh server->node / >> node->server failed >>> >>> Hello, >>> >>> I recently installed OSCAR 6.0.5svn03312010 on CentOS 5.4. (The >> 'latest >>> release' version wasn't working at all so I went to the development >>> version). >>> >>> I ran the 'test_cluster' script at the end of the installation >> wizard >>> the following is happening: >>> >>> --- >>> >>> [r...@h-node01 testing]# ./test_cluster >>> Performing root tests... >>> /home mounts 7 nodes >>> failed [FAILED] >>> >>> Preparing user tests... >>> Performing user tests... >>> SSH ping test [PASSED] >>> SSH server->node [FAILED] >>> SSH node->server >>> Permission denied, please try again. >>> Permission denied, please try again. >>> Permission denied (publickey,gssapi-with-mic,password). >>> SSH node->server [FAILED] >>> >>> --- >>> >>> 1. The /home mounts are failing on boot due to the error 'no route >> to >>> host', even though /etc/rc3.d/S25netfs is clearly being run after >>> /etc/rc3.d/S10network, which succeeds. I moved it to S99netfs and >> it >>> still fails to mount /home on boot. Immediately after booting I can >>> manually ssh to the client and mount /home and it works perfectly. >>> >>> 2. The SSH problem is due to the oscartst user not existing on any >> of >>> the client nodes. The test_cluster script seems to be trying to >> execute >>> useradd only on the head node if /home/oscartst doesn't exist, >> however >>> it does exist on the head node, as does the user, just not the >> clients. >>> >>> Does anyone have any idea how to resolve either of these issues? >>> >>> Also, I found this in the test_cluster script (while trying to work >> out >>> why $test_user_homedir/oscartestfile disappears even when the >> unlink >>> command is commented out): >>> >>> # Cleanup before copying base files >>> `rm -rf $test_user_homedir/*`; >>> >>> This looks very dangerous, especially if $test_user_homedir is >> somehow >>> unset. :) >>> >>> Cheers, >>> Nick. >>> >>> -- >>> Nick Triantafillou >>> Computer Systems Officer >>> Faculty of Informatics >>> University of Wollongong >>> >> >> ------------------------------------------------------------------------------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> Oscar-users mailing list >> Oscar-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/oscar-users >> >> >> ------------------------------------------------------------------------------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> Oscar-users mailing list >> Oscar-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/oscar-users ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users