Ivan,
Thank you for your solution. Will try that. Reagrds, Nikita From: "Hung-Sheng Tsao (Lao Tsao 老曹) Ph. D." [mailto:laot...@gmail.com] Sent: Tuesday, February 08, 2011 8:05 PM To: n...@kemsu.ru Subject: Re: [Oscar-users] Problem with GUI on CentOS 5.5 Unknown more information… _____ O : OSCAR-USERS@LISTS.SOURCEFORGE.NET 28 July 2010 • 4:57AM -0400 [Oscar-users] Oscar 6.0.5 (unstable) installation notes by Ivan V. Sergeyev REPLY TO AUTHOR REPLY TO GROUP _____ Just thought I would share my experiences here for those having installation problems, since I've now been using oscar 6.0.5 successfully on a production cluster for several months. My first install was marred by broken dependencies on the torque packages - I was forced to use SGE, which ended up being suboptimal (you can see my earlier posts if interested). Geoffroy quickly fixed these dependencies in trunk but the rpm's to this date have not been updated. As a result, I had all the same issues during my second round of imaging. The fix is of course to compile your own opkg's. To do so, first download the Oscar trunk using 'svn co <http://svn.oscar.openclustergroup.org/oscar/trunk> http://svn.oscar.openclustergroup.org/oscar/trunk oscar' - instructions at <http://svn.oscar.openclustergroup.org/trac/oscar/wiki/SVNinstructions.> http://svn.oscar.openclustergroup.org/trac/oscar/wiki/SVNinstructions. Get opkgc using 'yum install opkgc', the go to the oscar/packages/torque directory and (as root) run 'opkgc --dist=rhel' to compile the opkg's. These will be located in /usr/src/redhat/RPMS/x86_64/ for a 64-bit centos/rhel distribution. For those who don't want to bother with this process, I provide the updated x86_64 opkg's here: <http://biophys.chem.columbia.edu/oscar/> http://biophys.chem.columbia.edu/oscar/. Hopefully these will make it into the unstable repo soon enough (I'm guessing for 6.0.6). Place these opkg's in /tftpboot/oscar/rhel-5-x86_64/ and run packman --prepare-repo /tftpboot/oscar/rhel-5-x86_64 (or i386 respectively) to make the repo available to Oscar. Also create a local repo of your distribution's install disk per instructions in the official documentation. With the repos in place, and as long as you remember to turn off all firewalls, selinux, etc., installation is a breeze. I should mention this is CentOS 5.5, fully updated. The next hiccup came after imaging - the nodes imaged just fine but would not boot, giving kernel panics and errors like "setuproot: unable to mount /dev/root". This is hardware-related. I am using Dell PowerEdge T610's (8-core) as my nodes, which require mptbase, mptsas, and ata_piix kernel modules to boot properly. Solution: before imaging (but after image creation), run mkinitrd --preload mptbase --preload mptsas --preload ata_piix --without-dmraid --omit-lvm-modules /var/lib/systemimager/images/<image name>/boot/initrd-<kernel version>.el5.img <kernel version>. This is of course assuming you don't need lvm and dmraid for your system, otherwise those directives may be excluded. You may find your kernel version with 'uname -r'. With the new initrd in place, run or re-run oscar's Step 6 (if you've run it already it seems to be important to repeat it for some reason) and then image your nodes. Using this approach I have all the experimental packages working except for sge and linux-ha (don't need these). There are also a couple of bugs in the testing scripts for 6.0.5 (this should probably go in the devel mailing list but since I am not yet a member I'll write it here for now): In /var/lib/perl5/vendor_perl/5.8.8/OSCAR/OCA/RM_Detect/TORQUE.pm, line 38 should be 'test => "/var/lib/oscar/testing/$pkg/$test"' instead of /var/lib/oscar/$pkg/testing/$test. In /var/lib/oscar/testing/ganglia/test_user, both lines 97 and 128 should read "if ($hosts == $numhosts) {", not ($hosts eq $numhosts). Finally, I am using multiple NICs on each node and multiple gigabit switches for communication with OpenMPI (which knows how to fully utilize such a setup). As a word of advice, the latency has been disappointing for MD simulation (gromacs) performance - a modern 8-core machine generates so much data so quickly that it swamps the gigabit network interfaces. There's a nice paper on partially alleviating these issues: Kutzner C. et al. 2007 J. Computational Chemistry (28) 12: 2075-2084. For those who can afford it however, I strongly recommend InfiniBand or Myrinet. Hope someone finds this useful... Best, Ivan ____________________ Ivan V. Sergeyev McDermott Group Columbia University 3000 Broadway, MC 3132 New York, NY, 10027 On 2/8/2011 1:33 AM, Nikita Andreev wrote: I’m trying to deploy a cluster from CentOS 5.5 x86_64 with OSCAR 6.0.5. It’s a fresh install. I had a problem with torque-modulefile dependency with the following error: opkg-torque-server-2.1.13-1.noarch from unstable_rhel-5-x86_64 has depsolving problems --> Missing Dependency: torque-modulefile is needed by package opkg-torque-server-2.1.13-1.noarch (unstable_rhel-5-x86_64) I’ve resolved it by manually downloading opkg-torque-server and rebuilding with torque-oscar-modulefile dependency which I believe is correct. At the moment I have two issues I can’t resolve by myself: 1. When I go into configuring switcher from the GUI I see: “No pkg_config were supplied by any OSCAR packages – nothing to configure”. There should be something to configure since I choose openmpi, mpich and lam to install. 2. I can’t invoke 4th step. I get the following error (find log attached): Tk::Error: Can't set -options to `ARRAY(0xd34c710)' for Tk::Optionmenu=HASH(0xd360660): No -label at /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Tk/Widget.pm line 256 If CentOS 5.5 isn’t supported then please provide me with supported version. Is it 5.4? Another question would be why all necessary packages like torque, maui, etc .. are under “Experimental” package set? Is OSCAR 6.0.5 an unstable version and I should rollback to version 5? Regards, Nikita ------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users
<<image001.png>>
------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users