Hey, It's a pleasure if you read the attached document and give me your feedback.
regards. :-) > Hi All, > > I wish to test a real world problem as proof of concept using globus and > mpi. > I could not find any input how to integrate Globus, PBS and MPI. > I am working on HP UX , I have installed the distribution provided by HP. > > I will be grateful if some one can give some pointers how to proceed > with.It will be great if some one please provide any link for relevant > documents. > > Thanks & Regards > JP Singh > > http://ece.uprm.edu/~s047267 http://del.icio.us/josanabr http://blog-grid.blogspot.comTitle: Preparing Torque/Maui for Work with GT4
Preparing Torque/Maui for Work with GT4John A. Sanabria - [EMAIL PROTECTED]Last Updated: Wed Sep 5 23:35:49 2007
This document is based on my previous tutorial Installing and Configuring Torque/MAUI. The previous tutorial works fine when you deal with a plain cluster, however, additional configuration steps are necessary to be done for integrate Torque/MAUI with GT4. Then, the reader of the previouis article, will found similarities between them. Likewise, the new reader do not require has any knowledge about the previous article to install a Torque/MAUI cluster that works with no Globus Toolkit (GT) integration. Finally, this tutorial is an on-going work, so, any feedback is welcome. RequirementsFirst of all, for deploy a minimal cluster you need one machine, however, for present problems related with independent machines, two computational nodes is suggested. My testbed consist of two Linux machines with FC7 installed. Furthermore, The machines have installed the following network services:
Besides, you must get torque and maui source code. UsersFor every machine belonging to the cluster, is necessary create one user with the same id. (Someone could provide a short NIS+ tutorial). For this tutorial the user created is josanabr. Setting Up the ServicesNow, I provide a short steps and hacks for configure RSH and NFS properly. RSHNext, I describe the steps to configure the RSH service. (For simplicity, I recommend execute this configuration steps at EVERY cluster machine. (cssh))
Now, you can try to login as [EMAIL PROTECTED] ~]$ rsh pdclab-04 connect to address 136.145.116.81 port 543: Connection refused Trying krb4 rlogin... connect to address 136.145.116.81 port 543: Connection refused trying normal rlogin (/usr/bin/rlogin) Password: Last login: Tue Sep 4 15:46:11 from pdclab-00 [EMAIL PROTECTED] ~]$
That's cool? For Torque purposes, not is, It does not like [EMAIL PROTECTED] ~]$ which -a rsh /usr/kerberos/bin/rsh /usr/bin/rsh
The first (You can try another more elegant solution) In order to do that, as root, execute the next commands: [EMAIL PROTECTED] etc]# cd /usr/kerberos/bin/ [EMAIL PROTECTED] bin]# mv rsh rsh.krb [EMAIL PROTECTED] bin]# mv rlogin rlogin.krb [EMAIL PROTECTED] bin]# ln -sf /usr/bin/rsh . [EMAIL PROTECTED] bin]# ln -sf /usr/bin/rlogin . Now, try again: [EMAIL PROTECTED] ~]$ rsh pdclab-04 Password: Last login: Tue Sep 4 15:46:29 from pdclab-00 [EMAIL PROTECTED] ~]$
Hmmm!, looks better :-), still, I need provide my password. To avoid type the password, e.g. when you try to login from [EMAIL PROTECTED] ~]$ vi .rhosts [EMAIL PROTECTED] ~]$ cat .rhosts pdclab-00.ece.uprm.edu pdclab-00 [EMAIL PROTECTED] ~]$ chmod og-r .rhosts [EMAIL PROTECTED] ~]$ ls -l .rhosts -rw------- 1 josanabr josanabr 33 Sep 4 16:16 .rhosts [EMAIL PROTECTED] ~]$ Now, try again: [EMAIL PROTECTED] ~]$ rsh pdclab-04 Last login: Tue Sep 4 16:18:17 from pdclab-00 [EMAIL PROTECTED] ~]$
Hmmm!!!, well done, guy. Now, allow the connection from NFS
Now, for proper integration of Torque and GT4 we need to share a filesystem from the master node with compute nodes.
Remember, our master node is [EMAIL PROTECTED] etc]# vi /etc/exports [EMAIL PROTECTED] etc]# cat /etc/exports /home pdclab-04.ece.uprm.edu(rw,sync) [EMAIL PROTECTED] etc]# Then, restart the NFS related services: /etc/init.d/portmap restart /etc/init.d/nfs restart /etc/init.d/nfslock restart
Now, you can [EMAIL PROTECTED] ~]# mount -t nfs pdclab-00:/home /home [EMAIL PROTECTED] ~]# ls -l /home/ total 8 drwx------ 5 globus globus 4096 Jul 23 17:56 globus drwx------ 4 josanabr josanabr 4096 Sep 5 14:19 josanabr Building the ClusterFor achieve the "PBS" and GT4 integration, the first thing to must describe is set up the Torque and MAUI components. Now, we select Torque as the resource manager for distributed environments and can be consider like a PBS clone, but open source. On the other hand, MAUI is a robust scheduler to support advanced mechanism and policies for schedul large set of distributed computational resources. No more words, hands on. Setting up TorqueTorque is an open PBS descendant. Then, it is a distributed resource manager, providing control over batch jobs and distributed compute nodes. Although, has support to handle scheduling policies, this is not a major concern. Let's put hands on. Download the softwareThe software can be downloaded from here. Note: this tutorial employ the version 2.1.9. Unpacking, Configuring, Compiling and Installing the ServerGo to a proper directory where you wish uncompress the file: [EMAIL PROTECTED] ~]# cd /usr/local/src/ [EMAIL PROTECTED] src]# tar xfz ~/torque-2.1.9.tar.gz [EMAIL PROTECTED] src]# cd torque-2.1.9/
Now, for configuring torque, explicitly, is requested to build the ./configure --enable-server --enable-monitor --enable-clients make make install With no errors, under torque source code directory, you need need to execute next commands: ./torque.setup globus make packages
the first command, end to configure the server and indicate that Setting Up a Compute NodeAccording to the tasks done at this moment, you can copy the scripts:
from the server [EMAIL PROTECTED] ~]# scp pdclab-00:/usr/local/src/torque-2.1.9/torque-package-clients-linux-i686.sh . [EMAIL PROTECTED]'s password: torque-package-clients-linux-i686.sh 100% 400KB 400.0KB/s 00:00 [EMAIL PROTECTED] ~]# scp pdclab-00:/usr/local/src/torque-2.1.9/torque-package-mom-linux-i686.sh . [EMAIL PROTECTED]'s password: torque-package-mom-linux-i686.sh 100% 448KB 447.5KB/s 00:00 [EMAIL PROTECTED] ~]# Now, install the packages: [EMAIL PROTECTED] ~]# ./torque-package-clients-linux-i686.sh --install Installing TORQUE archive... Done. [EMAIL PROTECTED] ~]# ./torque-package-mom-linux-i686.sh --install Installing TORQUE archive... Done. [EMAIL PROTECTED] ~]#
Verify that the [EMAIL PROTECTED] ~]# cat /var/spool/torque/server_name pdclab-00.ece.uprm.edu [EMAIL PROTECTED] ~]#
It's ok. For finalize the client configuration, edit the file arch x86 opsys fc6 $logevent 255 $usecp *:/home /mnt/home
Last line indicate to map the directory Now you can execute the program to receive the jobs from master node: [EMAIL PROTECTED] ~]# pbs_mom Setting Up MauiMaui is an advanced policy engine used to improve the manageability and efficiency of machines ranging from clusters of a few processors to multi-teraflop supercomputers.
Next steps must be executed at master node ( Download the softwareIn order to get the software go here. Previous hacks
Due to integration issues with Torque, Maui expects to find [EMAIL PROTECTED] ~]# cd /usr/local/lib [EMAIL PROTECTED] lib]# ln -sf libtorque.so libpbs.so [EMAIL PROTECTED] lib]# ln -sf libtorque.a libpbs.a Unpacking, Configuring, Compiling and InstallingExecute the next commands:
[EMAIL PROTECTED] lib]# cd /usr/local/src/
[EMAIL PROTECTED] src]# tar xfz ~/maui-3.2.6p13.tar.gz
[EMAIL PROTECTED] src]# cd maui-3.2.6p13/
[EMAIL PROTECTED] maui-3.2.6p13]# export MAUIDIR=/var/spool/maui
[EMAIL PROTECTED] maui-3.2.6p13]# ./configure --with-spooldir=${MAUIDIR}
[EMAIL PROTECTED] maui-3.2.6p13]# make
[EMAIL PROTECTED] maui-3.2.6p13]# make install
Note If you got some message error related with Final Configuration StepsOk, almost is done, so execute next: [EMAIL PROTECTED] ~]# qmgr Qmgr: set server resources_default.nodect = 1 Qmgr: set server resources_default.walltime = 00:05:00 Qmgr: quit Finally,
[EMAIL PROTECTED] ~]# qterm -t quick ; pbs_server
[EMAIL PROTECTED] ~]# /usr/local/maui/sbin/maui
[EMAIL PROTECTED] ~]# pbsnodes -a
pdclab-04.ece.uprm.edu
state = free
np = 1
ntype = cluster
status = arch=x86,opsys=fc6,uname=Linux pdclab-04.ece.uprm.edu 2.6.18-1.2798.fc6xen #1 SMP Mon Oct 16 15:11:19 EDT 2006 i686,sessions=? 0,nsessions=? 0,nusers=0,idletime=185,totmem=816556kb,availmem=729324kb,physmem=262324kb,ncpus=1,loadave=0.02,netload=26127667,state=free,jobs=? 0,rectime=1189026813
Good kid. Testing Torque/MAUI Installation
In order to test our cluster deployment, login as
#!/bin/bash
/bin/hostname
save it as, [EMAIL PROTECTED] ~]$ qsub mysub 0.pdclab-00.ece.uprm.edu [EMAIL PROTECTED] ~]$ ls -rtl total 8 -rw-r--r-- 1 josanabr josanabr 29 Sep 5 17:17 mysub -rw------- 1 josanabr josanabr 23 Sep 5 17:17 mysub.o0 -rw------- 1 josanabr josanabr 0 Sep 5 17:17 mysub.e0 [EMAIL PROTECTED] ~]$ cat mysub.o0 pdclab-04.ece.uprm.edu [EMAIL PROTECTED] ~]$ Already you have a cluster, congrats. :-D. Now, our journey begins, ;-) A Short Journey over GT Quick Start GuideHere, instead to provide a deep description of the steps for carry out the Globus configuration, compilation and installation process, we just provide a checklist to follow for achieve the integration between Torque/MAUI and GT4. Then, a prior knowlegde with GT installation is recommended. Configuring, Compiling and Installing GT4
Ok, remember [EMAIL PROTECTED] ~]$ cd gt4.0.5-all-source-installer/ [EMAIL PROTECTED] gt4.0.5-all-source-installer]$ ./configure --prefix=/opt/gt --enable-wsgram-pbs [EMAIL PROTECTED] gt4.0.5-all-source-installer]$ make ... ... echo "Your build completed successfully. Please run make install." Your build completed successfully. Please run make install. [EMAIL PROTECTED] gt4.0.5-all-source-installer]$ make install Setting up Security at your Cluster
For that step, I have a simpleCA set up in the
Besides, [EMAIL PROTECTED] ~]$ myproxy-init -s init Your identity: /O=Grid/OU=GlobusTest/OU=simpleCA-init.ece.uprm.edu/OU=ece.uprm.edu/CN=John Sanabria Enter GRID pass phrase for this identity: Creating proxy ................................... Done Proxy Verify OK Your proxy is valid until: Wed Sep 12 21:06:10 2007 Enter MyProxy pass phrase: Verifying - Enter MyProxy pass phrase: A proxy valid for 168 hours (7.0 days) for user josanabr now exists on init. [EMAIL PROTECTED] ~]$ myproxy-logon -s init Enter MyProxy pass phrase: A credential has been received for user josanabr in /tmp/x509up_u501. [EMAIL PROTECTED] ~]$ For more information read section 4.3 of GT QuickStart Guide. Preparing GridFTP ServiceYou can follow the steps described in section 5.4. Preparing the Globus Container
Previous to follow the instructions given in section 5.5, you need provide some information to configure the RFT service at
GRAM, the time of truthRead the section 5.7. Test it, as follow: [EMAIL PROTECTED] ~]$ globusrun-ws -submit -s -c /bin/date Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:99bef29a-5c26-11dc-b839-00163e3dc54e Termination time: 09/07/2007 03:09 GMT Current job state: Active Current job state: CleanUp-Hold Wed Sep 5 23:09:38 AST 2007 Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done. Well, but still you're not utilizing your cluster. Try this: [EMAIL PROTECTED] ~]$ globusrun-ws -Ft PBS -submit -S -f a.rsl Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:b504593c-5c26-11dc-8737-00163e3dc54e Termination time: 09/07/2007 03:10 GMT Current job state: StageIn Current job state: Pending Current job state: Active Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done. Works?, hmmm!!! I guess not: [EMAIL PROTECTED] ~]$ cat stderr pdclab-04.ece.uprm.edu: Connection refused /var/spool/torque/mom_priv/jobs/22.pdclab-0.SC: line 55: [: too many arguments
But, everything looks correct? More amazing is the way to resolve the problem. Edit the file [EMAIL PROTECTED] ~]$ globusrun-ws -Ft PBS -submit -S -f a.rsl Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:2483539e-5c27-11dc-9bc2-00163e3dc54e Termination time: 09/07/2007 03:13 GMT Current job state: StageIn Current job state: Pending Current job state: Active Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done. [EMAIL PROTECTED] ~]$ cat stderr [EMAIL PROTECTED] ~]$ cat stdout Hello World! Already done! Final CommentsAt this time, integrate a cluster with GT can be a hard task. Besides, there exist so many factors that can affect the normal integration process. Then, the mailing list support, sometimes, is either unavailable or poor effective. This is not your foul guys, anyway, is dissapointed. This document, fulfill my need, perhaps for a newbie reader, more details are necessary. The main motivation for generate this document, is provide a better roadmap to integrate Torque/MAUI with Globus. I am sure, this document can be improved, then, I need your feedback. Any correction (grammar, technical, whatever!), I'll appreciate it. Regards. ResourcesCertainly, i did not write all that from scratch, but, i use the next web resources:
For more information use google :-D, or write me at |
Installing and Configuring Torque/MauiJohn A. Sanabria - [EMAIL PROTECTED]Last Updated: Tue Sep 4 13:31:21 2007
Today, many research areas are interested in to process a huge repositories of data, e.g. DNA sequences, weather forecasting, data mining, and so forth. The R & D groups are equipped with a vast set of hardware and software resources available to support the processing and analysis of these data. This resources, to effectively process the information required, is necessary to select proper tools to deal with the management and schedule of the resources according to policies established previously. Here, we are interested in to show how to install and configure Torque and Maui for a COW (Cluster Of Workstations). This document begins mentioning what is the difference between Torque and Maui. Next, it provides a set of steps for configure Torque, after, how to configure Maui. Finally, we provide a couple of samples to verify that everything is properly set up. Torque and Maui, What are these?On the one hand, Torque is a manager of distributed resources and the other hand Maui is an scheduler for distributed resources. Torque is the open version of another distributed resource manager, PBS. It provides control over batch jobs and distributed compute nodes. Although, it has support to deal with scheduling policies, this is not a major concern. Then, Maui is in charge of to deal with job scheduling for clusters and supercomputers. We choose Maui as cluster scheduler due to its easy integration with Torque. Hands OnDownload the softwareBoth projects can be accessed to www.clusterresources.com: Then, go to the download link and get the software. Installing and Configuring TorqueCompiling and First Configuration Steps in the Master NodeAssume that you have the file torque-2.0.0p8.tar.gz. Execute the next commands: # tar xvzf torque-2.0.0p8.tar.gz # cd torque-2.0.0p8
Before to proceed with the configuration process is necessary edit the file Now, execute that: # ./configure --enable-server --enable-monitor --enable-clients --with-scp # make # make install
After that, as # ./torque.setup globus # make packages The last command create the scripts necessary to configure the compute nodes. Configuring the Compute NodesIn the Torque jargon, the compute nodes are the resources in charge of to execute the tasks submitted to the cluster.
By default the Torque configuration files are located to After, configure the nodes in order to allows to connect from master node to any compute node with no password. To achieve it using ssh protocol, do the next: # ssh-keygen -t rsa # cat .ssh/id_rsa.pub | ssh [EMAIL PROTECTED] "cat - >> .ssh/authorized_keys"
The first command, Now, copy next two scripts for all compute node to be configured: # scp /usr/local/src/torque-2.0.0p8/torque-package-clients-linux-i686.sh . # scp /usr/local/src/torque-2.0.0p8/torque-package-mom-linux-i686.sh . Then, execute the next commands: # ./torque-package-clients-linux-i686.sh --install # ./torque-package-mom-linux-i686.sh --install
Now, verify the conten of the file
After, edit the file arch x86 opsys fc6 $logevent 255
Finally, execute the command Go back to the master
Log as root in the master node and execute the command Installing and Configuring MauiDifferent from Torque, Maui no need any client configuration, actually, the configuration process is nearly zero due to the easy integration with Torque. The Maui software must be installed in the master node, but, previous to compile it, is necessary to create these symbolic links: # ln -sf /usr/local/lib/libtorque.so /usr/loca/lib/libpbs.so # ln -sf /usr/local/lib/libtorque.a /usr/loca/lib/libpbs.a That is necessary because Maui by default expects to link its binaries with some PBS libraries. Now, execute next commands:
# export MAUIDIR=/var/spool/maui
# tar xvfz maui-3.2.6p13.tar.gz
# ./configure --with-spooldir=${MAUDIR}
# make
# make install
Now, we execute next commands in the Torque manager: # qmgr Qmgr: set server resources_default.nodect = 1 Qmgr: set server resources_default.walltime = 00:05:00 Qmgr: quit When, everything is done:
If you can't see all the compute nodes, wait a moment until all nodes get registered with the master node. Simple TestOk, cross your fingers, I am doing the proper and execute the next command: qsub mysub
where the #!/bin/bash /bin/hostname
the result to execute this, you must found a couple of files Integrating Globus and Torque/MauiThe Globus Quickstart Guide provide an useful steps to integrate the cluster master node with any existing Globus infrastructure. I encourage read the information provided here. If you find any problems executing tasks to the cluster execute next commands post Globus installation procedure. cd <GT_source_directory> make gt4-gram-pbs make install That information can be found here.
Try to execute the next command
<job>
<executable>my_echo</executable>
<directory>${GLOBUS_USER_HOME}</directory>
<argument>Hello</argument>
<argument>World!</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://pdclab-05.ece.uprm.edu:2811/bin/echo</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/my_echo</destinationUrl>
</transfer>
</fileStageIn>
<fileCleanUp>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/my_echo</file>
</deletion>
</fileCleanUp>
</job>
ConclusionTorque and Maui provide a low cost software framework for a cluster management. According to described above, setup a cluster with these tools is an easy task. I'm using XEN virtual machines with fedora core 6 as the base operating system for my cluster platform, but another computational environments can be used to deploy this tools thanks to source code availability of both projects. ResourcesCertainly, i did not write all that from scratch, but, i use the next web resources:
For more information use google :-D, or write me at |
