Re: [gt-user] Need help: GT4 with MPI and Condor-G
At first try to submit simple and MPI-based jobs via condor, then in case of success integrate condor pool with the GRAM Service. On Wed, Jan 20, 2010 at 1:09 AM, Paula Martinez pm...@uncu.edu.ar wrote: Hi All: We've four machines; the first one configured with Condor-G (Condor-7.4.1). This machine talks to the second one, a machine configured with Globus(4.2.1) GT4. The Globus machine also has a Condor configured on it, and is the master of tho Condor slaves. We need to execute MPI jobs that are sent from the Condor-G machine to the Globus machine and this one should contact the Condor master (in the same machine) that must send the job to the Condor slaves for execution. We read that this is not posible, that jobs can't be sent from Condor through Globus to finally executed in a Condor pool, and so far we can't make it work!! Here are some links that we read: https://lists.cs.wisc.edu/archive/condor-users/2007-February/msg00012.shtml https://lists.cs.wisc.edu/archive/condor-users/2008-May/msg00036.shtml Ing. Paula Marti要ez ITU - Redes y Telecomunicaciones -- Best Regards, Mehdi Sheikhalishahi
Re: [gt-user] Need help: GT4 with MPI and Condor-G
What is the error? There are some bug fixes in the latest version of GT5, I think it is a good idea if you take a look at those bug fixes, maybe your problem is a bug. And try to switch to use GT5 On Thu, Jan 21, 2010 at 3:36 PM, Paula Martinez pm...@uncu.edu.ar wrote: Hi, thanks for your answer. This arquitecture is working fine with jobs submitted with vanilla universe. The problem is with mpi parallel universe, because we don't know how to tell Condor-G to submit the job in parallel universe not in vanilla universe. Simple jobs and mpi jobs executed in the Condor pool are working fine (without globus). MPI jobs submitted from Condor-G are executed in globus not in the remote condor pool, and we don't know how to do that! We tried to use the following example: universe = grid grid_resource = condor condor globus01 globusrsl = (jobType=mpi) (count=2) remote_universe = parallel globus_xml = count2/countjobTypempi/jobType machine_count = 2 executable = mp2script Arguments = paralelo666 512 should_transfer_files = yes transfer_input_files = paralelo666, mp2script transfer_executable = false when_to_transfer_output = on_exit output = outfile_grid.$(Process) error = errfile_grid.$(Process) log = logfile_grid queue Thanks Mehdi! At first try to submit simple and MPI-based jobs via condor, then in case of success integrate condor pool with the GRAM Service. On Wed, Jan 20, 2010 at 1:09 AM, Paula Martinez pm...@uncu.edu.ar wrote: Hi All: We've four machines; the first one configured with Condor-G (Condor-7.4.1). This machine talks to the second one, a machine configured with Globus(4.2.1) GT4. The Globus machine also has a Condor configured on it, and is the master of tho Condor slaves. We need to execute MPI jobs that are sent from the Condor-G machine to the Globus machine and this one should contact the Condor master (in the same machine) that must send the job to the Condor slaves for execution. We read that this is not posible, that jobs can't be sent from Condor through Globus to finally executed in a Condor pool, and so far we can't make it work!! Here are some links that we read: https://lists.cs.wisc.edu/archive/condor-users/2007-February/msg00012.shtml https://lists.cs.wisc.edu/archive/condor-users/2008-May/msg00036.shtml Ing. Paula Marti要ez ITU - Redes y Telecomunicaciones -- Best Regards, Mehdi Sheikhalishahi Ing. Paula Martinez ITU - Redes y Telecomunicaciones -- Best Regards, Mehdi Sheikhalishahi
Re: [gt-user] Need help: GT4 with MPI and Condor-G
Sounds like you need to configure the second machine with a Condor batch adapter. The Open Science Grid compute element instructions show how to do this. (If you want a lighter-weight implementation, you can use the TIGRE or SURAgrid stacks -- ask for details if you need them.) https://twiki.grid.iu.edu/bin/view/ReleaseDocumentation/ComputeElementInstall (See the Install job managers section) Hope this helps. On Jan 19, 2010, at 6:09 PM, Paula Marti nez wrote: Hi All: We've four machines; the first one configured with Condor-G (Condor-7.4.1). This machine talks to the second one, a machine configured with Globus(4.2.1) GT4. The Globus machine also has a Condor configured on it, and is the master of tho Condor slaves. We need to execute MPI jobs that are sent from the Condor-G machine to the Globus machine and this one should contact the Condor master (in the same machine) that must send the job to the Condor slaves for execution. We read that this is not posible, that jobs can't be sent from Condor through Globus to finally executed in a Condor pool, and so far we can't make it work!! Here are some links that we read: https://lists.cs.wisc.edu/archive/condor-users/2007-February/msg00012.shtml https://lists.cs.wisc.edu/archive/condor-users/2008-May/msg00036.shtml Ing. Paula Marti nez ITU - Redes y Telecomunicaciones Alan Sill, Ph.D Senior Scientist, High Performance Computing Center Adjunct Professor of Physics TTU : Alan Sill, Texas Tech University Office: Admin 233, MS 4-1167 : : e-mail: alan.s...@ttu.edu ph. 806-742-4350 fax 806-742-4358 :