OK, thanks for both explanations.
Actually, after having installed the packages of LESC, we have to
run: gpt-postinstall.
And then we can run:
cd $GLOBUS_LOCATION/setup/globus/
setup-globus-job-manager-sge --mpi-pe=XXXXXX
Where XXXX is one of the available PE returned by "qconf -spl". So
when we then edit sge.pm, we see the variable $mpi_pe='XXXXX'; .
And then, all works fine.
Thanks,
Francois.
On 7/27/07, Wilfred Li <[EMAIL PROTECTED]> wrote: Hi,
Stuart is right, the original error message returned by SGE indicates
that the appropriate parallel environment wasn't set up for MPI.
#to see what PE are available:
#qconf -spl
Check your script and see what PE (-pe parameter) you are request, you
can easily modify from one of the existing ones.
#to see the details of the "mpi" PE:
#qconf -sp mpi
#to modify a PE
#qconf -mp mpi
Please see the man pages of qconf for other details.
Regards,
Wilfred
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto: [EMAIL PROTECTED] On Behalf Of Stuart Martin
Sent: Friday, July 27, 2007 8:15 AM
To: charles bacon; Francois Hornoy
Cc: globus user; R. Jeff Porter
Subject: Re: [gt-user] GT & SGE
Charles: I think your confusing "multi-jobs (MMJS) Vs
an individual job (MEJS)" and then for an MEJS job
"jobtype=multiple Vs jobtype=single".
Calling all SGE folks:
The default jobtype for an MEJS job is "multiple",
meaning that when you submit a job with count 4, 4
nodes/cpus will be allocated and 4 copies of the
executable will be started (one for each node/cpu). In
PBS we have some code to ssh/rsh to each node to start
the job. In SGE, I am not sure how it is done. Seems
there is some confusion in the SGE script with what to
do with jobtype multiple. If the SGE script is
dependent on the SGE PE environment to be setup in
order to process jobtype multiple, then that is a
current dependency and it needs to be setup. Simple as
that. I don't know if the PE environment comes default
with certain versions of SGE or how it is setup. Are
there others that can shed some light on how this is
done in SGE? Is the PE environment typically setup?
Is it easy to do?
An alternative to being dependent on the PE environment
would be to process jobtype=multiple jobs without it.
For example, maybe something similar to the PBS ssh/rsh
code can be written to start the application processes
on each allocated node?
Q: Is the PE environment required for processing GRAM
jobtype=MPI jobs? Sounds like it would be, so this
would indicate that the PE environment should always be
setup for a GRAM installation. If so, then this
dependency just needs to be made more explicit and documented.
-Stu
On Jul 27, 2007, at Jul 27, 7:41 AM, Charles Bacon wrote:
> The globus-job-manager creates it, I believe. What I
was suggesting,
> though, was to just add a line to the sge.pm that did
something like:
>
> if ( $jobtype == "multiple") && ( $count == 1 ) {
$jobtype = "single";
> }
>
> I am confused about why the jobtype is coming in as
multiple in the
> description, though. As far as I know, this should
be coming in as
> single when you submit something like -c
/bin/hostname. Maybe Martin
> or Stu can comment on that.
>
>
> Charles
>
> On Jul 27, 2007, at 12:29 PM, Francois Hornoy wrote:
>
>>
>> Hum ok, thank you.
>>
>> It seems that the default jobtype is "multiple", as
we can see in
>> the file:
>> include/gcc64dbg/globus_gram_protocol.h, line 328:
>> #define GLOBUS_GRAM_PROTOCOL_DEFAULT_JOBTYPE
>> "multiple"
>>
>> I've tried to "grep" in the sources of Globus and
LESC packages, and
>> did not fine that
GLOBUS_GRAM_PROTOCOL_DEFAULT_JOBTYPE. So maybe they
>> did not put anything, and by default, it's set to
"multiple". I don't
>> know.
>>
>> So, who generates that perl $description? "grep"
did not help me
>> much. I understand that the sge.pm reads this file, but
who
>> generates it?
>>
>> Thanks for helping,
>> Francois.
>>
>>
>> On 7/27/07, Charles Bacon < [EMAIL PROTECTED]> wrote:
>> On Jul 27, 2007, at 11:49 AM, Francois Hornoy wrote:
>>
>>> On 7/27/07, Charles Bacon < [EMAIL PROTECTED]> wrote:
>>> As the SGE module isn't ours, I don't have any
reason why it would
>>> be setting the jobtype to multiple here. If I were
you, I would
>>> just go into the sge.pm file and make it so it
didn't set my jobtype
>>> to multiple unless I asked it to. :-)
>>>
>>> Hehe ok. So, you mean that, in my SGE case, all the perl
>>> description (thus, "jobtype" in particular) is set
in the LESC
>>> packages and not in yours ?
>>
>> That's what I'm thinking. I send /bin/hostname jobs
to fork and pbs
>> adapters, and don't hit a jobtype of multiple. I
know that SGE in
>> particular has a jobarray type that some SGE
adapters call multiple,
>> and others don't. This is one of the reasons there
is more than one
>> SGE adapter, because people have made different
decisions from each
>> other.
>>
>>> Or the problem could be in "your" code?
>>
>> It's definitely possible, but I find it unlikely as
it stands.
>>
>>
>> Charles
>>
>>
>