Hi Herve

Sorry you are experiencing these problems. Part of the problem is that I
have no access to a BJS machine. I suspect the issue you are encountering is
that our interface to BJS may not be correct - the person that wrote it, I
believe, may have used the wrong environmental variables. At least, that is
what some of the Bproc folks have said.

Let me look into this a little more - no point in you continuing to thrash
on this. I'll challenge the Bproc folks to give me access to a BJS machine.

Again, I'm sorry for the trouble.
Ralph


On 11/7/06 7:27 AM, "hpe...@infonie.fr" <hpe...@infonie.fr> wrote:

> Hi Ralf, sorry for the delay in the answer but I encountered some difficulties
> to access to internet since yesterday.
> 
> I have tried all your suggestions but I continue to experience problems.
> Actually, I have a problem with bjs on the one hand that I may submit to a
> bproc forum and I still spawn problem on the other hand.
> 
> Let's focus first on the spawn problem.
> 
> Even with a "bjssub -i bash" or "bjssub -n 1 -i bash" command, I continue to
> have the log:
> mpirun -np 1 main_exe machine10
> main_exe: Begining of main_exe
> main_exe: Call MPI_Init
> main_exe: MPI_Info_set soft result=0
> main_exe: MPI_Info_set node result=0
> main_exe: Call MPI_Comm_spawn_multiple()
> --------------------------------------------------------------------------
> Some of the requested hosts are not included in the current allocation for the
> application:
>   ./spawned_exe
> The requested hosts were:
>   machine10
> 
> Verify that you have mapped the allocated resources properly using the
> --host specification.
> --------------------------------------------------------------------------
> [setics10:07250] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
> base/rmaps_base_node.c at line 210
> [setics10:07250] [0,0,0] ORTE_ERROR_LOG: Out of resource in file rmaps_rr.c at
> line 331
> 
> This problem is observed whatever the slave node is on the same machine than
> the master or not.
> 
> On the bjs side of the problem. I have run bjssub under gdb and I could
> observed that I did not go into the code part that setenv NODES variable, so I
> stayed with the default value NODES=0.
> 
> The question is, 
> is the spawn problem a result of the bjs problem ? or are they two independant
> problems ?
> 
> The good thing would be to find some other people with a debian platform,
> bproc, bjs and openmpi active. So that, we could check if I have made
> something wrong during the installation phase or if there is really a
> incompatibility problem in open mpi.
> 
> Thank you so much for all you support, I wish it is not succesful yet.
> 
> Regards.
> 
> Herve
> 
> Date: Fri, 03 Nov 2006 14:10:20 -0700
> From: Ralph H Castain <r...@lanl.gov>
> Subject: Re: [OMPI users] MPI_Comm_spawn multiple bproc support
> To: "Open MPI Users <us...@open-mpi.org>" <us...@open-mpi.org>
> Message-ID: <c170fe4c.59b3%...@lanl.gov>
> Content-Type: text/plain; charset="ISO-8859-1"
> 
> Okay, I picked up some further info that may help you.
> 
>>> The "bjsub -i /bin/env" only sets up the NODES for the session of
>>> /bin/env. Probably what he wants is "bjssub -i /bin/bash" and start
>>> bpsh/mpirun from the new shell.
> 
> I would recommend doing as they suggest. Also, they noted that you failed to
> specify the number of nodes you wanted on the bjssub command line. As a
> result, the system gave you only one node (hence the NODES=0 instead of
> NODES=0, 1).
> 
> If you do a "man bjssub", or a "bjssub --help", you should (hopefully) find
> out how to specify the desired number of nodes.
> 
> Hope that helps.
> Ralph
> 
> 
> On 11/2/06 6:46 AM, "Ralph Castain" <r...@lanl.gov> wrote:
> 
>> I truly appreciate your patience. Let me talk to some of our Bproc folks and
>> see if they can tell me what is going on. I agree - I would have expected
>> the NODES to be 0,1. The fact that you are getting just 0 explains the
>> behavior you are seeing with Open MPI.
>> 
>> I also know (though I don't the command syntax) that you can get a long-term
>> allocation from bjs (i.e., one that continues until you logout). Let me dig
>> a little and see how that is done.
>> 
>> Again, I appreciate your patience.
>> Ralph
>> 
>> 
>> On 11/2/06 6:32 AM, "hpe...@infonie.fr" <hpe...@infonie.fr> wrote:
>> 
>>> I again Ralf,
>>> 
>>>> I gather you have access to bjs? Could you use bjs to get a node
>>>> allocation,
>>>> and then send me a printout of the environment?
>>> 
>>> I have slightly changed my cluster configuration for something like:
>>> master is running on a machine call: machine10
>>> node 0 is running on a machine call: machine10 (same as master then)
>>> node 1 is running on a machine call: machine14
>>> 
>>> node 0 and 1 are up
>>> 
>>> My bjs configration allocates node 0 and 1 to the default pool
>>> <--------------->
>>> pool default
>>>       policy simple
>>>       nodes 0-1
>>> <----------------->
>>> 
>>> Be default, when I run "env" in a terminal, NODES variable is not present.
>>> If I run env under a job submission command like "bjsub -i env", then I can
>>> see the following new environments variable.
>>> NODES=0
>>> JOBID=27 (for instance)
>>> BPROC_RANK=0000000
>>> BPROC_PROGNAME=/usr/bin/env
>>> 
>>> When the command is over, NODES is unset again.
>>> 
>>> What is strange is that I would have expected that NODES=0,1. I do not know
>>> if
>>> you bjs users have the same behaviour.
>>> 
>>> Hopefully, it is the kind of information you were expecting.
>>> 
>>> Regards.
>>> 
>>> Herve
>>> 
>>> 
>>> 
>>> 
>>> --------------------- ALICE SECURITE ENFANTS ---------------------
>>> Prot?gez vos enfants des dangers d'Internet en installant S?curit? Enfants,
>>> le
>>> contr?le parental d'Alice.
>>> http://www.aliceadsl.fr/securitepc/default_copa.asp
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Sat, 4 Nov 2006 14:04:54 +0100 (CET)
> From: <pgar...@eside.deusto.es>
> Subject: [OMPI users] Technical inquiry
> To: us...@open-mpi.org
> Message-ID: <4444924729pgar...@eside.deusto.es>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> 
> Hi, everydoby. Good afternoon.
> 
> I've just configured and installed the openmpi-1.1.2 on a kubuntu
> GNU/linux, and I'm trying now to compile the hello.c example without
> results.
> 
>> root@kubuntu:/home/livestrong/mpi/test# uname -a
>> Linux kubuntu 2.6.15-23-386 #1 PREEMPT Tue May 23 13:49:40 UTC 2006
>> i686 GNU/Linux
> 
> Hello.c
> -------
> #include "/usr/lib/mpich-mpd/include/mpi.h"
> #include <stdio.h>
> int main (int argc, char** argv)
> {
>         MPI_Init(&argc, &argv);
>         printf("Hello word.\n");
>         MPI_Finalize();
>         return(0);
> }
> 
> The error that I'm finding is this:
> 
> root@kubuntu:/home/livestrong/mpi/prueba# mpirun -np 2 hello
> 0 - MPI_INIT : MPIRUN chose the wrong device ch_p4; program needs
> device ch_p4mpd
> /usr/lib/mpich/bin/mpirun.ch_p4: line 243: 16625 Segmentation
> fault  "/home/livestrong/mpi/prueba/hello" -p4pg
> "/home/livestrong/mpi/prueba/PI16545" -p4wd "/home/livestrong/mpi/prueba"
> 
> Does anybody know what it can be the problem?
> 
> Regards and thank you very much in advance.
> 
> Pablo.
> 
> PD: I send the ompi_info output and the config.log to you.
> 
> Besides
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: question.tar.gz
> Type: application/octet-stream
> Size: 59009 bytes
> Desc: 
> Url : 
> http://www.open-mpi.org/MailArchives/users/attachments/20061104/dd281cc5/attac
> hment.obj 
> 
> ------------------------------
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> End of users Digest, Vol 425, Issue 1
> *************************************
> 
> 
> --------------------- ALICE SECURITE ENFANTS ---------------------
> Protégez vos enfants des dangers d'Internet en installant Sécurité Enfants, le
> contrôle parental d'Alice.
> http://www.aliceadsl.fr/securitepc/default_copa.asp
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Reply via email to