On Jul 9, 2010, at 3:23 PM, Jerome Soumagne wrote:

> Hi Ken,
> 
> I thank you a lot for your reply, I will think about it and do some more 
> tests. I was only thinking about using MPI threads, but yes as you say if two 
> threads are scheduled on the same core, that wouldn't be pretty at all. I can 
> probably do some more tests of that functionality, but I don't expect to have 
> great results.
> 
> I'm not sure to correctly understand what you say about the spawn. I found a 
> presentation on the web from Richard Graham saying that the spawn 
> functionality was implemented as well as it says in this presentation that 
> you get a full MPI 2 support on the Cray XT. When I said that I had problems 
> with the MPI_Comm_accept/connect functions, I meant that I actually get 
> errors when I try to do a "simple" MPI_Open_port, do you know where I can 
> find in the code whether this function is implemented or not? If it is 
> implemented, knowing where it is defined would help me to find the origin of 
> my problem and possibly extend the support of this functionality (if it is 
> feasible). I would like to be able to link two different jobs together using 
> these functions, ie. creating a communicator between the jobs.

It is implemented in ompi/mca/dpm/orte. I believe it isn't supported for the 
reasons Ken described.

> 
> Thanks,
> 
> Jerome
> 
> On 07/09/2010 07:16 PM, Matney Sr, Kenneth D. wrote:
>> Hello Jerome,
>> 
>> The first one is simple.  portals is not thead-safe on the Cray XT.  As, I 
>> recall,
>> only the master thread can post an event. although any thread can receive
>> the event.  Although, i might have it backwards.  It has been a couple of 
>> years
>> since I played with this.
>> 
>> The second one depends on how you use your Cray XT.  In our case, the machine
>> is used as process-per-core; i.e., not as a collection of SMPs.  For 
>> performance
>> reasons, you definitely do not want MPI threads.  Also, since it is run 
>> process-per-core,
>> there is nothing to be gained with progress threads.  Portals events will 
>> generate a kernel
>> level interrupt.  Whether you can run the XT as a cluster of SMPs is another 
>> question
>> entirely.  We really have not tried this in the context of OMPI.  But, in 
>> conjunction with
>> portals, this might open a "can of worms".  For example, any thread can be 
>> run on any
>> core.  But the portals ID for a thread will be the NID/PID pair for that 
>> core.  If two threads
>> get scheduled to the same core, it would not be pretty.
>> 
>> I could see lots of reasons why spawn might fail.  First, it is run on a 
>> compute node.
>> There is no way for a compute node to run a process on another compute node.
>> Also, there will be no rank/size initialization forthcoming from ALPS.  So, 
>> even if
>> it got past this, it would be running on the same node as its parent.
>> -- Ken Matney, Sr.
>>    Oak Ridge National Laboratory
>> 
>> 
>> On Jul 9, 2010, at 7:53 AM, Jerome Soumagne wrote:
>> 
>> Hi,
>> 
>> As I said in the previous e-mail, we've recently installed OpenMPI on a Cray 
>> XT5 machine, and we therefore use the portals and the alps libraries. Thanks 
>> for providing the configuration script from Jaguar, this was very helpful, 
>> it had just to be slightly adapted in order to use the latest CNL version 
>> installed on this machine.
>> 
>> I have some questions though regarding the use of the portals btl and mtl 
>> components. I noticed that when I compiled OpenMPI with mpi-thread support 
>> enabled and ran a job, the portals components did not want to initialize due 
>> to these funny lines:
>> 
>> ./mtl_portals_component.c
>> 182     /* we don't run with no stinkin' threads */
>> 183     if (enable_progress_threads || enable_mpi_threads) return NULL;
>> 
>> I'd like to know why are mpi threads disabled since threads are supported on 
>> the XT5, does the btl/mtl require to have thread-safety implemented or 
>> something like that or is it because of the portals library itself ?
>> 
>> I would also like to use the MPI_Comm_accept/connect functions, it seems 
>> that it's not possible to do that using the portals mtl even if the spawn 
>> seems to be supported, did I do something wrong or is it really not 
>> supported?
>> In this case, is it possible to extend this module to support these 
>> functions? We could help in doing that.
>> 
>> I'd like also to know, are there any plans for creating a module in order to 
>> use the DMAPP interface for the Gemini interconnect?
>> 
>> Thanks.
>> 
>> Jerome
>> 
>> 
>> --
>> Jérôme Soumagne
>> Scientific Computing Research Group
>> CSCS, Swiss National Supercomputing Centre
>> Galleria 2, Via Cantonale  | Tel: +41 (0)91 610 8258
>> CH-6928 Manno, Switzerland | Fax: +41 (0)91 610 8282
>> 
>> 
>> 
>> <ATT00001..txt>
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>   
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to