Kenneth, please note, that the slides of the conference's program have been uploaded to http://www.eurompi2010.org/program/
Best regards, Rainer On Wednesday 22 September 2010 17:53:12 Kenneth Lloyd wrote: > Jeff, > > Is that EuroMPI2010 ob1 paper publicly available? I get involved in various > NUMA partitioning/architecting studies and it seems there is not a lot of > discussion in this area. > > Ken Lloyd > > ================== > Kenneth A. Lloyd > Watt Systems Technologies Inc. > > > > -----Original Message----- > From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On > Behalf Of Jeff Squyres Sent: Wednesday, September 22, 2010 6:00 AM > To: Open MPI Developers > Subject: Re: [OMPI devel] How to add a schedule algorithm to the pml > > Sorry for the delay in replying -- I was in Europe for the past two weeks; > travel always makes me waaaay behind on my INBOX... > > On Sep 14, 2010, at 9:56 PM, 张晶 wrote: > > I tried to add a schedule algorithm to the pml component ,ob1 etc. Poorly > > I can only find a paper named "Open MPI: A Flexible High Performance > > MPI" and some annotation in the source file. From them , I know ob1 has > > implemented round-robin& weighted distribution algorithm. But after > > tracking the MPI_Send(),I cann't figure out the location of these > > implement ,let alone to add a new schedule algorithm. I have two > > questions : > > 1.The location of the schedule algorithm ? > > It's complicated -- I'd say that the PML is probably among the most > complicated sections of Open MPI because it is the main "engine" that > enforces the MPI point-to-point semantics. The algorithm is fairly well > distribute throughout the PML source code. :-\ > > > 2.There are five components :cm,crcpw ,csum ,ob1,V in the pml framework . > > The function of these components? > > cm: this component drives the MTL point-to-point components. It is mainly > a thin wrapper for network transports that provide their own MPI-like > matching semantics. Hence, most of the MPI semantics are effectively done > in the lower layer (i.e., in the MTL components and their dependent > libraries). You probably won't be able to do much here, because such > transports (MX, Portals, etc.) do most of their semantics in the network > layer -- not in Open MPI. If you have a matching network layer, this is > the PML that you probably use (MX, Portals, PSM). > > crcpw: this is a fork of the ob1 PML; it add some failover semantics. > > csum: this is also a fork of the ob1 PML; it adds checksumming semantics > (so you can tell if the underlying transport had an error). > > v: this PML uses logging and replay to effect some level of fault > tolerance. It's a distant fork of the ob1 PML, but has quite a few > significant differences. > > ob1: this is the "main" PML that most users use (TCP, shared memory, > OpenFabrics, etc.). It gangs together one or more BTLs to send/receive > messages across individual network transports. Hence, it supports true > multi-device/multi-rail algorithms. The BML (BTL multiplexing layer) is a > thin management later that marshals all the BTLs in the process together > -- it's mainly array handling, etc. The ob1 PML is the one that decides > multi-rail/device splitting, etc. The INRIA folks just published a paper > last week at Euro MPI about adjusting the ob1 scheduling algorithm to also > take NUMA/NUNA/NUIOA effects into account, not just raw bandwidth > calculations. > > Hope this helps! -- ---------------------------------------------------------------- Dr.-Ing. Rainer Keller http://www.hlrs.de/people/keller HLRS Tel: ++49 (0)711-685 6 5858 Nobelstrasse 19 Fax: ++49 (0)711-685 6 5832 70550 Stuttgart email: kel...@hlrs.de Germany AIM/Skype:rusraink