Its working and we are discussing how to go about making it generally available. For the moment, we are happy to have some alpha users - I think you might find it interesting.
The basic idea is that a bunch of extended OpenDX processes run out in distributed-processor-space - one the master, responsible for the overall visualization program, and the rest, the slaves - each responsible for a subset of the data. "Stand-ins" on the master are very simple OpenDX Field objects that share important characteristics with the represented data - notably the bounding box and data min max. The stand-ins carry attributes that link them to the distributed data. These are, first, an identifier used to cache the partitioned data on the slaves (eg. each slave has its portion of the data associated with a single stand-in cached using the same identifier - thus, broadcasting the identifier allows each slave to access the corresponding data); and second, a data structure containing a list of lists - for each slave, a list of the partitions to be found on the slave, with the partition min max. Three modules implement parallel macros: CallDistributedMacroReturnsStandIn, DistributedMacroArguments, and DistributedMacroReturnStandIn. Run on the master, CDMRS is passed a macro name and a list of objects, which may be either stand-ins or simple objects. This causes the named macro to be run on each slave. The macro begins with DMA, which has outputs for each input to CDMRS (except the macro name) - if an input to CDRMS is a simple object, the corresponding output of DMA is the object; however, if the input to CDRMS was a stand-in, the corresponding output of DMA on each slave is the local data associated with that stand-in. Since no Grow/Shrink among the slaves is implemented, each partition must carry whatever overlap is necessary. For some special purposes (notably Streamline) the cell-dependent overlap tags contain 0 for non-overlap cells or the MPI process ID of the processors that actually own overlap cells. Data can be partitioned in several ways. I've created a modified Partition that partitions with overlap, and there's a Distribute module that, given a Group object, distributed the members of the group and creates a stand-in for the distributed result. I also have a distributed regular grid importer which imports a regular grid stored in a generally accessable file system (GPFS, for example) in parallel. I hope that'll serve as a good template for other apps that need custom parallel import. I also plan to extend it to use MPIIO. Some algorithms aren't going to lend themselves to parallel macros. For these, you implement a module that appears to run on the master node, but which can go parallel internally. You have direct access to MPI, but I've also implemented a communications layer implementing (for example) OpenDX object passing, point-to-point remote procedure calls and broadcast remote procedure calls. I've built a Streamline module that works in this fashion. The master sends each starting point to the slave node where the streamline starts. The slaves recieve the starting points and queue them. On each slave, a number of parallel threads retrieve starting points from the queue and tracks the streamline to a partition boundary. When a boundary is encountered, the endpoint is transmitted to the slave node where the streamline continues (using info from the overlap region) or, if the oundary is external to the overall data volume, passes a completion message back to the master. When the master has seen enough completions, it creates a standin for the distributed streamline geometry and finishes. There's also a parallel renderer based on binary swapping but supporting transparency, and (of course) you can gather distributed geometry onto a single node to use hardware rendering via the standard OpenDX Image module. I'll be looking at a Chromium renderer as soon as it becomes sufficiently stable.] Stability is an issue in DXMPI. Errors are not handled gracefully, for example, but in general it works pretty well. I build and run it on Linux and AIX. Greg Richard Gillilan <[EMAIL PROTECTED]>@opendx.watson.ibm.com on 07/13/2001 10:03:43 AM Please respond to [email protected] Sent by: [EMAIL PROTECTED] To: [email protected] cc: Subject: Re: [opendx-dev] Ideas for X independency Hi Greg, What is the state of the MPI version of OpenDX? Soon I'll be working on parallel algorithms on a Linux cluster and would like to understand what OpenDX can currently do. It's been quite a long time since I ran DX on a parallel machine, does the Parition module still work with MPI? Are there other ways to parallelize stuff? Thanks Richard
