Thanks Burlen, Berk, Ken - I'm digesting all the emails and rewriting a bunch of filters and readers to see if I can do everything with the UPDATE_PIECE keys and related exchanges of information.
I'll report back once I make some progress or need help. cheers JB > -----Original Message----- > From: burlen [mailto:[email protected]] > Sent: 16 December 2009 22:24 > To: Biddiscombe, John A. > Cc: Berk Geveci; [email protected] > Subject: Re: [Paraview] Parallel Data Redistribution > > oops typo: The ooc reader is a vtkObject. > > burlen wrote: > > Hey John, > > > >> Also : for dynamic load balancing, I'd like to instruct several > >> reader to read the same piece - since the algorithm controls (for > >> example) the particles the algorithm can internally communicate > >> information about what to do amongst its processes, but it can't talk > >> upstream to the readers and fudge them. > >> > >> I am wondering if there is any way of supporting this kind of thing > >> using the current information keys and my instinct says no. > > I guess you can kind of do this with the current "request update" > > stuff but thanks to the flexibility of the pipeline information > > key,values you can also roll your own very easy. > > > > I recently implemented dynamic load balancing in a new stream line > > tracer. To get the work load balanced its crucial that each process > > have to have on demand access to the entire data set. I accomplished > > it with information keys and by using a "meta-reader" in place of the > > traditional paraview reader. The meta reader does two things, it > > populates the new keys and it gives PV a dummy dataset that is one > > cell per process such that the bounds, shape, and array names are the > > same as the real dataset which is not read during the meta-reader > > execution. When the stream tracer executes downstream of the > > meta-reader he picks the keys out of the pipeline information. The > > important key,value is an out-of-core (ooc) reader. so that it can be > > passed through the information. Once the stream tracer has it he can > > make repeated IO requests as particles move through the dataset as > > needed. My interface accepts a point and returns a chunk of data. The > > ooc reader internally handles caching and memory management. In this > > way you can keep all processes busy all the time when tracing stream > > lines. The approach worked out well and was very simple to implement, > > with no modification to the executive. Also the filter has control of > > caching, and can free all the memory at the end of its execution which > > reduces significantly the memory footprint compared to the traditional > > PV reader. And I need not worry if PV or some upstream filter uses MPI > > communications in between during my IO requests. There is a little > > more to our scheduling algorithm which I wont discus now but so far > > for making poincare maps we scaled well up to 2E7 stream lines per > > frame and 96 processes and we minimize the memory footprint which is > > important to us. > > > > Berk and Ken already basically gave you all the options you need but I > > add this because it shows how flexible and powerful the pipeline > > information really is. > > > > Burlen > > > > Biddiscombe, John A. wrote: > >> Berk, > >> > >> We had a discussion back in 2008, which resides here > >> http://www.cmake.org/pipermail/paraview/2008-May/008170.html > >> > >> Continuing from this, my question of the other day, touches on the > >> same problem. > >> > >> I'd like to manipulate the piece number read by each reader. As > >> mentioned before, UPDATE_PIECE is not passed into RequestInformation > >> at first (since nobody knows how many pieces there are yet!), so I > >> can't (directly) generate information in the reader which is 'piece > >> dependent'. And I can't be sure that someone doing streaming won't > >> interfere with piece numbers when using the code differently. > >> > >> For the particle tracer (for example), I'd like to tell the upstream > >> pipeline to read no pieces when certain processes are empty of > >> particles (currently they update and generate{=read} data when they > >> don't need to). I may be able to suppress the forward upstream > >> somehow, but I don't know of an easy way for the algorithm to say > >> "Stop" to the executive to prevent it updating if the timestep > >> changes, but the algorithm has determined that no processing is > >> required (ForwardUpstream of Requests continues unabated). I'd like > >> to set the UPdatePiece to -1 to tell the executive to stop operating. > >> > >> Also : for dynamic load balancing, I'd like to instruct several > >> reader to read the same piece - since the algorithm controls (for > >> example) the particles the algorithm can internally communicate > >> information about what to do amongst its processes, but it can't talk > >> upstream to the readers and fudge them. > >> > >> I am wondering if there is any way of supporting this kind of thing > >> using the current information keys and my instinct says no. It seems > >> like the update pice and numpieces were really intended for streaming > >> and we need two kinds of 'pieces', one for streaming, another for > >> splitting in _parallel_ because they aren't quite the same. (Please > >> note that I haven't actually tried changing piece requests in the > >> algorithms yet, so I'm only guessing that it won't work properly) > >> > >> <cough> > >> UPDATE_STREAM_PIECE > >> UPDATE_PARALLEL_PIECE <\cough> > >> > >> Comments? > >> > >> JB > >> > >> > >> > >>> I would have the reader (most parallel readers do this) generate empty > >>> data on all processes of id >= N. Then your filter can redistribute > >>> from those N processes to all M processes. I am pretty sure > >>> RedistributePolyData can do this for polydata as long as you set the > >>> weight to 1 on all processes. Ditto for D3. > >>> > >>> -berk > >>> > >>> On Fri, Dec 11, 2009 at 4:13 PM, Biddiscombe, John A. > >>> <[email protected]> > >>> wrote: > >>> > >>>> Berk > >>>> > >>>> > >>>>> It sounds like M is equal to the number of processors (pipelines) and > >>>>> M >> N. Is that correct? > >>>>> > >>>> Yes, That's the idea. N blocks, broken (in place) into M new > >>>> blocks, then > >>>> > >>> fanned out to the M processes downstream where they can be processed > >>> separately . If it were on a single node, then each block could be a > >>> separate 'connection' to a downstream filter, but distributed, an > >>> explicit > >>> send is needed. > >>> > >>>> JB > >>>> > >>>> > >>>>> -berk > >>>>> > >>>>> On Fri, Dec 11, 2009 at 10:40 AM, Biddiscombe, John A. > >>>>> <[email protected]> > >>>>> wrote: > >>>>> > >>>>>> Berk > >>>>>> > >>>>>> The data will be UnstructuredGrid for now. Multiblock, but > >>>>>> actually, I > >>>>>> > >>>>> don't really care what each block is, only that I accept one block on > >>>>> > >>> each > >>> > >>>>> of N processes, split it into more pieces, and the next filter > >>>>> accepts > >>>>> > >>> one > >>> > >>>>> (or more if the numbers don't match up nicely) blocks and process > >>>>> them. > >>>>> > >>> The > >>> > >>>>> redistribution shouldn't care what data types, only how many > >>>>> blocks in > >>>>> > >>> and > >>> > >>>>> out. > >>>>> > >>>>>> Looking at RedistributePolyData makes me realize my initial idea > >>>>>> is no > >>>>>> > >>>>> good. In my mind I had a pipeline where multiblock datasets are > >>>>> passed > >>>>> > >>> down > >>> > >>>>> the pipeline and simply the number of pieces is manipulated to > >>>>> achieve > >>>>> > >>> what > >>> > >>>>> I wanted - but I see now that if I have M pieces downstream mapped > >>>>> > >>> upstream > >>> > >>>>> to N pieces, what will happen is the readers will be effectively > >>>>> > >>> duplicated > >>> > >>>>> and M/N readers will read the same pieces. I don't want this to > >>>>> happen as > >>>>> > >>> IO > >>> > >>>>> will be a big problem if readers read the same blocks M/N times. > >>>>> > >>>>>> I was hoping there was a way of simply instructing the pipeline to > >>>>>> > >>> manage > >>> > >>>>> the pieces, but I see now that this won't work, as there needs to > >>>>> be a > >>>>> specific Send from each N to their M/N receivers (because the data is > >>>>> physically in another process, so the pipeline can't see it). This is > >>>>> > >>> very > >>> > >>>>> annoying as there must be a class which already does this (block > >>>>> redistribution, rather than polygon level redistribution), and I > >>>>> would > >>>>> > >>> like > >>> > >>>>> it to be more 'pipeline integrated' so that the user doesn't have to > >>>>> explicitly send each time an algorithm needs it. > >>>>> > >>>>>> I'll go through RedistributePolyData in depth and see what I can > >>>>>> pull > >>>>>> > >>> out > >>> > >>>>> of it - please feel free to steer me towards another possibility :) > >>>>> > >>>>>> JB > >>>>>> > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Berk Geveci [mailto:[email protected]] > >>>>>>> Sent: 11 December 2009 16:09 > >>>>>>> To: Biddiscombe, John A. > >>>>>>> Cc: [email protected] > >>>>>>> Subject: Re: [Paraview] Parallel Data Redistribution > >>>>>>> > >>>>>>> What is the data type? vtkRedistributePolyData and its > >>>>>>> subclasses do > >>>>>>> this for polydata. It can do load balancing (where you can > >>>>>>> specify a > >>>>>>> weight for each processor) as well. > >>>>>>> > >>>>>>> -berk > >>>>>>> > >>>>>>> On Fri, Dec 11, 2009 at 9:59 AM, Biddiscombe, John A. > >>>>>>> > >>> <[email protected]> > >>> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> I have a filter pipeline which reads N blocks from disk, this > >>>>>>>> works > >>>>>>>> > >>>>> fine > >>>>> > >>>>>>> on N processors. > >>>>>>> > >>>>>>>> I now wish to subdivide those N blocks (using a custom filter) to > >>>>>>>> > >>>>> produce > >>>>> > >>>>>>> new data which will consist of M blocks - where M >> N. > >>>>>>> > >>>>>>>> I wish to run the algorithm on M processors and have the piece > >>>>>>>> > >>>>> information > >>>>> > >>>>>>> transformed between the two filters (reader -> splitter), so that > >>>>>>> > >>> blocks > >>> > >>>>> are > >>>>> > >>>>>>> distributed correctly. The reader will Read N blocks (leaving M-N > >>>>>>> > >>>>> processes > >>>>> > >>>>>>> unoccupied), but the filter which splits them up needs to output a > >>>>>>> > >>>>> different > >>>>> > >>>>>>> number of pieces and have the full M processes receiving data. > >>>>>>> > >>>>>>>> I have a reasonably good idea of how to implement this, but I'm > >>>>>>>> > >>>>> wondering > >>>>> > >>>>>>> if any filters already do something similar. I will of course take > >>>>>>> > >>> apart > >>> > >>>>> the > >>>>> > >>>>>>> D3 filter for ideas, but I don't need to do a parallel spatial > >>>>>>> > >>>>> decomposition > >>>>> > >>>>>>> since my blocks are already discrete - I just want to > >>>>>>> redistribute the > >>>>>>> blocks around and more importantly change the numbers of them > >>>>>>> between > >>>>>>> filters. > >>>>>>> > >>>>>>>> If anyone can suggest examples which do this already, please do > >>>>>>>> > >>>>>>>> Thanks > >>>>>>>> > >>>>>>>> JB > >>>>>>>> > >>>>>>>> -- > >>>>>>>> John Biddiscombe, email:biddisco @ > >>>>>>>> > >>> cscs.ch > >>> > >>>>>>>> http://www.cscs.ch/ > >>>>>>>> CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) > >>>>>>>> > >>> 610.82.07 > >>> > >>>>>>>> Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) > >>>>>>>> > >>> 610.82.82 > >>> > >>>>>>>> _______________________________________________ > >>>>>>>> Powered by www.kitware.com > >>>>>>>> > >>>>>>>> Visit other Kitware open-source projects at > >>>>>>>> > >>>>>>> http://www.kitware.com/opensource/opensource.html > >>>>>>> > >>>>>>>> Please keep messages on-topic and check the ParaView Wiki at: > >>>>>>>> > >>>>>>> http://paraview.org/Wiki/ParaView > >>>>>>> > >>>>>>>> Follow this link to subscribe/unsubscribe: > >>>>>>>> http://www.paraview.org/mailman/listinfo/paraview > >>>>>>>> > >>>>>>>> > >> _______________________________________________ > >> Powered by www.kitware.com > >> > >> Visit other Kitware open-source projects at > >> http://www.kitware.com/opensource/opensource.html > >> > >> Please keep messages on-topic and check the ParaView Wiki at: > >> http://paraview.org/Wiki/ParaView > >> > >> Follow this link to subscribe/unsubscribe: > >> http://www.paraview.org/mailman/listinfo/paraview > >> > > _______________________________________________ Powered by www.kitware.com Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView Follow this link to subscribe/unsubscribe: http://www.paraview.org/mailman/listinfo/paraview
