Not quite, Josh - I fixed it in our branch. Will send you a revised patch that does the job off-list for your review.
Thanks Ralph On 5/9/08 9:35 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote: > Ok I think I understand the problem a bit better now. I attached a > patch that should fix this, but I want you to check it out before I > commit just to make sure. > > If you specify '-mca filter xml' on the command line then only the > 'xml' component should be opened by mca_base_open. The problem was > that the selection logic used -1 as the lowest acceptable priority, > which conflicted with the set priority of the 'xml' component. This > patch sets this value to INT32_MIN which should be well below any > negative priority that a component would set for itself. > > Let me know if this works for you and I'll commit it. > > Cheers, > Josh > > > > On May 9, 2008, at 11:14 AM, Ralph Castain wrote: > >> Sure - take a look at the hg repository Jeff and I are working on: >> >> http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/channel >> >> Te opal/mca/filter framework illustrates the problem. I have one >> component >> in there right now, with a default module defined in the base. That >> component must only be selected if the user calls it. With the current >> select logic, I can't do this - if the priority is >=0, then it >> always is >> automatically selected. Priority < 0, never selectable even if >> specified. >> >> Thanks >> Ralph >> >> >> >> On 5/9/08 8:52 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote: >> >>> Ralph, >>> >>> Can you give me an example of a component that I can look at? It will >>> allow me to test the fix before committing, and to better understand >>> the problem. >>> >>> -- Josh >>> >>> On May 9, 2008, at 10:41 AM, Ralph Castain wrote: >>> >>>> I just hit a problem with this logic - should be a minor change. >>>> >>>> We have several frameworks where we have components that are only >>>> allowed be >>>> selected if the user specifically requests them by stating -mca foo >>>> bar. >>>> Because it is possible for there to be no other components that want >>>> to be >>>> selected, and because it is permissible for no components to be >>>> selected for >>>> that framework, we set bar's priority to be -1. >>>> >>>> The new select logic will not allow a negative priority to be >>>> selected, even >>>> if the user specifically requested that component. >>>> >>>> If we set the priority to be 0, then the system will allow the >>>> component to >>>> be automatically selected. This is not allowed as it can lead to bad >>>> behavior. >>>> >>>> So what we need the select system to do is say "if someone >>>> specified a >>>> specific component, don't worry about the returned priority - just >>>> use it" >>>> >>>> Josh: could you please modify this? >>>> >>>> Thanks! >>>> Ralph >>>> >>>> >>>> >>>> On 5/8/08 7:04 PM, "Pak Lui" <pak....@sun.com> wrote: >>>> >>>>> Thanks very much Josh! Will try it out soon. >>>>> >>>>> Josh Hursey wrote: >>>>>> Sorry about that. I didn't test that type of option. It should be >>>>>> working in r18418. Let me know if you see any more issues. >>>>>> >>>>>> -- Josh >>>>>> >>>>>> On May 8, 2008, at 6:04 PM, Pak Lui wrote: >>>>>> >>>>>>> I think I have a problem but I am not sure. I used to be able to >>>>>>> use the >>>>>>> circumflex (^) to switch between the gridengine launcher and the >>>>>>> ssh >>>>>>> launchers by doing something like this, e.g. -mca plm >>>>>>> ^gridengine, to >>>>>>> exclude some of the components plm (and also in ras). It doesn't >>>>>>> seem >>>>>>> like the 'negate' is in mca_base_component anymore. I guess I >>>>>>> just have >>>>>>> to spell out which component I want explicitly... >>>>>>> >>>>>>> Josh Hursey wrote: >>>>>>>> This has been committed in r18381 >>>>>>>> >>>>>>>> Please let me know if you have any problems with this commit. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Josh >>>>>>>> >>>>>>>> On May 5, 2008, at 10:41 AM, Josh Hursey wrote: >>>>>>>> >>>>>>>>> Awesome. >>>>>>>>> >>>>>>>>> The branch is updated to the latest trunk head. I encourage >>>>>>>>> folks to >>>>>>>>> check out this repository and make sure that it builds on their >>>>>>>>> system. A normal build of the branch should be enough to find >>>>>>>>> out if >>>>>>>>> there are any cut-n-paste problems (though I tried to be >>>>>>>>> careful, >>>>>>>>> mistakes do happen). >>>>>>>>> >>>>>>>>> I haven't heard any problems so this is looking like it will >>>>>>>>> come in >>>>>>>>> tomorrow after the teleconf. I'll ask again there to see if >>>>>>>>> there are >>>>>>>>> any voices of concern. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Josh >>>>>>>>> >>>>>>>>> On May 5, 2008, at 9:58 AM, Jeff Squyres wrote: >>>>>>>>> >>>>>>>>>> This all sounds good to me! >>>>>>>>>> >>>>>>>>>> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote: >>>>>>>>>> >>>>>>>>>>> What: Add mca_base_select() and adjust frameworks & >>>>>>>>>>> components to >>>>>>>>>>> use >>>>>>>>>>> it. >>>>>>>>>>> Why: Consolidation of code for general goodness. >>>>>>>>>>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca- >>>>>>>>>>> play >>>>>>>>>>> When: Code ready now. Documentation ready soon. >>>>>>>>>>> Timeout: May 6, 2008 (After teleconf) [1 week] >>>>>>>>>>> >>>>>>>>>>> Discussion: >>>>>>>>>>> ----------- >>>>>>>>>>> For a number of years a few developers have been talking >>>>>>>>>>> about >>>>>>>>>>> creating a MCA base component selection function. For various >>>>>>>>>>> reasons >>>>>>>>>>> this was never implemented. Recently I decided to give it a >>>>>>>>>>> try. >>>>>>>>>>> >>>>>>>>>>> A base select function will allow Open MPI to provide >>>>>>>>>>> completely >>>>>>>>>>> consistent selection behavior for many of its frameworks (18 >>>>>>>>>>> of 31 >>>>>>>>>>> to >>>>>>>>>>> be exact at the moment). The primary goal of this work is to >>>>>>>>>>> improving >>>>>>>>>>> code maintainability through code reuse. Other benefits also >>>>>>>>>>> result >>>>>>>>>>> such as a slightly smaller memory footprint. >>>>>>>>>>> >>>>>>>>>>> The mca_base_select() function represented the most commonly >>>>>>>>>>> used >>>>>>>>>>> logic for component selection: Select the one component with >>>>>>>>>>> the >>>>>>>>>>> highest priority and close all of the not selected >>>>>>>>>>> components. This >>>>>>>>>>> function can be found at the path below in the branch: >>>>>>>>>>> opal/mca/base/mca_base_components_select.c >>>>>>>>>>> >>>>>>>>>>> To support this I had to formalize a query() function in the >>>>>>>>>>> mca_base_component_t of the form: >>>>>>>>>>> int mca_base_query_component_fn(mca_base_module_t **module, >>>>>>>>>>> int >>>>>>>>>>> *priority); >>>>>>>>>>> >>>>>>>>>>> This function is specified after the open and close component >>>>>>>>>>> functions in this structure as to allow compatibility with >>>>>>>>>>> frameworks >>>>>>>>>>> that do not use the base selection logic. Frameworks that do >>>>>>>>>>> *not* >>>>>>>>>>> use >>>>>>>>>>> this function are *not* effected by this commit. However, >>>>>>>>>>> every >>>>>>>>>>> component in the frameworks that use the mca_base_select >>>>>>>>>>> function >>>>>>>>>>> must >>>>>>>>>>> adjust their component query function to fit that specified >>>>>>>>>>> above. >>>>>>>>>>> >>>>>>>>>>> 18 frameworks in Open MPI have been changed. I have updated >>>>>>>>>>> all of >>>>>>>>>>> the >>>>>>>>>>> components in the 18 frameworks available in the trunk on my >>>>>>>>>>> branch. >>>>>>>>>>> The effected frameworks are: >>>>>>>>>>> - OPAL Carto >>>>>>>>>>> - OPAL crs >>>>>>>>>>> - OPAL maffinity >>>>>>>>>>> - OPAL memchecker >>>>>>>>>>> - OPAL paffinity >>>>>>>>>>> - ORTE errmgr >>>>>>>>>>> - ORTE ess >>>>>>>>>>> - ORTE Filem >>>>>>>>>>> - ORTE grpcomm >>>>>>>>>>> - ORTE odls >>>>>>>>>>> - ORTE pml >>>>>>>>>>> - ORTE ras >>>>>>>>>>> - ORTE rmaps >>>>>>>>>>> - ORTE routed >>>>>>>>>>> - ORTE snapc >>>>>>>>>>> - OMPI crcp >>>>>>>>>>> - OMPI dpm >>>>>>>>>>> - OMPI pubsub >>>>>>>>>>> >>>>>>>>>>> There was a question of the memory footprint change as a >>>>>>>>>>> result of >>>>>>>>>>> this commit. I used 'pmap' to determine process memory >>>>>>>>>>> footprint >>>>>>>>>>> of a >>>>>>>>>>> hello world MPI program. Static and Shared build numbers are >>>>>>>>>>> below >>>>>>>>>>> along with variations on launching locally and to a single >>>>>>>>>>> node >>>>>>>>>>> allocated by SLURM. All of this was on Indiana University's >>>>>>>>>>> Odin >>>>>>>>>>> machine. We compare against the trunk (r18276) representing >>>>>>>>>>> the last >>>>>>>>>>> SVN sync point of the branch. >>>>>>>>>>> >>>>>>>>>>> Process(shared)| Trunk | Branch | Diff (Improvement) >>>>>>>>>>> ---------------+----------+---------+------- >>>>>>>>>>> mpirun (orted) | 39976K | 36828K | 3148K >>>>>>>>>>> hello (0) | 229288K | 229268K | 20K >>>>>>>>>>> hello (1) | 229288K | 229268K | 20K >>>>>>>>>>> ---------------+----------+---------+------- >>>>>>>>>>> mpirun | 40032K | 37924K | 2108K >>>>>>>>>>> orted | 34720K | 34660K | 60K >>>>>>>>>>> hello (0) | 228404K | 228384K | 20K >>>>>>>>>>> hello (1) | 228404K | 228384K | 20K >>>>>>>>>>> >>>>>>>>>>> Process(static)| Trunk | Branch | Diff (Improvement) >>>>>>>>>>> ---------------+----------+---------+------- >>>>>>>>>>> mpirun (orted) | 21384K | 21372K | 12K >>>>>>>>>>> hello (0) | 194000K | 193980K | 20K >>>>>>>>>>> hello (1) | 194000K | 193980K | 20K >>>>>>>>>>> ---------------+----------+---------+------- >>>>>>>>>>> mpirun | 21384K | 21372K | 12K >>>>>>>>>>> orted | 21208K | 21196K | 12K >>>>>>>>>>> hello (0) | 193116K | 193096K | 20K >>>>>>>>>>> hello (1) | 193116K | 193096K | 20K >>>>>>>>>>> >>>>>>>>>>> As you can see there are some small memory footprint >>>>>>>>>>> improvements on >>>>>>>>>>> my branch that result from this work. The size of the Open >>>>>>>>>>> MPI >>>>>>>>>>> project >>>>>>>>>>> shrinks a bit as well. This commit cuts between 3,500 and >>>>>>>>>>> 2,000 >>>>>>>>>>> lines >>>>>>>>>>> of code (depending on how you count) so about a ~1% code >>>>>>>>>>> shrink. >>>>>>>>>>> >>>>>>>>>>> The branch is stable in all of the testing I have done, but >>>>>>>>>>> there >>>>>>>>>>> are >>>>>>>>>>> some platforms on which I cannot test. So please give this >>>>>>>>>>> branch a >>>>>>>>>>> try and let me know if you find any problems. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Josh >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Jeff Squyres >>>>>>>>>> Cisco Systems >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> de...@open-mpi.org >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> - Pak Lui >>>>>>> pak....@sun.com >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel