Sure - take a look at the hg repository Jeff and I are working on: http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/channel
Te opal/mca/filter framework illustrates the problem. I have one component in there right now, with a default module defined in the base. That component must only be selected if the user calls it. With the current select logic, I can't do this - if the priority is >=0, then it always is automatically selected. Priority < 0, never selectable even if specified. Thanks Ralph On 5/9/08 8:52 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote: > Ralph, > > Can you give me an example of a component that I can look at? It will > allow me to test the fix before committing, and to better understand > the problem. > > -- Josh > > On May 9, 2008, at 10:41 AM, Ralph Castain wrote: > >> I just hit a problem with this logic - should be a minor change. >> >> We have several frameworks where we have components that are only >> allowed be >> selected if the user specifically requests them by stating -mca foo >> bar. >> Because it is possible for there to be no other components that want >> to be >> selected, and because it is permissible for no components to be >> selected for >> that framework, we set bar's priority to be -1. >> >> The new select logic will not allow a negative priority to be >> selected, even >> if the user specifically requested that component. >> >> If we set the priority to be 0, then the system will allow the >> component to >> be automatically selected. This is not allowed as it can lead to bad >> behavior. >> >> So what we need the select system to do is say "if someone specified a >> specific component, don't worry about the returned priority - just >> use it" >> >> Josh: could you please modify this? >> >> Thanks! >> Ralph >> >> >> >> On 5/8/08 7:04 PM, "Pak Lui" <pak....@sun.com> wrote: >> >>> Thanks very much Josh! Will try it out soon. >>> >>> Josh Hursey wrote: >>>> Sorry about that. I didn't test that type of option. It should be >>>> working in r18418. Let me know if you see any more issues. >>>> >>>> -- Josh >>>> >>>> On May 8, 2008, at 6:04 PM, Pak Lui wrote: >>>> >>>>> I think I have a problem but I am not sure. I used to be able to >>>>> use the >>>>> circumflex (^) to switch between the gridengine launcher and the >>>>> ssh >>>>> launchers by doing something like this, e.g. -mca plm >>>>> ^gridengine, to >>>>> exclude some of the components plm (and also in ras). It doesn't >>>>> seem >>>>> like the 'negate' is in mca_base_component anymore. I guess I >>>>> just have >>>>> to spell out which component I want explicitly... >>>>> >>>>> Josh Hursey wrote: >>>>>> This has been committed in r18381 >>>>>> >>>>>> Please let me know if you have any problems with this commit. >>>>>> >>>>>> Cheers, >>>>>> Josh >>>>>> >>>>>> On May 5, 2008, at 10:41 AM, Josh Hursey wrote: >>>>>> >>>>>>> Awesome. >>>>>>> >>>>>>> The branch is updated to the latest trunk head. I encourage >>>>>>> folks to >>>>>>> check out this repository and make sure that it builds on their >>>>>>> system. A normal build of the branch should be enough to find >>>>>>> out if >>>>>>> there are any cut-n-paste problems (though I tried to be careful, >>>>>>> mistakes do happen). >>>>>>> >>>>>>> I haven't heard any problems so this is looking like it will >>>>>>> come in >>>>>>> tomorrow after the teleconf. I'll ask again there to see if >>>>>>> there are >>>>>>> any voices of concern. >>>>>>> >>>>>>> Cheers, >>>>>>> Josh >>>>>>> >>>>>>> On May 5, 2008, at 9:58 AM, Jeff Squyres wrote: >>>>>>> >>>>>>>> This all sounds good to me! >>>>>>>> >>>>>>>> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote: >>>>>>>> >>>>>>>>> What: Add mca_base_select() and adjust frameworks & >>>>>>>>> components to >>>>>>>>> use >>>>>>>>> it. >>>>>>>>> Why: Consolidation of code for general goodness. >>>>>>>>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca- >>>>>>>>> play >>>>>>>>> When: Code ready now. Documentation ready soon. >>>>>>>>> Timeout: May 6, 2008 (After teleconf) [1 week] >>>>>>>>> >>>>>>>>> Discussion: >>>>>>>>> ----------- >>>>>>>>> For a number of years a few developers have been talking about >>>>>>>>> creating a MCA base component selection function. For various >>>>>>>>> reasons >>>>>>>>> this was never implemented. Recently I decided to give it a >>>>>>>>> try. >>>>>>>>> >>>>>>>>> A base select function will allow Open MPI to provide >>>>>>>>> completely >>>>>>>>> consistent selection behavior for many of its frameworks (18 >>>>>>>>> of 31 >>>>>>>>> to >>>>>>>>> be exact at the moment). The primary goal of this work is to >>>>>>>>> improving >>>>>>>>> code maintainability through code reuse. Other benefits also >>>>>>>>> result >>>>>>>>> such as a slightly smaller memory footprint. >>>>>>>>> >>>>>>>>> The mca_base_select() function represented the most commonly >>>>>>>>> used >>>>>>>>> logic for component selection: Select the one component with >>>>>>>>> the >>>>>>>>> highest priority and close all of the not selected >>>>>>>>> components. This >>>>>>>>> function can be found at the path below in the branch: >>>>>>>>> opal/mca/base/mca_base_components_select.c >>>>>>>>> >>>>>>>>> To support this I had to formalize a query() function in the >>>>>>>>> mca_base_component_t of the form: >>>>>>>>> int mca_base_query_component_fn(mca_base_module_t **module, int >>>>>>>>> *priority); >>>>>>>>> >>>>>>>>> This function is specified after the open and close component >>>>>>>>> functions in this structure as to allow compatibility with >>>>>>>>> frameworks >>>>>>>>> that do not use the base selection logic. Frameworks that do >>>>>>>>> *not* >>>>>>>>> use >>>>>>>>> this function are *not* effected by this commit. However, every >>>>>>>>> component in the frameworks that use the mca_base_select >>>>>>>>> function >>>>>>>>> must >>>>>>>>> adjust their component query function to fit that specified >>>>>>>>> above. >>>>>>>>> >>>>>>>>> 18 frameworks in Open MPI have been changed. I have updated >>>>>>>>> all of >>>>>>>>> the >>>>>>>>> components in the 18 frameworks available in the trunk on my >>>>>>>>> branch. >>>>>>>>> The effected frameworks are: >>>>>>>>> - OPAL Carto >>>>>>>>> - OPAL crs >>>>>>>>> - OPAL maffinity >>>>>>>>> - OPAL memchecker >>>>>>>>> - OPAL paffinity >>>>>>>>> - ORTE errmgr >>>>>>>>> - ORTE ess >>>>>>>>> - ORTE Filem >>>>>>>>> - ORTE grpcomm >>>>>>>>> - ORTE odls >>>>>>>>> - ORTE pml >>>>>>>>> - ORTE ras >>>>>>>>> - ORTE rmaps >>>>>>>>> - ORTE routed >>>>>>>>> - ORTE snapc >>>>>>>>> - OMPI crcp >>>>>>>>> - OMPI dpm >>>>>>>>> - OMPI pubsub >>>>>>>>> >>>>>>>>> There was a question of the memory footprint change as a >>>>>>>>> result of >>>>>>>>> this commit. I used 'pmap' to determine process memory >>>>>>>>> footprint >>>>>>>>> of a >>>>>>>>> hello world MPI program. Static and Shared build numbers are >>>>>>>>> below >>>>>>>>> along with variations on launching locally and to a single node >>>>>>>>> allocated by SLURM. All of this was on Indiana University's >>>>>>>>> Odin >>>>>>>>> machine. We compare against the trunk (r18276) representing >>>>>>>>> the last >>>>>>>>> SVN sync point of the branch. >>>>>>>>> >>>>>>>>> Process(shared)| Trunk | Branch | Diff (Improvement) >>>>>>>>> ---------------+----------+---------+------- >>>>>>>>> mpirun (orted) | 39976K | 36828K | 3148K >>>>>>>>> hello (0) | 229288K | 229268K | 20K >>>>>>>>> hello (1) | 229288K | 229268K | 20K >>>>>>>>> ---------------+----------+---------+------- >>>>>>>>> mpirun | 40032K | 37924K | 2108K >>>>>>>>> orted | 34720K | 34660K | 60K >>>>>>>>> hello (0) | 228404K | 228384K | 20K >>>>>>>>> hello (1) | 228404K | 228384K | 20K >>>>>>>>> >>>>>>>>> Process(static)| Trunk | Branch | Diff (Improvement) >>>>>>>>> ---------------+----------+---------+------- >>>>>>>>> mpirun (orted) | 21384K | 21372K | 12K >>>>>>>>> hello (0) | 194000K | 193980K | 20K >>>>>>>>> hello (1) | 194000K | 193980K | 20K >>>>>>>>> ---------------+----------+---------+------- >>>>>>>>> mpirun | 21384K | 21372K | 12K >>>>>>>>> orted | 21208K | 21196K | 12K >>>>>>>>> hello (0) | 193116K | 193096K | 20K >>>>>>>>> hello (1) | 193116K | 193096K | 20K >>>>>>>>> >>>>>>>>> As you can see there are some small memory footprint >>>>>>>>> improvements on >>>>>>>>> my branch that result from this work. The size of the Open MPI >>>>>>>>> project >>>>>>>>> shrinks a bit as well. This commit cuts between 3,500 and 2,000 >>>>>>>>> lines >>>>>>>>> of code (depending on how you count) so about a ~1% code >>>>>>>>> shrink. >>>>>>>>> >>>>>>>>> The branch is stable in all of the testing I have done, but >>>>>>>>> there >>>>>>>>> are >>>>>>>>> some platforms on which I cannot test. So please give this >>>>>>>>> branch a >>>>>>>>> try and let me know if you find any problems. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Josh >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>>> -- >>>>>>>> Jeff Squyres >>>>>>>> Cisco Systems >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>>> >>>>> -- >>>>> >>>>> - Pak Lui >>>>> pak....@sun.com >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel