Hi, I tried r 18423 with rank_file component and got seqfault ( I increase priority of the component if rmaps_rank_file_path exist)
/home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun -np 4 -hostfile hostfile_ompi -mca rmaps_rank_file_path rankfile -mca paffinity_base_verbose 5 ./mpi_p_SMD -t bw -output 1 -order 1 [witch1:25456] mca:base:select: Querying component [linux] [witch1:25456] mca:base:select: Query of component [linux] set priority to 10 [witch1:25456] mca:base:select: Selected component [linux] [witch1:25456] *** Process received signal *** [witch1:25456] Signal: Segmentation fault (11) [witch1:25456] Signal code: Invalid permissions (2) [witch1:25456] Failing at address: 0x2b2875530030 [witch1:25456] [ 0] /lib64/libpthread.so.0 [0x2b28759dfc10] [witch1:25456] [ 1] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-pal.so.0 [0x2b28753e2bb6] [witch1:25456] [ 2] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-pal.so.0 [0x2b28753e23b6] [witch1:25456] [ 3] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-pal.so.0 [0x2b28753e22fd] [witch1:25456] [ 4] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-rte.so.0(orte_util_encode_pidmap+0x2f4) [0x2b287527f412] [witch1:25456] [ 5] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-rte.so.0(orte_odls_base_default_get_add_procs_data+0x989) [0x2b28752934f5] [witch1:25456] [ 6] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x1a3) [0x2b287529e60b] [witch1:25456] [ 7] /home/USERS/lenny/OMPI_ORTE_SMD/lib/openmpi/mca_plm_rsh.so [0x2b287612f788] [witch1:25456] [ 8] /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun [0x4032bf] [witch1:25456] [ 9] /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun [0x402b53] [witch1:25456] [10] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b2875b06154] [witch1:25456] [11] /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun [0x402aa9] [witch1:25456] *** End of error message *** Segmentation fault On Tue, May 6, 2008 at 9:09 PM, Josh Hursey <jjhur...@open-mpi.org> wrote: > This has been committed in r18381 > > Please let me know if you have any problems with this commit. > > Cheers, > Josh > > On May 5, 2008, at 10:41 AM, Josh Hursey wrote: > > > Awesome. > > > > The branch is updated to the latest trunk head. I encourage folks to > > check out this repository and make sure that it builds on their > > system. A normal build of the branch should be enough to find out if > > there are any cut-n-paste problems (though I tried to be careful, > > mistakes do happen). > > > > I haven't heard any problems so this is looking like it will come in > > tomorrow after the teleconf. I'll ask again there to see if there are > > any voices of concern. > > > > Cheers, > > Josh > > > > On May 5, 2008, at 9:58 AM, Jeff Squyres wrote: > > > >> This all sounds good to me! > >> > >> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote: > >> > >>> What: Add mca_base_select() and adjust frameworks & components to > >>> use > >>> it. > >>> Why: Consolidation of code for general goodness. > >>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play > >>> When: Code ready now. Documentation ready soon. > >>> Timeout: May 6, 2008 (After teleconf) [1 week] > >>> > >>> Discussion: > >>> ----------- > >>> For a number of years a few developers have been talking about > >>> creating a MCA base component selection function. For various > >>> reasons > >>> this was never implemented. Recently I decided to give it a try. > >>> > >>> A base select function will allow Open MPI to provide completely > >>> consistent selection behavior for many of its frameworks (18 of 31 > >>> to > >>> be exact at the moment). The primary goal of this work is to > >>> improving > >>> code maintainability through code reuse. Other benefits also result > >>> such as a slightly smaller memory footprint. > >>> > >>> The mca_base_select() function represented the most commonly used > >>> logic for component selection: Select the one component with the > >>> highest priority and close all of the not selected components. This > >>> function can be found at the path below in the branch: > >>> opal/mca/base/mca_base_components_select.c > >>> > >>> To support this I had to formalize a query() function in the > >>> mca_base_component_t of the form: > >>> int mca_base_query_component_fn(mca_base_module_t **module, int > >>> *priority); > >>> > >>> This function is specified after the open and close component > >>> functions in this structure as to allow compatibility with > >>> frameworks > >>> that do not use the base selection logic. Frameworks that do *not* > >>> use > >>> this function are *not* effected by this commit. However, every > >>> component in the frameworks that use the mca_base_select function > >>> must > >>> adjust their component query function to fit that specified above. > >>> > >>> 18 frameworks in Open MPI have been changed. I have updated all of > >>> the > >>> components in the 18 frameworks available in the trunk on my branch. > >>> The effected frameworks are: > >>> - OPAL Carto > >>> - OPAL crs > >>> - OPAL maffinity > >>> - OPAL memchecker > >>> - OPAL paffinity > >>> - ORTE errmgr > >>> - ORTE ess > >>> - ORTE Filem > >>> - ORTE grpcomm > >>> - ORTE odls > >>> - ORTE pml > >>> - ORTE ras > >>> - ORTE rmaps > >>> - ORTE routed > >>> - ORTE snapc > >>> - OMPI crcp > >>> - OMPI dpm > >>> - OMPI pubsub > >>> > >>> There was a question of the memory footprint change as a result of > >>> this commit. I used 'pmap' to determine process memory footprint > >>> of a > >>> hello world MPI program. Static and Shared build numbers are below > >>> along with variations on launching locally and to a single node > >>> allocated by SLURM. All of this was on Indiana University's Odin > >>> machine. We compare against the trunk (r18276) representing the last > >>> SVN sync point of the branch. > >>> > >>> Process(shared)| Trunk | Branch | Diff (Improvement) > >>> ---------------+----------+---------+------- > >>> mpirun (orted) | 39976K | 36828K | 3148K > >>> hello (0) | 229288K | 229268K | 20K > >>> hello (1) | 229288K | 229268K | 20K > >>> ---------------+----------+---------+------- > >>> mpirun | 40032K | 37924K | 2108K > >>> orted | 34720K | 34660K | 60K > >>> hello (0) | 228404K | 228384K | 20K > >>> hello (1) | 228404K | 228384K | 20K > >>> > >>> Process(static)| Trunk | Branch | Diff (Improvement) > >>> ---------------+----------+---------+------- > >>> mpirun (orted) | 21384K | 21372K | 12K > >>> hello (0) | 194000K | 193980K | 20K > >>> hello (1) | 194000K | 193980K | 20K > >>> ---------------+----------+---------+------- > >>> mpirun | 21384K | 21372K | 12K > >>> orted | 21208K | 21196K | 12K > >>> hello (0) | 193116K | 193096K | 20K > >>> hello (1) | 193116K | 193096K | 20K > >>> > >>> As you can see there are some small memory footprint improvements on > >>> my branch that result from this work. The size of the Open MPI > >>> project > >>> shrinks a bit as well. This commit cuts between 3,500 and 2,000 > >>> lines > >>> of code (depending on how you count) so about a ~1% code shrink. > >>> > >>> The branch is stable in all of the testing I have done, but there > >>> are > >>> some platforms on which I cannot test. So please give this branch a > >>> try and let me know if you find any problems. > >>> > >>> Cheers, > >>> Josh > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> > >> -- > >> Jeff Squyres > >> Cisco Systems > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >