I believe Devendar Bureddy nailed the root cause. I am providing his excellent analysis below:
>From Devendar: with curiosity i looked at this issue. here's my 2 cents I think issue is because of BTL components is opened&closed twice(ompi_init, yoda) which leading to incorrect usage of var groups. The following sequence of events creating invalid memory 1) all openib component parameters registered in ompi_mpi_init main > start_pes> shmem_init -> oshmem_shmem_init -> ompi_mpi_init -> mca_base_framework_open -> mca_pml_base_open ..... mca_bml_base_open... -> btl_openib_component_register() * for all string variables it allocated a memory block (var->mbv_storage = PTR) At this time a new var group id:114 (of parent group id: 112) is created for all openib component variables. 2) This var group is de-registered in ompi_mpi_init. It marks all variables as invalid. but, the group&vars is still exist main > start_pes> shmem_init -> oshmem_shmem_init -> mca_pml_base_select -> mca_base_components_close -> ... -> mca_bml_base_close -> mca_base_framework_close -> mca_base_var_group_deregister(groupid: 114) * all string variables memory is deallocated ( set var->mbv_storage = NULL;) 3) because of step 2). btl_openib.so shared lib dlclosed 4) Now we are reopening openib in yoda and registering the openib variables again. main > start_pes> shmem_init > oshmem_shmem_init -> _shmem_init -> mca_base_framework_open -> mca_spml_base_open> mca_spml_yoda_component_open-> ..... mca_bml_base_open... -> btl_openib_component_register -> register_variables() * In register_variables(), var_find() finds this variable( from the same old group: 114) and reset the variables. * For string variables, it allocated the buffers again ( (var->mbv_storage = PTR) * note that group:114 is not belongs to yoda component. 5) In yoda component close, it never finds above group(114) because this is not belongs to this component. So, do not call mca_base_var_group_deregister() again on the var group. string var memory is not deallocated. main > start_pes> shmem_init > oshmem_shmem_init -> _shmem_init -> mca_spml_base_select ->..> mca_spml_yoda_component_close -> mca_bml_base_close -> mca_base_var_group_find(). 6) because of step 5), the btl_openib.so is dlclosed(). This step invalidates, all openib string vars memory ( var->mbv_storage = PTR) allocated in step 4) 7) in ompi_mpi_finalize(), it will loop through all vars and finalizes and deallocate the string var memory (var->mbv_storage = PTR) ompi_mpi_finalize >...> mca_base_var_finalize * var->mbv_storage = PTR is invalid at this stage and causing the SEGFAULT. This also explains why Dinar's patch, kostul_fix.patch (http://bgate.mellanox.com/redmine/attachments/1643/kostul_fix.patch), resolves the issue. His patch prevents you from finding the invalid already opened params. So, I see in a lot of these registration functions the signature has an entry for the project name, but now, NULL, is always passed. I see a note by Nathan in ../opal/mca/base/mca_base_var.c +1311 { /* XXX -- component_update -- We will stash the project name in the component */ return mca_base_var_register (NULL, component->mca_type_name, Seems knowing the project name, oshmem, would allow us to distinguish between the different BMLs. Nathan, please advise. Josh -----Original Message----- From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Nathan Hjelm Sent: Monday, December 16, 2013 12:44 PM To: Open MPI Developers Subject: Re: [OMPI devel] bug in mca framework? On Mon, Dec 16, 2013 at 05:21:05PM +0000, Joshua Ladd wrote: > After speaking with Igor Ivanov about this this morning, he summarized his > findings as follows: > > 1. Valgrind comes up clean. Thats good to hear but unfortunate since this seems really like a stomping-on-memory problem. > 2. The issue is not reproduced with a static build. This is a red-herring. The variable itself contains garbage. The mbv_storage pointer looked like it was on the stack, the name was not valid, etc. Not sure how we got an mca_base_var_t into that state since the only time we touch anything in them is in mca_base_var_finalize. That functions cleans up all of the state to two calls to it should be harmless. > 3. A bisection study reveals that problems first appear after commit: > https://svn.open-mpi.org/trac/ompi/changeset/28800/trunk/opal/mca/base > /mca_base_var.c Possibly also a coincidence. That commit only 1) moves the group stuff into its own file, and 2) adds the mca_base_pvar interface. Its possible I messed something up in the rest of the code but unlikely. I will take another look though. -Nathan > > > Josh > > -----Original Message----- > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff > Squyres (jsquyres) > Sent: Monday, December 16, 2013 12:15 PM > To: Open MPI Developers > Subject: Re: [OMPI devel] bug in mca framework? > > It might be worthwhile to run this through valgrind and see if something is > being freed incorrectly...? > > > On Dec 16, 2013, at 12:11 PM, Nathan Hjelm <hje...@lanl.gov> wrote: > > > I took a look at the stacktraces last week and could not identify > > where the bug is. I will dig deeper this week and see if I can come up with > > the correct fix. > > > > -Nathan > > > > On Mon, Dec 09, 2013 at 03:17:36PM +0200, Mike Dubman wrote: > >> Nathan, > >> Could you please comment on the Igor`s observations? > >> Thanks > >> > >> On Wed, Dec 4, 2013 at 4:44 PM, Igor Ivanov <igor.iva...@itseez.com> > >> wrote: > >> > >> On 04.12.2013 17:56, Jeff Squyres (jsquyres) wrote: > >> > >> On Dec 4, 2013, at 2:52 AM, Igor Ivanov <igor.iva...@itseez.com> > >> wrote: > >> > >> It is the first mca variable with type as string from btl/openib as > >> 'device_param_files'. Actually you can disable it and get failure > >> on > >> the second. > >> > >> Description of case we see: > >> 1. openib mca variables are registered during startup as stage at > >> select component phase; > >> 2. but a winner is cm component and openib mca variables are > >> deregistered as part of mca group; > >> 3. mca variables are not removed from global mca array but they > >> marked as invalid and memory for string is freed; > >> 4. shmem needs openib for yoda and does bml initialization; > >> 5. openib mca variables are registered againusing light mode as > >> searching itself in global array and refreshing their > >> fields again; > >> > >> Can you explain what you mean by step 5? I.e., what does "using > >> light > >> mode" mean? Is the openib component register function invoked again? > >> > >> It is correct, it is called twice. "light mode" means that > >> mca_base_var_register() does not allocate mca variable object again, it > >> seeks this variable in global array and finding it updates fields in > >> mca_base_var_t structure (at least mbv_storage). > >> > >> 6. for unknown reason bml finalization does not clean these vars as > >> it is done in step 2; > >> 7. mca_btl_openib.so is unloaded; > >> 8. opal_finalize() destroys mca variables form global array, > >> observes openib`s variable, try destroy using non accessed > >> address; > >> > >> So a code that is under discussion fixes step 6. > >> > >> Nathan: it sounds like an MCA var (and entire group) is registered, > >> unregistered, and then registered again. Does the MCA var system get > >> confused here when it tries to unregister the group a 2nd time? > >> > >> Probably issue relates incorrect recognition if variable valid/invalid > >> during second call of mca_base_var_deregister(). > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel