There is no relation at all between ompi_proc_t and ompi_process_info_t. The 
ompi_proc_t is defined in the MPI layer and is used in that layer in various 
places very much like orte_proc_t is used in the ORTE layer.

If you look in ompi/mca/rte/orte/rte_orte.c, you'll see how we handle the 
revised function calls. Basically, we use the process name to retrieve the 
modex data via the opal_db, and then load a pointer to the hostname into the 
ompi_proc_t proc_hostname field. Thus, the definition of ompi_proc_t remains in 
the MPI layer.

So there was no need to change the ompi/mca/rte/rte.h file, nor to #define 
anything in the component .h file - just have to modify the wrapper code inside 
the RTE component itself.

HTH
Ralph


On Dec 18, 2013, at 1:50 PM, Thomas Naughton <naught...@ornl.gov> wrote:

> Hi Ralph,
> 
> Question about the MPI-RTE interface change in r29931.  The change was not
> reflected in the "ompi/mca/rte/rte.h" file.
> 
> I'm curious how the newly added "struct ompi_proc_t" relates to the "struct 
> ompi_process_info_t" that is described in the "rte.h" file?
> 
> I understand the general motivation for the API change but it is less clear
> to me how the information previously defined in the header changes (or does
> not change)?
> 
> Thanks,
> --tjn
> 
> _________________________________________________________________________
>  Thomas Naughton                                      naught...@ornl.gov
>  Research Associate                                   (865) 576-4184
> 
> 
> On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote:
> 
>> Author: rhc (Ralph Castain)
>> Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013)
>> New Revision: 29931
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/29931
>> 
>> Log:
>> Revert r29917 and replace it with a fix that resolves the thread deadlock 
>> while retaining the desired debug info. In an earlier commit, we had changed 
>> the modex accordingly:
>> 
>> * automatically retrieve the hostname (and all RTE info) for all procs 
>> during MPI_Init if nprocs < cutoff
>> 
>> * if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc 
>> upon the first call to modex_recv for that proc. This would provide the 
>> hostname for debugging purposes as we only report errors on messages, and so 
>> we must have called modex_recv to get the endpoint info
>> 
>> * BTLs are not to call modex_recv until they need the endpoint info for 
>> first message - i.e., not during add_procs so we don't call it for every 
>> process in the job, but only those with whom we communicate
>> 
>> My understanding is that only some BTLs have been modified to meet that 
>> third requirement, but those include the Cray ones where jobs are big enough 
>> that launch times were becoming an issue. Other BTLs would hopefully be 
>> modified as time went on and interest in using them at scale arose. 
>> Meantime, those BTLs would call modex_recv on every proc, and we would 
>> therefore be no worse than the prior behavior.
>> 
>> This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of 
>> the ompi_process_name_t for the proc so that the hostname can be easily 
>> inserted. I have advised the ORNL folks of the change.
>> 
>> cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock
>> 
>> Text files modified:
>>  trunk/ompi/mca/rte/orte/rte_orte.h        |     7 ++++---
>>  trunk/ompi/mca/rte/orte/rte_orte_module.c |    27 
>> ++++++++++++++++++---------
>>  trunk/ompi/proc/proc.c                    |    26 ++++++++++++++++++++++----
>>  trunk/ompi/runtime/ompi_module_exchange.c |    10 +++++-----
>>  4 files changed, 49 insertions(+), 21 deletions(-)
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to