Re: [OMPI devel] Q: MPI-RTE / ompi_proc_t vs. ompi_process_info_t ?

2013-12-18 Thread Thomas Naughton

Hi Ralph,

OK, thanks for clarification and code pointers. 
I'll update "rte.h" to reflect the updates.


Thanks,
--tjn

 _
  Thomas Naughton  naught...@ornl.gov
  Research Associate   (865) 576-4184


On Wed, 18 Dec 2013, Ralph Castain wrote:


There is no relation at all between ompi_proc_t and ompi_process_info_t. The 
ompi_proc_t is defined in the MPI layer and is used in that layer in various 
places very much like orte_proc_t is used in the ORTE layer.

If you look in ompi/mca/rte/orte/rte_orte.c, you'll see how we handle the 
revised function calls. Basically, we use the process name to retrieve the 
modex data via the opal_db, and then load a pointer to the hostname into the 
ompi_proc_t proc_hostname field. Thus, the definition of ompi_proc_t remains in 
the MPI layer.

So there was no need to change the ompi/mca/rte/rte.h file, nor to #define 
anything in the component .h file - just have to modify the wrapper code inside 
the RTE component itself.

HTH
Ralph


On Dec 18, 2013, at 1:50 PM, Thomas Naughton  wrote:


Hi Ralph,

Question about the MPI-RTE interface change in r29931.  The change was not
reflected in the "ompi/mca/rte/rte.h" file.

I'm curious how the newly added "struct ompi_proc_t" relates to the "struct 
ompi_process_info_t" that is described in the "rte.h" file?

I understand the general motivation for the API change but it is less clear
to me how the information previously defined in the header changes (or does
not change)?

Thanks,
--tjn

_
 Thomas Naughton  naught...@ornl.gov
 Research Associate   (865) 576-4184


On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote:


Author: rhc (Ralph Castain)
Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013)
New Revision: 29931
URL: https://svn.open-mpi.org/trac/ompi/changeset/29931

Log:
Revert r29917 and replace it with a fix that resolves the thread deadlock while 
retaining the desired debug info. In an earlier commit, we had changed the 
modex accordingly:

* automatically retrieve the hostname (and all RTE info) for all procs during 
MPI_Init if nprocs < cutoff

* if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc upon 
the first call to modex_recv for that proc. This would provide the hostname for 
debugging purposes as we only report errors on messages, and so we must have 
called modex_recv to get the endpoint info

* BTLs are not to call modex_recv until they need the endpoint info for first 
message - i.e., not during add_procs so we don't call it for every process in 
the job, but only those with whom we communicate

My understanding is that only some BTLs have been modified to meet that third 
requirement, but those include the Cray ones where jobs are big enough that 
launch times were becoming an issue. Other BTLs would hopefully be modified as 
time went on and interest in using them at scale arose. Meantime, those BTLs 
would call modex_recv on every proc, and we would therefore be no worse than 
the prior behavior.

This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of 
the ompi_process_name_t for the proc so that the hostname can be easily 
inserted. I have advised the ORNL folks of the change.

cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock

Text files modified:
 trunk/ompi/mca/rte/orte/rte_orte.h| 7 ---
 trunk/ompi/mca/rte/orte/rte_orte_module.c |27 ++-
 trunk/ompi/proc/proc.c|26 ++
 trunk/ompi/runtime/ompi_module_exchange.c |10 +-
 4 files changed, 49 insertions(+), 21 deletions(-)


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] Q: MPI-RTE / ompi_proc_t vs. ompi_process_info_t ?

2013-12-18 Thread Ralph Castain
There is no relation at all between ompi_proc_t and ompi_process_info_t. The 
ompi_proc_t is defined in the MPI layer and is used in that layer in various 
places very much like orte_proc_t is used in the ORTE layer.

If you look in ompi/mca/rte/orte/rte_orte.c, you'll see how we handle the 
revised function calls. Basically, we use the process name to retrieve the 
modex data via the opal_db, and then load a pointer to the hostname into the 
ompi_proc_t proc_hostname field. Thus, the definition of ompi_proc_t remains in 
the MPI layer.

So there was no need to change the ompi/mca/rte/rte.h file, nor to #define 
anything in the component .h file - just have to modify the wrapper code inside 
the RTE component itself.

HTH
Ralph


On Dec 18, 2013, at 1:50 PM, Thomas Naughton  wrote:

> Hi Ralph,
> 
> Question about the MPI-RTE interface change in r29931.  The change was not
> reflected in the "ompi/mca/rte/rte.h" file.
> 
> I'm curious how the newly added "struct ompi_proc_t" relates to the "struct 
> ompi_process_info_t" that is described in the "rte.h" file?
> 
> I understand the general motivation for the API change but it is less clear
> to me how the information previously defined in the header changes (or does
> not change)?
> 
> Thanks,
> --tjn
> 
> _
>  Thomas Naughton  naught...@ornl.gov
>  Research Associate   (865) 576-4184
> 
> 
> On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote:
> 
>> Author: rhc (Ralph Castain)
>> Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013)
>> New Revision: 29931
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/29931
>> 
>> Log:
>> Revert r29917 and replace it with a fix that resolves the thread deadlock 
>> while retaining the desired debug info. In an earlier commit, we had changed 
>> the modex accordingly:
>> 
>> * automatically retrieve the hostname (and all RTE info) for all procs 
>> during MPI_Init if nprocs < cutoff
>> 
>> * if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc 
>> upon the first call to modex_recv for that proc. This would provide the 
>> hostname for debugging purposes as we only report errors on messages, and so 
>> we must have called modex_recv to get the endpoint info
>> 
>> * BTLs are not to call modex_recv until they need the endpoint info for 
>> first message - i.e., not during add_procs so we don't call it for every 
>> process in the job, but only those with whom we communicate
>> 
>> My understanding is that only some BTLs have been modified to meet that 
>> third requirement, but those include the Cray ones where jobs are big enough 
>> that launch times were becoming an issue. Other BTLs would hopefully be 
>> modified as time went on and interest in using them at scale arose. 
>> Meantime, those BTLs would call modex_recv on every proc, and we would 
>> therefore be no worse than the prior behavior.
>> 
>> This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of 
>> the ompi_process_name_t for the proc so that the hostname can be easily 
>> inserted. I have advised the ORNL folks of the change.
>> 
>> cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock
>> 
>> Text files modified:
>>  trunk/ompi/mca/rte/orte/rte_orte.h| 7 ---
>>  trunk/ompi/mca/rte/orte/rte_orte_module.c |27 
>> ++-
>>  trunk/ompi/proc/proc.c|26 ++
>>  trunk/ompi/runtime/ompi_module_exchange.c |10 +-
>>  4 files changed, 49 insertions(+), 21 deletions(-)
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



[OMPI devel] Q: MPI-RTE / ompi_proc_t vs. ompi_process_info_t ?

2013-12-18 Thread Thomas Naughton

Hi Ralph,

Question about the MPI-RTE interface change in r29931.  The change was not
reflected in the "ompi/mca/rte/rte.h" file.

I'm curious how the newly added "struct ompi_proc_t" relates to the 
"struct ompi_process_info_t" that is described in the "rte.h" file?


I understand the general motivation for the API change but it is less clear
to me how the information previously defined in the header changes (or does
not change)?

Thanks,
--tjn

 _
  Thomas Naughton  naught...@ornl.gov
  Research Associate   (865) 576-4184


On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote:


Author: rhc (Ralph Castain)
Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013)
New Revision: 29931
URL: https://svn.open-mpi.org/trac/ompi/changeset/29931

Log:
Revert r29917 and replace it with a fix that resolves the thread deadlock while 
retaining the desired debug info. In an earlier commit, we had changed the 
modex accordingly:

* automatically retrieve the hostname (and all RTE info) for all procs during 
MPI_Init if nprocs < cutoff

* if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc upon 
the first call to modex_recv for that proc. This would provide the hostname for 
debugging purposes as we only report errors on messages, and so we must have 
called modex_recv to get the endpoint info

* BTLs are not to call modex_recv until they need the endpoint info for first 
message - i.e., not during add_procs so we don't call it for every process in 
the job, but only those with whom we communicate

My understanding is that only some BTLs have been modified to meet that third 
requirement, but those include the Cray ones where jobs are big enough that 
launch times were becoming an issue. Other BTLs would hopefully be modified as 
time went on and interest in using them at scale arose. Meantime, those BTLs 
would call modex_recv on every proc, and we would therefore be no worse than 
the prior behavior.

This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of 
the ompi_process_name_t for the proc so that the hostname can be easily 
inserted. I have advised the ORNL folks of the change.

cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock

Text files modified:
  trunk/ompi/mca/rte/orte/rte_orte.h| 7 ---
  trunk/ompi/mca/rte/orte/rte_orte_module.c |27 ++-
  trunk/ompi/proc/proc.c|26 ++
  trunk/ompi/runtime/ompi_module_exchange.c |10 +-
  4 files changed, 49 insertions(+), 21 deletions(-)