Re: [OMPI devel] Q: MPI-RTE / ompi_proc_t vs. ompi_process_info_t ?
Hi Ralph, OK, thanks for clarification and code pointers. I'll update "rte.h" to reflect the updates. Thanks, --tjn _ Thomas Naughton naught...@ornl.gov Research Associate (865) 576-4184 On Wed, 18 Dec 2013, Ralph Castain wrote: There is no relation at all between ompi_proc_t and ompi_process_info_t. The ompi_proc_t is defined in the MPI layer and is used in that layer in various places very much like orte_proc_t is used in the ORTE layer. If you look in ompi/mca/rte/orte/rte_orte.c, you'll see how we handle the revised function calls. Basically, we use the process name to retrieve the modex data via the opal_db, and then load a pointer to the hostname into the ompi_proc_t proc_hostname field. Thus, the definition of ompi_proc_t remains in the MPI layer. So there was no need to change the ompi/mca/rte/rte.h file, nor to #define anything in the component .h file - just have to modify the wrapper code inside the RTE component itself. HTH Ralph On Dec 18, 2013, at 1:50 PM, Thomas Naughton wrote: Hi Ralph, Question about the MPI-RTE interface change in r29931. The change was not reflected in the "ompi/mca/rte/rte.h" file. I'm curious how the newly added "struct ompi_proc_t" relates to the "struct ompi_process_info_t" that is described in the "rte.h" file? I understand the general motivation for the API change but it is less clear to me how the information previously defined in the header changes (or does not change)? Thanks, --tjn _ Thomas Naughton naught...@ornl.gov Research Associate (865) 576-4184 On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote: Author: rhc (Ralph Castain) Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013) New Revision: 29931 URL: https://svn.open-mpi.org/trac/ompi/changeset/29931 Log: Revert r29917 and replace it with a fix that resolves the thread deadlock while retaining the desired debug info. In an earlier commit, we had changed the modex accordingly: * automatically retrieve the hostname (and all RTE info) for all procs during MPI_Init if nprocs < cutoff * if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc upon the first call to modex_recv for that proc. This would provide the hostname for debugging purposes as we only report errors on messages, and so we must have called modex_recv to get the endpoint info * BTLs are not to call modex_recv until they need the endpoint info for first message - i.e., not during add_procs so we don't call it for every process in the job, but only those with whom we communicate My understanding is that only some BTLs have been modified to meet that third requirement, but those include the Cray ones where jobs are big enough that launch times were becoming an issue. Other BTLs would hopefully be modified as time went on and interest in using them at scale arose. Meantime, those BTLs would call modex_recv on every proc, and we would therefore be no worse than the prior behavior. This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of the ompi_process_name_t for the proc so that the hostname can be easily inserted. I have advised the ORNL folks of the change. cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock Text files modified: trunk/ompi/mca/rte/orte/rte_orte.h| 7 --- trunk/ompi/mca/rte/orte/rte_orte_module.c |27 ++- trunk/ompi/proc/proc.c|26 ++ trunk/ompi/runtime/ompi_module_exchange.c |10 +- 4 files changed, 49 insertions(+), 21 deletions(-) ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Recommended tool to measure packet counters
Ah got it ! Thanks -- Sid On 18 December 2013 07:44, Jeff Squyres (jsquyres) wrote: > On Dec 14, 2013, at 8:02 AM, Siddhartha Jana > wrote: > > > Is there a preferred method/tool among developers of MPI-library for > checking the count of the packets transmitted by the network card during > two-sided communication? > > > > Is the use of > > iptables -I INPUT -i eth0 > > iptables -I OUTPUT -o eth0 > > > > recommended ? > > If you're using an ethernet, non-OS-bypass transport (e.g., TCP), you > might also want to look at ethtool. > > Note that these counts will include control messages sent by Open MPI, too > -- not just raw MPI traffic. They also will not include any traffic sent > across shared memory (or other transports). > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] RFC: remove opal progress recursion depth counter
Opps, yeah. Meant 1.7.5. If people agree with this change I could possibly slip it in before Friday but that is unlikely. On Wed, Dec 18, 2013 at 03:32:36PM -0800, Ralph Castain wrote: > U1.7.4 is leaving the station on Fri, Nathan, so next Tues => will > have to go into 1.7.5 > > > On Dec 18, 2013, at 3:23 PM, Nathan Hjelm wrote: > > > What: Remove the opal_progress_recursion_depth_counter from > > opal_progress. > > > > Why: This counter adds two atomic adds to the critical path when > > OPAL_HAVE_THREADS is set (which is the case for most builds). I grepped > > through ompi, orte, and opal to find where this value was being used and > > did not find anything either inside or outside opal_progress. > > > > When: I want this change to go into 1.7.4 (if possible) so setting a > > quick timeout for next Tuesday. > > > > Let me know if there is a good reason to keep this counter and it will > > be spared. > > > > -Nathan Hjelm > > HPC-5, LANL > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel pgp5W1crmdsfc.pgp Description: PGP signature
Re: [OMPI devel] RFC: remove opal progress recursion depth counter
U1.7.4 is leaving the station on Fri, Nathan, so next Tues => will have to go into 1.7.5 On Dec 18, 2013, at 3:23 PM, Nathan Hjelm wrote: > What: Remove the opal_progress_recursion_depth_counter from > opal_progress. > > Why: This counter adds two atomic adds to the critical path when > OPAL_HAVE_THREADS is set (which is the case for most builds). I grepped > through ompi, orte, and opal to find where this value was being used and > did not find anything either inside or outside opal_progress. > > When: I want this change to go into 1.7.4 (if possible) so setting a > quick timeout for next Tuesday. > > Let me know if there is a good reason to keep this counter and it will > be spared. > > -Nathan Hjelm > HPC-5, LANL > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] RFC: remove opal progress recursion depth counter
What: Remove the opal_progress_recursion_depth_counter from opal_progress. Why: This counter adds two atomic adds to the critical path when OPAL_HAVE_THREADS is set (which is the case for most builds). I grepped through ompi, orte, and opal to find where this value was being used and did not find anything either inside or outside opal_progress. When: I want this change to go into 1.7.4 (if possible) so setting a quick timeout for next Tuesday. Let me know if there is a good reason to keep this counter and it will be spared. -Nathan Hjelm HPC-5, LANL pgpwYhikrOjnJ.pgp Description: PGP signature
Re: [OMPI devel] Q: MPI-RTE / ompi_proc_t vs. ompi_process_info_t ?
There is no relation at all between ompi_proc_t and ompi_process_info_t. The ompi_proc_t is defined in the MPI layer and is used in that layer in various places very much like orte_proc_t is used in the ORTE layer. If you look in ompi/mca/rte/orte/rte_orte.c, you'll see how we handle the revised function calls. Basically, we use the process name to retrieve the modex data via the opal_db, and then load a pointer to the hostname into the ompi_proc_t proc_hostname field. Thus, the definition of ompi_proc_t remains in the MPI layer. So there was no need to change the ompi/mca/rte/rte.h file, nor to #define anything in the component .h file - just have to modify the wrapper code inside the RTE component itself. HTH Ralph On Dec 18, 2013, at 1:50 PM, Thomas Naughton wrote: > Hi Ralph, > > Question about the MPI-RTE interface change in r29931. The change was not > reflected in the "ompi/mca/rte/rte.h" file. > > I'm curious how the newly added "struct ompi_proc_t" relates to the "struct > ompi_process_info_t" that is described in the "rte.h" file? > > I understand the general motivation for the API change but it is less clear > to me how the information previously defined in the header changes (or does > not change)? > > Thanks, > --tjn > > _ > Thomas Naughton naught...@ornl.gov > Research Associate (865) 576-4184 > > > On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote: > >> Author: rhc (Ralph Castain) >> Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013) >> New Revision: 29931 >> URL: https://svn.open-mpi.org/trac/ompi/changeset/29931 >> >> Log: >> Revert r29917 and replace it with a fix that resolves the thread deadlock >> while retaining the desired debug info. In an earlier commit, we had changed >> the modex accordingly: >> >> * automatically retrieve the hostname (and all RTE info) for all procs >> during MPI_Init if nprocs < cutoff >> >> * if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc >> upon the first call to modex_recv for that proc. This would provide the >> hostname for debugging purposes as we only report errors on messages, and so >> we must have called modex_recv to get the endpoint info >> >> * BTLs are not to call modex_recv until they need the endpoint info for >> first message - i.e., not during add_procs so we don't call it for every >> process in the job, but only those with whom we communicate >> >> My understanding is that only some BTLs have been modified to meet that >> third requirement, but those include the Cray ones where jobs are big enough >> that launch times were becoming an issue. Other BTLs would hopefully be >> modified as time went on and interest in using them at scale arose. >> Meantime, those BTLs would call modex_recv on every proc, and we would >> therefore be no worse than the prior behavior. >> >> This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of >> the ompi_process_name_t for the proc so that the hostname can be easily >> inserted. I have advised the ORNL folks of the change. >> >> cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock >> >> Text files modified: >> trunk/ompi/mca/rte/orte/rte_orte.h| 7 --- >> trunk/ompi/mca/rte/orte/rte_orte_module.c |27 >> ++- >> trunk/ompi/proc/proc.c|26 ++ >> trunk/ompi/runtime/ompi_module_exchange.c |10 +- >> 4 files changed, 49 insertions(+), 21 deletions(-) > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Q: MPI-RTE / ompi_proc_t vs. ompi_process_info_t ?
Hi Ralph, Question about the MPI-RTE interface change in r29931. The change was not reflected in the "ompi/mca/rte/rte.h" file. I'm curious how the newly added "struct ompi_proc_t" relates to the "struct ompi_process_info_t" that is described in the "rte.h" file? I understand the general motivation for the API change but it is less clear to me how the information previously defined in the header changes (or does not change)? Thanks, --tjn _ Thomas Naughton naught...@ornl.gov Research Associate (865) 576-4184 On Mon, 16 Dec 2013, svn-commit-mai...@open-mpi.org wrote: Author: rhc (Ralph Castain) Date: 2013-12-16 22:26:00 EST (Mon, 16 Dec 2013) New Revision: 29931 URL: https://svn.open-mpi.org/trac/ompi/changeset/29931 Log: Revert r29917 and replace it with a fix that resolves the thread deadlock while retaining the desired debug info. In an earlier commit, we had changed the modex accordingly: * automatically retrieve the hostname (and all RTE info) for all procs during MPI_Init if nprocs < cutoff * if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc upon the first call to modex_recv for that proc. This would provide the hostname for debugging purposes as we only report errors on messages, and so we must have called modex_recv to get the endpoint info * BTLs are not to call modex_recv until they need the endpoint info for first message - i.e., not during add_procs so we don't call it for every process in the job, but only those with whom we communicate My understanding is that only some BTLs have been modified to meet that third requirement, but those include the Cray ones where jobs are big enough that launch times were becoming an issue. Other BTLs would hopefully be modified as time went on and interest in using them at scale arose. Meantime, those BTLs would call modex_recv on every proc, and we would therefore be no worse than the prior behavior. This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of the ompi_process_name_t for the proc so that the hostname can be easily inserted. I have advised the ORNL folks of the change. cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock Text files modified: trunk/ompi/mca/rte/orte/rte_orte.h| 7 --- trunk/ompi/mca/rte/orte/rte_orte_module.c |27 ++- trunk/ompi/proc/proc.c|26 ++ trunk/ompi/runtime/ompi_module_exchange.c |10 +- 4 files changed, 49 insertions(+), 21 deletions(-)
[OMPI devel] Problem with memory in mpi program
My program it is with MPI and OpenMP, and is a sample program take much memory, I don't know the memory RAM consume for a mpi program and I want to know if mpi consume a lot of memory when if used together openmp or I doing something wrong, for take memory Ram of mi program I used a file /proc/id_proc/stat, where id_proc if the id of my process. This is my example program: #include #include "mpi.h" #include #include #include int main(int argc, char** argv){ int my_rank; /* rank of process */ int p; /* number of processes */ MPI_Init_thread(&argc, &argv, MPI::THREAD_MULTIPLE,&argc); /* find out process rank */ MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); /* find out number of processes */ MPI_Comm_size(MPI_COMM_WORLD, &p); char cad[4]; MPI_Status status; omp_set_num_threads(2); #pragma omp parallel { int h = omp_get_thread_num(); if(h==0){ MPI_Send(&cad, 1, MPI::CHAR, my_rank, 11,MPI_COMM_WORLD); } else{ std::vector all(2,0); MPI_Recv(&cad, 2, MPI::CHAR, MPI::ANY_SOURCE, MPI::ANY_TAG,MPI_COMM_WORLD, &status); } } /* shut down MPI */ MPI_Finalize(); return 0; } Compile: mpic++ -fopenmp -fno-threadsafe-statics -o sample_program sample_program.c Run mpirun sample_program and the memory consume: 190MB Please I need help is very important to me get a low memory consume
Re: [OMPI devel] Bus error with openmpi-1.7.4rc1 on Solaris
Found the problem. Was accessing a boolean variable using intval. That is a bug that has gone unnoticed on all platforms but thankfully Solaris caught it. Please try the attached patch. -Nathan On Wed, Dec 18, 2013 at 12:27:29PM +0100, Siegmar Gross wrote: > Hi, > > today I installed openmpi-1.7.4rc1 on Solaris 10 Sparc with Sun > C 5.12. Unfortunately my problems with bus errors, which I reported > December 4th for openmpi-1.7.4a1r29784 at us...@open-mpi.org, are > not solved yet. Has somebody time to look into that matter or is > Solaris support abandoned, so that I have to stay with openmpi-1.6.x > in the future? Thank you very much for any help in advance. > > > Kind regards > > Siegmar > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel diff --git a/opal/mca/base/mca_base_var.c b/opal/mca/base/mca_base_var.c index 7b55eb8..c043c06 100644 --- a/opal/mca/base/mca_base_var.c +++ b/opal/mca/base/mca_base_var.c @@ -1682,7 +1682,11 @@ static int var_value_string (mca_base_var_t *var, char **value_string) ret = (0 > ret) ? OPAL_ERR_OUT_OF_RESOURCE : OPAL_SUCCESS; } else { -ret = var->mbv_enumerator->string_from_value(var->mbv_enumerator, value->intval, &tmp); +if (MCA_BASE_VAR_TYPE_BOOL == var->mbv_type) { +ret = var->mbv_enumerator->string_from_value(var->mbv_enumerator, value->boolval, &tmp); +} else { +ret = var->mbv_enumerator->string_from_value(var->mbv_enumerator, value->intval, &tmp); +} *value_string = strdup (tmp); if (NULL == value_string) { pgpFNtma5UKPz.pgp Description: PGP signature
Re: [OMPI devel] Bus error with openmpi-1.7.4rc1 on Solaris
Siegmar -- Thanks for keeping us honest! I just filed three tickets with the issues you reported: https://svn.open-mpi.org/trac/ompi/ticket/3988 https://svn.open-mpi.org/trac/ompi/ticket/3989 https://svn.open-mpi.org/trac/ompi/ticket/3990 On Dec 18, 2013, at 6:27 AM, Siegmar Gross wrote: > Hi, > > today I installed openmpi-1.7.4rc1 on Solaris 10 Sparc with Sun > C 5.12. Unfortunately my problems with bus errors, which I reported > December 4th for openmpi-1.7.4a1r29784 at us...@open-mpi.org, are > not solved yet. Has somebody time to look into that matter or is > Solaris support abandoned, so that I have to stay with openmpi-1.6.x > in the future? Thank you very much for any help in advance. > > > Kind regards > > Siegmar > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] [PATCH v2 2/2] Trying to get the C/R code to compile again. (send_*_nb)
In the case of the send, there really isn't any problem with just replacing things - the non-blocking change won't impact anything, so no need to retain the old code. People were only concerned about the recv's as those places will require further repair, and they wanted to ensure we know where those places are located. You also need to change those comparisons, however, as the return code isn't the number of bytes sent any more - it is just ORTE_SUCCESS or else an error code, so you should be testing for ORTE_SUCCESS == On Dec 18, 2013, at 6:42 AM, Adrian Reber wrote: > From: Adrian Reber > > This patch changes all send/send_buffer occurrences in the C/R code > to send_nb/send_buffer_nb. > The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED). > The new code compiles but does not work. > > Changes from V1: > * #ifdef out the code (so it is preserved for later re-design) > * marked the broken C/R code with ENABLE_FT_FIXED > > Signed-off-by: Adrian Reber > --- > ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c| 18 +++ > orte/mca/errmgr/base/errmgr_base_tool.c | 4 ++ > orte/mca/rml/ftrm/rml_ftrm.h| 19 > orte/mca/rml/ftrm/rml_ftrm_component.c | 2 - > orte/mca/rml/ftrm/rml_ftrm_module.c | 63 + > orte/mca/snapc/full/snapc_full_app.c| 20 > orte/mca/snapc/full/snapc_full_global.c | 12 + > orte/mca/snapc/full/snapc_full_local.c | 4 ++ > orte/mca/sstore/central/sstore_central_app.c| 8 > orte/mca/sstore/central/sstore_central_global.c | 4 ++ > orte/mca/sstore/central/sstore_central_local.c | 12 + > orte/mca/sstore/stage/sstore_stage_app.c| 8 > orte/mca/sstore/stage/sstore_stage_global.c | 4 ++ > orte/mca/sstore/stage/sstore_stage_local.c | 16 +++ > orte/tools/orte-checkpoint/orte-checkpoint.c| 4 ++ > orte/tools/orte-migrate/orte-migrate.c | 4 ++ > 16 files changed, 130 insertions(+), 72 deletions(-) > > diff --git a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > index cba7586..4f7bd7f 100644 > --- a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > +++ b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > @@ -5102,7 +5102,11 @@ static int wait_quiesce_drained(void) > PACK_BUFFER(buffer, response, 1, OPAL_SIZE, ""); > > /* JJH - Performance Optimization? - Why not post all isends, > then wait? */ > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if ( 0 > ( ret = ompi_rte_send_buffer(&(cur_peer_ref->proc_name), > buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) { > +#endif /* ENABLE_FT_FIXED */ > +if ( 0 > ( ret = > ompi_rte_send_buffer_nb(&(cur_peer_ref->proc_name), buffer, > OMPI_CRCP_COORD_BOOKMARK_TAG, orte_rml_send_callback, NULL)) ) { > exit_status = ret; > goto cleanup; > } > @@ -5303,7 +5307,11 @@ static int send_bookmarks(int peer_idx) > PACK_BUFFER(buffer, (peer_ref->total_msgs_recvd), 1, OPAL_UINT32, > "crcp:bkmrk: send_bookmarks: Unable to pack > total_msgs_recvd"); > > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if ( 0 > ( ret = ompi_rte_send_buffer(&peer_name, buffer, > OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) { > +#endif /* ENABLE_FT_FIXED */ > +if ( 0 > ( ret = ompi_rte_send_buffer_nb(&peer_name, buffer, > OMPI_CRCP_COORD_BOOKMARK_TAG, orte_rml_send_callback, NULL)) ) { > opal_output(mca_crcp_bkmrk_component.super.output_handle, > "crcp:bkmrk: send_bookmarks: Failed to send bookmark to > peer %s: Return %d\n", > OMPI_NAME_PRINT(&peer_name), > @@ -5599,8 +5607,13 @@ static int > do_send_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, > /* > * Do the send... > */ > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if ( 0 > ( ret = ompi_rte_send_buffer(&peer_ref->proc_name, buffer, > OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) > { > +#endif /* ENABLE_FT_FIXED */ > +if ( 0 > ( ret = ompi_rte_send_buffer_nb(&peer_ref->proc_name, buffer, > + OMPI_CRCP_COORD_BOOKMARK_TAG, > orte_rml_send_callback, NULL)) ) { > opal_output(mca_crcp_bkmrk_component.super.output_handle, > "crcp:bkmrk: do_send_msg_detail: Unable to send message > details to peer %s: Return %d\n", > OMPI_NAME_PRINT(&peer_ref->proc_name), > @@ -6217,8 +6230,13 @@ static int > do_recv_msg_detail_resp(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, > "crcp:bkmrk: recv_msg_details: Unable to ask peer for more > messages"); > PACK_BUFFER(buffer, total_found, 1, OPAL_UINT32, > "crcp:bkmrk: recv_msg_details: Unable to ask peer for more > messages"); > +#ifdef ENABLE_FT_FIXED > +/* This
Re: [OMPI devel] [PATCH v2 1/2] Trying to get the C/R code to compile again. (recv_*_nb)
Hi Adrian No point in keeping the old code for those places where you update the syntax of a non-blocking recv (i.e., you remove the no-longer-reqd extra param). I would only keep it where you have to replace a blocking recv with a non-blocking one as that is where the behavior will change. Other than that, it looks fine to me. On Dec 18, 2013, at 6:42 AM, Adrian Reber wrote: > From: Adrian Reber > > This patch changes all recv/recv_buffer occurrences in the C/R code > to recv_nb/recv_buffer_nb. > The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED). > The new code compiles but does not work. > > Changes from V1: > * #ifdef out the code (so it is preserved for later re-design) > * marked the broken C/R code with ENABLE_FT_FIXED > > Signed-off-by: Adrian Reber > --- > ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c| 19 ++ > orte/mca/errmgr/base/errmgr_base_tool.c | 6 +- > orte/mca/rml/ftrm/rml_ftrm.h| 27 ++--- > orte/mca/rml/ftrm/rml_ftrm_component.c | 2 - > orte/mca/rml/ftrm/rml_ftrm_module.c | 78 +++-- > orte/mca/snapc/full/snapc_full_app.c| 12 > orte/mca/snapc/full/snapc_full_global.c | 25 > orte/mca/snapc/full/snapc_full_local.c | 24 > orte/mca/sstore/central/sstore_central_app.c| 6 ++ > orte/mca/sstore/central/sstore_central_global.c | 11 ++-- > orte/mca/sstore/central/sstore_central_local.c | 11 ++-- > orte/mca/sstore/stage/sstore_stage_app.c| 5 ++ > orte/mca/sstore/stage/sstore_stage_global.c | 11 ++-- > orte/mca/sstore/stage/sstore_stage_local.c | 11 ++-- > orte/tools/orte-checkpoint/orte-checkpoint.c| 9 ++- > orte/tools/orte-migrate/orte-migrate.c | 9 ++- > 16 files changed, 124 insertions(+), 142 deletions(-) > > diff --git a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > index 5d4005f..cba7586 100644 > --- a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > +++ b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c > @@ -4739,6 +4739,8 @@ static int ft_event_post_drain_acks(void) > drain_msg_ack = (ompi_crcp_bkmrk_pml_drain_message_ack_ref_t*)item; > > /* Post the receive */ > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if( OMPI_SUCCESS != (ret = ompi_rte_recv_buffer_nb( > &drain_msg_ack->peer, > > OMPI_CRCP_COORD_BOOKMARK_TAG, > 0, > @@ -4750,6 +4752,9 @@ static int ft_event_post_drain_acks(void) > OMPI_NAME_PRINT(&(drain_msg_ack->peer))); > return ret; > } > +#endif /* ENABLE_FT_FIXED */ > +ompi_rte_recv_buffer_nb(&drain_msg_ack->peer, > OMPI_CRCP_COORD_BOOKMARK_TAG, > +0, drain_message_ack_cbfunc, NULL); > } > > return OMPI_SUCCESS; > @@ -5330,6 +5335,8 @@ static int recv_bookmarks(int peer_idx) > peer_name.jobid = OMPI_PROC_MY_NAME->jobid; > peer_name.vpid = peer_idx; > > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if ( 0 > (ret = ompi_rte_recv_buffer_nb(&peer_name, > OMPI_CRCP_COORD_BOOKMARK_TAG, > 0, > @@ -5342,6 +5349,9 @@ static int recv_bookmarks(int peer_idx) > exit_status = ret; > goto cleanup; > } > +#endif /* ENABLE_FT_FIXED */ > +ompi_rte_recv_buffer_nb(&peer_name, OMPI_CRCP_COORD_BOOKMARK_TAG, > + 0, recv_bookmarks_cbfunc, NULL); > > ++total_recv_bookmarks; > > @@ -5616,6 +5626,8 @@ static int > do_send_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, > /* > * Recv the ACK msg > */ > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if ( 0 > (ret = ompi_rte_recv_buffer(&peer_ref->proc_name, buffer, > OMPI_CRCP_COORD_BOOKMARK_TAG, 0) ) ) > { > opal_output(mca_crcp_bkmrk_component.super.output_handle, > @@ -5626,6 +5638,9 @@ static int > do_send_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, > exit_status = ret; > goto cleanup; > } > +#endif /* ENABLE_FT_FIXED */ > +ompi_rte_recv_buffer_nb(&peer_ref->proc_name, > OMPI_CRCP_COORD_BOOKMARK_TAG, 0, > +orte_rml_recv_callback, NULL); > > UNPACK_BUFFER(buffer, recv_response, 1, OPAL_UINT32, > "crcp:bkmrk: send_msg_details: Failed to unpack the ACK > from peer buffer."); > @@ -5790,6 +5805,8 @@ static int > do_recv_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, > /* > * Recv the msg > */ > +#ifdef ENABLE_FT_FIXED > +/* This is the old, now broken code */ > if ( 0 > (ret = ompi_rte_recv_buffer(&peer_ref->proc_name, buffer, > OMPI_CRCP_COORD_BOOKMARK_TAG, 0) ) ) { > opal
[OMPI devel] [PATCH v2 0/2] Trying to get the C/R code to compile again
From: Adrian Reber This is the second try to replace the usage of blocking send and recv in the C/R code with the non-blocking versions. The new code compiles (in contrast to the old code) but does not work yet. This is the first step to get the C/R code working again. Right now it only compiles. Changes from V1: * #ifdef out the broken code (so it is preserved for later re-design) * marked the broken C/R code with ENABLE_FT_FIXED Adrian Reber (2): Trying to get the C/R code to compile again. (recv_*_nb) Trying to get the C/R code to compile again. (send_*_nb) ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c| 37 +++ orte/mca/errmgr/base/errmgr_base_tool.c | 10 +- orte/mca/rml/ftrm/rml_ftrm.h| 46 +--- orte/mca/rml/ftrm/rml_ftrm_component.c | 4 - orte/mca/rml/ftrm/rml_ftrm_module.c | 141 orte/mca/snapc/full/snapc_full_app.c| 32 ++ orte/mca/snapc/full/snapc_full_global.c | 37 +-- orte/mca/snapc/full/snapc_full_local.c | 28 +++-- orte/mca/sstore/central/sstore_central_app.c| 14 +++ orte/mca/sstore/central/sstore_central_global.c | 15 ++- orte/mca/sstore/central/sstore_central_local.c | 23 +++- orte/mca/sstore/stage/sstore_stage_app.c| 13 +++ orte/mca/sstore/stage/sstore_stage_global.c | 15 ++- orte/mca/sstore/stage/sstore_stage_local.c | 27 - orte/tools/orte-checkpoint/orte-checkpoint.c| 13 ++- orte/tools/orte-migrate/orte-migrate.c | 13 ++- 16 files changed, 254 insertions(+), 214 deletions(-) -- 1.8.4.2
[OMPI devel] [PATCH v2 2/2] Trying to get the C/R code to compile again. (send_*_nb)
From: Adrian Reber This patch changes all send/send_buffer occurrences in the C/R code to send_nb/send_buffer_nb. The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED). The new code compiles but does not work. Changes from V1: * #ifdef out the code (so it is preserved for later re-design) * marked the broken C/R code with ENABLE_FT_FIXED Signed-off-by: Adrian Reber --- ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c| 18 +++ orte/mca/errmgr/base/errmgr_base_tool.c | 4 ++ orte/mca/rml/ftrm/rml_ftrm.h| 19 orte/mca/rml/ftrm/rml_ftrm_component.c | 2 - orte/mca/rml/ftrm/rml_ftrm_module.c | 63 + orte/mca/snapc/full/snapc_full_app.c| 20 orte/mca/snapc/full/snapc_full_global.c | 12 + orte/mca/snapc/full/snapc_full_local.c | 4 ++ orte/mca/sstore/central/sstore_central_app.c| 8 orte/mca/sstore/central/sstore_central_global.c | 4 ++ orte/mca/sstore/central/sstore_central_local.c | 12 + orte/mca/sstore/stage/sstore_stage_app.c| 8 orte/mca/sstore/stage/sstore_stage_global.c | 4 ++ orte/mca/sstore/stage/sstore_stage_local.c | 16 +++ orte/tools/orte-checkpoint/orte-checkpoint.c| 4 ++ orte/tools/orte-migrate/orte-migrate.c | 4 ++ 16 files changed, 130 insertions(+), 72 deletions(-) diff --git a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c index cba7586..4f7bd7f 100644 --- a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c +++ b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c @@ -5102,7 +5102,11 @@ static int wait_quiesce_drained(void) PACK_BUFFER(buffer, response, 1, OPAL_SIZE, ""); /* JJH - Performance Optimization? - Why not post all isends, then wait? */ +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > ( ret = ompi_rte_send_buffer(&(cur_peer_ref->proc_name), buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) { +#endif /* ENABLE_FT_FIXED */ +if ( 0 > ( ret = ompi_rte_send_buffer_nb(&(cur_peer_ref->proc_name), buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, orte_rml_send_callback, NULL)) ) { exit_status = ret; goto cleanup; } @@ -5303,7 +5307,11 @@ static int send_bookmarks(int peer_idx) PACK_BUFFER(buffer, (peer_ref->total_msgs_recvd), 1, OPAL_UINT32, "crcp:bkmrk: send_bookmarks: Unable to pack total_msgs_recvd"); +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > ( ret = ompi_rte_send_buffer(&peer_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) { +#endif /* ENABLE_FT_FIXED */ +if ( 0 > ( ret = ompi_rte_send_buffer_nb(&peer_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, orte_rml_send_callback, NULL)) ) { opal_output(mca_crcp_bkmrk_component.super.output_handle, "crcp:bkmrk: send_bookmarks: Failed to send bookmark to peer %s: Return %d\n", OMPI_NAME_PRINT(&peer_name), @@ -5599,8 +5607,13 @@ static int do_send_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, /* * Do the send... */ +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > ( ret = ompi_rte_send_buffer(&peer_ref->proc_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) { +#endif /* ENABLE_FT_FIXED */ +if ( 0 > ( ret = ompi_rte_send_buffer_nb(&peer_ref->proc_name, buffer, + OMPI_CRCP_COORD_BOOKMARK_TAG, orte_rml_send_callback, NULL)) ) { opal_output(mca_crcp_bkmrk_component.super.output_handle, "crcp:bkmrk: do_send_msg_detail: Unable to send message details to peer %s: Return %d\n", OMPI_NAME_PRINT(&peer_ref->proc_name), @@ -6217,8 +6230,13 @@ static int do_recv_msg_detail_resp(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, "crcp:bkmrk: recv_msg_details: Unable to ask peer for more messages"); PACK_BUFFER(buffer, total_found, 1, OPAL_UINT32, "crcp:bkmrk: recv_msg_details: Unable to ask peer for more messages"); +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > ( ret = ompi_rte_send_buffer(&peer_ref->proc_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0)) ) { +#endif /* ENABLE_FT_FIXED */ + +if ( 0 > ( ret = ompi_rte_send_buffer_nb(&peer_ref->proc_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, orte_rml_send_callback, NULL)) ) { opal_output(mca_crcp_bkmrk_component.super.output_handle, "crcp:bkmrk: recv_msg_detail_resp: Unable to send message detail response to peer %s: Return %d\n", OMPI_NAME_PRINT(&peer_ref->proc_name), diff --git a/orte/mca/errmgr/base/errmgr_base_tool.c b/orte/mca/errmgr/base/errmgr_base_tool.c index b982e46..e274bae 100644 --- a/orte/mca/errmgr/base/errmgr_base_tool.c +
[OMPI devel] [PATCH v2 1/2] Trying to get the C/R code to compile again. (recv_*_nb)
From: Adrian Reber This patch changes all recv/recv_buffer occurrences in the C/R code to recv_nb/recv_buffer_nb. The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED). The new code compiles but does not work. Changes from V1: * #ifdef out the code (so it is preserved for later re-design) * marked the broken C/R code with ENABLE_FT_FIXED Signed-off-by: Adrian Reber --- ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c| 19 ++ orte/mca/errmgr/base/errmgr_base_tool.c | 6 +- orte/mca/rml/ftrm/rml_ftrm.h| 27 ++--- orte/mca/rml/ftrm/rml_ftrm_component.c | 2 - orte/mca/rml/ftrm/rml_ftrm_module.c | 78 +++-- orte/mca/snapc/full/snapc_full_app.c| 12 orte/mca/snapc/full/snapc_full_global.c | 25 orte/mca/snapc/full/snapc_full_local.c | 24 orte/mca/sstore/central/sstore_central_app.c| 6 ++ orte/mca/sstore/central/sstore_central_global.c | 11 ++-- orte/mca/sstore/central/sstore_central_local.c | 11 ++-- orte/mca/sstore/stage/sstore_stage_app.c| 5 ++ orte/mca/sstore/stage/sstore_stage_global.c | 11 ++-- orte/mca/sstore/stage/sstore_stage_local.c | 11 ++-- orte/tools/orte-checkpoint/orte-checkpoint.c| 9 ++- orte/tools/orte-migrate/orte-migrate.c | 9 ++- 16 files changed, 124 insertions(+), 142 deletions(-) diff --git a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c index 5d4005f..cba7586 100644 --- a/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c +++ b/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c @@ -4739,6 +4739,8 @@ static int ft_event_post_drain_acks(void) drain_msg_ack = (ompi_crcp_bkmrk_pml_drain_message_ack_ref_t*)item; /* Post the receive */ +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if( OMPI_SUCCESS != (ret = ompi_rte_recv_buffer_nb( &drain_msg_ack->peer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0, @@ -4750,6 +4752,9 @@ static int ft_event_post_drain_acks(void) OMPI_NAME_PRINT(&(drain_msg_ack->peer))); return ret; } +#endif /* ENABLE_FT_FIXED */ +ompi_rte_recv_buffer_nb(&drain_msg_ack->peer, OMPI_CRCP_COORD_BOOKMARK_TAG, +0, drain_message_ack_cbfunc, NULL); } return OMPI_SUCCESS; @@ -5330,6 +5335,8 @@ static int recv_bookmarks(int peer_idx) peer_name.jobid = OMPI_PROC_MY_NAME->jobid; peer_name.vpid = peer_idx; +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > (ret = ompi_rte_recv_buffer_nb(&peer_name, OMPI_CRCP_COORD_BOOKMARK_TAG, 0, @@ -5342,6 +5349,9 @@ static int recv_bookmarks(int peer_idx) exit_status = ret; goto cleanup; } +#endif /* ENABLE_FT_FIXED */ +ompi_rte_recv_buffer_nb(&peer_name, OMPI_CRCP_COORD_BOOKMARK_TAG, + 0, recv_bookmarks_cbfunc, NULL); ++total_recv_bookmarks; @@ -5616,6 +5626,8 @@ static int do_send_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, /* * Recv the ACK msg */ +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > (ret = ompi_rte_recv_buffer(&peer_ref->proc_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0) ) ) { opal_output(mca_crcp_bkmrk_component.super.output_handle, @@ -5626,6 +5638,9 @@ static int do_send_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, exit_status = ret; goto cleanup; } +#endif /* ENABLE_FT_FIXED */ +ompi_rte_recv_buffer_nb(&peer_ref->proc_name, OMPI_CRCP_COORD_BOOKMARK_TAG, 0, +orte_rml_recv_callback, NULL); UNPACK_BUFFER(buffer, recv_response, 1, OPAL_UINT32, "crcp:bkmrk: send_msg_details: Failed to unpack the ACK from peer buffer."); @@ -5790,6 +5805,8 @@ static int do_recv_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, /* * Recv the msg */ +#ifdef ENABLE_FT_FIXED +/* This is the old, now broken code */ if ( 0 > (ret = ompi_rte_recv_buffer(&peer_ref->proc_name, buffer, OMPI_CRCP_COORD_BOOKMARK_TAG, 0) ) ) { opal_output(mca_crcp_bkmrk_component.super.output_handle, "crcp:bkmrk: do_recv_msg_detail: %s <-- %s Failed to receive buffer from peer. Return %d\n", @@ -5799,6 +5816,8 @@ static int do_recv_msg_detail(ompi_crcp_bkmrk_pml_peer_ref_t *peer_ref, exit_status = ret; goto cleanup; } +#endif /* ENABLE_FT_FIXED */ +ompi_rte_recv_buffer_nb(&peer_ref->proc_name, OMPI_CRCP_COORD_BOOKMARK_TAG, 0, orte_rml_recv_callback, NULL); /* Pull out the communicator ID */ UNPACK_BUFFER(buffer, (*comm_id), 1, OPAL_UI
Re: [OMPI devel] [patch] async-signal-safe signal handler
This patch looks good to me (sorry for the delay in replying -- MPI Forum + OMPI dev meeting got in the way). Brian -- do you have any opinions on it? On Dec 11, 2013, at 1:43 AM, Kawashima, Takahiro wrote: > Hi, > > Open MPI's signal handler (show_stackframe function defined in > opal/util/stacktrace.c) calls non-async-signal-safe functions > and it causes a problem. > > See attached mpisigabrt.c. Passing corrupted memory to realloc(3) > will cause SIGABRT and show_stackframe function will be invoked. > But invoked show_stackframe function deadlocks in backtrace_symbols(3) > on some systems because backtrace_symbols(3) calls malloc(3) > internally and a deadlock of realloc/malloc mutex occurs. > > Attached mpisigabrt.gstack.txt shows the stacktrace gotten > by gdb in this deadlock situation on Ubuntu 12.04 LTS (precise) > x86_64. Though I could not reproduce this behavior on RHEL 5/6, > I can reproduce it also on K computer and its successor PRIMEHPC FX10. > Passing non-heap memory to free(3) and double-free also cause > this deadlock. > > malloc (and backtrace_symbols) is not marked as async-signal-safe > in POSIX and current glibc, though it seems to have been marked > in old glibc. So we should not call it in the signal handler now. > > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04 > http://cygwin.com/ml/libc-help/2013-06/msg5.html > > I wrote a patch to address this issue. See the attached > async-signal-safe-stacktrace.patch. > > This patch calls backtrace_symbols_fd(3) instead of backtrace_symbols(3). > Though backtrace_symbols_fd is not declared as async-signal-safe, > it is described not to call malloc internally in its man. So it > should be rather safer. > > Output format of show_stackframe function is not changed by > this patch. But the opal_backtrace_print function (backtrace > framework) interface is changed for the output format compatibility. > This requires changes in some additional files (ompi_mpi_abort.c > etc.). > > This patch also removes unnecessary fflush(3) calls, which are > meaningless for write(2) system call but might cause a similar > problem. > > What do you think about this patch? > > Takahiro Kawashima, > MPI development team, > Fujitsu > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] Recommended tool to measure packet counters
On Dec 14, 2013, at 8:02 AM, Siddhartha Jana wrote: > Is there a preferred method/tool among developers of MPI-library for checking > the count of the packets transmitted by the network card during two-sided > communication? > > Is the use of > iptables -I INPUT -i eth0 > iptables -I OUTPUT -o eth0 > > recommended ? If you're using an ethernet, non-OS-bypass transport (e.g., TCP), you might also want to look at ethtool. Note that these counts will include control messages sent by Open MPI, too -- not just raw MPI traffic. They also will not include any traffic sent across shared memory (or other transports). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI devel] Bus error with openmpi-1.7.4rc1 on Solaris
Hi, today I installed openmpi-1.7.4rc1 on Solaris 10 Sparc with Sun C 5.12. Unfortunately my problems with bus errors, which I reported December 4th for openmpi-1.7.4a1r29784 at us...@open-mpi.org, are not solved yet. Has somebody time to look into that matter or is Solaris support abandoned, so that I have to stay with openmpi-1.6.x in the future? Thank you very much for any help in advance. Kind regards Siegmar