Re: [OMPI users] opal_pmix_base_select failed for master and 4.0.0

2018-10-12 Thread Ralph H Castain
Hi Siegmar

The patch was merged into the v4.0.0 branch on Oct 10th, so should be available 
in the nightly tarball from that date onward.


> On Oct 6, 2018, at 2:12 AM, Siegmar Gross 
>  wrote:
> 
> Hi Jeff, hi Ralph,
> 
> Great, it works again! Thank you very much for your help. I'm really happy,
> if the undefined references for Sun C are resolved and there are no new
> problems for that compiler :-)). Do you know when the pmix patch will be
> integrated into version 4.0.0?
> 
> 
> Best regards
> 
> Siegmar
> 
> 
> On 10/5/18 4:33 PM, Jeff Squyres (jsquyres) via users wrote:
>> Oops!  We had a typo in yesterday's fix -- fixed:
>> https://github.com/open-mpi/ompi/pull/5847
>> Ralph also put double extra super protection to make triple sure that this 
>> error can't happen again in:
>> https://github.com/open-mpi/ompi/pull/5846
>> Both of these should be in tonight's nightly snapshot.
>> Thank you!
>>> On Oct 5, 2018, at 5:45 AM, Ralph H Castain  wrote:
>>> 
>>> Please send Jeff and I the opal/mca/pmix/pmix4x/pmix/config.log again - 
>>> we’ll need to see why it isn’t building. The patch definitely is not in the 
>>> v4.0 branch, but it should have been in master.
>>> 
>>> 
 On Oct 5, 2018, at 2:04 AM, Siegmar Gross 
  wrote:
 
 Hi Ralph, hi Jeff,
 
 
 On 10/3/18 8:14 PM, Ralph H Castain wrote:
> Jeff and I talked and believe the patch in 
> https://github.com/open-mpi/ompi/pull/5836 should fix the problem.
 
 
 Today I've installed openmpi-master-201810050304-5f1c940 and
 openmpi-v4.0.x-201810050241-c079666. Unfortunately, I still get the
 same error for all seven versions that I was able to build.
 
 loki hello_1 114 mpicc --showme
 gcc -I/usr/local/openmpi-master_64_gcc/include -fexceptions -pthread 
 -std=c11 -m64 -Wl,-rpath -Wl,/usr/local/openmpi-master_64_gcc/lib64 
 -Wl,--enable-new-dtags -L/usr/local/openmpi-master_64_gcc/lib64 -lmpi
 
 loki hello_1 115 ompi_info | grep "Open MPI repo revision"
  Open MPI repo revision: v2.x-dev-6262-g5f1c940
 
 loki hello_1 116 mpicc hello_1_mpi.c
 
 loki hello_1 117 mpiexec -np 2 a.out
 [loki:25575] [[64603,0],0] ORTE_ERROR_LOG: Not found in file 
 ../../../../../openmpi-master-201810050304-5f1c940/orte/mca/ess/hnp/ess_hnp_module.c
  at line 320
 --
 It looks like orte_init failed for some reason; your parallel process is
 likely to abort.  There are many reasons that a parallel process can
 fail during orte_init; some of which are due to configuration or
 environment problems.  This failure appears to be an internal failure;
 here's some additional information (which may only be relevant to an
 Open MPI developer):
 
  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
 --
 loki hello_1 118
 
 
 I don't know, if you have already applied your suggested patch or if the
 error message is still from a version without that patch. Do you need
 anything else?
 
 
 Best regards
 
 Siegmar
 
 
>> On Oct 2, 2018, at 2:50 PM, Jeff Squyres (jsquyres) via users 
>>  wrote:
>> 
>> (Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him 
>> off-list)
>> 
>> It looks like Siegmar passed --with-hwloc=internal.
>> 
>> Open MPI's configure understood this and did the appropriate things.
>> PMIX's configure didn't.
>> 
>> I think we need to add an adjustment into the PMIx configure.m4 in 
>> OMPI...
>> 
>> 
>>> On Oct 2, 2018, at 5:25 PM, Ralph H Castain  wrote:
>>> 
>>> Hi Siegmar
>>> 
>>> I honestly have no idea - for some reason, the PMIx component isn’t 
>>> seeing the internal hwloc code in your environment.
>>> 
>>> Jeff, Brice - any ideas?
>>> 
>>> 
 On Oct 2, 2018, at 1:18 PM, Siegmar Gross 
  wrote:
 
 Hi Ralph,
 
 how can I confirm that HWLOC built? Some hwloc files are available
 in the built directory.
 
 loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find 
 . -name '*hwloc*'
 ./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
 ./opal/mca/hwloc
 ./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
 ./opal/mca/hwloc/base/hwloc_base_frame.lo
 ./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
 ./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
 ./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
 ./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
 ./opal/mca/hwloc/base/hwloc_base_dt.lo
 ./opal/mca/hwloc/base/hwloc_base_util.lo
 ./opal/mca/hwloc/base/hwloc_base_maffinity.lo
 

Re: [OMPI users] [version 2.1.5] invalid memory reference

2018-10-12 Thread Patrick Bégou
I have downloaded the nightly snapshot tarball of october 10th 2018 for
the 3.1 version and it solves the memory problem.
I ran my test case on 1, 2, 4, 10, 16, 20, 32, 40, and 64 cores
successfully.
This version also allows to compile my prerequisites libraries, so we
can use it out of the box to stay in production.
This allow me time to update hdf5, but also petsc/slepc libs, to works
with more recent MPI standard and moving to OpenMPI4.x versions.

Thanks for all your precious advices.

Patrick

Le 11/10/2018 à 17:19, Jeff Squyres (jsquyres) a écrit :
> Patrick --
>
> You might want to update your HDF code to not use MPI_LB and MPI_UB -- these 
> constants were deprecated in MPI-2.1 in 2009 (an equivalent function, 
> MPI_TYPE_CREATE_RESIZED was added in MPI-2.0 in 1997), and were removed from 
> the MPI-3.0 standard in 2012.
>
> Meaning: the death of these constants has been written on the wall since 2009.
>
> That being said, Open MPI v4.0 did not remove these constants -- we just 
> *disabled them by default*, specifically for cases like this.  I.e., we want 
> to make the greater MPI community aware that:
>
> 1) there are MPI-1 constructs that were initially deprecated and finally 
> removed from the standard (this happened years ago)
> 2) MPI applications should start moving away from these removed MPI-1 
> constructs
> 3) Open MPI is disabling these removed MPI-1 constructs by default in Open 
> MPI v4.0.  The current plan is to actually fully remove these MPI-1 
> constructs in Open MPI v5.0 (perhaps in 2019?).
>
> For the v4.0.x series, you can configure/build Open MPI with 
> --enable-mpi1-compatibility to re-activate MPI_LB and MPI_UB.
>
>
>
>> On Oct 11, 2018, at 10:58 AM, Patrick Begou 
>>  wrote:
>>
>> Hi Jeff and George
>>
>> thanks for your answer. I find some time to work again on this problem dans 
>> I have downloaded OpenMPI 4.0.0rc4. It compiles without any problem but 
>> building the first dependance of my code (hdf5 1.8.12) with this version 4 
>> fails:
>>
>> ../../src/H5Smpio.c:355:28: error: 'MPI_LB' undeclared (first use in this 
>> function); did you mean 'MPI_IO'?
>>  old_types[0] = MPI_LB;
>> ^~
>> MPI_IO
>> ../../src/H5Smpio.c:355:28: note: each undeclared identifier is reported 
>> only once for each function it appears in
>> ../../src/H5Smpio.c:357:28: error: 'MPI_UB' undeclared (first use in this 
>> function); did you mean 'MPI_LB'?
>>  old_types[2] = MPI_UB;
>> ^~
>> MPI_LB
>> ../../src/H5Smpio.c:365:24: warning: implicit declaration of function 
>> 'MPI_Type_struct'; did you mean 'MPI_Type_size_x'? 
>> [-Wimplicit-function-declaration]
>>  mpi_code = MPI_Type_struct(3,   /* count */
>> ^~~
>> MPI_Type_size_x
>>
>> It is not possible for me to use a more recent hdf5 version as the API as 
>> changed and will not work with the code, even in compatible mode.
>>
>> At this time, I'll try version 3 from the git repo if I have the required 
>> tools available on my server. All prerequisites compile successfully with 
>> 3.1.2.
>>
>> Patrick
>>
>> -- 
>> ===
>> |  Equipe M.O.S.T. |  |
>> |  Patrick BEGOU   | mailto:patrick.be...@grenoble-inp.fr |
>> |  LEGI|  |
>> |  BP 53 X | Tel 04 76 82 51 35   |
>> |  38041 GRENOBLE CEDEX| Fax 04 76 82 52 71   |
>> ===
>>
>

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users