Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
Thank you, Gilles for the pointer. I see what operations supported in SM
now.

On Wed, Dec 9, 2015 at 8:05 PM, Gilles Gouaillardet 
wrote:

> Saliya,
>
> from ompi/mca/coll/sm/coll_sm_module.c in mca_coll_sm_comm_query()
> sm_module->super.coll_allgatherv = NULL;
>
> that means the coll sm module does *not* implement allgatherv, so openmpi
> will use the next module
> (which is very likely the default module, that is why there is no
> performance improvement in your specific benchmark)
>
> Cheers,
>
> Gilles
>
>
>
> On 12/10/2015 2:53 AM, Saliya Ekanayake wrote:
>
> Hi,
>
> In a previous email, I wanted to know how to enable shared memory
> collectives and I was told setting the coll_sm_priority to anything over 30
> should do it.
>
> I tested this for a microbenchmark on allgatherv, but it didn't improve
> performance over the default setting. See below, where I tested for
> different number of processes per node on 48 nodes. The total message size
> is kept constant at 240 bytes (or 2.28MB).
>
> Am I doing something wrong here?
>
> Thank you,
> saliya
>
> [image: Inline image 1]
>
> --
> Saliya Ekanayake
> Ph.D. Candidate | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
> Cell 812-391-4914
> http://saliya.org
>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/12/28153.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/12/28156.php
>



-- 
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org


Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Gilles Gouaillardet

Saliya,

from ompi/mca/coll/sm/coll_sm_module.c in mca_coll_sm_comm_query()
sm_module->super.coll_allgatherv = NULL;

that means the coll sm module does *not* implement allgatherv, so 
openmpi will use the next module
(which is very likely the default module, that is why there is no 
performance improvement in your specific benchmark)


Cheers,

Gilles


On 12/10/2015 2:53 AM, Saliya Ekanayake wrote:

Hi,

In a previous email, I wanted to know how to enable shared memory 
collectives and I was told setting the coll_sm_priority to anything 
over 30 should do it.


I tested this for a microbenchmark on allgatherv, but it didn't 
improve performance over the default setting. See below, where I 
tested for different number of processes per node on 48 nodes. The 
total message size is kept constant at 240 bytes (or 2.28MB).


Am I doing something wrong here?

Thank you,
saliya

Inline image 1

--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org


___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/12/28153.php




Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Aurélien Bouteiller
Try to run with coll_base_verbose 1000, just to see what collective module got 
effectively loaded.

Aurélien
--
Aurélien Bouteiller, Ph.D. ~~ https://icl.cs.utk.edu/~bouteill/ 

> Le 9 déc. 2015 à 09:53, Saliya Ekanayake  a écrit :
> 
> Hi,
> 
> In a previous email, I wanted to know how to enable shared memory collectives 
> and I was told setting the coll_sm_priority to anything over 30 should do it.
> 
> I tested this for a microbenchmark on allgatherv, but it didn't improve 
> performance over the default setting. See below, where I tested for different 
> number of processes per node on 48 nodes. The total message size is kept 
> constant at 240 bytes (or 2.28MB).
> 
> Am I doing something wrong here?
> 
> Thank you,
> saliya
> 
> 
> 
> -- 
> Saliya Ekanayake
> Ph.D. Candidate | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
> Cell 812-391-4914
> http://saliya.org 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/12/28153.php



[OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
Hi,

In a previous email, I wanted to know how to enable shared memory
collectives and I was told setting the coll_sm_priority to anything over 30
should do it.

I tested this for a microbenchmark on allgatherv, but it didn't improve
performance over the default setting. See below, where I tested for
different number of processes per node on 48 nodes. The total message size
is kept constant at 240 bytes (or 2.28MB).

Am I doing something wrong here?

Thank you,
saliya

[image: Inline image 1]

-- 
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org


Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
ok, I can confirm that once I update the file_get_position function to 
what we have in master and the 2.x series, your test passes with ompio 
in the 1.10 series as well. I am happy to provide a patch for testing, 
and to submit a pr. I am however worried since we know that ompio in the 
1.10 series is significantly out of sync with master, so there is 
potential for other, similar issues.


It would however be interesting to see, whether your code works 
correctly with ompio in the 2.x release (or master), and I would be 
happy to provide any support necessary for testing (including the offer, 
that I can run the tests if you provide me the source code).


Thanks
Edgar

On 12/9/2015 9:30 AM, Edgar Gabriel wrote:

what does the mount command return?

On 12/9/2015 9:27 AM, Paul Kapinos wrote:

Dear Edgar,


On 12/09/15 16:16, Edgar Gabriel wrote:

I tested your code in master and v1.10 ( on my local machine), and I get for
both version of ompio exactly the same (correct) output that you had with romio.

I've tested it at local hard disk..

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h .
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   1.1T   16G  1.1T   2% /w0

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt;
./a.out
fileOffset, fileSize 7 7
fileOffset, fileSize2323
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt;
./a.out
fileOffset, fileSize 0 7
fileOffset, fileSize 016
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128



However, I also noticed that in the ompio version that is in the v1.10 branch,
the MPI_File_get_size function is not implemented on lustre.

Yes we have Lustre in the cluster.
I believe that was one of 'another' issues mentioned, yes some users tend to use
Lustre as HPC file system =)





Thanks
Edgar

On 12/9/2015 8:06 AM, Edgar Gabriel wrote:

I will look at your test case and see what is going on in ompio. That
being said, the vast number of fixes and improvements that went into
ompio over the last two years were not back ported to the 1.8 (and thus
1.10) series, since it would have required changes to the interfaces of
the frameworks involved (and thus would have violated one of rules of
Open MPI release series) . Anyway, if there is a simple fix for your
test case for the 1.10 series, I am happy to provide a patch. It might
take me a day or two however.

Edgar

On 12/9/2015 6:24 AM, Paul Kapinos wrote:

Sorry, forgot to mention: 1.10.1


 Open MPI: 1.10.1
   Open MPI repo revision: v1.10.0-178-gb80f802
Open MPI release date: Nov 03, 2015
 Open RTE: 1.10.1
   Open RTE repo revision: v1.10.0-178-gb80f802
Open RTE release date: Nov 03, 2015
 OPAL: 1.10.1
   OPAL repo revision: v1.10.0-178-gb80f802
OPAL release date: Nov 03, 2015
  MPI API: 3.0.0
 Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier
from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

Dear Open MPI developers,
did OMPIO (1) reached 'usable-stable' state?

As we reported in (2) we had some trouble in building Open MPI with
ROMIO,
which fact was hidden by OMPIO implementation stepping into the MPI_IO
breach. The fact 'ROMIO isn't AVBL' was detected after users complained
'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
investigations.

Take a look at the attached example. It deliver different result in
case of
using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
We've seen more examples of divergent behaviour but this one is quite
handy.

Is that a bug in OMPIO or did we miss something?

Best
Paul Kapinos


1) http://www.open-mpi.org/faq/?category=ompio

2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

3) (ROMIO is default; on local hard drive at node 'cluster')
$ ompi_info  | grep  romio
   MCA io: romio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
$ ompi_info  | grep  ompio
   MCA io: ompio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
$ mpif90 main.f90

$ echo hello1234 > 

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
ok, forget it, I found the issue. I totally forgot that in the 1.10 
series I have to manually force ompio ( it is the default on master and 
2.x). It fails now for me as well with v1.10, will elt you know what I find.


Thanks
Edgar

On 12/9/2015 9:30 AM, Edgar Gabriel wrote:

what does the mount command return?

On 12/9/2015 9:27 AM, Paul Kapinos wrote:

Dear Edgar,


On 12/09/15 16:16, Edgar Gabriel wrote:

I tested your code in master and v1.10 ( on my local machine), and I get for
both version of ompio exactly the same (correct) output that you had with romio.

I've tested it at local hard disk..

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h .
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   1.1T   16G  1.1T   2% /w0

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt;
./a.out
fileOffset, fileSize 7 7
fileOffset, fileSize2323
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt;
./a.out
fileOffset, fileSize 0 7
fileOffset, fileSize 016
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128



However, I also noticed that in the ompio version that is in the v1.10 branch,
the MPI_File_get_size function is not implemented on lustre.

Yes we have Lustre in the cluster.
I believe that was one of 'another' issues mentioned, yes some users tend to use
Lustre as HPC file system =)





Thanks
Edgar

On 12/9/2015 8:06 AM, Edgar Gabriel wrote:

I will look at your test case and see what is going on in ompio. That
being said, the vast number of fixes and improvements that went into
ompio over the last two years were not back ported to the 1.8 (and thus
1.10) series, since it would have required changes to the interfaces of
the frameworks involved (and thus would have violated one of rules of
Open MPI release series) . Anyway, if there is a simple fix for your
test case for the 1.10 series, I am happy to provide a patch. It might
take me a day or two however.

Edgar

On 12/9/2015 6:24 AM, Paul Kapinos wrote:

Sorry, forgot to mention: 1.10.1


 Open MPI: 1.10.1
   Open MPI repo revision: v1.10.0-178-gb80f802
Open MPI release date: Nov 03, 2015
 Open RTE: 1.10.1
   Open RTE repo revision: v1.10.0-178-gb80f802
Open RTE release date: Nov 03, 2015
 OPAL: 1.10.1
   OPAL repo revision: v1.10.0-178-gb80f802
OPAL release date: Nov 03, 2015
  MPI API: 3.0.0
 Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier
from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

Dear Open MPI developers,
did OMPIO (1) reached 'usable-stable' state?

As we reported in (2) we had some trouble in building Open MPI with
ROMIO,
which fact was hidden by OMPIO implementation stepping into the MPI_IO
breach. The fact 'ROMIO isn't AVBL' was detected after users complained
'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
investigations.

Take a look at the attached example. It deliver different result in
case of
using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
We've seen more examples of divergent behaviour but this one is quite
handy.

Is that a bug in OMPIO or did we miss something?

Best
Paul Kapinos


1) http://www.open-mpi.org/faq/?category=ompio

2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

3) (ROMIO is default; on local hard drive at node 'cluster')
$ ompi_info  | grep  romio
   MCA io: romio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
$ ompi_info  | grep  ompio
   MCA io: ompio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
$ mpif90 main.f90

$ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
  fileOffset, fileSize1010
  fileOffset, fileSize2626
  ierr0
  MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

$ export OMPI_MCA_io=ompio
$ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
  fileOffset, fileSize

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel

what does the mount command return?

On 12/9/2015 9:27 AM, Paul Kapinos wrote:

Dear Edgar,


On 12/09/15 16:16, Edgar Gabriel wrote:

I tested your code in master and v1.10 ( on my local machine), and I get for
both version of ompio exactly the same (correct) output that you had with romio.

I've tested it at local hard disk..

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h .
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   1.1T   16G  1.1T   2% /w0

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt;
./a.out
   fileOffset, fileSize 7 7
   fileOffset, fileSize2323
   ierr0
   MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt;
./a.out
   fileOffset, fileSize 0 7
   fileOffset, fileSize 016
   ierr0
   MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128



However, I also noticed that in the ompio version that is in the v1.10 branch,
the MPI_File_get_size function is not implemented on lustre.

Yes we have Lustre in the cluster.
I believe that was one of 'another' issues mentioned, yes some users tend to use
Lustre as HPC file system =)





Thanks
Edgar

On 12/9/2015 8:06 AM, Edgar Gabriel wrote:

I will look at your test case and see what is going on in ompio. That
being said, the vast number of fixes and improvements that went into
ompio over the last two years were not back ported to the 1.8 (and thus
1.10) series, since it would have required changes to the interfaces of
the frameworks involved (and thus would have violated one of rules of
Open MPI release series) . Anyway, if there is a simple fix for your
test case for the 1.10 series, I am happy to provide a patch. It might
take me a day or two however.

Edgar

On 12/9/2015 6:24 AM, Paul Kapinos wrote:

Sorry, forgot to mention: 1.10.1


Open MPI: 1.10.1
  Open MPI repo revision: v1.10.0-178-gb80f802
   Open MPI release date: Nov 03, 2015
Open RTE: 1.10.1
  Open RTE repo revision: v1.10.0-178-gb80f802
   Open RTE release date: Nov 03, 2015
OPAL: 1.10.1
  OPAL repo revision: v1.10.0-178-gb80f802
   OPAL release date: Nov 03, 2015
 MPI API: 3.0.0
Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier
from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

   Dear Open MPI developers,
   did OMPIO (1) reached 'usable-stable' state?

   As we reported in (2) we had some trouble in building Open MPI with
ROMIO,
   which fact was hidden by OMPIO implementation stepping into the MPI_IO
   breach. The fact 'ROMIO isn't AVBL' was detected after users complained
   'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
   investigations.

   Take a look at the attached example. It deliver different result in
case of
   using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
   We've seen more examples of divergent behaviour but this one is quite
handy.

   Is that a bug in OMPIO or did we miss something?

   Best
   Paul Kapinos


   1) http://www.open-mpi.org/faq/?category=ompio

   2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

   3) (ROMIO is default; on local hard drive at node 'cluster')
   $ ompi_info  | grep  romio
  MCA io: romio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
   $ ompi_info  | grep  ompio
  MCA io: ompio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
   $ mpif90 main.f90

   $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
 fileOffset, fileSize1010
 fileOffset, fileSize2626
 ierr0
 MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

   $ export OMPI_MCA_io=ompio
   $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
 fileOffset, fileSize 010
 fileOffset, fileSize 016
 ierr0
 MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128


   --
   Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
   RWTH Aachen University, IT Center
   

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Paul Kapinos

Dear Edgar,


On 12/09/15 16:16, Edgar Gabriel wrote:

I tested your code in master and v1.10 ( on my local machine), and I get for
both version of ompio exactly the same (correct) output that you had with romio.


I've tested it at local hard disk..

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h .
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   1.1T   16G  1.1T   2% /w0

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt; 
./a.out

 fileOffset, fileSize 7 7
 fileOffset, fileSize2323
 ierr0
 MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio

pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt; 
./a.out

 fileOffset, fileSize 0 7
 fileOffset, fileSize 016
 ierr0
 MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128



However, I also noticed that in the ompio version that is in the v1.10 branch,
the MPI_File_get_size function is not implemented on lustre.


Yes we have Lustre in the cluster.
I believe that was one of 'another' issues mentioned, yes some users tend to use 
Lustre as HPC file system =)







Thanks
Edgar

On 12/9/2015 8:06 AM, Edgar Gabriel wrote:

I will look at your test case and see what is going on in ompio. That
being said, the vast number of fixes and improvements that went into
ompio over the last two years were not back ported to the 1.8 (and thus
1.10) series, since it would have required changes to the interfaces of
the frameworks involved (and thus would have violated one of rules of
Open MPI release series) . Anyway, if there is a simple fix for your
test case for the 1.10 series, I am happy to provide a patch. It might
take me a day or two however.

Edgar

On 12/9/2015 6:24 AM, Paul Kapinos wrote:

Sorry, forgot to mention: 1.10.1


   Open MPI: 1.10.1
 Open MPI repo revision: v1.10.0-178-gb80f802
  Open MPI release date: Nov 03, 2015
   Open RTE: 1.10.1
 Open RTE repo revision: v1.10.0-178-gb80f802
  Open RTE release date: Nov 03, 2015
   OPAL: 1.10.1
 OPAL repo revision: v1.10.0-178-gb80f802
  OPAL release date: Nov 03, 2015
MPI API: 3.0.0
   Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier
from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

  Dear Open MPI developers,
  did OMPIO (1) reached 'usable-stable' state?

  As we reported in (2) we had some trouble in building Open MPI with
ROMIO,
  which fact was hidden by OMPIO implementation stepping into the MPI_IO
  breach. The fact 'ROMIO isn't AVBL' was detected after users complained
  'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
  investigations.

  Take a look at the attached example. It deliver different result in
case of
  using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
  We've seen more examples of divergent behaviour but this one is quite
handy.

  Is that a bug in OMPIO or did we miss something?

  Best
  Paul Kapinos


  1) http://www.open-mpi.org/faq/?category=ompio

  2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

  3) (ROMIO is default; on local hard drive at node 'cluster')
  $ ompi_info  | grep  romio
 MCA io: romio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
  $ ompi_info  | grep  ompio
 MCA io: ompio (MCA v2.0.0, API v2.0.0, Component
v1.10.1)
  $ mpif90 main.f90

  $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
fileOffset, fileSize1010
fileOffset, fileSize2626
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

  $ export OMPI_MCA_io=ompio
  $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
fileOffset, fileSize 010
fileOffset, fileSize 016
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128


  --
  Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
  RWTH Aachen University, IT Center
  Seffenter Weg 23,  D 52074  Aachen (Germany)
  Tel: +49 241/80-24915



___
users 

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel

Paul,

I tested your code in master and v1.10 ( on my local machine), and I get 
for both version of ompio exactly the same (correct) output that you had 
with romio. However, I also noticed that in the ompio version that is in 
the v1.10 branch, the MPI_File_get_size function is not implemented on 
lustre. Did you run your test by any chance on a lustre file system?


Thanks
Edgar

On 12/9/2015 8:06 AM, Edgar Gabriel wrote:

I will look at your test case and see what is going on in ompio. That
being said, the vast number of fixes and improvements that went into
ompio over the last two years were not back ported to the 1.8 (and thus
1.10) series, since it would have required changes to the interfaces of
the frameworks involved (and thus would have violated one of rules of
Open MPI release series) . Anyway, if there is a simple fix for your
test case for the 1.10 series, I am happy to provide a patch. It might
take me a day or two however.

Edgar

On 12/9/2015 6:24 AM, Paul Kapinos wrote:

Sorry, forgot to mention: 1.10.1


   Open MPI: 1.10.1
 Open MPI repo revision: v1.10.0-178-gb80f802
  Open MPI release date: Nov 03, 2015
   Open RTE: 1.10.1
 Open RTE repo revision: v1.10.0-178-gb80f802
  Open RTE release date: Nov 03, 2015
   OPAL: 1.10.1
 OPAL repo revision: v1.10.0-178-gb80f802
  OPAL release date: Nov 03, 2015
MPI API: 3.0.0
   Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

  Dear Open MPI developers,
  did OMPIO (1) reached 'usable-stable' state?

  As we reported in (2) we had some trouble in building Open MPI with ROMIO,
  which fact was hidden by OMPIO implementation stepping into the MPI_IO
  breach. The fact 'ROMIO isn't AVBL' was detected after users complained
  'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
  investigations.

  Take a look at the attached example. It deliver different result in case 
of
  using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
  We've seen more examples of divergent behaviour but this one is quite 
handy.

  Is that a bug in OMPIO or did we miss something?

  Best
  Paul Kapinos


  1) http://www.open-mpi.org/faq/?category=ompio

  2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

  3) (ROMIO is default; on local hard drive at node 'cluster')
  $ ompi_info  | grep  romio
 MCA io: romio (MCA v2.0.0, API v2.0.0, Component 
v1.10.1)
  $ ompi_info  | grep  ompio
 MCA io: ompio (MCA v2.0.0, API v2.0.0, Component 
v1.10.1)
  $ mpif90 main.f90

  $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
fileOffset, fileSize1010
fileOffset, fileSize2626
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

  $ export OMPI_MCA_io=ompio
  $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
fileOffset, fileSize 010
fileOffset, fileSize 016
ierr0
MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128


  --
  Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
  RWTH Aachen University, IT Center
  Seffenter Weg 23,  D 52074  Aachen (Germany)
  Tel: +49 241/80-24915



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/12/28145.php



--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
--



Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
I will look at your test case and see what is going on in ompio. That 
being said, the vast number of fixes and improvements that went into 
ompio over the last two years were not back ported to the 1.8 (and thus 
1.10) series, since it would have required changes to the interfaces of 
the frameworks involved (and thus would have violated one of rules of 
Open MPI release series) . Anyway, if there is a simple fix for your 
test case for the 1.10 series, I am happy to provide a patch. It might 
take me a day or two however.


Edgar

On 12/9/2015 6:24 AM, Paul Kapinos wrote:

Sorry, forgot to mention: 1.10.1


  Open MPI: 1.10.1
Open MPI repo revision: v1.10.0-178-gb80f802
 Open MPI release date: Nov 03, 2015
  Open RTE: 1.10.1
Open RTE repo revision: v1.10.0-178-gb80f802
 Open RTE release date: Nov 03, 2015
  OPAL: 1.10.1
OPAL repo revision: v1.10.0-178-gb80f802
 OPAL release date: Nov 03, 2015
   MPI API: 3.0.0
  Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

 Dear Open MPI developers,
 did OMPIO (1) reached 'usable-stable' state?

 As we reported in (2) we had some trouble in building Open MPI with ROMIO,
 which fact was hidden by OMPIO implementation stepping into the MPI_IO
 breach. The fact 'ROMIO isn't AVBL' was detected after users complained
 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
 investigations.

 Take a look at the attached example. It deliver different result in case of
 using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
 We've seen more examples of divergent behaviour but this one is quite 
handy.

 Is that a bug in OMPIO or did we miss something?

 Best
 Paul Kapinos


 1) http://www.open-mpi.org/faq/?category=ompio

 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

 3) (ROMIO is default; on local hard drive at node 'cluster')
 $ ompi_info  | grep  romio
MCA io: romio (MCA v2.0.0, API v2.0.0, Component 
v1.10.1)
 $ ompi_info  | grep  ompio
MCA io: ompio (MCA v2.0.0, API v2.0.0, Component 
v1.10.1)
 $ mpif90 main.f90

 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
   fileOffset, fileSize1010
   fileOffset, fileSize2626
   ierr0
   MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

 $ export OMPI_MCA_io=ompio
 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
   fileOffset, fileSize 010
   fileOffset, fileSize 016
   ierr0
   MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128


 --
 Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
 RWTH Aachen University, IT Center
 Seffenter Weg 23,  D 52074  Aachen (Germany)
 Tel: +49 241/80-24915



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/12/28145.php





--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
--



Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Paul Kapinos

Sorry, forgot to mention: 1.10.1


Open MPI: 1.10.1
  Open MPI repo revision: v1.10.0-178-gb80f802
   Open MPI release date: Nov 03, 2015
Open RTE: 1.10.1
  Open RTE repo revision: v1.10.0-178-gb80f802
   Open RTE release date: Nov 03, 2015
OPAL: 1.10.1
  OPAL repo revision: v1.10.0-178-gb80f802
   OPAL release date: Nov 03, 2015
 MPI API: 3.0.0
Ident string: 1.10.1


On 12/09/15 11:26, Gilles Gouaillardet wrote:

Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier from
now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos > wrote:

Dear Open MPI developers,
did OMPIO (1) reached 'usable-stable' state?

As we reported in (2) we had some trouble in building Open MPI with ROMIO,
which fact was hidden by OMPIO implementation stepping into the MPI_IO
breach. The fact 'ROMIO isn't AVBL' was detected after users complained
'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
investigations.

Take a look at the attached example. It deliver different result in case of
using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
We've seen more examples of divergent behaviour but this one is quite handy.

Is that a bug in OMPIO or did we miss something?

Best
Paul Kapinos


1) http://www.open-mpi.org/faq/?category=ompio

2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

3) (ROMIO is default; on local hard drive at node 'cluster')
$ ompi_info  | grep  romio
   MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1)
$ ompi_info  | grep  ompio
   MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1)
$ mpif90 main.f90

$ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
  fileOffset, fileSize1010
  fileOffset, fileSize2626
  ierr0
  MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

$ export OMPI_MCA_io=ompio
$ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
  fileOffset, fileSize 010
  fileOffset, fileSize 016
  ierr0
  MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/12/28145.php




--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Gilles Gouaillardet
Paul,

which OpenMPI version are you using ?

thanks for providing a simple reproducer, that will make things much easier
from now.
(and at first glance, that might not be a very tricky bug)

Cheers,

Gilles

On Wednesday, December 9, 2015, Paul Kapinos 
wrote:

> Dear Open MPI developers,
> did OMPIO (1) reached 'usable-stable' state?
>
> As we reported in (2) we had some trouble in building Open MPI with ROMIO,
> which fact was hidden by OMPIO implementation stepping into the MPI_IO
> breach. The fact 'ROMIO isn't AVBL' was detected after users complained
> 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further
> investigations.
>
> Take a look at the attached example. It deliver different result in case
> of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3).
> We've seen more examples of divergent behaviour but this one is quite handy.
>
> Is that a bug in OMPIO or did we miss something?
>
> Best
> Paul Kapinos
>
>
> 1) http://www.open-mpi.org/faq/?category=ompio
>
> 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php
>
> 3) (ROMIO is default; on local hard drive at node 'cluster')
> $ ompi_info  | grep  romio
>   MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1)
> $ ompi_info  | grep  ompio
>   MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1)
> $ mpif90 main.f90
>
> $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
>  fileOffset, fileSize1010
>  fileOffset, fileSize2626
>  ierr0
>  MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128
>
> $ export OMPI_MCA_io=ompio
> $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
>  fileOffset, fileSize 010
>  fileOffset, fileSize 016
>  ierr0
>  MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128
>
>
> --
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, IT Center
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
>


[OMPI users] OMPIO correctnes issues

2015-12-09 Thread Paul Kapinos

Dear Open MPI developers,
did OMPIO (1) reached 'usable-stable' state?

As we reported in (2) we had some trouble in building Open MPI with ROMIO, which 
fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The 
fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work 
as expected with version XYZ of OpenMPI' and further investigations.


Take a look at the attached example. It deliver different result in case of 
using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've 
seen more examples of divergent behaviour but this one is quite handy.


Is that a bug in OMPIO or did we miss something?

Best
Paul Kapinos


1) http://www.open-mpi.org/faq/?category=ompio

2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php

3) (ROMIO is default; on local hard drive at node 'cluster')
$ ompi_info  | grep  romio
  MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1)
$ ompi_info  | grep  ompio
  MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1)
$ mpif90 main.f90

$ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
 fileOffset, fileSize1010
 fileOffset, fileSize2626
 ierr0
 MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128

$ export OMPI_MCA_io=ompio
$ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster  ./a.out;
 fileOffset, fileSize 010
 fileOffset, fileSize 016
 ierr0
 MPI_MODE_WRONLY,  MPI_MODE_APPEND4 128


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915
program example
use mpi
integer:: ierr
integer(MPI_OFFSET_KIND) :: fileOffset
integer(KIND=MPI_OFFSET_KIND):: fileSize
real :: outData(10)
integer :: resUnit=565
 call MPI_INIT(ierr)
 call MPI_file_open(MPI_COMM_WORLD,  'out.txt',   MPI_MODE_WRONLY + 
MPI_MODE_APPEND,  MPI_INFO_NULL,  resUnit, ierr)
 
 call MPI_FILE_GET_SIZE (resUnit, fileSize, ierr)
 call MPI_file_get_position(resUnit,fileOffset,ierr)
 print *, 'fileOffset, fileSize', fileOffset, fileSize
 
 call MPI_file_seek (resUnit,fileOffset,MPI_SEEK_SET,ierr)
 call MPI_file_write(resUnit, outData, 2, &
 MPI_DOUBLE, MPI_STATUS_IGNORE, ierr)
 
 call MPI_file_get_position(resUnit,fileOffset,ierr)
 call MPI_FILE_GET_SIZE (resUnit, fileSize, ierr)
 print *, 'fileOffset, fileSize', fileOffset, fileSize
 
 
 print *, 'ierr ', ierr
 print *, 'MPI_MODE_WRONLY,  MPI_MODE_APPEND ', MPI_MODE_WRONLY,  
MPI_MODE_APPEND
 
 
 call MPI_file_close(resUnit,ierr)
 call MPI_FINALIZE(ierr)
end


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] Odd behavior with subarray datatypes in OpenMPI 1.10.1

2015-12-09 Thread Gilles Gouaillardet

Daniel,

your program works fine with mpich, and this is very likely an OpenMPI bug

here is an intermediate patch that solves your problem, but i still have 
to fully test


Best regards,

Gilles

On 12/9/2015 2:56 AM, GARMANN, DANIEL J DR-02 USAF AFMC AFRL/RQVA wrote:

Hello all,

I've noticed a change in behavior with subarray datatypes in OpenMPI 1.10.1 where the 
lower bounds and extents are different than previous versions.  This leads to 
incorrect displacements when using the subarrays with MPI-IO.  I've attached a sample 
code for 2 processors the shows the issue.  When run on 2 processors, the program 
will decompose a 10x10 array of real(8) elements into two subarrays, dimensioned 5x10 
each.  It then serially writes a file with two full arrays, x and y, with values 
between 1.0 and 10.0, that will then be read in via MPI-IO with the unique subarrays 
created on each processor, where rank 0 gets all x <= 5.0, and rank 1 gets all x 
> 5.0.

--- OpenMPI version 1.10.0 gives the correct behavior with the following output:

FULL ARRAY SIZE: 10 10

RANK = 0
SUBARRAY DIMENSIONS :  5, 10
SUBARRAY INDICES:  0,  0
LOWER BOUND : 0
EXTENT  : 800
X-VALUES :   1.0  2.0  3.0  4.0  5.0
Y-VALUES :   1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 10.0

RANK = 1
SUBARRAY DIMENSIONS :  5, 10
SUBARRAY INDICES:  5,  0
LOWER BOUND : 0
EXTENT  : 800
X-VALUES :   6.0  7.0  8.0  9.0 10.0
Y-VALUES :   1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 10.0


--- OpenMPI version 1.10.1 results in the following, where you will notice 
different lower bounds and extents resulting in the wrong y-values on rank 0:

RANK = 0
SUBARRAY DIMENSIONS :  5, 10
SUBARRAY INDICES:  0,  0
LOWER BOUND : 0
EXTENT  : 760
X-VALUES :   1.0  2.0  3.0  4.0  5.0
Y-VALUES :   6.0  1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0

RANK = 1
SUBARRAY DIMENSIONS :  5, 10
SUBARRAY INDICES:  5,  0
LOWER BOUND : 40
EXTENT  : 760
X-VALUES :   6.0  7.0  8.0  9.0 10.0
Y-VALUES :   1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 10.0


Is this a bug with OpenMPI 1.10.1 or am I assuming something incorrectly in my 
program?  I realize it would be best to read all the data through one call to 
MPI_FILE_READ_ALL; however, this sample program is a very simple example of a 
much more complex program that cannot buffer the I/O so easily, so I am 
compelled to stick with successive reads of the subarray datatypes.

Any help is greatly appreciated!

Best regards,
Dan Garmann



PROGRAM test_extent

IMPLICIT NONE

INCLUDE 'mpif.h'

INTEGER, PARAMETER :: num_dims = 2
INTEGER :: ierr, num_procs, i, j, n, myrank, subarray_type, subsize, 
arrsize, fid
INTEGER, DIMENSION(num_dims) :: arrdim, subdim, subind
INTEGER(KIND=MPI_ADDRESS_KIND) :: lower_bound, extent
REAL(KIND=8), DIMENSION(10,10) :: x, y
REAL(KIND=8), DIMENSION( 5,10) :: xsub, ysub

CALL MPI_INIT( ierr )
CALL MPI_COMM_RANK( mpi_comm_world, myrank, ierr )
CALL MPI_COMM_SIZE( mpi_comm_world, num_procs, ierr )

arrdim = (/ 10, 10 /)! Full array dimensions
subdim = (/  5, 10 /)  ! Sub-array dimensions
subind = (/ myrank*5, 0 /) ! Sub-array stating index in full array (base 0)
arrsize = PRODUCT(arrdim) ! number of elements in full array
subsize = PRODUCT(subdim) ! number of elements in sub-array

CALL MPI_TYPE_CREATE_SUBARRAY( num_dims, arrdim, subdim, subind, &
&  MPI_ORDER_FORTRAN, MPI_REAL8, subarray_type, 
ierr )
CALL MPI_TYPE_COMMIT ( subarray_type, ierr )
CALL MPI_TYPE_GET_EXTENT( subarray_type, lower_bound, extent, ierr )

! Write temporary file for testing MPI-IO
IF (myrank == 0) THEN
   DO j = 1, 10
  DO i = 1, 10
 x(i,j) = REAL(i,KIND=8)
 y(i,j) = REAL(j,KIND=8)
  END DO
   END DO
   
OPEN(UNIT=1,FILE='io_test.dat',FORM='UNFORMATTED',ACCESS='STREAM',ACTION='WRITE',STATUS='REPLACE')
   WRITE(1) x, y
   CLOSE(1)
END IF

! Open test file and read using MPI-IO with the sub-array datatype
CALL MPI_FILE_OPEN( mpi_comm_world, 'io_test.dat', MPI_MODE_RDONLY, 
MPI_INFO_NULL, fid, ierr )
CALL MPI_FILE_SET_VIEW( fid, 0_MPI_OFFSET_KIND, MPI_REAL8, subarray_type, 
'NATIVE', MPI_INFO_NULL, ierr )
CALL MPI_FILE_READ_ALL( fid, xsub, subsize, MPI_REAL8, MPI_STATUS_IGNORE, 
ierr )
CALL MPI_FILE_READ_ALL( fid, ysub, subsize, MPI_REAL8, MPI_STATUS_IGNORE, 
ierr )
CALL MPI_FILE_CLOSE( fid, ierr )

! Write output to screen
IF (myrank == 0) WRITE(*,'(A,I0,1X,I0)') 'FULL ARRAY SIZE: ', arrdim
DO n = 0, num_procs-1
   IF (myrank == n) THEN
  WRITE(*,'(/A,I0)') 'RANK = ', myrank
  WRITE(*,'(3X,A,I2,", ",I2)') 'SUBARRAY DIMENSIONS : ', subdim
  WRITE(*,'(3X,A,I2,", ",I2)') 'SUBARRAY INDICES: ', subind
  WRITE(*,'(2(3X,A,I0))') 'LOWER BOUND : ', lower_bound