Re: [OMPI users] OMPIO correctnes issues
ok, I can confirm that once I update the file_get_position function to what we have in master and the 2.x series, your test passes with ompio in the 1.10 series as well. I am happy to provide a patch for testing, and to submit a pr. I am however worried since we know that ompio in the 1.10 series is significantly out of sync with master, so there is potential for other, similar issues. It would however be interesting to see, whether your code works correctly with ompio in the 2.x release (or master), and I would be happy to provide any support necessary for testing (including the offer, that I can run the tests if you provide me the source code). Thanks Edgar On 12/9/2015 9:30 AM, Edgar Gabriel wrote: what does the mount command return? On 12/9/2015 9:27 AM, Paul Kapinos wrote: Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've tested it at local hard disk.. pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.1T 16G 1.1T 2% /w0 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 7 7 fileOffset, fileSize2323 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 0 7 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented on lustre. Yes we have Lustre in the cluster. I believe that was one of 'another' issues mentioned, yes some users tend to use Lustre as HPC file system =) Thanks Edgar On 12/9/2015 8:06 AM, Edgar Gabriel wrote: I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 >
Re: [OMPI users] OMPIO correctnes issues
ok, forget it, I found the issue. I totally forgot that in the 1.10 series I have to manually force ompio ( it is the default on master and 2.x). It fails now for me as well with v1.10, will elt you know what I find. Thanks Edgar On 12/9/2015 9:30 AM, Edgar Gabriel wrote: what does the mount command return? On 12/9/2015 9:27 AM, Paul Kapinos wrote: Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've tested it at local hard disk.. pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.1T 16G 1.1T 2% /w0 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 7 7 fileOffset, fileSize2323 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 0 7 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented on lustre. Yes we have Lustre in the cluster. I believe that was one of 'another' issues mentioned, yes some users tend to use Lustre as HPC file system =) Thanks Edgar On 12/9/2015 8:06 AM, Edgar Gabriel wrote: I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize
Re: [OMPI users] OMPIO correctnes issues
what does the mount command return? On 12/9/2015 9:27 AM, Paul Kapinos wrote: Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've tested it at local hard disk.. pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.1T 16G 1.1T 2% /w0 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 7 7 fileOffset, fileSize2323 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 0 7 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented on lustre. Yes we have Lustre in the cluster. I believe that was one of 'another' issues mentioned, yes some users tend to use Lustre as HPC file system =) Thanks Edgar On 12/9/2015 8:06 AM, Edgar Gabriel wrote: I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center
Re: [OMPI users] OMPIO correctnes issues
Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've tested it at local hard disk.. pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.1T 16G 1.1T 2% /w0 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 7 7 fileOffset, fileSize2323 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 0 7 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented on lustre. Yes we have Lustre in the cluster. I believe that was one of 'another' issues mentioned, yes some users tend to use Lustre as HPC file system =) Thanks Edgar On 12/9/2015 8:06 AM, Edgar Gabriel wrote: I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users
Re: [OMPI users] OMPIO correctnes issues
Paul, I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented on lustre. Did you run your test by any chance on a lustre file system? Thanks Edgar On 12/9/2015 8:06 AM, Edgar Gabriel wrote: I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/12/28145.php -- Edgar Gabriel Associate Professor Parallel Software Technologies Lab http://pstl.cs.uh.edu Department of Computer Science University of Houston Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 --
Re: [OMPI users] OMPIO correctnes issues
I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/12/28145.php -- Edgar Gabriel Associate Professor Parallel Software Technologies Lab http://pstl.cs.uh.edu Department of Computer Science University of Houston Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 --
Re: [OMPI users] OMPIO correctnes issues
Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/12/28145.php -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] OMPIO correctnes issues
Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinoswrote: > Dear Open MPI developers, > did OMPIO (1) reached 'usable-stable' state? > > As we reported in (2) we had some trouble in building Open MPI with ROMIO, > which fact was hidden by OMPIO implementation stepping into the MPI_IO > breach. The fact 'ROMIO isn't AVBL' was detected after users complained > 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further > investigations. > > Take a look at the attached example. It deliver different result in case > of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). > We've seen more examples of divergent behaviour but this one is quite handy. > > Is that a bug in OMPIO or did we miss something? > > Best > Paul Kapinos > > > 1) http://www.open-mpi.org/faq/?category=ompio > > 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php > > 3) (ROMIO is default; on local hard drive at node 'cluster') > $ ompi_info | grep romio > MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) > $ ompi_info | grep ompio > MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) > $ mpif90 main.f90 > > $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; > fileOffset, fileSize1010 > fileOffset, fileSize2626 > ierr0 > MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 > > $ export OMPI_MCA_io=ompio > $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; > fileOffset, fileSize 010 > fileOffset, fileSize 016 > ierr0 > MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 > > > -- > Dipl.-Inform. Paul Kapinos - High Performance Computing, > RWTH Aachen University, IT Center > Seffenter Weg 23, D 52074 Aachen (Germany) > Tel: +49 241/80-24915 >
[OMPI users] OMPIO correctnes issues
Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 program example use mpi integer:: ierr integer(MPI_OFFSET_KIND) :: fileOffset integer(KIND=MPI_OFFSET_KIND):: fileSize real :: outData(10) integer :: resUnit=565 call MPI_INIT(ierr) call MPI_file_open(MPI_COMM_WORLD, 'out.txt', MPI_MODE_WRONLY + MPI_MODE_APPEND, MPI_INFO_NULL, resUnit, ierr) call MPI_FILE_GET_SIZE (resUnit, fileSize, ierr) call MPI_file_get_position(resUnit,fileOffset,ierr) print *, 'fileOffset, fileSize', fileOffset, fileSize call MPI_file_seek (resUnit,fileOffset,MPI_SEEK_SET,ierr) call MPI_file_write(resUnit, outData, 2, & MPI_DOUBLE, MPI_STATUS_IGNORE, ierr) call MPI_file_get_position(resUnit,fileOffset,ierr) call MPI_FILE_GET_SIZE (resUnit, fileSize, ierr) print *, 'fileOffset, fileSize', fileOffset, fileSize print *, 'ierr ', ierr print *, 'MPI_MODE_WRONLY, MPI_MODE_APPEND ', MPI_MODE_WRONLY, MPI_MODE_APPEND call MPI_file_close(resUnit,ierr) call MPI_FINALIZE(ierr) end smime.p7s Description: S/MIME Cryptographic Signature