BTW: How does redistributing MPI/runtime work with all the choices you have?

For ex: with MS-MPI, Intel-MPI - wouldn't the user have to install these 
packages? [i.e you can't just copy them over to a folder and have mpiexec work 
- from what I can tell]

And how did you plan on installing MPICH - but make mpiexec from OpenMPI 
redistributable? Did you use OpeMPI from cygwin - or install it manually?

And presumably you don't want users installing cygwin.

Satish

On Sun, 28 Jun 2020, Satish Balay via petsc-users wrote:

> On Sun, 28 Jun 2020, Paolo Lampitella wrote:
> 
> > Dear PETSc users,
> > 
> > I’ve been an happy PETSc user since version 3.3, using it both under Ubuntu 
> > (from 14.04 up to 20.04) and CentOS (from 5 to 8).
> > 
> > I use it as an optional component for a parallel Fortran code (that, BTW, 
> > also uses metis) and, wherever allowed, I used to install myself MPI (both 
> > MPICH and OpenMPI) and PETSc on top of it without any trouble ever (besides 
> > being, myself, as dumb as one can be in this).
> > 
> > I did this on top of gnu compilers and, less extensively, intel compilers, 
> > both on a range of different systems (from virtual machines, to 
> > workstations to actual clusters).
> > 
> > So far so good.
> > 
> > Today I find myself in the need of deploying my application to Windows 10 
> > users, which means giving them a folder with all the executables and 
> > libraries to make them run in it, including the mpi runtime. Unfortunately, 
> > I also have to rely on free tools (can’t afford Intel for the moment).
> > 
> > To the best of my knowledge, considering also far from optimal solutions, 
> > my options would then be: Virtual machines and WSL1, Cygwin, MSYS2-MinGW64, 
> > Cross compiling with MinGW64 from within Linux, PGI + Visual Studio + 
> > Cygwin (not sure about this one)
> > 
> > I know this is largely unsupported, but I was wondering if there is, 
> > nonetheless, some general (and more official) knowledge available on the 
> > matter. What I tried so far:
> > 
> > 
> >   1.  Virtual machines and WSL1: both work like a charm, just like in the 
> > native OS, but very far from ideal for the distribution purpose
> > 
> > 
> >   1.  Cygwin with gnu compilers (as opposed to using Intel and Visual 
> > Studio): I was unable to compile myself MPI as I am used to on Linux, so I 
> > just tried going all in and let PETSc do everything for me (using static 
> > linking): download and install MPICH, BLAS, LAPACK, METIS and HYPRE. 
> > Everything just worked (for now compiling and making trivial tests) and I 
> > am able to use everything from within a cygwin terminal (even with 
> > executables and dependencies outside cygwin). Still, even within cygwin, I 
> > can’t switch to use, say, the cygwin ompi mpirun/mpiexec for an mpi program 
> > compiled with PETSc mpich (things run but not as expected). Some troubles 
> > start when I try to use cmd.exe (which I pictured as the more natural way 
> > to launch in Windows). In particular, using (note that \ is in cmd.exe, / 
> > was used in cygwin terminal):
> 
> I don't understand. Why build with MPICH - but use mpiexec from OpenMPI?
> 
> If it is because you can easily redistribute OpenMPI - why not build PETSc 
> with OpenMPI?
> 
> You can't use Intel/MS-MPI from cygwin/gcc/gfortran
> 
> Also - even-though --download-mpich works with cygwin/gcc - its no loner 
> supported on windows [by MPICH group].
> 
> > 
> > .\mpiexec.hydra.exe -np 8 .\my.exe
> > 
> > Nothing happens unless I push Enter a second time. Things seem to work 
> > then, but if I try to run a serial executable with the command above I get 
> > the following errors (which, instead, doesn’t happen using the cygwin 
> > terminal):
> > 
> > [proxy:0:0@Dell7540-Paolo] HYDU_sock_write (utils/sock/sock.c:286): write 
> > error (No such process)
> > [proxy:0:0@Dell7540-Paolo] HYD_pmcd_pmip_control_cmd_cb 
> > (pm/pmiserv/pmip_cb.c:935): unable to write to downstream stdin
> > [proxy:0:0@Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event 
> > (tools/demux/demux_poll.c:76): callback returned error status
> > [proxy:0:0@Dell7540-Paolo] main (pm/pmiserv/pmip.c:206): demux engine error 
> > waiting for event
> > [mpiexec@Dell7540-Paolo] control_cb (pm/pmiserv/pmiserv_cb.c:200): assert 
> > (!closed) failed
> > [mpiexec@Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event 
> > (tools/demux/demux_poll.c:76): callback returned error status
> > [mpiexec@Dell7540-Paolo] HYD_pmci_wait_for_completion 
> > (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
> > [mpiexec@Dell7540-Paolo] main (ui/mpich/mpiexec.c:336): process manager 
> > error waiting for completion
> > 
> > Just for the sake of completeness, I also tried using the Intel and 
> > Microsoft MPI redistributables, which might be more natural candidates, 
> > instead of the petsc compiled version of the MPI runtime (and they are 
> > MPICH derivatives, after all). But, running with:
> > 
> > mpiexec -np 1 my.exe
> > 
> > I get the following error with Intel:
> > 
> > [cli_0]: write_line error; fd=440 buf=:cmd=init pmi_version=1 
> > pmi_subversion=1
> > :
> > system msg for write_line failure : Bad file descriptor
> > [cli_0]: Unable to write to PMI_fd
> > [cli_0]: write_line error; fd=440 buf=:cmd=get_appnum
> > :
> > system msg for write_line failure : Bad file descriptor
> > Fatal error in MPI_Init: Other MPI error, error stack:
> > MPIR_Init_thread(467):
> > MPID_Init(140).......: channel initialization failed
> > MPID_Init(421).......: PMI_Get_appnum returned -1
> > [cli_0]: aborting job:
> > Fatal error in MPI_Init: Other MPI error, error stack:
> > MPIR_Init_thread(467):
> > MPID_Init(140).......: channel initialization failed
> > MPID_Init(421).......: PMI_Get_appnum returned -1
> > 
> > And the following error with MS-MPI:
> > 
> > [unset]: unable to decode hostport from 44e5747b-d19e-4ea8-ac7a-ec2102cabb21
> > Fatal error in MPI_Init: Other MPI error, error stack:
> > MPIR_Init_thread(467):
> > MPID_Init(140).......: channel initialization failed
> > MPID_Init(403).......: PMI_Init returned -1
> > [unset]: aborting job:
> > Fatal error in MPI_Init: Other MPI error, error stack:
> > MPIR_Init_thread(467):
> > MPID_Init(140).......: channel initialization failed
> > MPID_Init(403).......: PMI_Init returned -1
> > 
> > independently from the number of processes, but more processes produce more 
> > copies of this. However, both Intel and MS-MPI are able to run a serial 
> > fortran executable built with cygwin. I think I made everything correctly 
> > and adding -localhost didn’t help (actually, it caused more problems to the 
> > interpretation of the cmd line arguments for mpiexec)
> > 
> > 
> >   1.  Cygwin with MinGW64 compilers. Never managed to compile MPI, not even 
> > trough PETSc.
> > 
> > 
> > 
> >   1.  MSYS2+MinGW64 compilers. I understood that MinGW is not well 
> > supported, probably because of how it handles paths, but I wanted to give 
> > it a try, because it should be more “native” and there seems to be relevant 
> > examples out there that managed to do it. I first tried with the msys2 mpi 
> > distribution, produced the .mod file out of the mpi.f90 file in the 
> > distribution (I tried my best with different hacks from known limitations 
> > of this file as also present in the official MS-MPI distribution) and tried 
> > with my code without petsc, but it failed in compiling the code with some 
> > strange MPI related error (argument mismatch between two unrelated MPI 
> > calls in the code, which is non sense to me). In contrast, simple mpi tests 
> > (hello world like) worked as expected. Then I decided to follow this:
> > 
> > 
> > 
> > https://doc.freefem.org/introduction/installation.html#compilation-on-windows
> > 
> > 
> > 
> > but the exact same type of error came up (MPI calls in my code were 
> > different, but the error was the same). Trying again from scratch (i.e., 
> > without all the things I did in the beginning to compile my code) the same 
> > error came up in compiling some of the freefem dependencies (this time not 
> > even mpi calls).
> > 
> > 
> > 
> > As a side note, there seems to be an official effort in porting petsc to 
> > msys2 
> > (https://github.com/okhlybov/MINGW-packages/tree/whpc/mingw-w64-petsc), but 
> > it didn’t get into the official packages yet, which I interpret as a warning
> > 
> > 
> > 
> >   1.  Didn’t give a try to cross compiling with MinGw from Linux, as I 
> > tought it couldn’t be any better than doing it from MSYS2
> >   2.  Didn’t try PGI as I actually didn’t know if I would then been able to 
> > make PETSc work.
> > 
> > So, here there are some questions I have with respect to where I stand now 
> > and the points above:
> > 
> > 
> >      *   I haven’t seen the MSYS2-MinGw64 toolchain mentioned at all in 
> > official documentation/discussions. Should I definitely abandon it (despite 
> > someone mentioning it as working) because of known issues?
> 
> I don't have experience with MSYS2-MinGw64, However Pierre does - and perhaps 
> can comment on this. I don't know how things work on the fortran side.
> 
> >      *   What about the PGI route? I don’t see it mentioned as well. I 
> > guess it would require some work on win32fe
> 
> Again - no experience here.
> 
> >      *   For my Cygwin-GNU route (basically what is mentioned in PFLOTRAN 
> > documentation), am I expected to then run from the cygwin terminal or 
> > should the windows prompt work as well? Is the fact that I require a second 
> > Enter hit and the mismanagement of serial executables the sign of something 
> > wrong with the Windows prompt?
> 
> I would think Cygwin-GNU route should work. I'll have to see if I can 
> reproduce the issues you have.
> 
> Satish
> 
> >      *   More generally, is there some known working, albeit non official, 
> > route given my constraints (free+fortran+windows+mpi+petsc)?
> > 
> > Thanks for your attention and your great work on PETSc
> > 
> > Best regards
> > 
> > Paolo Lampitella
> > 

Reply via email to