On Sun, 28 Jun 2020, Paolo Lampitella wrote:

> Dear PETSc users,
> 
> I’ve been an happy PETSc user since version 3.3, using it both under Ubuntu 
> (from 14.04 up to 20.04) and CentOS (from 5 to 8).
> 
> I use it as an optional component for a parallel Fortran code (that, BTW, 
> also uses metis) and, wherever allowed, I used to install myself MPI (both 
> MPICH and OpenMPI) and PETSc on top of it without any trouble ever (besides 
> being, myself, as dumb as one can be in this).
> 
> I did this on top of gnu compilers and, less extensively, intel compilers, 
> both on a range of different systems (from virtual machines, to workstations 
> to actual clusters).
> 
> So far so good.
> 
> Today I find myself in the need of deploying my application to Windows 10 
> users, which means giving them a folder with all the executables and 
> libraries to make them run in it, including the mpi runtime. Unfortunately, I 
> also have to rely on free tools (can’t afford Intel for the moment).
> 
> To the best of my knowledge, considering also far from optimal solutions, my 
> options would then be: Virtual machines and WSL1, Cygwin, MSYS2-MinGW64, 
> Cross compiling with MinGW64 from within Linux, PGI + Visual Studio + Cygwin 
> (not sure about this one)
> 
> I know this is largely unsupported, but I was wondering if there is, 
> nonetheless, some general (and more official) knowledge available on the 
> matter. What I tried so far:
> 
> 
>   1.  Virtual machines and WSL1: both work like a charm, just like in the 
> native OS, but very far from ideal for the distribution purpose
> 
> 
>   1.  Cygwin with gnu compilers (as opposed to using Intel and Visual 
> Studio): I was unable to compile myself MPI as I am used to on Linux, so I 
> just tried going all in and let PETSc do everything for me (using static 
> linking): download and install MPICH, BLAS, LAPACK, METIS and HYPRE. 
> Everything just worked (for now compiling and making trivial tests) and I am 
> able to use everything from within a cygwin terminal (even with executables 
> and dependencies outside cygwin). Still, even within cygwin, I can’t switch 
> to use, say, the cygwin ompi mpirun/mpiexec for an mpi program compiled with 
> PETSc mpich (things run but not as expected). Some troubles start when I try 
> to use cmd.exe (which I pictured as the more natural way to launch in 
> Windows). In particular, using (note that \ is in cmd.exe, / was used in 
> cygwin terminal):

I don't understand. Why build with MPICH - but use mpiexec from OpenMPI?

If it is because you can easily redistribute OpenMPI - why not build PETSc with 
OpenMPI?

You can't use Intel/MS-MPI from cygwin/gcc/gfortran

Also - even-though --download-mpich works with cygwin/gcc - its no loner 
supported on windows [by MPICH group].

> 
> .\mpiexec.hydra.exe -np 8 .\my.exe
> 
> Nothing happens unless I push Enter a second time. Things seem to work then, 
> but if I try to run a serial executable with the command above I get the 
> following errors (which, instead, doesn’t happen using the cygwin terminal):
> 
> [proxy:0:0@Dell7540-Paolo] HYDU_sock_write (utils/sock/sock.c:286): write 
> error (No such process)
> [proxy:0:0@Dell7540-Paolo] HYD_pmcd_pmip_control_cmd_cb 
> (pm/pmiserv/pmip_cb.c:935): unable to write to downstream stdin
> [proxy:0:0@Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event 
> (tools/demux/demux_poll.c:76): callback returned error status
> [proxy:0:0@Dell7540-Paolo] main (pm/pmiserv/pmip.c:206): demux engine error 
> waiting for event
> [mpiexec@Dell7540-Paolo] control_cb (pm/pmiserv/pmiserv_cb.c:200): assert 
> (!closed) failed
> [mpiexec@Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event 
> (tools/demux/demux_poll.c:76): callback returned error status
> [mpiexec@Dell7540-Paolo] HYD_pmci_wait_for_completion 
> (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
> [mpiexec@Dell7540-Paolo] main (ui/mpich/mpiexec.c:336): process manager error 
> waiting for completion
> 
> Just for the sake of completeness, I also tried using the Intel and Microsoft 
> MPI redistributables, which might be more natural candidates, instead of the 
> petsc compiled version of the MPI runtime (and they are MPICH derivatives, 
> after all). But, running with:
> 
> mpiexec -np 1 my.exe
> 
> I get the following error with Intel:
> 
> [cli_0]: write_line error; fd=440 buf=:cmd=init pmi_version=1 pmi_subversion=1
> :
> system msg for write_line failure : Bad file descriptor
> [cli_0]: Unable to write to PMI_fd
> [cli_0]: write_line error; fd=440 buf=:cmd=get_appnum
> :
> system msg for write_line failure : Bad file descriptor
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(421).......: PMI_Get_appnum returned -1
> [cli_0]: aborting job:
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(421).......: PMI_Get_appnum returned -1
> 
> And the following error with MS-MPI:
> 
> [unset]: unable to decode hostport from 44e5747b-d19e-4ea8-ac7a-ec2102cabb21
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(403).......: PMI_Init returned -1
> [unset]: aborting job:
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(403).......: PMI_Init returned -1
> 
> independently from the number of processes, but more processes produce more 
> copies of this. However, both Intel and MS-MPI are able to run a serial 
> fortran executable built with cygwin. I think I made everything correctly and 
> adding -localhost didn’t help (actually, it caused more problems to the 
> interpretation of the cmd line arguments for mpiexec)
> 
> 
>   1.  Cygwin with MinGW64 compilers. Never managed to compile MPI, not even 
> trough PETSc.
> 
> 
> 
>   1.  MSYS2+MinGW64 compilers. I understood that MinGW is not well supported, 
> probably because of how it handles paths, but I wanted to give it a try, 
> because it should be more “native” and there seems to be relevant examples 
> out there that managed to do it. I first tried with the msys2 mpi 
> distribution, produced the .mod file out of the mpi.f90 file in the 
> distribution (I tried my best with different hacks from known limitations of 
> this file as also present in the official MS-MPI distribution) and tried with 
> my code without petsc, but it failed in compiling the code with some strange 
> MPI related error (argument mismatch between two unrelated MPI calls in the 
> code, which is non sense to me). In contrast, simple mpi tests (hello world 
> like) worked as expected. Then I decided to follow this:
> 
> 
> 
> https://doc.freefem.org/introduction/installation.html#compilation-on-windows
> 
> 
> 
> but the exact same type of error came up (MPI calls in my code were 
> different, but the error was the same). Trying again from scratch (i.e., 
> without all the things I did in the beginning to compile my code) the same 
> error came up in compiling some of the freefem dependencies (this time not 
> even mpi calls).
> 
> 
> 
> As a side note, there seems to be an official effort in porting petsc to 
> msys2 (https://github.com/okhlybov/MINGW-packages/tree/whpc/mingw-w64-petsc), 
> but it didn’t get into the official packages yet, which I interpret as a 
> warning
> 
> 
> 
>   1.  Didn’t give a try to cross compiling with MinGw from Linux, as I tought 
> it couldn’t be any better than doing it from MSYS2
>   2.  Didn’t try PGI as I actually didn’t know if I would then been able to 
> make PETSc work.
> 
> So, here there are some questions I have with respect to where I stand now 
> and the points above:
> 
> 
>      *   I haven’t seen the MSYS2-MinGw64 toolchain mentioned at all in 
> official documentation/discussions. Should I definitely abandon it (despite 
> someone mentioning it as working) because of known issues?

I don't have experience with MSYS2-MinGw64, However Pierre does - and perhaps 
can comment on this. I don't know how things work on the fortran side.

>      *   What about the PGI route? I don’t see it mentioned as well. I guess 
> it would require some work on win32fe

Again - no experience here.

>      *   For my Cygwin-GNU route (basically what is mentioned in PFLOTRAN 
> documentation), am I expected to then run from the cygwin terminal or should 
> the windows prompt work as well? Is the fact that I require a second Enter 
> hit and the mismanagement of serial executables the sign of something wrong 
> with the Windows prompt?

I would think Cygwin-GNU route should work. I'll have to see if I can reproduce 
the issues you have.

Satish

>      *   More generally, is there some known working, albeit non official, 
> route given my constraints (free+fortran+windows+mpi+petsc)?
> 
> Thanks for your attention and your great work on PETSc
> 
> Best regards
> 
> Paolo Lampitella
> 

Reply via email to