On 29/08/16 15:06, Guyer, Jonathan E. Dr. (Fed) wrote:
> Hard to say. Diagnosing parallel race conditions is neither easy nor 
> entertaining.
>
> Could be that you just need a comm.Barrier() call before the comm.bcast() 
> calls.
>
> One possible issue is that FiPy automatically partitions the mesh when it 
> detects that it is running in parallel, so the result of your run() routine 
> is that phi1 gets solved only on partition 1, phi2 only on partition 2, etc. 
> and so it may be waiting for the solutions on the other partitions that are 
> never going to come.

As I print the shape of phi1...phi4, I found the reason is that: I want
to solve four equations with each equation (in whole mesh) one core.
However, once running with $mpiexec -n 4 python script.py, Fipy will
automatically divide the mesh and cellvariables into four subdomains.
Besides, it seems that the user defined communicator cannot control fipy
communicator.
Anyway, if I do not do any tricky thing, just write normal code and run
in four-core parallel, for example the following code:

#############

nx = 200 ny = nx
dx = 1. dy = 1. mesh = fp.PeriodicGrid2D(nx=nx, ny=ny, dx=dx, dy=dy)

eq1 = fp.TransientTerm(var=phi1, coeff=1.) == fp.DiffusionTerm(var=phi1, 
coeff=1.)
eq2 = fp.TransientTerm(var=phi2, coeff=1.) == fp.DiffusionTerm(var=phi2, 
coeff=1.)
eq3 = fp.TransientTerm(var=phi3, coeff=1.) == fp.DiffusionTerm(var=phi3, 
coeff=1.)
eq4 = fp.TransientTerm(var=phi4, coeff=1.) == fp.DiffusionTerm(var=phi4, 
coeff=1.)
eq = eq1 & eq2 & eq3 & eq4
res = 1.e4 while res > 1.e-3:
    eq.cacheMatrix()
    eq.cacheRHSvector()
    res = eq.sweep(dt=1, solver=solver())
    mat = eq.matrix
    vec = eq.RHSvector

phi1.updateOld()
phi2.updateOld()
phi3.updateOld()
phi4.updateOld()
print 'phi1 shape = ', np.shape(phi1.value)

#############   I notice the np.shape(phi1.value) is not the whole mesh
shape (200*200), but (10400) ,(10400), (10800) and (10800) in four
subdomains. The question is, how to gather phi1 in subdomains to make it
a complete one in the whole domain (with shape 40000)? Thanks
>> On Aug 28, 2016, at 11:50 AM, ronghai wu <[email protected]> wrote:
>>
>>
>> Dear Fipy developers and users,
>>
>> The official parallel running way works for me, however, I would like to try 
>> an different way shown in the attached script. When running by $time mpiexec 
>> -n 4 python My_Fipy_Parallel.py, it gets stuck at time step 179 without any 
>> error message pop out, four cores are full but enough memory left. The 
>> bigger the mesh is, the sooner it will get stuck. Does any know why and how 
>> to solve this problem? Thanks.
>>
>> Regards
>> Ronghai
>> <My_Fipy_Parallel.py>_______________________________________________
>> fipy mailing list
>> [email protected]
>> http://www.ctcms.nist.gov/fipy
>>  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
>
> _______________________________________________
> fipy mailing list
> [email protected]
> http://www.ctcms.nist.gov/fipy
>   [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
-- 
------------------------------------------
Ronghai Wu

Institute of Materials Simulation (WW8)
Department of Materials Science and Engineering
University of Erlangen-Nürnberg
Dr.-Mack-Str. 77, 90762 Fürth, Germany

Tel. +49 (0)911 65078-65064
_______________________________________________
fipy mailing list
[email protected]
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

Reply via email to