Re: Issue with executing FiPy processes in parallel

Guyer, Jonathan E. Dr. Sat, 21 Mar 2015 08:47:36 -0700

It doesn't have anything to do with the solver. 

The issue is that multiprocessing pickles the arguments to the multiprocessed 
function. FiPy has known issues with pickling (see 
https://github.com/usnistgov/fipy/issues/358 and 
https://github.com/usnistgov/fipy/wiki/MeshPickling#sour-pickles (which didn't 
survive migration from http://matforge.org/fipy/wiki/MeshPickling#sourpickles)).


In this case, the problem is that our pickling code is unaware that you have 
changed the mesh origin.
This situation is easily resolved by replacing 

> mesh = fipy.Grid2D(dx=dx, dy=dp, nx=2*nx, ny=2*np, communicator=comm)
> mesh.origin = numpy.array([[-xMax],[-pMax]])

with

> mesh = fipy.Grid2D(dx=dx, dy=dp, nx=2*nx, ny=2*np, communicator=comm) + 
> [[-xMax],[-pMax]]


As a final remark, I would encourage trimming a *lot* more to get diagnostic 
examples. A single iteration is sufficient to demonstrate the problem and is 
much easier to follow. I set n1 = n2 = 1 and looped 'for k in range(1)'. 
Nothing more is needed. You can probably throw a lot more away than that.


On Mar 13, 2015, at 4:16 PM, [email protected] wrote:

> There are three possible ways to execute the code. One could execute
> everything in serial, or use multiprocessing with processes=1, or
> multiprocessing with processes=N. With my code, I'm finding that the two
> multiprocessing results are the same but do not agree with the serial
> result.
> 
> I'm attaching a trimmed down version of the program which still displays
> this behavior.  It is currently set up to test the impulse responses both
> in serial and parallel with processes=1, forcing both methods to use
> LinearPCGSolver. It also prints out lists of the impulse response after a
> small time evolution.
> 
> Thank you,
> Tony
> 
> 
> # Import Modules
> import fipy
> import numpy
> import multiprocessing
> 
> ############################## Functions ############################
> def evolution(phi, tStart, tEnd, mesh, xMax, pMax):
>    tEvo = tStart
>    X, P = mesh.cellCenters()
>    xFace, pFace = mesh.faceCenters()
>    dx = mesh.dx
>    dp = mesh.dy
>    nHalf = mesh.shape[1]/2
>    while(tEvo<tEnd):
>        timeStepDuration=((1./(nHalf))*(1./numpy.sqrt(2)))
>        if(tEvo+timeStepDuration>tEnd):
>            timeStepDuration = tEnd - tEvo
>        # Create Diffusion Term
>        gxx = 0.*X
>        gxp = 0.*X
>        gpx = 0.*X
>        gpp = 0.*X + 2.
>        dTerm = fipy.DiffusionTerm(fipy.CellVariable(mesh=mesh,
>                                            value=[[gxx,gxp],[gpx,gpp]]))
>        del gxx, gxp, gpx, gpp
>        # Create convection term
>        uX = 1.*pFace
>        uP = -1.*xFace - 2.*pFace
>        cCoeff = fipy.FaceVariable(mesh=mesh, value=[uX,uP])
>        cTerm = fipy.ExponentialConvectionTerm(cCoeff)
>        del uX, uP, cCoeff
>        # Create evolution equation
>        eq = fipy.TransientTerm()+cTerm==dTerm
>        # Evolve system
>        solver = fipy.solvers.LinearPCGSolver(tolerance=10**-20, \
>                                                iterations=1000, precon=None)
>        eq.solve(var=phi, dt=timeStepDuration, solver=solver)
>        norm = fipy.numerix.sum(phi.value, axis=0)*dx*dp
>        tEvo = tEvo + timeStepDuration
>        if(abs(norm-1)>10**(-10)):
>            s1 = 'Error: Distribution is no longer normalized.\n'
>            s2 = 'Abs(1-Norm) = '
>            s3 = repr(norm-1)
>            print(s1 + s2 + s3)
>            del s1, s2, s3
>            break
>        del timeStepDuration, cTerm, dTerm, eq, norm, solver
>    del tEvo, X, P, xFace, pFace, dx, dp, nHalf
>    del tStart, tEnd, mesh, xMax, pMax
>    return phi
> 
> def impulseResponse(argArray):
>    tStart = argArray[0]
>    tFinal = argArray[1]
>    mesh = argArray[2]
>    xMax = argArray[3]
>    pMax = argArray[4]
>    k = argArray[5]
>    dx = mesh.dx
>    dp = mesh.dy
>    nSide = mesh.nx
>    phi = fipy.CellVariable(mesh=mesh, name=r"$\rho(x,p)$")
>    temp = numpy.zeros(((nSide)**2,))
>    temp[k]=1.
>    norm = dx*dp
>    phi.setValue(temp/norm)
>    del dx, dp, nSide, temp, norm, argArray
>    response = evolution(phi, tStart, tFinal, mesh, xMax, pMax)
>    del phi, tStart, tFinal, mesh, xMax, pMax
>    print(repr(k)+' , '+repr(response.value[k]))
>    del k
>    return response
> 
> #####################################################################
> #####################################################################
> 
> #################### User Specified Parameters ######################
> 
> # Initial time for evolution
> tStart = 0.
> 
> # Final time for evolution
> tFinal = 0.001
> 
> # Number of divisions per thermal length in phase space
> n1 = 2
> 
> # Number of thermal lengths in phase space domain
> n2 = 3
> 
> #####################################################################
> #####################################################################
> 
> # Initialize phase-space lattice
> nx = n1*n2
> np = nx
> dx = 1./n1
> dp = 1./n1
> xMax = dx*nx
> pMax = dp*np
> comm = fipy.tools.serial
> mesh = fipy.Grid2D(dx=dx, dy=dp, nx=2*nx, ny=2*np, communicator=comm)
> mesh.origin = numpy.array([[-xMax],[-pMax]])
> del np, dx, dp
> 
> # Find all impulse response functions in parallel
> pool = multiprocessing.Pool(processes=1)
> responseStore = pool.map(impulseResponse, \
>                            [[tStart, tFinal, mesh, xMax, pMax, k]\
>                            for k in range((2*nx)**2)])
> pool.close()
> pool.join()
> del pool
> 
> # Find all impulse response functions in serial
> responseStoreSer = numpy.zeros(((2*nx)**2,(2*nx)**2))
> for k in range((2*nx)**2):
>    responseStoreSer[k,:] = impulseResponse([tStart, tFinal, mesh, \
>                                                            xMax, pMax, k])
> 
> 
> 
> 
> 
>> I'm not aware that we've encountered this, but I also don't think we've
>> ever tried.
>> 
>> Just to clarify, the issue is that the answer changes between processes=1
>> or processes=2+?
>> 
>> I don't think that any of FiPy's parallel code (e.g., using mpirun to
>> invoke PyTrilinos in parallel) should be triggered by this use of
>> multiprocessing, but we've never tried.
>> 
>> I'll try to do some experiments. If you'd like to send me a complete
>> (non)working example, I'll be happy to take a look at it.
>> 
>> 
>> On Mar 13, 2015, at 2:46 PM, Tony Bartolotta <[email protected]> wrote:
>> 
>>> FiPy Developers,
>>> 
>>> I am currently using FiPy 3.1 with Python 2.7 on a laptop running Ubuntu
>>> 14.04. I am attempting to find the impulse responses of a Fokker-Planck
>>> equation. In the below snippets of code, funcCall(args) calculate an
>>> impulse
>>> response with FiPy using a non-parallelized solver.
>>> 
>>> 
>>> Serial Calls
>>> 
>>> for k in range(N):
>>>   responses[k,:] = funcCall(args)
>>> 
>>> 
>>> Parallel Calls
>>> 
>>> pool = multiprocessing.Pool(processes=1)
>>> responses = pool.map(funcCall, [args for k in range(N)])
>>> pool.close()
>>> pool.join()
>>> 
>>> 
>>> I am finding that FiPy returns different numerical results when I call
>>> this
>>> method in serial or in parallel. Strangely, even if the parallel calls
>>> are
>>> carried out by only one worker using the same solver, the results are
>>> still
>>> different.
>>> 
>>> Using a debugging tool, all arguments appear to be correctly passed to
>>> FiPy.
>>> After considering a few test cases, the serial calls appear to arrive at
>>> the
>>> correct numerical solutions while the parallel version does not. I am
>>> unsure
>>> what is causing this behavior. Are you aware of any potential cause of
>>> this
>>> behavior?
>>> 
>>> If necessary, I can post a more complete version of the code.
>>> 
>>> Thank you,
>>> Tony
>>> 
>>> _______________________________________________
>>> fipy mailing list
>>> [email protected]
>>> http://www.ctcms.nist.gov/fipy
>>> [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
>> 
>> 
>> _______________________________________________
>> fipy mailing list
>> [email protected]
>> http://www.ctcms.nist.gov/fipy
>>  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
>> 
> 
> 
> _______________________________________________
> fipy mailing list
> [email protected]
> http://www.ctcms.nist.gov/fipy
>  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]


_______________________________________________
fipy mailing list
[email protected]
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

Re: Issue with executing FiPy processes in parallel

Reply via email to