> On Feb 25, 2017, at 11:00 AM, Matthew Knepley <[email protected]> wrote:
>
> On Sat, Feb 25, 2017 at 10:12 AM, Gideon Simpson <[email protected]>
> wrote:
>
> -gideon
>
>> On Feb 25, 2017, at 10:44 AM, Jed Brown <[email protected]> wrote:
>>
>> Gideon Simpson <[email protected]> writes:
>>
>>> I’ve been continuing working on implementing a projection method problem,
>>> which, loosely, looks like the following. Vec y contains the state vector
>>> for my system, y’ = f(y) which is solved with a TS, using, for now, rk4.
>>>
>>> I have added to this a TSPostStep which does root finding on the nonlinear
>>> problem
>>>
>>> f(lambda) = f(lambda; y, w) = g(y + lambda * w) - g0
>>
>> You might want to treat this as a DAE.
>
> Yea, we discussed DAE previously. For the sake of a reviewer, I’m trying to
> implement a classical method (explicit time step + projection), rather than
> DAE.
>
>>
>>> y and w are vectors that are the size of the system (numbers of mesh
>>> points), and lambda is a scalar perturbation.
>>>
>>> Right now, the snes function for solving this amounts to:
>>>
>>> PetscErrorCode projector(SNES snes, Vec lam, Vec f, void *ctx){
>>>
>>> AppCtx * user = (AppCtx *) ctx;
>>> const PetscScalar *lam_value;
>>> PetscScalar *f_value;
>>>
>>> VecGetArrayRead(lam, &lam_value);
>>> VecGetArray(f, &f_value);
>>> VecCopy(user->y, user->w);
If you do make SNES run on PETSC_COMM_WORLD where lam is now a parallel
vector with 1 entry on process 0 and zero entries on all the others (and
similar for f)
You will need to broadcast the lam_value[0] to all processes before calling the
VecAXPY()
>>> VecAXPY(user->w, lam_value[0], user->gradMy);
>>>
>>> f_value[0] = g(user->w) -user->g0;
>>> VecRestoreArrayRead(lam, &lam_value);
>>> VecRestoreArray(f, &f_value);
>>>
>>> return 0;
>>> }
>>>
>>> To get this all to work, I constructed the SNES and Vec lam with
>>> PETSC_COMM_SELF, effectively putting a copy of the nonlinear problem on
>>> each process. Basically, the nonlinear problem is low dimensional, but it
>>> parametrically depends on the high dimensional, distributed, vectors y and
>>> w.
>>>
>>> The one thing that bothers me about this is:
>>>
>>> 1. It would seem that each process is is doing all of these vector
>>> operations, which is entirely redundant, as the only thing that really
>>> needs to be shared amongst the processes is the value of lambda. Is that
>>> correct?
>>
>> Only the SNES part is redundant, but it's super cheap because it's a
>> scalar problem. The Vec operations are using global (parallel,
>> distributed) vectors, so they are not redundant. Of course it is
>> critical for the SNES on every process to be configured identically and
>> deterministic so that the processes don't diverge. And g(user->w) must
>> return the same value on each process (it probably finishes with an
>> MPI_Allreduce or similar).
>
> So within the root finding problem, when it calls, for instance, VecCopy
> (which is acting on distributed vectors), it’s not doing that once on process
> 0, once on process 1, once on process 2, etc. Or it is, but you’re saying
> it’s too cheap to matter?
Each process is copying ITs part of the vector; which is exactly what you
need.
>
> No they are collective. That is why Jed said its crucial that the same
> branches are taken on all processes.
>
> Two things I’m mildly concerned about is that, since user->w changes at each
> step of the solver, by virtue of the lambda scalar changing, if the SNES’s
> are running asynchronously, by virtue of them being PETSC_COMM_SELF
> constructs, couldn’t there be a collision?
>
> Yes, as I say above. I think its much safer to run a COMM_WORLD SNES that
> just has nothing on most procs than to do the former
> since you are already synchronizing everywhere.
>
> The operation g(user->w) does indeed conclude with an MPI_Allreduce, followed
> by DMDAVecRestoreArray and DMRestoreLocalVector.
>
> Lastly, in the TSPostStep, it runs as:
>
> SNESSolve(user->snes, NULL,user->lam);
> VecGetArray(user->lam, &lam_value);
Again here you would need to broadcast this user->lam value to all processes
before calling the VecAXPY() since it is only stored on process zero.
> VecAXPY(y, lam_value[0], user->gradMy);
> VecRestoreArray(user->lam, &lam_value);
>
> Is there any communication issue with ensuring that all the SNES’s have
> finished on each process, before proceeding to do the vector operations?
>
> No.
>
> I keep wondering if I should just write this as a global, with the data
> structures stored on process 0, just to avoid this kind of headache.
>
> Yes.
>
> Matt
>
>>
>>> 2. Is there an obvious way around this redundancy?
>>>
>>> -gideon
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener