So you are saying that E and B don't have any coupling between values at 
different velocities and hence don't need ghost points?
Yes, exactly - E and B depend only on space coordinates.

It is just an optimization and you shouldn't worry about it until you have 
working code.
OK, thanks, that is what I will do.

Oleksandr Koshkarov


On 08/01/2017 10:35 AM, Barry Smith wrote:
On Aug 1, 2017, at 1:20 AM, Oleksandr Koshkarov <[email protected]> wrote:

Thank you, it is very helpful.

I think there is a small problem with this approach. By storing E and B in DMDA 
I will communicate their values each time when I communicate ghost points. 
Moreover, it seems that each processor will have multiples copies of E and B 
(depending on how big velocity range it holds). So it seems it is possible to 
reduce communication and memory usage by 3 (maybe it is not a big number) if 
each process holds only one copy of E and B. However, if E and B will be stored 
in local vector, it is not obvious for me how to construct a time solver so it 
also evolves E and B all together with C.

    So you are saying that E and B don't have any coupling between values at 
different velocities and hence don't need ghost points? DMDA is simple, it 
doesn't have support for only have ghost points for some variables but should 
the communication prove to be time-consuming we could provide a custom version 
that does not ghost-point the E and B. It is just an optimization and you 
shouldn't worry about it until you have working code.

   Barry


p.s. currently, I am still playing with PETSc manual and I started implementing 
the approach you proposed.

Best regards,

Oleksandr Koshkarov.


On 07/30/2017 04:35 PM, Barry Smith wrote:
On Jul 30, 2017, at 2:04 PM, Oleksandr Koshkarov <[email protected]> wrote:

Yes, It seems I am wrong and you are right.

I think I can then make electric and magnetic fields local and do parallel 
partitioning only in v space - it will defiantly restrict the problem size, but 
maybe the code still be still useful and I get petsc experience with it :) The 
question here if I will just parallize along v-space, the 3d DMDA should be 
enough, right? I do not really understand how would I do the partitioning here, 
can you please give me some hints?
   You could do this, it is tempting due to its simplicity. Let Mv,Nv, and Pv 
represent the number of grid points in the three velocity directions and Mx, 
Nx, Px represent the number of grid points in physical space. If you have three 
unknowns at each grid point represented by C, E, B then you would define

     dof = 3*Mx*Nx*Px

Create your DMDACreate3d() using Mv,Nv, and Pv and dof then each process will have a 
certain number of velocity "grid" points and at each grid point there will room 
for all the physical points (times 3 for each unknown). So think of it as a three 
dimensional array with all the dependence on spatial grid point variables stacked up on 
top of each other. If you call

   DMDAVecGetArray()

the resulting array will be indexed as

array[kv][jv][iv][ilocal]

where kv is the third grid component of velocity etc and ilocal goes from 0 to 
3*Mx*Nx*Px. You can decide if you want to store the C, E, B interlaced (meaning 
array[kv][jv][iv][0] is the first C, array[kv][jv][iv][1] is the first E ...) 
or uninterlaced (meaning array[kv][jv][iv][0:Mx*Nx*Px] are the C etc). If you 
use noninterlaced then applying a 3d FFT to a particular component is 
straightforward since they are all stored next to each other.

  This is definitely not a scalable approach but you might be able to have Mx = 
My = Mz = 128 which is not bad for a spectral method.

  I recommend starting by writing a simple code that does this first with 2d 
spatial and 2d, it is just so much easy to reason about a two array as you get 
the kinks out of the code.

   Barry





Another approach is to abandon global Fourier, let say for education purposes I 
will pick some finite difference along spacial dimensions, how would you advice 
me to partition this problem with PETSc then?


I know something like spectral elements or discontinuous Galerkin will be 
ideal, but I want to get petsc experience first with simpler problem and then 
move to discontinuous Galerkin (this will be defiantly the next step). 
(Unfortunately, I have never done any unstructured meshes like spectral 
elements or discontinuous Galerkin myself before, but did some finite 
difference and spectral methods)

On 07/30/2017 12:46 PM, Matthew Knepley wrote:
On Sun, Jul 30, 2017 at 12:44 PM, Oleksandr Koshkarov <[email protected]> 
wrote:
Thanks for the references, and warning.

I am trying to do exactly this
http://www.sciencedirect.com/science/article/pii/S0021999115004738

(I probably should have added the reference in the first place, sorry)

the evolution formula is (15) in the article.

So, I will do convolution only in 3D (while full system is 6D) because velocity 
space is not discretized with Fourier

So As I understand, it will not be N^2 vs N logN. Moreover, I know that algorithm with 
FFT has significantly larger "constant".

Please correct me if I am wrong here :)

I think you are wrong here. First, I do not understand why you think the 
asymptotics have changed. Second, the constant for
FFT is about 6, which would be really hard to beat with a quadrature.

Last, this discretization looks fully coupled to me. I cannot see how you would 
parallelize this right off the bat. This is a problem
with spectral discretizations, and the main motivator for spectral elements.

  Thanks,

    Matt
I know for sure that FFT is a must at the end, but I thought I can start with 
matrix product to get more familiar with PETSc as it will be simpler to 
implement (At least I think so).

I hope it does make sense.

Another question about convolution via matrix product. Is there better way to do it other 
then to construct NxN Toeplitz-like matrix each time step on                   the 
"fly" from state vector and then multiply it to state vector?

p.s. Probably I should do FFT from the start, but I am a little bit intimidated 
by the problem taking into account that I do not have PETSc experience.
Best regards,
Oleksandr Koshkarov.

On 07/30/2017 09:01 AM, Matthew Knepley wrote:
On Sat, Jul 29, 2017 at 7:01 PM, Oleksandr Koshkarov <[email protected]> 
wrote:
Thank you for the response,

Now that you said about additional communication for fft, I want to ask another 
question:

What if I disregard FFT and will compute convolution directly, I will rewrite 
it as vector-matrix-vector product

I would caution you here. The algorithm is the most important thing to get 
right up front. In this case, the effect of
dimension will be dominant, and anything that does not scale well with 
dimension will be thrown away, perhaps
quite quickly, as you want to solve bigger problems.

Small calculation: Take a 6-dim box which is 'n' on a side, so that N = n^6. 
FFT is N log N, whereas direct evaluation of
a convolution is N^2. So if we could do n = 100 points on a side with FFT, you 
can do, assuming the constants are about
equal,

   100^6 (log 100^6) = x^12
   1e12 (6 2) = x^12
   x ~ 10

Past versions of PETSc had a higher dimensional DA construct (adda in prior 
versions), but no one used it so we removed it.

I assume you are proposing something like

  https://arxiv.org/abs/1306.4625

or maybe

  https://arxiv.org/abs/1610.00397

The later is interesting since I think you could use a DMDA for the velocity 
and something like DMPlex for the space.

  Thanks,

    Matt

x^T A x (where T - is transpose, x - is my state vector, and A is a matrix 
which represents convolution and probably it will be not very sparse).

Yes - I will do more work (alghorithm will be slower), but will I win if we 
take the parallelization into account - less communication + probably more 
scalable algorithm?

Best regards,

Alex.



On 07/29/2017 05:23 PM, Barry Smith wrote:
   DMDA provides a simple way to do domain decomposition of structured grid 
meshes in the x, y, and z direction, that is it makes it easy to efficiently 
chop up vectors in all 3 dimensions, allowing very large problems easily. 
Unfortunately it only goes up to 3 dimensions and attempts to produce versions 
for higher dimensions have failed.

   If you would like to do domain decomposition across all six of your 
dimensions (which I think you really need to do to solve large problems) we 
don't have a tool that helps with doing the domain decomposition. To make 
matters more complicated, likely the 3d mpi-fftw that need to be applied 
require rejiggering the data across the nodes to get it into a form where the 
ffts can be applied efficiently.

    I don't think there is necessarily anything theoretically difficult about 
managing the six dimensional domain decomposition I just don't know of any open 
source software out there that helps to do this. Maybe some PETSc users are 
aware of such software and can chime in.

    Aside from these two issues :-) PETSc would be useful for you since you 
could use our time integrators and nonlinear solvers. Pulling out subvectors to 
process on can be done with with either VecScatter or VecStrideScatter, 
VecStrideGather etc depending on the layout of the unknowns, i.e. how the 
domain decomposition is done.

     I wish I had better answers for managing the 6d domain decomposition

    Barry




On Jul 29, 2017, at 5:06 PM, Oleksandr Koshkarov <[email protected]> wrote:

Dear All,

I am a new PETSc user and I am still in the middle of the manual (I have 
finally settled to choose PETSc as my main numerical library) so I am sorry 
beforehand if my questions would be naive.

I am trying to solve 6+1 dimensional Vlasov equation with spectral methods. 
More precisely, I will try to solve half-discretized equations of the form 
(approximate form) with pseudospectral Fourier method:

(Equations are in latex format, the nice website to see them is 
https://www.codecogs.com/latex/eqneditor.php)

\frac{dC_{m,m,p}}{dt} =\\
\partial_x \left ( a_n C_{n+1,m,p} +b_n C_{n,m,p} +c_n C_{n-1,m,p} \right ) \\
+ \partial_y \left ( a_m C_{n,m+1,p} +b_m C_{n,m,p} +c_m C_{n,m-1,p}  \right ) 
\\
+ \partial_z \left ( a_p C_{n,m,p+1} +b_p C_{n,m,p} +c_p C_{n,m,p-1}  \right ) 
\\
+ d_n E_x C_{n-1,m,p} + d_m E_x C_{n,m-1,p} + d_p E_x C_{n,m,p-1} \\
+ B_x (e_{m,p} C_{n,m-1,p-1} + f_{m,p}C_{n,m-1,p+1} + \dots) + B_y (\dots) + 
B_z (\dots)


where a,b,c,d,e,f are some constants which can depend on n,m,p,

n,m,p = 0...N,

C_{n,m,p} = C_{n,m,p}(x,y,z),

E_x = E_x(x,y,z), (same for E_y,B_x,...)

and fields are described with equation of the sort (linear pdes with source 
terms dependent on C):

\frac{dE_x}{dt} = \partial_y B_z - \partial_z B_x + (AC_{1,0,0} + BC_{0,0,0}) \\
\frac{dB_x}{dt} = -\partial_y E_z + \partial_z E_x

with A,B being some constants.

I will need fully implicit Crank–Nicolson method, so my plan is to use SNES or 
TS where

I will use one MPI PETSc vector which describes the state of the system (all, 
C, E, B),  then I can evolve linear part with just matrix multiplication.

The first question is, should I use something like DMDA? if so, it is only 
3dimensional but I need 6 dimensional vectots? Will it be faster then matrix 
multiplication?

Then to do nonlinear part I will need 3d mpi-fftw to compute convolutions. The 
problem is, how do I extract subvectors from full big vector state? (what is 
the best approach?)

If you have some other suggestions, please feel free to share

Thanks and best regards,

Oleksandr Koshkarov.





--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/


--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/

Reply via email to