> On Aug 1, 2017, at 12:50 PM, Oleksandr Koshkarov <[email protected]> wrote:
> 
>>  I didn't understand this. This is very wasteful. I have no good solution.
> Equations for E and B are linear, moreover I know exactly which time solver I 
> need - implicit Crank–Nicolson method, which conserves energy exactly in my 
> case - and I know it works well from the article. So instead of PETSc TS, I 
> can use SNES and do the time discretization myself. Therefore inside SNES, I 
> will solve local linear system for E^{n+1} and B^{n+1} each time.
> 
> And each processor will store only one copy of local E^{n} and B^{n}.
> 
> It will probably even improve the convergence of overall nonlinear solver.
> 
> Does it make sense?

   I am not completely clear. You cannot have the E and B associated with the 
DMDA so the DMDA dof would be 1*Mx*Nx*Px and you'd keep the E and B in some 
other vectors?

  Barry

> 
> Oleksandr Koshkarov.
> 
> 
> On 08/01/2017 10:56 AM, Barry Smith wrote:
>>> On Aug 1, 2017, at 11:43 AM, Oleksandr Koshkarov <[email protected]> 
>>> wrote:
>>> 
>>>> So you are saying that E and B don't have any coupling between values at 
>>>> different velocities and hence don't need ghost points?
>>> Yes, exactly - E and B depend only on space coordinates.
>>    I didn't understand this. This is very wasteful. I have no good solution.
>>>> It is just an optimization and you shouldn't worry about it until you have 
>>>> working code.
>>> OK, thanks, that is what I will do.
>>> 
>>> Oleksandr Koshkarov
>>> 
>>> 
>>> On 08/01/2017 10:35 AM, Barry Smith wrote:
>>>>> On Aug 1, 2017, at 1:20 AM, Oleksandr Koshkarov <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Thank you, it is very helpful.
>>>>> 
>>>>> I think there is a small problem with this approach. By storing E and B 
>>>>> in DMDA I will communicate their values each time when I communicate 
>>>>> ghost points. Moreover, it seems that each processor will have multiples 
>>>>> copies of E and B (depending on how big velocity range it holds). So it 
>>>>> seems it is possible to reduce communication and memory usage by 3 (maybe 
>>>>> it is not a big number) if each process holds only one copy of E and B. 
>>>>> However, if E and B will be stored in local vector, it is not obvious for 
>>>>> me how to construct a time solver so it also evolves E and B all together 
>>>>> with C.
>>>>> 
>>>>    So you are saying that E and B don't have any coupling between values 
>>>> at different velocities and hence don't need ghost points? DMDA is simple, 
>>>> it doesn't have support for only have ghost points for some variables but 
>>>> should the communication prove to be time-consuming we could provide a 
>>>> custom version that does not ghost-point the E and B. It is just an 
>>>> optimization and you shouldn't worry about it until you have working code.
>>>> 
>>>>   Barry
>>>> 
>>>> 
>>>>> p.s. currently, I am still playing with PETSc manual and I started 
>>>>> implementing the approach you proposed.
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Oleksandr Koshkarov.
>>>>> 
>>>>> 
>>>>> On 07/30/2017 04:35 PM, Barry Smith wrote:
>>>>>>> On Jul 30, 2017, at 2:04 PM, Oleksandr Koshkarov <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Yes, It seems I am wrong and you are right.
>>>>>>> 
>>>>>>> I think I can then make electric and magnetic fields local and do 
>>>>>>> parallel partitioning only in v space - it will defiantly restrict the 
>>>>>>> problem size, but maybe the code still be still useful and I get petsc 
>>>>>>> experience with it :) The question here if I will just parallize along 
>>>>>>> v-space, the 3d DMDA should be enough, right? I do not really 
>>>>>>> understand how would I do the partitioning here, can you please give me 
>>>>>>> some hints?
>>>>>>   You could do this, it is tempting due to its simplicity. Let Mv,Nv, 
>>>>>> and Pv represent the number of grid points in the three velocity 
>>>>>> directions and Mx, Nx, Px represent the number of grid points in 
>>>>>> physical space. If you have three unknowns at each grid point 
>>>>>> represented by C, E, B then you would define
>>>>>> 
>>>>>>     dof = 3*Mx*Nx*Px
>>>>>> 
>>>>>> Create your DMDACreate3d() using Mv,Nv, and Pv and dof then each process 
>>>>>> will have a certain number of velocity "grid" points and at each grid 
>>>>>> point there will room for all the physical points (times 3 for each 
>>>>>> unknown). So think of it as a three dimensional array with all the 
>>>>>> dependence on spatial grid point variables stacked up on top of each 
>>>>>> other. If you call
>>>>>> 
>>>>>>   DMDAVecGetArray()
>>>>>> 
>>>>>> the resulting array will be indexed as
>>>>>> 
>>>>>> array[kv][jv][iv][ilocal]
>>>>>> 
>>>>>> where kv is the third grid component of velocity etc and ilocal goes 
>>>>>> from 0 to 3*Mx*Nx*Px. You can decide if you want to store the C, E, B 
>>>>>> interlaced (meaning array[kv][jv][iv][0] is the first C, 
>>>>>> array[kv][jv][iv][1] is the first E ...) or uninterlaced (meaning 
>>>>>> array[kv][jv][iv][0:Mx*Nx*Px] are the C etc). If you use noninterlaced 
>>>>>> then applying a 3d FFT to a particular component is straightforward 
>>>>>> since they are all stored next to each other.
>>>>>> 
>>>>>>  This is definitely not a scalable approach but you might be able to 
>>>>>> have Mx = My = Mz = 128 which is not bad for a spectral method.
>>>>>> 
>>>>>>  I recommend starting by writing a simple code that does this first with 
>>>>>> 2d spatial and 2d, it is just so much easy to reason about a two array 
>>>>>> as you get the kinks out of the code.
>>>>>> 
>>>>>>   Barry
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Another approach is to abandon global Fourier, let say for education 
>>>>>>> purposes I will pick some finite difference along spacial dimensions, 
>>>>>>> how would you advice me to partition this problem with PETSc then?
>>>>>>> 
>>>>>>> 
>>>>>>> I know something like spectral elements or discontinuous Galerkin will 
>>>>>>> be ideal, but I want to get petsc experience first with simpler problem 
>>>>>>> and then move to discontinuous Galerkin (this will be defiantly the 
>>>>>>> next step). (Unfortunately, I have never done any unstructured meshes 
>>>>>>> like spectral elements or discontinuous Galerkin myself before, but did 
>>>>>>> some finite difference and spectral methods)
>>>>>>> 
>>>>>>> On 07/30/2017 12:46 PM, Matthew Knepley wrote:
>>>>>>>> On Sun, Jul 30, 2017 at 12:44 PM, Oleksandr Koshkarov 
>>>>>>>> <[email protected]> wrote:
>>>>>>>> Thanks for the references, and warning.
>>>>>>>> 
>>>>>>>> I am trying to do exactly this
>>>>>>>> http://www.sciencedirect.com/science/article/pii/S0021999115004738
>>>>>>>> 
>>>>>>>> (I probably should have added the reference in the first place, sorry)
>>>>>>>> 
>>>>>>>> the evolution formula is (15) in the article.
>>>>>>>> 
>>>>>>>> So, I will do convolution only in 3D (while full system is 6D) because 
>>>>>>>> velocity space is not discretized with Fourier
>>>>>>>> 
>>>>>>>> So As I understand, it will not be N^2 vs N logN. Moreover, I know 
>>>>>>>> that algorithm with FFT has significantly larger "constant".
>>>>>>>> 
>>>>>>>> Please correct me if I am wrong here :)
>>>>>>>> 
>>>>>>>> I think you are wrong here. First, I do not understand why you think 
>>>>>>>> the asymptotics have changed. Second, the constant for
>>>>>>>> FFT is about 6, which would be really hard to beat with a quadrature.
>>>>>>>> 
>>>>>>>> Last, this discretization looks fully coupled to me. I cannot see how 
>>>>>>>> you would parallelize this right off the bat. This is a problem
>>>>>>>> with spectral discretizations, and the main motivator for spectral 
>>>>>>>> elements.
>>>>>>>> 
>>>>>>>>  Thanks,
>>>>>>>> 
>>>>>>>>    Matt
>>>>>>>> I know for sure that FFT is a must at the end, but I thought I can 
>>>>>>>> start with matrix product to get more familiar with PETSc as it will 
>>>>>>>> be simpler to implement (At least I think so).
>>>>>>>> 
>>>>>>>> I hope it does make sense.
>>>>>>>> 
>>>>>>>> Another question about convolution via matrix product. Is there better 
>>>>>>>> way to do it other then to construct NxN Toeplitz-like matrix each 
>>>>>>>> time step on                   the "fly" from state vector and then 
>>>>>>>> multiply it to state vector?
>>>>>>>> 
>>>>>>>> p.s. Probably I should do FFT from the start, but I am a little bit 
>>>>>>>> intimidated by the problem taking into account that I do not have 
>>>>>>>> PETSc experience.
>>>>>>>> Best regards,
>>>>>>>> Oleksandr Koshkarov.
>>>>>>>> 
>>>>>>>> On 07/30/2017 09:01 AM, Matthew Knepley wrote:
>>>>>>>>> On Sat, Jul 29, 2017 at 7:01 PM, Oleksandr Koshkarov 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> Thank you for the response,
>>>>>>>>> 
>>>>>>>>> Now that you said about additional communication for fft, I want to 
>>>>>>>>> ask another question:
>>>>>>>>> 
>>>>>>>>> What if I disregard FFT and will compute convolution directly, I will 
>>>>>>>>> rewrite it as vector-matrix-vector product
>>>>>>>>> 
>>>>>>>>> I would caution you here. The algorithm is the most important thing 
>>>>>>>>> to get right up front. In this case, the effect of
>>>>>>>>> dimension will be dominant, and anything that does not scale well 
>>>>>>>>> with dimension will be thrown away, perhaps
>>>>>>>>> quite quickly, as you want to solve bigger problems.
>>>>>>>>> 
>>>>>>>>> Small calculation: Take a 6-dim box which is 'n' on a side, so that N 
>>>>>>>>> = n^6. FFT is N log N, whereas direct evaluation of
>>>>>>>>> a convolution is N^2. So if we could do n = 100 points on a side with 
>>>>>>>>> FFT, you can do, assuming the constants are about
>>>>>>>>> equal,
>>>>>>>>> 
>>>>>>>>>   100^6 (log 100^6) = x^12
>>>>>>>>>   1e12 (6 2) = x^12
>>>>>>>>>   x ~ 10
>>>>>>>>> 
>>>>>>>>> Past versions of PETSc had a higher dimensional DA construct (adda in 
>>>>>>>>> prior versions), but no one used it so we removed it.
>>>>>>>>> 
>>>>>>>>> I assume you are proposing something like
>>>>>>>>> 
>>>>>>>>>  https://arxiv.org/abs/1306.4625
>>>>>>>>> 
>>>>>>>>> or maybe
>>>>>>>>> 
>>>>>>>>>  https://arxiv.org/abs/1610.00397
>>>>>>>>> 
>>>>>>>>> The later is interesting since I think you could use a DMDA for the 
>>>>>>>>> velocity and something like DMPlex for the space.
>>>>>>>>> 
>>>>>>>>>  Thanks,
>>>>>>>>> 
>>>>>>>>>    Matt
>>>>>>>>> 
>>>>>>>>> x^T A x (where T - is transpose, x - is my state vector, and A is a 
>>>>>>>>> matrix which represents convolution and probably it will be not very 
>>>>>>>>> sparse).
>>>>>>>>> 
>>>>>>>>> Yes - I will do more work (alghorithm will be slower), but will I win 
>>>>>>>>> if we take the parallelization into account - less communication + 
>>>>>>>>> probably more scalable algorithm?
>>>>>>>>> 
>>>>>>>>> Best regards,
>>>>>>>>> 
>>>>>>>>> Alex.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 07/29/2017 05:23 PM, Barry Smith wrote:
>>>>>>>>>   DMDA provides a simple way to do domain decomposition of structured 
>>>>>>>>> grid meshes in the x, y, and z direction, that is it makes it easy to 
>>>>>>>>> efficiently chop up vectors in all 3 dimensions, allowing very large 
>>>>>>>>> problems easily. Unfortunately it only goes up to 3 dimensions and 
>>>>>>>>> attempts to produce versions for higher dimensions have failed.
>>>>>>>>> 
>>>>>>>>>   If you would like to do domain decomposition across all six of your 
>>>>>>>>> dimensions (which I think you really need to do to solve large 
>>>>>>>>> problems) we don't have a tool that helps with doing the domain 
>>>>>>>>> decomposition. To make matters more complicated, likely the 3d 
>>>>>>>>> mpi-fftw that need to be applied require rejiggering the data across 
>>>>>>>>> the nodes to get it into a form where the ffts can be applied 
>>>>>>>>> efficiently.
>>>>>>>>> 
>>>>>>>>>    I don't think there is necessarily anything theoretically 
>>>>>>>>> difficult about managing the six dimensional domain decomposition I 
>>>>>>>>> just don't know of any open source software out there that helps to 
>>>>>>>>> do this. Maybe some PETSc users are aware of such software and can 
>>>>>>>>> chime in.
>>>>>>>>> 
>>>>>>>>>    Aside from these two issues :-) PETSc would be useful for you 
>>>>>>>>> since you could use our time integrators and nonlinear solvers. 
>>>>>>>>> Pulling out subvectors to process on can be done with with either 
>>>>>>>>> VecScatter or VecStrideScatter, VecStrideGather etc depending on the 
>>>>>>>>> layout of the unknowns, i.e. how the domain decomposition is done.
>>>>>>>>> 
>>>>>>>>>     I wish I had better answers for managing the 6d domain 
>>>>>>>>> decomposition
>>>>>>>>> 
>>>>>>>>>    Barry
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Jul 29, 2017, at 5:06 PM, Oleksandr Koshkarov 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> I am a new PETSc user and I am still in the middle of the manual (I 
>>>>>>>>> have finally settled to choose PETSc as my main numerical library) so 
>>>>>>>>> I am sorry beforehand if my questions would be naive.
>>>>>>>>> 
>>>>>>>>> I am trying to solve 6+1 dimensional Vlasov equation with spectral 
>>>>>>>>> methods. More precisely, I will try to solve half-discretized 
>>>>>>>>> equations of the form (approximate form) with pseudospectral Fourier 
>>>>>>>>> method:
>>>>>>>>> 
>>>>>>>>> (Equations are in latex format, the nice website to see them is 
>>>>>>>>> https://www.codecogs.com/latex/eqneditor.php)
>>>>>>>>> 
>>>>>>>>> \frac{dC_{m,m,p}}{dt} =\\
>>>>>>>>> \partial_x \left ( a_n C_{n+1,m,p} +b_n C_{n,m,p} +c_n C_{n-1,m,p} 
>>>>>>>>> \right ) \\
>>>>>>>>> + \partial_y \left ( a_m C_{n,m+1,p} +b_m C_{n,m,p} +c_m C_{n,m-1,p}  
>>>>>>>>> \right ) \\
>>>>>>>>> + \partial_z \left ( a_p C_{n,m,p+1} +b_p C_{n,m,p} +c_p C_{n,m,p-1}  
>>>>>>>>> \right ) \\
>>>>>>>>> + d_n E_x C_{n-1,m,p} + d_m E_x C_{n,m-1,p} + d_p E_x C_{n,m,p-1} \\
>>>>>>>>> + B_x (e_{m,p} C_{n,m-1,p-1} + f_{m,p}C_{n,m-1,p+1} + \dots) + B_y 
>>>>>>>>> (\dots) + B_z (\dots)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> where a,b,c,d,e,f are some constants which can depend on n,m,p,
>>>>>>>>> 
>>>>>>>>> n,m,p = 0...N,
>>>>>>>>> 
>>>>>>>>> C_{n,m,p} = C_{n,m,p}(x,y,z),
>>>>>>>>> 
>>>>>>>>> E_x = E_x(x,y,z), (same for E_y,B_x,...)
>>>>>>>>> 
>>>>>>>>> and fields are described with equation of the sort (linear pdes with 
>>>>>>>>> source terms dependent on C):
>>>>>>>>> 
>>>>>>>>> \frac{dE_x}{dt} = \partial_y B_z - \partial_z B_x + (AC_{1,0,0} + 
>>>>>>>>> BC_{0,0,0}) \\
>>>>>>>>> \frac{dB_x}{dt} = -\partial_y E_z + \partial_z E_x
>>>>>>>>> 
>>>>>>>>> with A,B being some constants.
>>>>>>>>> 
>>>>>>>>> I will need fully implicit Crank–Nicolson method, so my plan is to 
>>>>>>>>> use SNES or TS where
>>>>>>>>> 
>>>>>>>>> I will use one MPI PETSc vector which describes the state of the 
>>>>>>>>> system (all, C, E, B),  then I can evolve linear part with just 
>>>>>>>>> matrix multiplication.
>>>>>>>>> 
>>>>>>>>> The first question is, should I use something like DMDA? if so, it is 
>>>>>>>>> only 3dimensional but I need 6 dimensional vectots? Will it be faster 
>>>>>>>>> then matrix multiplication?
>>>>>>>>> 
>>>>>>>>> Then to do nonlinear part I will need 3d mpi-fftw to compute 
>>>>>>>>> convolutions. The problem is, how do I extract subvectors from full 
>>>>>>>>> big vector state? (what is the best approach?)
>>>>>>>>> 
>>>>>>>>> If you have some other suggestions, please feel free to share
>>>>>>>>> 
>>>>>>>>> Thanks and best regards,
>>>>>>>>> 
>>>>>>>>> Oleksandr Koshkarov.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> What most experimenters take for granted before they begin their 
>>>>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>>>>> their experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>> 
>>>>>>>>> http://www.caam.rice.edu/~mk51/
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> What most experimenters take for granted before they begin their 
>>>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>>>> their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>> 
>>>>>>>> http://www.caam.rice.edu/~mk51/
> 

Reply via email to