> On Aug 1, 2017, at 12:50 PM, Oleksandr Koshkarov <[email protected]> wrote: > >> I didn't understand this. This is very wasteful. I have no good solution. > Equations for E and B are linear, moreover I know exactly which time solver I > need - implicit Crank–Nicolson method, which conserves energy exactly in my > case - and I know it works well from the article. So instead of PETSc TS, I > can use SNES and do the time discretization myself. Therefore inside SNES, I > will solve local linear system for E^{n+1} and B^{n+1} each time. > > And each processor will store only one copy of local E^{n} and B^{n}. > > It will probably even improve the convergence of overall nonlinear solver. > > Does it make sense?
I am not completely clear. You cannot have the E and B associated with the DMDA so the DMDA dof would be 1*Mx*Nx*Px and you'd keep the E and B in some other vectors? Barry > > Oleksandr Koshkarov. > > > On 08/01/2017 10:56 AM, Barry Smith wrote: >>> On Aug 1, 2017, at 11:43 AM, Oleksandr Koshkarov <[email protected]> >>> wrote: >>> >>>> So you are saying that E and B don't have any coupling between values at >>>> different velocities and hence don't need ghost points? >>> Yes, exactly - E and B depend only on space coordinates. >> I didn't understand this. This is very wasteful. I have no good solution. >>>> It is just an optimization and you shouldn't worry about it until you have >>>> working code. >>> OK, thanks, that is what I will do. >>> >>> Oleksandr Koshkarov >>> >>> >>> On 08/01/2017 10:35 AM, Barry Smith wrote: >>>>> On Aug 1, 2017, at 1:20 AM, Oleksandr Koshkarov <[email protected]> >>>>> wrote: >>>>> >>>>> Thank you, it is very helpful. >>>>> >>>>> I think there is a small problem with this approach. By storing E and B >>>>> in DMDA I will communicate their values each time when I communicate >>>>> ghost points. Moreover, it seems that each processor will have multiples >>>>> copies of E and B (depending on how big velocity range it holds). So it >>>>> seems it is possible to reduce communication and memory usage by 3 (maybe >>>>> it is not a big number) if each process holds only one copy of E and B. >>>>> However, if E and B will be stored in local vector, it is not obvious for >>>>> me how to construct a time solver so it also evolves E and B all together >>>>> with C. >>>>> >>>> So you are saying that E and B don't have any coupling between values >>>> at different velocities and hence don't need ghost points? DMDA is simple, >>>> it doesn't have support for only have ghost points for some variables but >>>> should the communication prove to be time-consuming we could provide a >>>> custom version that does not ghost-point the E and B. It is just an >>>> optimization and you shouldn't worry about it until you have working code. >>>> >>>> Barry >>>> >>>> >>>>> p.s. currently, I am still playing with PETSc manual and I started >>>>> implementing the approach you proposed. >>>>> >>>>> Best regards, >>>>> >>>>> Oleksandr Koshkarov. >>>>> >>>>> >>>>> On 07/30/2017 04:35 PM, Barry Smith wrote: >>>>>>> On Jul 30, 2017, at 2:04 PM, Oleksandr Koshkarov <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Yes, It seems I am wrong and you are right. >>>>>>> >>>>>>> I think I can then make electric and magnetic fields local and do >>>>>>> parallel partitioning only in v space - it will defiantly restrict the >>>>>>> problem size, but maybe the code still be still useful and I get petsc >>>>>>> experience with it :) The question here if I will just parallize along >>>>>>> v-space, the 3d DMDA should be enough, right? I do not really >>>>>>> understand how would I do the partitioning here, can you please give me >>>>>>> some hints? >>>>>> You could do this, it is tempting due to its simplicity. Let Mv,Nv, >>>>>> and Pv represent the number of grid points in the three velocity >>>>>> directions and Mx, Nx, Px represent the number of grid points in >>>>>> physical space. If you have three unknowns at each grid point >>>>>> represented by C, E, B then you would define >>>>>> >>>>>> dof = 3*Mx*Nx*Px >>>>>> >>>>>> Create your DMDACreate3d() using Mv,Nv, and Pv and dof then each process >>>>>> will have a certain number of velocity "grid" points and at each grid >>>>>> point there will room for all the physical points (times 3 for each >>>>>> unknown). So think of it as a three dimensional array with all the >>>>>> dependence on spatial grid point variables stacked up on top of each >>>>>> other. If you call >>>>>> >>>>>> DMDAVecGetArray() >>>>>> >>>>>> the resulting array will be indexed as >>>>>> >>>>>> array[kv][jv][iv][ilocal] >>>>>> >>>>>> where kv is the third grid component of velocity etc and ilocal goes >>>>>> from 0 to 3*Mx*Nx*Px. You can decide if you want to store the C, E, B >>>>>> interlaced (meaning array[kv][jv][iv][0] is the first C, >>>>>> array[kv][jv][iv][1] is the first E ...) or uninterlaced (meaning >>>>>> array[kv][jv][iv][0:Mx*Nx*Px] are the C etc). If you use noninterlaced >>>>>> then applying a 3d FFT to a particular component is straightforward >>>>>> since they are all stored next to each other. >>>>>> >>>>>> This is definitely not a scalable approach but you might be able to >>>>>> have Mx = My = Mz = 128 which is not bad for a spectral method. >>>>>> >>>>>> I recommend starting by writing a simple code that does this first with >>>>>> 2d spatial and 2d, it is just so much easy to reason about a two array >>>>>> as you get the kinks out of the code. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Another approach is to abandon global Fourier, let say for education >>>>>>> purposes I will pick some finite difference along spacial dimensions, >>>>>>> how would you advice me to partition this problem with PETSc then? >>>>>>> >>>>>>> >>>>>>> I know something like spectral elements or discontinuous Galerkin will >>>>>>> be ideal, but I want to get petsc experience first with simpler problem >>>>>>> and then move to discontinuous Galerkin (this will be defiantly the >>>>>>> next step). (Unfortunately, I have never done any unstructured meshes >>>>>>> like spectral elements or discontinuous Galerkin myself before, but did >>>>>>> some finite difference and spectral methods) >>>>>>> >>>>>>> On 07/30/2017 12:46 PM, Matthew Knepley wrote: >>>>>>>> On Sun, Jul 30, 2017 at 12:44 PM, Oleksandr Koshkarov >>>>>>>> <[email protected]> wrote: >>>>>>>> Thanks for the references, and warning. >>>>>>>> >>>>>>>> I am trying to do exactly this >>>>>>>> http://www.sciencedirect.com/science/article/pii/S0021999115004738 >>>>>>>> >>>>>>>> (I probably should have added the reference in the first place, sorry) >>>>>>>> >>>>>>>> the evolution formula is (15) in the article. >>>>>>>> >>>>>>>> So, I will do convolution only in 3D (while full system is 6D) because >>>>>>>> velocity space is not discretized with Fourier >>>>>>>> >>>>>>>> So As I understand, it will not be N^2 vs N logN. Moreover, I know >>>>>>>> that algorithm with FFT has significantly larger "constant". >>>>>>>> >>>>>>>> Please correct me if I am wrong here :) >>>>>>>> >>>>>>>> I think you are wrong here. First, I do not understand why you think >>>>>>>> the asymptotics have changed. Second, the constant for >>>>>>>> FFT is about 6, which would be really hard to beat with a quadrature. >>>>>>>> >>>>>>>> Last, this discretization looks fully coupled to me. I cannot see how >>>>>>>> you would parallelize this right off the bat. This is a problem >>>>>>>> with spectral discretizations, and the main motivator for spectral >>>>>>>> elements. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> I know for sure that FFT is a must at the end, but I thought I can >>>>>>>> start with matrix product to get more familiar with PETSc as it will >>>>>>>> be simpler to implement (At least I think so). >>>>>>>> >>>>>>>> I hope it does make sense. >>>>>>>> >>>>>>>> Another question about convolution via matrix product. Is there better >>>>>>>> way to do it other then to construct NxN Toeplitz-like matrix each >>>>>>>> time step on the "fly" from state vector and then >>>>>>>> multiply it to state vector? >>>>>>>> >>>>>>>> p.s. Probably I should do FFT from the start, but I am a little bit >>>>>>>> intimidated by the problem taking into account that I do not have >>>>>>>> PETSc experience. >>>>>>>> Best regards, >>>>>>>> Oleksandr Koshkarov. >>>>>>>> >>>>>>>> On 07/30/2017 09:01 AM, Matthew Knepley wrote: >>>>>>>>> On Sat, Jul 29, 2017 at 7:01 PM, Oleksandr Koshkarov >>>>>>>>> <[email protected]> wrote: >>>>>>>>> Thank you for the response, >>>>>>>>> >>>>>>>>> Now that you said about additional communication for fft, I want to >>>>>>>>> ask another question: >>>>>>>>> >>>>>>>>> What if I disregard FFT and will compute convolution directly, I will >>>>>>>>> rewrite it as vector-matrix-vector product >>>>>>>>> >>>>>>>>> I would caution you here. The algorithm is the most important thing >>>>>>>>> to get right up front. In this case, the effect of >>>>>>>>> dimension will be dominant, and anything that does not scale well >>>>>>>>> with dimension will be thrown away, perhaps >>>>>>>>> quite quickly, as you want to solve bigger problems. >>>>>>>>> >>>>>>>>> Small calculation: Take a 6-dim box which is 'n' on a side, so that N >>>>>>>>> = n^6. FFT is N log N, whereas direct evaluation of >>>>>>>>> a convolution is N^2. So if we could do n = 100 points on a side with >>>>>>>>> FFT, you can do, assuming the constants are about >>>>>>>>> equal, >>>>>>>>> >>>>>>>>> 100^6 (log 100^6) = x^12 >>>>>>>>> 1e12 (6 2) = x^12 >>>>>>>>> x ~ 10 >>>>>>>>> >>>>>>>>> Past versions of PETSc had a higher dimensional DA construct (adda in >>>>>>>>> prior versions), but no one used it so we removed it. >>>>>>>>> >>>>>>>>> I assume you are proposing something like >>>>>>>>> >>>>>>>>> https://arxiv.org/abs/1306.4625 >>>>>>>>> >>>>>>>>> or maybe >>>>>>>>> >>>>>>>>> https://arxiv.org/abs/1610.00397 >>>>>>>>> >>>>>>>>> The later is interesting since I think you could use a DMDA for the >>>>>>>>> velocity and something like DMPlex for the space. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> x^T A x (where T - is transpose, x - is my state vector, and A is a >>>>>>>>> matrix which represents convolution and probably it will be not very >>>>>>>>> sparse). >>>>>>>>> >>>>>>>>> Yes - I will do more work (alghorithm will be slower), but will I win >>>>>>>>> if we take the parallelization into account - less communication + >>>>>>>>> probably more scalable algorithm? >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> Alex. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 07/29/2017 05:23 PM, Barry Smith wrote: >>>>>>>>> DMDA provides a simple way to do domain decomposition of structured >>>>>>>>> grid meshes in the x, y, and z direction, that is it makes it easy to >>>>>>>>> efficiently chop up vectors in all 3 dimensions, allowing very large >>>>>>>>> problems easily. Unfortunately it only goes up to 3 dimensions and >>>>>>>>> attempts to produce versions for higher dimensions have failed. >>>>>>>>> >>>>>>>>> If you would like to do domain decomposition across all six of your >>>>>>>>> dimensions (which I think you really need to do to solve large >>>>>>>>> problems) we don't have a tool that helps with doing the domain >>>>>>>>> decomposition. To make matters more complicated, likely the 3d >>>>>>>>> mpi-fftw that need to be applied require rejiggering the data across >>>>>>>>> the nodes to get it into a form where the ffts can be applied >>>>>>>>> efficiently. >>>>>>>>> >>>>>>>>> I don't think there is necessarily anything theoretically >>>>>>>>> difficult about managing the six dimensional domain decomposition I >>>>>>>>> just don't know of any open source software out there that helps to >>>>>>>>> do this. Maybe some PETSc users are aware of such software and can >>>>>>>>> chime in. >>>>>>>>> >>>>>>>>> Aside from these two issues :-) PETSc would be useful for you >>>>>>>>> since you could use our time integrators and nonlinear solvers. >>>>>>>>> Pulling out subvectors to process on can be done with with either >>>>>>>>> VecScatter or VecStrideScatter, VecStrideGather etc depending on the >>>>>>>>> layout of the unknowns, i.e. how the domain decomposition is done. >>>>>>>>> >>>>>>>>> I wish I had better answers for managing the 6d domain >>>>>>>>> decomposition >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Jul 29, 2017, at 5:06 PM, Oleksandr Koshkarov >>>>>>>>> <[email protected]> wrote: >>>>>>>>> >>>>>>>>> Dear All, >>>>>>>>> >>>>>>>>> I am a new PETSc user and I am still in the middle of the manual (I >>>>>>>>> have finally settled to choose PETSc as my main numerical library) so >>>>>>>>> I am sorry beforehand if my questions would be naive. >>>>>>>>> >>>>>>>>> I am trying to solve 6+1 dimensional Vlasov equation with spectral >>>>>>>>> methods. More precisely, I will try to solve half-discretized >>>>>>>>> equations of the form (approximate form) with pseudospectral Fourier >>>>>>>>> method: >>>>>>>>> >>>>>>>>> (Equations are in latex format, the nice website to see them is >>>>>>>>> https://www.codecogs.com/latex/eqneditor.php) >>>>>>>>> >>>>>>>>> \frac{dC_{m,m,p}}{dt} =\\ >>>>>>>>> \partial_x \left ( a_n C_{n+1,m,p} +b_n C_{n,m,p} +c_n C_{n-1,m,p} >>>>>>>>> \right ) \\ >>>>>>>>> + \partial_y \left ( a_m C_{n,m+1,p} +b_m C_{n,m,p} +c_m C_{n,m-1,p} >>>>>>>>> \right ) \\ >>>>>>>>> + \partial_z \left ( a_p C_{n,m,p+1} +b_p C_{n,m,p} +c_p C_{n,m,p-1} >>>>>>>>> \right ) \\ >>>>>>>>> + d_n E_x C_{n-1,m,p} + d_m E_x C_{n,m-1,p} + d_p E_x C_{n,m,p-1} \\ >>>>>>>>> + B_x (e_{m,p} C_{n,m-1,p-1} + f_{m,p}C_{n,m-1,p+1} + \dots) + B_y >>>>>>>>> (\dots) + B_z (\dots) >>>>>>>>> >>>>>>>>> >>>>>>>>> where a,b,c,d,e,f are some constants which can depend on n,m,p, >>>>>>>>> >>>>>>>>> n,m,p = 0...N, >>>>>>>>> >>>>>>>>> C_{n,m,p} = C_{n,m,p}(x,y,z), >>>>>>>>> >>>>>>>>> E_x = E_x(x,y,z), (same for E_y,B_x,...) >>>>>>>>> >>>>>>>>> and fields are described with equation of the sort (linear pdes with >>>>>>>>> source terms dependent on C): >>>>>>>>> >>>>>>>>> \frac{dE_x}{dt} = \partial_y B_z - \partial_z B_x + (AC_{1,0,0} + >>>>>>>>> BC_{0,0,0}) \\ >>>>>>>>> \frac{dB_x}{dt} = -\partial_y E_z + \partial_z E_x >>>>>>>>> >>>>>>>>> with A,B being some constants. >>>>>>>>> >>>>>>>>> I will need fully implicit Crank–Nicolson method, so my plan is to >>>>>>>>> use SNES or TS where >>>>>>>>> >>>>>>>>> I will use one MPI PETSc vector which describes the state of the >>>>>>>>> system (all, C, E, B), then I can evolve linear part with just >>>>>>>>> matrix multiplication. >>>>>>>>> >>>>>>>>> The first question is, should I use something like DMDA? if so, it is >>>>>>>>> only 3dimensional but I need 6 dimensional vectots? Will it be faster >>>>>>>>> then matrix multiplication? >>>>>>>>> >>>>>>>>> Then to do nonlinear part I will need 3d mpi-fftw to compute >>>>>>>>> convolutions. The problem is, how do I extract subvectors from full >>>>>>>>> big vector state? (what is the best approach?) >>>>>>>>> >>>>>>>>> If you have some other suggestions, please feel free to share >>>>>>>>> >>>>>>>>> Thanks and best regards, >>>>>>>>> >>>>>>>>> Oleksandr Koshkarov. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>> their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> http://www.caam.rice.edu/~mk51/ >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> http://www.caam.rice.edu/~mk51/ >
