> On Aug 1, 2017, at 11:43 AM, Oleksandr Koshkarov <[email protected]> wrote: > >> So you are saying that E and B don't have any coupling between values at >> different velocities and hence don't need ghost points? > Yes, exactly - E and B depend only on space coordinates.
I didn't understand this. This is very wasteful. I have no good solution. > >> It is just an optimization and you shouldn't worry about it until you have >> working code. > OK, thanks, that is what I will do. > > Oleksandr Koshkarov > > > On 08/01/2017 10:35 AM, Barry Smith wrote: >>> On Aug 1, 2017, at 1:20 AM, Oleksandr Koshkarov <[email protected]> >>> wrote: >>> >>> Thank you, it is very helpful. >>> >>> I think there is a small problem with this approach. By storing E and B in >>> DMDA I will communicate their values each time when I communicate ghost >>> points. Moreover, it seems that each processor will have multiples copies >>> of E and B (depending on how big velocity range it holds). So it seems it >>> is possible to reduce communication and memory usage by 3 (maybe it is not >>> a big number) if each process holds only one copy of E and B. However, if E >>> and B will be stored in local vector, it is not obvious for me how to >>> construct a time solver so it also evolves E and B all together with C. >>> >> So you are saying that E and B don't have any coupling between values at >> different velocities and hence don't need ghost points? DMDA is simple, it >> doesn't have support for only have ghost points for some variables but >> should the communication prove to be time-consuming we could provide a >> custom version that does not ghost-point the E and B. It is just an >> optimization and you shouldn't worry about it until you have working code. >> >> Barry >> >> >>> p.s. currently, I am still playing with PETSc manual and I started >>> implementing the approach you proposed. >>> >>> Best regards, >>> >>> Oleksandr Koshkarov. >>> >>> >>> On 07/30/2017 04:35 PM, Barry Smith wrote: >>>>> On Jul 30, 2017, at 2:04 PM, Oleksandr Koshkarov <[email protected]> >>>>> wrote: >>>>> >>>>> Yes, It seems I am wrong and you are right. >>>>> >>>>> I think I can then make electric and magnetic fields local and do >>>>> parallel partitioning only in v space - it will defiantly restrict the >>>>> problem size, but maybe the code still be still useful and I get petsc >>>>> experience with it :) The question here if I will just parallize along >>>>> v-space, the 3d DMDA should be enough, right? I do not really understand >>>>> how would I do the partitioning here, can you please give me some hints? >>>> You could do this, it is tempting due to its simplicity. Let Mv,Nv, and >>>> Pv represent the number of grid points in the three velocity directions >>>> and Mx, Nx, Px represent the number of grid points in physical space. If >>>> you have three unknowns at each grid point represented by C, E, B then you >>>> would define >>>> >>>> dof = 3*Mx*Nx*Px >>>> >>>> Create your DMDACreate3d() using Mv,Nv, and Pv and dof then each process >>>> will have a certain number of velocity "grid" points and at each grid >>>> point there will room for all the physical points (times 3 for each >>>> unknown). So think of it as a three dimensional array with all the >>>> dependence on spatial grid point variables stacked up on top of each >>>> other. If you call >>>> >>>> DMDAVecGetArray() >>>> >>>> the resulting array will be indexed as >>>> >>>> array[kv][jv][iv][ilocal] >>>> >>>> where kv is the third grid component of velocity etc and ilocal goes from >>>> 0 to 3*Mx*Nx*Px. You can decide if you want to store the C, E, B >>>> interlaced (meaning array[kv][jv][iv][0] is the first C, >>>> array[kv][jv][iv][1] is the first E ...) or uninterlaced (meaning >>>> array[kv][jv][iv][0:Mx*Nx*Px] are the C etc). If you use noninterlaced >>>> then applying a 3d FFT to a particular component is straightforward since >>>> they are all stored next to each other. >>>> >>>> This is definitely not a scalable approach but you might be able to have >>>> Mx = My = Mz = 128 which is not bad for a spectral method. >>>> >>>> I recommend starting by writing a simple code that does this first with >>>> 2d spatial and 2d, it is just so much easy to reason about a two array as >>>> you get the kinks out of the code. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> >>>>> Another approach is to abandon global Fourier, let say for education >>>>> purposes I will pick some finite difference along spacial dimensions, how >>>>> would you advice me to partition this problem with PETSc then? >>>>> >>>>> >>>>> I know something like spectral elements or discontinuous Galerkin will be >>>>> ideal, but I want to get petsc experience first with simpler problem and >>>>> then move to discontinuous Galerkin (this will be defiantly the next >>>>> step). (Unfortunately, I have never done any unstructured meshes like >>>>> spectral elements or discontinuous Galerkin myself before, but did some >>>>> finite difference and spectral methods) >>>>> >>>>> On 07/30/2017 12:46 PM, Matthew Knepley wrote: >>>>>> On Sun, Jul 30, 2017 at 12:44 PM, Oleksandr Koshkarov >>>>>> <[email protected]> wrote: >>>>>> Thanks for the references, and warning. >>>>>> >>>>>> I am trying to do exactly this >>>>>> http://www.sciencedirect.com/science/article/pii/S0021999115004738 >>>>>> >>>>>> (I probably should have added the reference in the first place, sorry) >>>>>> >>>>>> the evolution formula is (15) in the article. >>>>>> >>>>>> So, I will do convolution only in 3D (while full system is 6D) because >>>>>> velocity space is not discretized with Fourier >>>>>> >>>>>> So As I understand, it will not be N^2 vs N logN. Moreover, I know that >>>>>> algorithm with FFT has significantly larger "constant". >>>>>> >>>>>> Please correct me if I am wrong here :) >>>>>> >>>>>> I think you are wrong here. First, I do not understand why you think the >>>>>> asymptotics have changed. Second, the constant for >>>>>> FFT is about 6, which would be really hard to beat with a quadrature. >>>>>> >>>>>> Last, this discretization looks fully coupled to me. I cannot see how >>>>>> you would parallelize this right off the bat. This is a problem >>>>>> with spectral discretizations, and the main motivator for spectral >>>>>> elements. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> I know for sure that FFT is a must at the end, but I thought I can start >>>>>> with matrix product to get more familiar with PETSc as it will be >>>>>> simpler to implement (At least I think so). >>>>>> >>>>>> I hope it does make sense. >>>>>> >>>>>> Another question about convolution via matrix product. Is there better >>>>>> way to do it other then to construct NxN Toeplitz-like matrix each time >>>>>> step on the "fly" from state vector and then multiply >>>>>> it to state vector? >>>>>> >>>>>> p.s. Probably I should do FFT from the start, but I am a little bit >>>>>> intimidated by the problem taking into account that I do not have PETSc >>>>>> experience. >>>>>> Best regards, >>>>>> Oleksandr Koshkarov. >>>>>> >>>>>> On 07/30/2017 09:01 AM, Matthew Knepley wrote: >>>>>>> On Sat, Jul 29, 2017 at 7:01 PM, Oleksandr Koshkarov >>>>>>> <[email protected]> wrote: >>>>>>> Thank you for the response, >>>>>>> >>>>>>> Now that you said about additional communication for fft, I want to ask >>>>>>> another question: >>>>>>> >>>>>>> What if I disregard FFT and will compute convolution directly, I will >>>>>>> rewrite it as vector-matrix-vector product >>>>>>> >>>>>>> I would caution you here. The algorithm is the most important thing to >>>>>>> get right up front. In this case, the effect of >>>>>>> dimension will be dominant, and anything that does not scale well with >>>>>>> dimension will be thrown away, perhaps >>>>>>> quite quickly, as you want to solve bigger problems. >>>>>>> >>>>>>> Small calculation: Take a 6-dim box which is 'n' on a side, so that N = >>>>>>> n^6. FFT is N log N, whereas direct evaluation of >>>>>>> a convolution is N^2. So if we could do n = 100 points on a side with >>>>>>> FFT, you can do, assuming the constants are about >>>>>>> equal, >>>>>>> >>>>>>> 100^6 (log 100^6) = x^12 >>>>>>> 1e12 (6 2) = x^12 >>>>>>> x ~ 10 >>>>>>> >>>>>>> Past versions of PETSc had a higher dimensional DA construct (adda in >>>>>>> prior versions), but no one used it so we removed it. >>>>>>> >>>>>>> I assume you are proposing something like >>>>>>> >>>>>>> https://arxiv.org/abs/1306.4625 >>>>>>> >>>>>>> or maybe >>>>>>> >>>>>>> https://arxiv.org/abs/1610.00397 >>>>>>> >>>>>>> The later is interesting since I think you could use a DMDA for the >>>>>>> velocity and something like DMPlex for the space. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> x^T A x (where T - is transpose, x - is my state vector, and A is a >>>>>>> matrix which represents convolution and probably it will be not very >>>>>>> sparse). >>>>>>> >>>>>>> Yes - I will do more work (alghorithm will be slower), but will I win >>>>>>> if we take the parallelization into account - less communication + >>>>>>> probably more scalable algorithm? >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Alex. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 07/29/2017 05:23 PM, Barry Smith wrote: >>>>>>> DMDA provides a simple way to do domain decomposition of structured >>>>>>> grid meshes in the x, y, and z direction, that is it makes it easy to >>>>>>> efficiently chop up vectors in all 3 dimensions, allowing very large >>>>>>> problems easily. Unfortunately it only goes up to 3 dimensions and >>>>>>> attempts to produce versions for higher dimensions have failed. >>>>>>> >>>>>>> If you would like to do domain decomposition across all six of your >>>>>>> dimensions (which I think you really need to do to solve large >>>>>>> problems) we don't have a tool that helps with doing the domain >>>>>>> decomposition. To make matters more complicated, likely the 3d mpi-fftw >>>>>>> that need to be applied require rejiggering the data across the nodes >>>>>>> to get it into a form where the ffts can be applied efficiently. >>>>>>> >>>>>>> I don't think there is necessarily anything theoretically difficult >>>>>>> about managing the six dimensional domain decomposition I just don't >>>>>>> know of any open source software out there that helps to do this. Maybe >>>>>>> some PETSc users are aware of such software and can chime in. >>>>>>> >>>>>>> Aside from these two issues :-) PETSc would be useful for you since >>>>>>> you could use our time integrators and nonlinear solvers. Pulling out >>>>>>> subvectors to process on can be done with with either VecScatter or >>>>>>> VecStrideScatter, VecStrideGather etc depending on the layout of the >>>>>>> unknowns, i.e. how the domain decomposition is done. >>>>>>> >>>>>>> I wish I had better answers for managing the 6d domain decomposition >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Jul 29, 2017, at 5:06 PM, Oleksandr Koshkarov <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Dear All, >>>>>>> >>>>>>> I am a new PETSc user and I am still in the middle of the manual (I >>>>>>> have finally settled to choose PETSc as my main numerical library) so I >>>>>>> am sorry beforehand if my questions would be naive. >>>>>>> >>>>>>> I am trying to solve 6+1 dimensional Vlasov equation with spectral >>>>>>> methods. More precisely, I will try to solve half-discretized equations >>>>>>> of the form (approximate form) with pseudospectral Fourier method: >>>>>>> >>>>>>> (Equations are in latex format, the nice website to see them is >>>>>>> https://www.codecogs.com/latex/eqneditor.php) >>>>>>> >>>>>>> \frac{dC_{m,m,p}}{dt} =\\ >>>>>>> \partial_x \left ( a_n C_{n+1,m,p} +b_n C_{n,m,p} +c_n C_{n-1,m,p} >>>>>>> \right ) \\ >>>>>>> + \partial_y \left ( a_m C_{n,m+1,p} +b_m C_{n,m,p} +c_m C_{n,m-1,p} >>>>>>> \right ) \\ >>>>>>> + \partial_z \left ( a_p C_{n,m,p+1} +b_p C_{n,m,p} +c_p C_{n,m,p-1} >>>>>>> \right ) \\ >>>>>>> + d_n E_x C_{n-1,m,p} + d_m E_x C_{n,m-1,p} + d_p E_x C_{n,m,p-1} \\ >>>>>>> + B_x (e_{m,p} C_{n,m-1,p-1} + f_{m,p}C_{n,m-1,p+1} + \dots) + B_y >>>>>>> (\dots) + B_z (\dots) >>>>>>> >>>>>>> >>>>>>> where a,b,c,d,e,f are some constants which can depend on n,m,p, >>>>>>> >>>>>>> n,m,p = 0...N, >>>>>>> >>>>>>> C_{n,m,p} = C_{n,m,p}(x,y,z), >>>>>>> >>>>>>> E_x = E_x(x,y,z), (same for E_y,B_x,...) >>>>>>> >>>>>>> and fields are described with equation of the sort (linear pdes with >>>>>>> source terms dependent on C): >>>>>>> >>>>>>> \frac{dE_x}{dt} = \partial_y B_z - \partial_z B_x + (AC_{1,0,0} + >>>>>>> BC_{0,0,0}) \\ >>>>>>> \frac{dB_x}{dt} = -\partial_y E_z + \partial_z E_x >>>>>>> >>>>>>> with A,B being some constants. >>>>>>> >>>>>>> I will need fully implicit CrankâNicolson method, so my plan is to use >>>>>>> SNES or TS where >>>>>>> >>>>>>> I will use one MPI PETSc vector which describes the state of the system >>>>>>> (all, C, E, B), then I can evolve linear part with just matrix >>>>>>> multiplication. >>>>>>> >>>>>>> The first question is, should I use something like DMDA? if so, it is >>>>>>> only 3dimensional but I need 6 dimensional vectots? Will it be faster >>>>>>> then matrix multiplication? >>>>>>> >>>>>>> Then to do nonlinear part I will need 3d mpi-fftw to compute >>>>>>> convolutions. The problem is, how do I extract subvectors from full big >>>>>>> vector state? (what is the best approach?) >>>>>>> >>>>>>> If you have some other suggestions, please feel free to share >>>>>>> >>>>>>> Thanks and best regards, >>>>>>> >>>>>>> Oleksandr Koshkarov. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which >>>>>>> their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> http://www.caam.rice.edu/~mk51/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> http://www.caam.rice.edu/~mk51/ >
