Hello, I am facing an error while I am using matrix-free in GPU with periodic boundary conditions. I am attaching a minimal example that illustrates the issue I am facing. I am using deal.II -9.3.0-pre.
The minimal example is derived from step-64 of the tutorials. In this code, 1) Create a single element using the hypercube function. 2) Create HEX27 finite element based dof_handler and also create the constraint matrices 3) Create matrix-free objects on the host and GPU. 4) Create a host input vector compatible with the constraints ( I set the values at the unconstrained nodes to be its global Id). 5) Send the input vector from the host to the GPU 6) Perform a single vmult operation with the Laplace operator on the host and the GPU. 7) Send the output from the GPU to the host 8) Compare the two outputs When I ran the code in debug mode on a single MPI task and compared the two outputs, the values at the unconstrained nodes do not seem to match. To ensure there are no bugs in my minimal example, I have a periodicBC flag. When The periodicBC is set to true, periodic + homogeneous Dirichlet boundary condition are imposed. If it is set to false, an homogeneous Dirichlet BC is imposed at the interior node. In this case, the output values do match. This flag affects how the constraint matrix is created and nothing else. I would be very grateful if someone can tell me what mistake I am making. Any help is greatly appreciated. thanks and regards, Vishal Subramanian -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/4d208dbf-5238-4c80-bcf5-39c634d2d534n%40googlegroups.com.
Creating a single element mesh using hypercube Applying Dirichlet boundary conditions Created Constriants Number of active cells: 1 Number of degrees of freedom: 27 Populating the input vector vector L2 Norm of the input = 74.33034374 vector L2 Norm of the input = 74.33034374 Value of the input vec at global node id = 0 is : 0 Value of the input vec at global node id = 1 is : 1 Value of the input vec at global node id = 2 is : 2 Value of the input vec at global node id = 3 is : 3 Value of the input vec at global node id = 4 is : 4 Value of the input vec at global node id = 5 is : 5 Value of the input vec at global node id = 6 is : 6 Value of the input vec at global node id = 7 is : 7 Value of the input vec at global node id = 8 is : 8 Value of the input vec at global node id = 9 is : 9 Value of the input vec at global node id = 10 is : 10 Value of the input vec at global node id = 11 is : 11 Value of the input vec at global node id = 12 is : 12 Value of the input vec at global node id = 13 is : 13 Value of the input vec at global node id = 14 is : 14 Value of the input vec at global node id = 15 is : 15 Value of the input vec at global node id = 16 is : 16 Value of the input vec at global node id = 17 is : 17 Value of the input vec at global node id = 18 is : 18 Value of the input vec at global node id = 19 is : 19 Value of the input vec at global node id = 20 is : 20 Value of the input vec at global node id = 21 is : 21 Value of the input vec at global node id = 22 is : 22 Value of the input vec at global node id = 23 is : 23 Value of the input vec at global node id = 24 is : 24 Value of the input vec at global node id = 25 is : 25 Value of the input vec at global node id = 26 is : 0 vector L2 Norm of the output host = 61.65471943 vector L2 Norm of the output dev = 61.65471943 vector L2 Norm from dev in host = 61.65471943 Difference between host and dev output at global node id 0 is : 1.33226763e-15 Difference between host and dev output at global node id 1 is : 4.440892099e-16 Difference between host and dev output at global node id 2 is : -8.881784197e-16 Difference between host and dev output at global node id 3 is : -4.440892099e-16 Difference between host and dev output at global node id 4 is : -6.661338148e-16 Difference between host and dev output at global node id 5 is : -6.661338148e-16 Difference between host and dev output at global node id 6 is : -3.108624469e-15 Difference between host and dev output at global node id 7 is : -1.33226763e-15 Difference between host and dev output at global node id 8 is : -1.554312234e-15 Difference between host and dev output at global node id 9 is : -4.440892099e-16 Difference between host and dev output at global node id 10 is : 4.440892099e-16 Difference between host and dev output at global node id 11 is : 2.553512957e-15 Difference between host and dev output at global node id 12 is : 2.220446049e-15 Difference between host and dev output at global node id 13 is : 2.220446049e-16 Difference between host and dev output at global node id 14 is : 4.440892099e-16 Difference between host and dev output at global node id 15 is : -1.33226763e-15 Difference between host and dev output at global node id 16 is : -5.329070518e-15 Difference between host and dev output at global node id 17 is : -6.217248938e-15 Difference between host and dev output at global node id 18 is : -1.065814104e-14 Difference between host and dev output at global node id 19 is : -1.243449788e-14 Difference between host and dev output at global node id 20 is : -7.105427358e-15 Difference between host and dev output at global node id 21 is : -7.105427358e-15 Difference between host and dev output at global node id 22 is : 1.421085472e-14 Difference between host and dev output at global node id 23 is : 1.421085472e-14 Difference between host and dev output at global node id 24 is : 2.131628207e-14 Difference between host and dev output at global node id 25 is : 2.131628207e-14 Test pass
Creating a single element mesh using hypercube Applying periodic boundary conditions Created Constriants Number of active cells: 1 Number of degrees of freedom: 27 Populating the input vector vector L2 Norm of the input = 43.35896678 vector L2 Norm of the input = 43.35896678 Value of the input vec at global node id = 0 is : 0 Value of the input vec at global node id = 1 is : 0 Value of the input vec at global node id = 2 is : 0 Value of the input vec at global node id = 3 is : 0 Value of the input vec at global node id = 4 is : 0 Value of the input vec at global node id = 5 is : 0 Value of the input vec at global node id = 6 is : 0 Value of the input vec at global node id = 7 is : 0 Value of the input vec at global node id = 8 is : 8 Value of the input vec at global node id = 9 is : 0 Value of the input vec at global node id = 10 is : 10 Value of the input vec at global node id = 11 is : 0 Value of the input vec at global node id = 12 is : 0 Value of the input vec at global node id = 13 is : 0 Value of the input vec at global node id = 14 is : 0 Value of the input vec at global node id = 15 is : 0 Value of the input vec at global node id = 16 is : 16 Value of the input vec at global node id = 17 is : 0 Value of the input vec at global node id = 18 is : 0 Value of the input vec at global node id = 19 is : 0 Value of the input vec at global node id = 20 is : 20 Value of the input vec at global node id = 21 is : 0 Value of the input vec at global node id = 22 is : 22 Value of the input vec at global node id = 23 is : 0 Value of the input vec at global node id = 24 is : 24 Value of the input vec at global node id = 25 is : 0 Value of the input vec at global node id = 26 is : 0 vector L2 Norm of the output host = 49.43173121 vector L2 Norm of the output dev = 78.00111759 vector L2 Norm from dev in host = 78.00111759 Difference between host and dev output at global node id 0 is : 6.014814815 Difference between host and dev output at global node id 8 is : 3.525925926 Difference between host and dev output at global node id 10 is : 3.271111111 Difference between host and dev output at global node id 16 is : 5.351111111 Difference between host and dev output at global node id 20 is : 17.49333333 Difference between host and dev output at global node id 22 is : 16.87703704 Difference between host and dev output at global node id 24 is : 17.92 Test fail
minimal_example.cu
Description: application/cu-seeme
