[deal.II] Re: Is it normal for the results to vary this much when running in parallel.?

Abbas Ballout Mon, 12 Jun 2023 02:59:10 -0700

Additionally; 

I took step 40, changed the rhs term to be a constant 1 and added a  
flux_calculation()

function which loops on the boundary and dots the normal with the gradient 
of the solution (akin to calculating forces).
I also added a GridTools::distort_random(0.3, triangulation, true, 500 or 
something);after mesh generation to get an unstructured mesh.  
I run it for a single cycle and there seem to be a discrepancy in the flux 
calculation for an unstructured grid when running with different cores. 
This doesn't happen with a structured grid. 

The output for the flux calculation on a structured mesh with MPI 1 is:
Running with PETSc on 1 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 9 iterations.
Sigma_flux -0.994394

and for MPI 4 is:
Running with PETSc on 4 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 9 iterations.
Sigma_flux -0.994394

But after activating distort_random the output with MPI 1 is:
Running with PETSc on 1 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 8 iterations.
Sigma_flux -0.993997

but with MPI 4 the output is:
Running with PETSc on 4 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 9 iterations.
Sigma_flux -0.994323

The file is also attached bellow. 

On Monday, June 12, 2023 at 10:57:03 AM UTC+2 Abbas Ballout wrote:

> I am running step 18.  
>
> This is the output I getting for a single quasi-tatic step with mpirun 1 : 
>
>  
>
>
>
>
>
>
>
>
>
>
>
> * Cycle 0:    Number of active cells:       3712 (by partition: 3712)    
> Number of degrees of freedom: 17226 (by partition: 17226)    Assembling 
> system... norm of rhs is 1.88062e+10    Solver converged in 103 
> iterations.    Updating quadrature point data...  Cycle 1:    Number of 
> active cells:       12805 (by partition: 12805)    Number of degrees of 
> freedom: 51708 (by partition: 51708)    Assembling system... norm of rhs is 
> 1.86145e+10    Solver converged in 120 iterations.    Updating quadrature 
> point data...    Moving mesh..*.
>
>  And this is the output I get when mpirun 3: 
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Timestep 1 at time 1  Cycle 0:    Number of active cells:       3712 (by 
> partition: 1360+1286+1066)    Number of degrees of freedom: 17226 (by 
> partition: 6651+5922+4653)    Assembling system... norm of rhs is 
> 1.88062e+10    Solver converged in 131 iterations.    Updating quadrature 
> point data...  Cycle 1:    Number of active cells:       12805 (by 
> partition: 4565+4425+3815)    Number of degrees of freedom: 51708 (by 
> partition: 19983+17250+14475)    Assembling system... norm of rhs is 
> 3.67161e+10    Solver converged in 126 iterations.    Updating quadrature 
> point data...    Moving mesh...*
>
> The L2 norm in cycle 1 is different between runs with mpi 1 and mpi 3. Is 
> this normal? 
>
> I am experiencing the same problem in my code and the results seem to be 
> slightly mpi dependent.  
>
> The code is attached bellow. It's step 18 the only difference is that I 
> only am running a single step 
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/76e2b9fd-853f-4fc2-a8da-02f86ded0c56n%40googlegroups.com.

/* ---------------------------------------------------------------------
 *
 * Copyright (C) 2009 - 2022 by the deal.II authors
 *
 * This file is part of the deal.II library.
 *
 * The deal.II library is free software; you can use it, redistribute
 * it, and/or modify it under the terms of the GNU Lesser General
 * Public License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 * The full text of the license can be found in the file LICENSE.md at
 * the top level directory of deal.II.
 *
 * ---------------------------------------------------------------------

 *
 * Author: Wolfgang Bangerth, Texas A&M University, 2009, 2010
 *         Timo Heister, University of Goettingen, 2009, 2010
 */


#include <deal.II/numerics/vector_tools.h>
#include <deal.II/base/quadrature_lib.h>
#include <deal.II/base/function.h>
#include <deal.II/base/timer.h>

#include <deal.II/lac/generic_linear_algebra.h>

namespace LA
{
#if defined(DEAL_II_WITH_PETSC) && !defined(DEAL_II_PETSC_WITH_COMPLEX) && \
  !(defined(DEAL_II_WITH_TRILINOS) && defined(FORCE_USE_OF_TRILINOS))
  using namespace dealii::LinearAlgebraPETSc;
#  define USE_PETSC_LA
#elif defined(DEAL_II_WITH_TRILINOS)
  using namespace dealii::LinearAlgebraTrilinos;
#else
#  error DEAL_II_WITH_PETSC or DEAL_II_WITH_TRILINOS required
#endif
} // namespace LA

#include <deal.II/numerics/matrix_tools.h>
#include <deal.II/lac/vector.h>
#include <deal.II/lac/full_matrix.h>
#include <deal.II/lac/solver_cg.h>
#include <deal.II/lac/affine_constraints.h>
#include <deal.II/lac/dynamic_sparsity_pattern.h>

#include <deal.II/grid/grid_generator.h>
#include <deal.II/grid/grid_tools.h>
#include <deal.II/dofs/dof_handler.h>
#include <deal.II/dofs/dof_tools.h>
#include <deal.II/fe/fe_values.h>
#include <deal.II/fe/fe_q.h>
#include <deal.II/numerics/vector_tools.h>
#include <deal.II/numerics/data_out.h>
#include <deal.II/numerics/error_estimator.h>

#include <deal.II/base/utilities.h>
#include <deal.II/base/conditional_ostream.h>
#include <deal.II/base/index_set.h>
#include <deal.II/lac/sparsity_tools.h>
#include <deal.II/distributed/tria.h>
#include <deal.II/distributed/grid_refinement.h>

#include <fstream>
#include <iostream>

namespace Step40
{
  using namespace dealii;


  template <int dim>
  class LaplaceProblem
  {
  public:
    LaplaceProblem();

    void run();

  private:
    void setup_system();
    void assemble_system();
    void solve();
    void refine_grid();
    void output_results(const unsigned int cycle) const;
    void flux_calculation();


    MPI_Comm mpi_communicator;

    parallel::distributed::Triangulation<dim> triangulation;

    FE_Q<dim>       fe;
    DoFHandler<dim> dof_handler;

    IndexSet locally_owned_dofs;
    IndexSet locally_relevant_dofs;

    AffineConstraints<double> constraints;

    LA::MPI::SparseMatrix system_matrix;
    LA::MPI::Vector       locally_relevant_solution;
    LA::MPI::Vector       system_rhs;

    ConditionalOStream pcout;
    TimerOutput        computing_timer;
  };




  template <int dim>
  LaplaceProblem<dim>::LaplaceProblem()
    : mpi_communicator(MPI_COMM_WORLD)
    , triangulation(mpi_communicator,
                    typename Triangulation<dim>::MeshSmoothing(
                      Triangulation<dim>::smoothing_on_refinement |
                      Triangulation<dim>::smoothing_on_coarsening))
    , fe(2)
    , dof_handler(triangulation)
    , pcout(std::cout,
            (Utilities::MPI::this_mpi_process(mpi_communicator) == 0))
    , computing_timer(mpi_communicator,
                      pcout,
                      TimerOutput::never,
                      TimerOutput::wall_times)
  {}




  template <int dim>
  void LaplaceProblem<dim>::setup_system()
  {
    TimerOutput::Scope t(computing_timer, "setup");

    dof_handler.distribute_dofs(fe);

    locally_owned_dofs = dof_handler.locally_owned_dofs();
    locally_relevant_dofs =
      DoFTools::extract_locally_relevant_dofs(dof_handler);

    locally_relevant_solution.reinit(locally_owned_dofs,
                                     locally_relevant_dofs,
                                     mpi_communicator);
    system_rhs.reinit(locally_owned_dofs, mpi_communicator);

    constraints.clear();
    constraints.reinit(locally_relevant_dofs);
    DoFTools::make_hanging_node_constraints(dof_handler, constraints);
    VectorTools::interpolate_boundary_values(dof_handler,
                                             0,
                                             Functions::ZeroFunction<dim>(),
                                             constraints);
    constraints.close();

    DynamicSparsityPattern dsp(locally_relevant_dofs);

    DoFTools::make_sparsity_pattern(dof_handler, dsp, constraints, false);
    SparsityTools::distribute_sparsity_pattern(dsp,
                                               dof_handler.locally_owned_dofs(),
                                               mpi_communicator,
                                               locally_relevant_dofs);

    system_matrix.reinit(locally_owned_dofs,
                         locally_owned_dofs,
                         dsp,
                         mpi_communicator);
  }




  template <int dim>
  void LaplaceProblem<dim>::assemble_system()
  {
    TimerOutput::Scope t(computing_timer, "assembly");

    const QGauss<dim> quadrature_formula(fe.degree + 1);

    FEValues<dim> fe_values(fe,
                            quadrature_formula,
                            update_values | update_gradients |
                              update_quadrature_points | update_JxW_values);

    const unsigned int dofs_per_cell = fe.n_dofs_per_cell();
    const unsigned int n_q_points    = quadrature_formula.size();

    FullMatrix<double> cell_matrix(dofs_per_cell, dofs_per_cell);
    Vector<double>     cell_rhs(dofs_per_cell);

    std::vector<types::global_dof_index> local_dof_indices(dofs_per_cell);

    for (const auto &cell : dof_handler.active_cell_iterators())
      if (cell->is_locally_owned())
        {
          cell_matrix = 0.;
          cell_rhs    = 0.;

          fe_values.reinit(cell);

          for (unsigned int q_point = 0; q_point < n_q_points; ++q_point)
            {

              for (unsigned int i = 0; i < dofs_per_cell; ++i)
                {
                  for (unsigned int j = 0; j < dofs_per_cell; ++j)
                    cell_matrix(i, j) += fe_values.shape_grad(i, q_point) *
                                         fe_values.shape_grad(j, q_point) *
                                         fe_values.JxW(q_point);

                  cell_rhs(i) += 1.0 *                         //
                                 fe_values.shape_value(i, q_point) * //
                                 fe_values.JxW(q_point);
                }
            }

          cell->get_dof_indices(local_dof_indices);
          constraints.distribute_local_to_global(cell_matrix,
                                                 cell_rhs,
                                                 local_dof_indices,
                                                 system_matrix,
                                                 system_rhs);
        }

    system_matrix.compress(VectorOperation::add);
    system_rhs.compress(VectorOperation::add);
  }




  template <int dim>
  void LaplaceProblem<dim>::solve()
  {
    TimerOutput::Scope t(computing_timer, "solve");
    LA::MPI::Vector    completely_distributed_solution(locally_owned_dofs,
                                                    mpi_communicator);

    SolverControl solver_control(dof_handler.n_dofs(), 1e-12);
    LA::SolverCG  solver(solver_control);


    LA::MPI::PreconditionAMG::AdditionalData data;
#ifdef USE_PETSC_LA
    data.symmetric_operator = true;
#else
    /* Trilinos defaults are good */
#endif
    LA::MPI::PreconditionAMG preconditioner;
    preconditioner.initialize(system_matrix, data);

    solver.solve(system_matrix,
                 completely_distributed_solution,
                 system_rhs,
                 preconditioner);

    pcout << "   Solved in " << solver_control.last_step() << " iterations."
          << std::endl;

    constraints.distribute(completely_distributed_solution);

    locally_relevant_solution = completely_distributed_solution;
  }




  template <int dim>
  void LaplaceProblem<dim>::refine_grid()
  {
    TimerOutput::Scope t(computing_timer, "refine");

    Vector<float> estimated_error_per_cell(triangulation.n_active_cells());
    KellyErrorEstimator<dim>::estimate(
      dof_handler,
      QGauss<dim - 1>(fe.degree + 1),
      std::map<types::boundary_id, const Function<dim> *>(),
      locally_relevant_solution,
      estimated_error_per_cell);
    parallel::distributed::GridRefinement::refine_and_coarsen_fixed_number(
      triangulation, estimated_error_per_cell, 0.3, 0.03);
    triangulation.execute_coarsening_and_refinement();
  }




  template <int dim>
  void LaplaceProblem<dim>::output_results(const unsigned int cycle) const
  {
    DataOut<dim> data_out;
    data_out.attach_dof_handler(dof_handler);
    data_out.add_data_vector(locally_relevant_solution, "u");

    Vector<float> subdomain(triangulation.n_active_cells());
    for (unsigned int i = 0; i < subdomain.size(); ++i)
      subdomain(i) = triangulation.locally_owned_subdomain();
    data_out.add_data_vector(subdomain, "subdomain");

    data_out.build_patches();

    data_out.write_vtu_with_pvtu_record(
      "./", "solution", cycle, mpi_communicator, 2, 8);
  }

 template <int dim>
  void LaplaceProblem<dim>::flux_calculation()
  {
    
    double Sigma_flux = 0.0;
    QGauss<dim - 1> face_quadrature_formula(5);

    FEFaceValues<dim> fe_face_values(fe,
                                     face_quadrature_formula,
                                     UpdateFlags(update_values |
                                                 update_gradients |
                                                 update_quadrature_points |
                                                 update_normal_vectors |
                                                 update_JxW_values));

    const unsigned int n_face_q_points = face_quadrature_formula.size();

    std::vector<Tensor<1, dim>> gradu(n_face_q_points);

    typename DoFHandler<dim>::active_cell_iterator cell = dof_handler.begin_active(), endc = dof_handler.end();

    double local_flux=0;

    for (; cell != endc; ++cell)
      if (cell->is_locally_owned())
        for (unsigned int face = 0; face < GeometryInfo<dim>::faces_per_cell; ++face)
          
          {
            fe_face_values.reinit(cell, face);
            fe_face_values.get_function_gradients(locally_relevant_solution, gradu);

            const std::vector<double> &JxW = fe_face_values.get_JxW_values();
            const std::vector<Tensor<1, dim>> &normals = fe_face_values.get_normal_vectors();

            for (unsigned int point = 0; point < n_face_q_points; ++point)
            {
              local_flux+= gradu[point]*normals[point]*JxW[point]; 
            }
          }

    Sigma_flux = Utilities::MPI::sum(local_flux, mpi_communicator);
    pcout<<"Sigma_flux " << Sigma_flux << "\n"  ; 
  }


  template <int dim>
  void LaplaceProblem<dim>::run()
  {
    pcout << "Running with "
#ifdef USE_PETSC_LA
          << "PETSc"
#else
          << "Trilinos"
#endif
          << " on " << Utilities::MPI::n_mpi_processes(mpi_communicator)
          << " MPI rank(s)..." << std::endl;

    const unsigned int n_cycles = 1;
    for (unsigned int cycle = 0; cycle < n_cycles; ++cycle)
      {
        pcout << "Cycle " << cycle << ':' << std::endl;

        if (cycle == 0)
          {
            GridGenerator::hyper_cube(triangulation);
            triangulation.refine_global(3);
          //    GridTools::distort_random(0.3, triangulation, true);
          }
        else
          refine_grid();

        setup_system();

        pcout << "   Number of active cells:       "
              << triangulation.n_global_active_cells() << std::endl
              << "   Number of degrees of freedom: " << dof_handler.n_dofs()
              << std::endl;

        assemble_system();
        solve();

        {
          TimerOutput::Scope t(computing_timer, "output");
          output_results(cycle);
        }
        flux_calculation();

       // computing_timer.print_summary();
     //   computing_timer.reset();

        pcout << std::endl;
      }
  }
} // namespace Step40




int main(int argc, char *argv[])
{
  try
    {
      using namespace dealii;
      using namespace Step40;

      Utilities::MPI::MPI_InitFinalize mpi_initialization(argc, argv, 1);

      LaplaceProblem<2> laplace_problem_2d;
      laplace_problem_2d.run();
    }
  catch (std::exception &exc)
    {
      std::cerr << std::endl
                << std::endl
                << "----------------------------------------------------"
                << std::endl;
      std::cerr << "Exception on processing: " << std::endl
                << exc.what() << std::endl
                << "Aborting!" << std::endl
                << "----------------------------------------------------"
                << std::endl;

      return 1;
    }
  catch (...)
    {
      std::cerr << std::endl
                << std::endl
                << "----------------------------------------------------"
                << std::endl;
      std::cerr << "Unknown exception!" << std::endl
                << "Aborting!" << std::endl
                << "----------------------------------------------------"
                << std::endl;
      return 1;
    }

  return 0;
}

[deal.II] Re: Is it normal for the results to vary this much when running in parallel.?

Reply via email to