I developed a code based on libMesh to optimize components subjected to
extreme thermal loads. I use a gradient-based optimizer, and the
optimization problem has lots of constraint functions, so I have to perform
a direct sensitivity analysis where the two systems (thermal and static)
both have to be solved once for each design variable. There are many design
variables, so, for the sake of efficiency, I solve the system with all of
the RHS vectors packed into a matrix. Previously, I did this using PETSc
and SuperLU_DIST. For the sake of maintainability and efficiency, I am
replacing PETSc with Eigen at the moment, and have run into an issue that I
can't understand.

The systems are evaluated using 4 separate functions (this is relevant
later):
1) Solve temperatures
2) Solve displacements
3) Differentiate temperatures
4) Differentiate displacements

The solve steps use libMesh's interface and internal Eigen solver. Since
this can't be used to solve multiple RHS vectors at once, in each of the
differentiation functions I make a copy or const reference of the system
matrix using a mat() method I added to EigenSparseMatrix. I then create my
own Eigen solver, pack the RHS vectors for the sensitivity analysis into a
matrix and solve. I'm testing different variants; some of them work just
fine, but some of them crash, and I see no logical reason why.

- If I use an Eigen::BiCGSTAB<EigenSM> solver for steps #3 and #4,
everything works as expected, even if I use the matrix copy, which has a
column-major storage order.
- If I change the solver in step #4 to Eigen::SparseLU, there is a
significant efficiency improvement, and everything works fine.
- However, if I change the solver in step #3 to Eigen::SparseLU, or even
just to Eigen::BiCGSTAB<Eigen::SparseMatrix<Number>>, I get the following
error message and the code crashes:

double free or corruption (out)
[e2mbacur2018li:27024] *** Process received signal ***
[e2mbacur2018li:27024] Signal: Aborted (6)
[e2mbacur2018li:27024] Signal code:  (-6)
[e2mbacur2018li:27024] [ 0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7fc2af29b890]
[e2mbacur2018li:27024] [ 1]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7fc2aeed6e97]
[e2mbacur2018li:27024] [ 2]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7fc2aeed8801]
[e2mbacur2018li:27024] [ 3]
/lib/x86_64-linux-gnu/libc.so.6(+0x89897)[0x7fc2aef21897]
[e2mbacur2018li:27024] [ 4]
/lib/x86_64-linux-gnu/libc.so.6(+0x9090a)[0x7fc2aef2890a]
[e2mbacur2018li:27024] [ 5]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x525)[0x7fc2aef2fe75]
[e2mbacur2018li:27024] [ 6]
/usr/local/lib/libmesh_opt.so.0(_ZN5Eigen8internal8bicgstabINS_3RefIKNS_12SparseMatrixIdLi1EiEELi0ENS_11OuterStrideILin1EEEEENS_5BlockIKNS_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELin1ELi1ELb1EEENS9_ISB_Lin1ELi1ELb1EEENS_22DiagonalPreconditionerIdEEEEbRKT_RKT0_RT1_RKT2_RlRNSN_10RealScalarE+0x5a5a)[0x7fc2b11c3e8a]
[e2mbacur2018li:27024] [ 7]
/usr/local/lib/libmesh_opt.so.0(_ZN7libMesh23EigenSparseLinearSolverIdE5solveERNS_12SparseMatrixIdEERNS_13NumericVectorIdEES7_dj+0x1946)[0x7fc2b11d7436]
[e2mbacur2018li:27024] [ 8]
/usr/local/lib/libmesh_opt.so.0(_ZN7libMesh20LinearImplicitSystem5solveEv+0x40d)[0x7fc2b1233f7d]
[e2mbacur2018li:27024] [ 9] ./mspfc_td(+0xd635c)[0x557868ecb35c]
[e2mbacur2018li:27024] [10] ./mspfc_td(+0xcdf39)[0x557868ec2f39]
[e2mbacur2018li:27024] [11] ./mspfc_td(+0x9a63b)[0x557868e8f63b]
[e2mbacur2018li:27024] [12] ./mspfc_td(+0x2229a)[0x557868e1729a]
[e2mbacur2018li:27024] [13]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fc2aeeb9b97]
[e2mbacur2018li:27024] [14] ./mspfc_td(+0x2381a)[0x557868e1881a]
[e2mbacur2018li:27024] *** End of error message ***
Aborted (core dumped)

Valgrind confirms the confusing part: the choice of solver in the function
for step #3 (even the choice of the matrix type template parameter), causes
a memory error in the solver used in step #1. Would you guys have any idea
why changing the type of an Eigen solver external to the libMesh library
would cause a memory error in the internal solver before the external
solver even comes into scope? I'd appreciate any insight. For some reason,
the choice of solver in step #4 seems to have no connection to the issue at
all. Below is an excerpt from valgrind.

Regards,
Bailey

==27128== Invalid free() / delete / delete[] / realloc()
==27128==    at 0x4C30D3B: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27128==    by 0x57C0E89: bool
Eigen::internal::bicgstab<Eigen::Ref<Eigen::SparseMatrix<double, 1, int>
const, 0, Eigen::OuterStride<-1> >, Eigen::Block<Eigen::Matrix<double, -1,
1, 0, -1, 1> const, -1, 1, true>, Eigen::Block<Eigen::Matrix<double, -1, 1,
0, -1, 1>, -1, 1, true>, Eigen::DiagonalPreconditioner<double>
>(Eigen::Ref<Eigen::SparseMatrix<double, 1, int> const, 0,
Eigen::OuterStride<-1> > const&, Eigen::Block<Eigen::Matrix<double, -1, 1,
0, -1, 1> const, -1, 1, true> const&, Eigen::Block<Eigen::Matrix<double,
-1, 1, 0, -1, 1>, -1, 1, true>&, Eigen::DiagonalPreconditioner<double>
const&, long&, Eigen::Block<Eigen::Matrix<double, -1, 1, 0, -1, 1>, -1, 1,
true>::RealScalar&) (in /usr/local/lib/libmesh_opt.so.0.0.0)
==27128==    by 0x57D4435:
libMesh::EigenSparseLinearSolver<double>::solve(libMesh::SparseMatrix<double>&,
libMesh::NumericVector<double>&, libMesh::NumericVector<double>&, double,
unsigned int) (in /usr/local/lib/libmesh_opt.so.0.0.0)
==27128==    by 0x5830F7C: libMesh::LinearImplicitSystem::solve() (in
/usr/local/lib/libmesh_opt.so.0.0.0)
==27128==    by 0x1DE35B: MinStressPFC_TD::_solve_thermal() (in
/home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x1D5F38: MinStressPFC::eval_sys_equations(unsigned int,
std::vector<double, std::allocator<double> > const&, std::vector<double,
std::allocator<double> >&, std::vector<double, std::allocator<double> >&)
(in /home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x1A263A: OptimizerMMA_eigen::iterate() (in
/home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x12A299: main (in /home/bacur/minstresspfc/mspfc_td)
==27128==  Address 0x1905ee00 is 32 bytes inside a block of size 2,880
alloc'd
==27128==    at 0x4C2FB0F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27128==    by 0x16C97C: Eigen::internal::aligned_malloc(unsigned long)
(in /home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x1BF31B: Eigen::PlainObjectBase<Eigen::Matrix<double, -1,
1, 0, -1, 1> >::resize(long, long) (in /home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x57BBBC6: bool
Eigen::internal::bicgstab<Eigen::Ref<Eigen::SparseMatrix<double, 1, int>
const, 0, Eigen::OuterStride<-1> >, Eigen::Block<Eigen::Matrix<double, -1,
1, 0, -1, 1> const, -1, 1, true>, Eigen::Block<Eigen::Matrix<double, -1, 1,
0, -1, 1>, -1, 1, true>, Eigen::DiagonalPreconditioner<double>
>(Eigen::Ref<Eigen::SparseMatrix<double, 1, int> const, 0,
Eigen::OuterStride<-1> > const&, Eigen::Block<Eigen::Matrix<double, -1, 1,
0, -1, 1> const, -1, 1, true> const&, Eigen::Block<Eigen::Matrix<double,
-1, 1, 0, -1, 1>, -1, 1, true>&, Eigen::DiagonalPreconditioner<double>
const&, long&, Eigen::Block<Eigen::Matrix<double, -1, 1, 0, -1, 1>, -1, 1,
true>::RealScalar&) (in /usr/local/lib/libmesh_opt.so.0.0.0)
==27128==    by 0x57D4435:
libMesh::EigenSparseLinearSolver<double>::solve(libMesh::SparseMatrix<double>&,
libMesh::NumericVector<double>&, libMesh::NumericVector<double>&, double,
unsigned int) (in /usr/local/lib/libmesh_opt.so.0.0.0)
==27128==    by 0x5830F7C: libMesh::LinearImplicitSystem::solve() (in
/usr/local/lib/libmesh_opt.so.0.0.0)
==27128==    by 0x1DE35B: MinStressPFC_TD::_solve_thermal() (in
/home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x1D5F38: MinStressPFC::eval_sys_equations(unsigned int,
std::vector<double, std::allocator<double> > const&, std::vector<double,
std::allocator<double> >&, std::vector<double, std::allocator<double> >&)
(in /home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x1A263A: OptimizerMMA_eigen::iterate() (in
/home/bacur/minstresspfc/mspfc_td)
==27128==    by 0x12A299: main (in /home/bacur/minstresspfc/mspfc_td)

_______________________________________________
Libmesh-users mailing list
Libmesh-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to