Hi John,

I was running an older version of LibMesh when I got the previous segfault
in libMesh::DofMap::clear_sparsity(), which happened after changing the
assemble for CondensedEigenSystem to only set diagonal dofs in B. Then I
upgraded to libMesh 0.9.4 and that segfault went away. Now the code runs
fine with 1 and 2 processes, however it segfaults with 4 or more processes
after printing the following message:

PARMETIS ERROR: The sum of tpwgts for constraint #0 is not 1.0

I also get this error message: Assertion `((min_procid ==
this->processor_id()) && obj) || (min_procid != this->processor_id())'
failed.

Here's the stacktrace:

#22 0x0000003f212bcd22 in __cxa_throw () from /usr/lib64/libstdc++.so.6
#23 0x00002b5818f65382 in void
libMesh::ParallelMesh::libmesh_assert_valid_parallel_object_ids<libMesh::Elem>(libMesh::mapvector<libMesh::Elem*,
unsigned int> const&) const () from
/home/hsahasra/NEMO5/libs/libmesh/libmesh/.libs/libmesh_dbg.so.0
#24 0x00002b5818f572b0 in
libMesh::ParallelMesh::libmesh_assert_valid_parallel_ids() const () from
/home/hsahasra/NEMO5/libs/libmesh/libmesh/.libs/libmesh_dbg.so.0
#25 0x00002b5818f57431 in
libMesh::ParallelMesh::renumber_nodes_and_elements() () from
/home/hsahasra/NEMO5/libs/libmesh/libmesh/.libs/libmesh_dbg.so.0
#26 0x00002b5818e4224d in libMesh::MeshBase::prepare_for_use(bool, bool) ()
from /home/hsahasra/NEMO5/libs/libmesh/libmesh/.libs/libmesh_dbg.so.0


I searched for this error on Google and found a few earlier posts related
to LibMesh using a different PARMETIS than PETSc. Were you able to get a
fix for this issue?

Thanks!
Harshad

On Tue, Feb 16, 2016 at 4:19 PM, Harshad Sahasrabudhe <hsaha...@purdue.edu>
wrote:

> Hi John and David,
>
> Thanks for your comments. I'm working on rebuilding our environment to get
> the full profiling. I have done some preliminary profiling, the scaling
> that I posted above was only for the call "_system->solve()" on the
> CondensedEigenSystem.
>
> I suggest you consult the SLEPc manual, and ask on the PETSc mailing list
>> if needed. As you can see from the SLEPc manual, you normally don't need to
>> worry about preconditioning since Krylov methods are already optimal if
>> you're looking for an extreme eigenvalue.
>
>
> I'm calculating the smallest eigenvalues so I guess optimization of
> precondition isn't that high on the priority list. One reason for the bad
> scaling could be high number of basis vectors. I was using 3*nev, I'll
> reduce it to 2*nev as suggested in the SLEPc manual and try again.
>
> I  changed the matrix assembly to only add diagonals to the B matrix.
> After doing this, I get a segfault in DofMap::clear_sparsity. Could this be
> because PETSc clears the values that haven't been set? Here's the stack
> trace:
>
> #0  0x0000003ab5232625 in raise () from /lib64/libc.so.6
> #1  0x0000003ab5233e05 in abort () from /lib64/libc.so.6
> #2  0x0000003ab5270537 in __libc_message () from /lib64/libc.so.6
> #3  0x0000003ab5275f4e in malloc_printerr () from /lib64/libc.so.6
> #4  0x00002b483d3bccdc in my_free_hook (ptr=0x1ddec80,
> caller=0x2b483ff657fb) at ../../i_rtc_hook.c:113
> #5  0x00002b483ff657fb in libMesh::DofMap::clear_sparsity() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #6  0x00002b483ff6569d in libMesh::DofMap::clear() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #7  0x00002b48405556b7 in libMesh::System::clear() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #8  0x00002b4840521bfa in libMesh::EigenSystem::clear() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #9  0x00002b4840521b8c in libMesh::EigenSystem::~EigenSystem() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #10 0x00002b4840514d48 in
> libMesh::CondensedEigenSystem::~CondensedEigenSystem() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #11 0x00002b4840524a61 in libMesh::EquationSystems::clear() () from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #12 0x00002b4840524882 in libMesh::EquationSystems::~EquationSystems() ()
> from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
> #13 0x00002b484052484a in libMesh::EquationSystems::~EquationSystems() ()
> from
> /apps/group/ncn/carter/PETSc34/libs/libmesh/libmesh/.libs/libmesh_opt.so.0
>
> Thanks,
> Harshad
>
> On Tue, Feb 16, 2016 at 12:35 PM, John Peterson <jwpeter...@gmail.com>
> wrote:
>
>>
>>
>> On Tue, Feb 16, 2016 at 10:12 AM, Harshad Sahasrabudhe <
>> hsaha...@purdue.edu> wrote:
>>
>>> Hi John,
>>>
>>> Thanks for the response.
>>>
>>>
>>>> I'm using KrylovSchur eigensolver
>>>>> in SLEPc and I find that the eigenvalue computation anti-scales when I
>>>>> use
>>>>>
>>>>
>>>> anti-scales?
>>>>
>>>
>>> Yes, I get the following scaling
>>>
>>> Processes  Eigensolver time(s)
>>>        1               68.3037
>>>        2               51.9604
>>>        4               49.5286
>>>        8               66.7834
>>>       16             106.671
>>>       32             128.522
>>>
>>> One node contains 16 processors, so the last time is for 2 nodes.
>>>
>>
>>
>> Even your scaling from 1->2 processors is very bad, let alone out to 32.
>> Have you done any profiling?  Even configuring libmesh with
>> --enable-perflog would be better than nothing, and might help you figure
>> out which parts of the code are not scaling.  If it's not the Eigensolve,
>> then you are focusing in the wrong place anyway.
>>
>>
>>
>>
>>> Gauss quadrature. Having a diagonal pattern in B might fix the scaling
>>>>> and
>>>>> increase the performance.
>>>>>
>>>>> Do you have any other suggestions on improving the scaling?
>>>>>
>>>>
>>>> Use preconditioners which scale well, like AMG, if they make sense for
>>>> your problem.
>>>>
>>>
>>> I can try using different preconditioners. How do I set the
>>> preconditioner for the eigensolver?
>>>
>>
>> As David mentioned, you can use all the usual PETSc command line options
>> (-pc_type, -ksp_type, etc.) but prefix them with "-st_" and they will get
>> used by SLEPc.
>>
>> --
>> John
>>
>
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Libmesh-users mailing list
Libmesh-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to