[deal.II] Re: AMR , how to pass solution vector to refined mesh

2017-03-16 Thread Jaekwang Kim
Thank you, Dr. Bangerth!

Step-15 tutorial was really helpful for me and I tried to used solution 
transfer class. 

I refined grid as follow. 

Because I am using Picard iteration, I use it as a previous_solution vector 
(in first refined mesh) when it is interpolated into refined meshes. 



template 

void nonlinear::refine_grid ()

{

Vector estimated_error_per_cell (triangulation.n_active_cells());

KellyErrorEstimator::estimate (dof_handler,

QGauss(3),

typename FunctionMap::type(),

solution,

estimated_error_per_cell);

GridRefinement::refine_and_coarsen_fixed_number (triangulation,

 
estimated_error_per_cell,

 0.3, 0.0);



/* You need to tranfer solution (n-1 step) domain into previous 
solution (n step)*/

   

triangulation.prepare_coarsening_and_refinement ();

SolutionTransfer solution_transfer(dof_handler);

solution_transfer.prepare_for_coarsening_and_refinement(solution);

triangulation.execute_coarsening_and_refinement(); // I think n_dof() 
has increased here. You need to check this if necessary

dof_handler.distribute_dofs(fe);



   // Vector tmp(dof_handler.n_dofs()); //tmp - n step dof



previous_solution.reinit (dof_handler.n_dofs());



solution_transfer.interpolate(solution, previous_solution);  // 
interpolate (input, output)



solution.reinit (dof_handler.n_dofs());



}


Also My setup_system function is is as follow 



template 

void nonlinear::setup_system (const unsigned int refinement_cycle)

{

if (refinement_cycle==0)

{ dof_handler.distribute_dofs (fe);



std::cout << "   Number of degrees of freedom: "

<< dof_handler.n_dofs()

<< std::endl;



solution.reinit (dof_handler.n_dofs());

previous_solution.reinit (dof_handler.n_dofs());

system_rhs.reinit (dof_handler.n_dofs());





for (unsigned int i=0; i(),

  constraints);



constraints.close ();



}

else

{



//You don't need to make previous solution. Or solution vector , 
(it will be done on refinement step)

//What you should do?  set boundary condition again.

DynamicSparsityPattern dsp(dof_handler.n_dofs());

DoFTools::make_sparsity_pattern (dof_handler, dsp);

sparsity_pattern.copy_from(dsp);



system_matrix.reinit (sparsity_pattern);

system_rhs.reinit (dof_handler.n_dofs());





constraints.clear ();

DoFTools::make_hanging_node_constraints (dof_handler,

 constraints);

VectorTools::interpolate_boundary_values (dof_handler,1,

  BoundaryValues(),

  constraints);

constraints.close ();



std::cout << "Set up system finished" << std::endl;

}



}


I think this two function do everything I needed... but when I run my code, 
I run into error message when I assemble system in second time. 
what might be the problem.?

Always thank you!!

Jaekwang Kim 

*An error occurred in line <1668> of file 

 
in function*

*void dealii::internals::dealiiSparseMatrix::add_value(const LocalType, 
const size_type, const size_type, SparseMatrixIterator &) 
[SparseMatrixIterator = dealii::SparseMatrixIterators::Iterator, LocalType = double]*

*The violated condition was: *

*matrix_values->column() == column*

*The name and call sequence of the exception was:*

*typename SparseMatrix::ExcInvalidIndex(row, column)*

*Additional Information: *

*You are trying to access the matrix entry with index <0,30>, but this 
entry does not exist in the sparsity pattern of this matrix.*


*The most common cause for this problem is that you used a method to build 
the sparsity pattern that did not (completely) take into account all of the 
entries you will later try to write into. An example would be building a 
sparsity pattern that does not include the entries you will write into due 
to constraints on degrees of freedom such as hanging nodes or periodic 
boundary conditions. In such cases, building the sparsity pattern will 
succeed, but you will get errors such as the current one at one point or 
other when trying to write into the entries of the matrix. *



-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, s

[deal.II] Re: Installation using Spack fails during 'ncurses' installation

2017-03-16 Thread Jean-Paul Pelteret
Dear Stephen,

Since this is an issue related not directly to deal.II but rather Spack 
itself (ncurses is only a dependency of a dealii related package), I would 
suggest that you post your question on their forum 
. You'll generally get a 
quick response there, and if not then you should open an issue 
 on their github page.

Best regards,
Jean-Paul

On Thursday, March 16, 2017 at 4:19:36 PM UTC, Stephen DeWitt wrote:
>
> Hello,
> I'm trying to install dealii on a shared AFS file system. Originally, I 
> tried to install everything manually, which I had done successfully on our 
> HPC cluster, but ran into a PETSc link error and decided to try Spack.
>
> I followed the instructions on the deal.II wiki, but installation stopped 
> during the "install" phase for "ncurses". The packages "bzip2" and 
> "muparser" installed without a problem.
>
> Here's the error message:
>
>  
> 100.0%
>
> *==>* Staging archive: /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/ncurses-6.0.tar.gz
>
> *==>* Created stage in /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m
>
> *==>* Applied patch sed_pgi.patch
>
> *==>* Ran patch() for ncurses
>
> *==>* Building ncurses [AutotoolsPackage]
>
> *==>* Executing phase : 'autoreconf'
>
> *==>* Executing phase : 'configure'
>
> *==>* Executing phase : 'build'
>
> *==>* Executing phase : 'install'
>
> *==>* Error: ProcessError: Command exited with status 2:
>
> 'make' '-j2' 'install'
>
> /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/lib/spack/spack/build_systems/autotools.py:282,
>  
> in install:
>
>  277  def install(self, spec, prefix):
>
>  278  """Makes the install targets specified by
>
>  279  :py:attr:``~.AutotoolsPackage.install_targets``
>
>  280  """
>
>  281  with working_dir(self.build_directory):
>
>   >> 282  inspect.getmodule(self).make(*self.install_targets)
>
>
> See build log for details:
>
>   /tmp/stvdwtt/spack-stage/spack-stage-7oak9b/ncurses-6.0/spack-build.out
>
> I went to the build log (which is quite long), and see several errors like 
> this:
>
> cd ../lib && (ln -s -f libpanel.so.6.0 libpanel.so.6; ln -s -f 
> libpanel.so.6 libpanel.so; )
>
> /usr/bin/ld: total time in link: 0.021996
>
> /usr/bin/ld: data size 29224512
>
> cd /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib
>  
> && (ln -s -f libpanel.so.6.0 libpanel.so.6; ln -s -f libpanel.so.6 
> libpanel.so; )
>
> test -z "" && /sbin/ldconfig
>
> /sbin/ldconfig: Can't create temporary cache file /etc/ld.so.cache~: 
> Permission denied
>
> make[1]: [/afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib/libpanel.so.6.0]
>  
> Error 1 (ignored)
>
> The last few lines of the build log (which I'm not sure are relevant) are:
>
> Running sh /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/ncurses-6.0/misc/shlib
>  
> tic to install /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/share/terminfo
>  
> ...
>
>
> You may see messages regarding extended capabilities, e.g., AX.
>
> These are extended terminal capabilities which are compiled
>
> using
>
> tic -x
>
> If you have ncurses 4.2 applications, you should read the INSTALL
>
> document, and install the terminfo without the -x option.
>
>
> ** creating form.pc
>
> ** creating ncurses++.pc
>
> touch pc-files
>
> /bin/sh -c 'for name in *.pc; do /usr/bin/install -c -m 644 $name /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib/pkgconfig/$name;
>  
> done'
>
> Judging from the error, I'm assuming that it's a permissions issue. I 
> double-checked that my $SPACK_ROOT environment variable is set correctly. 
> The first line of the build log includes a '--prefix=' statement, which 
> correctly picks up the $SPACK_ROOT path.
>
> Does anyone have ideas on what the problem is? Digging around the Spack 
> documentation didn't turn up anything.
>
> Thanks!
> Steve
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group"

[deal.II] Re: Installation using Spack fails during 'ncurses' installation

2017-03-16 Thread Bruno Turcksin
Steve,

I have tried to use spack several times on two clusters and it never worked 
for me (but it works fine on my own machine). I usually have to patch a 
bunch of things and at the end, I still have problems when I load the 
modules. I find it a lot easier to install everything manually. You can 
also try candi, there is an option to build deal.II on a cluster.

Best,

Bruno

On Thursday, March 16, 2017 at 12:19:36 PM UTC-4, Stephen DeWitt wrote:
>
> Hello,
> I'm trying to install dealii on a shared AFS file system. Originally, I 
> tried to install everything manually, which I had done successfully on our 
> HPC cluster, but ran into a PETSc link error and decided to try Spack.
>
> I followed the instructions on the deal.II wiki, but installation stopped 
> during the "install" phase for "ncurses". The packages "bzip2" and 
> "muparser" installed without a problem.
>
> Here's the error message:
>
>  
> 100.0%
>
> *==>* Staging archive: /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/ncurses-6.0.tar.gz
>
> *==>* Created stage in /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m
>
> *==>* Applied patch sed_pgi.patch
>
> *==>* Ran patch() for ncurses
>
> *==>* Building ncurses [AutotoolsPackage]
>
> *==>* Executing phase : 'autoreconf'
>
> *==>* Executing phase : 'configure'
>
> *==>* Executing phase : 'build'
>
> *==>* Executing phase : 'install'
>
> *==>* Error: ProcessError: Command exited with status 2:
>
> 'make' '-j2' 'install'
>
> /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/lib/spack/spack/build_systems/autotools.py:282,
>  
> in install:
>
>  277  def install(self, spec, prefix):
>
>  278  """Makes the install targets specified by
>
>  279  :py:attr:``~.AutotoolsPackage.install_targets``
>
>  280  """
>
>  281  with working_dir(self.build_directory):
>
>   >> 282  inspect.getmodule(self).make(*self.install_targets)
>
>
> See build log for details:
>
>   /tmp/stvdwtt/spack-stage/spack-stage-7oak9b/ncurses-6.0/spack-build.out
>
> I went to the build log (which is quite long), and see several errors like 
> this:
>
> cd ../lib && (ln -s -f libpanel.so.6.0 libpanel.so.6; ln -s -f 
> libpanel.so.6 libpanel.so; )
>
> /usr/bin/ld: total time in link: 0.021996
>
> /usr/bin/ld: data size 29224512
>
> cd /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib
>  
> && (ln -s -f libpanel.so.6.0 libpanel.so.6; ln -s -f libpanel.so.6 
> libpanel.so; )
>
> test -z "" && /sbin/ldconfig
>
> /sbin/ldconfig: Can't create temporary cache file /etc/ld.so.cache~: 
> Permission denied
>
> make[1]: [/afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib/libpanel.so.6.0]
>  
> Error 1 (ignored)
>
> The last few lines of the build log (which I'm not sure are relevant) are:
>
> Running sh /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/ncurses-6.0/misc/shlib
>  
> tic to install /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/share/terminfo
>  
> ...
>
>
> You may see messages regarding extended capabilities, e.g., AX.
>
> These are extended terminal capabilities which are compiled
>
> using
>
> tic -x
>
> If you have ncurses 4.2 applications, you should read the INSTALL
>
> document, and install the terminfo without the -x option.
>
>
> ** creating form.pc
>
> ** creating ncurses++.pc
>
> touch pc-files
>
> /bin/sh -c 'for name in *.pc; do /usr/bin/install -c -m 644 $name /afs/
> umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib/pkgconfig/$name;
>  
> done'
>
> Judging from the error, I'm assuming that it's a permissions issue. I 
> double-checked that my $SPACK_ROOT environment variable is set correctly. 
> The first line of the build log includes a '--prefix=' statement, which 
> correctly picks up the $SPACK_ROOT path.
>
> Does anyone have ideas on what the problem is? Digging around the Spack 
> documentation didn't turn up anything.
>
> Thanks!
> Steve
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiv

[deal.II] Installation using Spack fails during 'ncurses' installation

2017-03-16 Thread Stephen DeWitt
Hello,
I'm trying to install dealii on a shared AFS file system. Originally, I 
tried to install everything manually, which I had done successfully on our 
HPC cluster, but ran into a PETSc link error and decided to try Spack.

I followed the instructions on the deal.II wiki, but installation stopped 
during the "install" phase for "ncurses". The packages "bzip2" and 
"muparser" installed without a problem.

Here's the error message:

 
100.0%

*==>* Staging archive: 
/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/ncurses-6.0.tar.gz

*==>* Created stage in 
/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m

*==>* Applied patch sed_pgi.patch

*==>* Ran patch() for ncurses

*==>* Building ncurses [AutotoolsPackage]

*==>* Executing phase : 'autoreconf'

*==>* Executing phase : 'configure'

*==>* Executing phase : 'build'

*==>* Executing phase : 'install'

*==>* Error: ProcessError: Command exited with status 2:

'make' '-j2' 'install'

/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/lib/spack/spack/build_systems/autotools.py:282,
 
in install:

 277  def install(self, spec, prefix):

 278  """Makes the install targets specified by

 279  :py:attr:``~.AutotoolsPackage.install_targets``

 280  """

 281  with working_dir(self.build_directory):

  >> 282  inspect.getmodule(self).make(*self.install_targets)


See build log for details:

  /tmp/stvdwtt/spack-stage/spack-stage-7oak9b/ncurses-6.0/spack-build.out

I went to the build log (which is quite long), and see several errors like 
this:

cd ../lib && (ln -s -f libpanel.so.6.0 libpanel.so.6; ln -s -f 
libpanel.so.6 libpanel.so; )

/usr/bin/ld: total time in link: 0.021996

/usr/bin/ld: data size 29224512

cd 
/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib
 
&& (ln -s -f libpanel.so.6.0 libpanel.so.6; ln -s -f libpanel.so.6 
libpanel.so; )

test -z "" && /sbin/ldconfig

/sbin/ldconfig: Can't create temporary cache file /etc/ld.so.cache~: 
Permission denied

make[1]: 
[/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib/libpanel.so.6.0]
 
Error 1 (ignored)

The last few lines of the build log (which I'm not sure are relevant) are:

Running sh 
/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/var/spack/stage/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/ncurses-6.0/misc/shlib
 
tic to install 
/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/share/terminfo
 
...


You may see messages regarding extended capabilities, e.g., AX.

These are extended terminal capabilities which are compiled

using

tic -x

If you have ncurses 4.2 applications, you should read the INSTALL

document, and install the terminfo without the -x option.


** creating form.pc

** creating ncurses++.pc

touch pc-files

/bin/sh -c 'for name in *.pc; do /usr/bin/install -c -m 644 $name 
/afs/umich.edu/user/s/t/stvdwtt/Public/PRISMS/software/spack/spack/opt/spack/linux-rhel6-x86_64/gcc-4.4.7/ncurses-6.0-4wkexyzgaxdwrfs6wqje2ppcm5di263m/lib/pkgconfig/$name;
 
done'

Judging from the error, I'm assuming that it's a permissions issue. I 
double-checked that my $SPACK_ROOT environment variable is set correctly. 
The first line of the build log includes a '--prefix=' statement, which 
correctly picks up the $SPACK_ROOT path.

Does anyone have ideas on what the problem is? Digging around the Spack 
documentation didn't turn up anything.

Thanks!
Steve

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Internal instability of the GMRES Solver / Trilinos

2017-03-16 Thread Martin Kronbichler
Dear Pascal,

No, you do not need to try the other solution. I'm glad I could help.
(This asserts the approach that we need to be careful with the vector
pool between different calls.)

Best,
Martin


On 16.03.2017 15:21, Pascal Kraft wrote:
> Hi Martin,
>
>  I have tried a version
> with 
> GrowingVectorMemory::release_unused_memory()
> at the end of each step and removed my change to trilinos_vector.cc
> l.247 (back to the version from dealii source) and it seems to work
> fine. I have not tried the other solution you proposed, should I?
> Would the result help you?
>
> Thank you a lot for your support! This had been driving me crazy :)
>
> Best,
> Pascal
>
> Am Donnerstag, 16. März 2017 08:58:53 UTC+1 schrieb Martin Kronbichler:
>
> Dear Pascal,
>
> You are right, in your case one needs to call
> 
> GrowingVectorMemory::release_unused_memory()
> rather than for the vector. Can you try that as well?
>
> The problem appears to be that the call to SameAs returns
> different results for different processors, which it should not
> be, which is why I suspect that there might be some stale
> communicator object around. Another indication for that assumption
> is that you get stuck in the initialization of the temporary
> vectors of the GMRES solver, which is exactly this kind of situation.
>
> As to the particular patch I referred to: It does release some
> memory that might have stale information but it also changes some
> of the call structures slightly. Could you try to change the
> following:
>
> if(vector->Map().SameAs(v.vector->Map()) == false)
>
> to
>
> if(v.vector->Map().SameAs(vector->Map()) == false)
>
> Best, Martin
>
> On 16.03.2017 01:28, Pascal Kraft wrote:
>> Hi Martin,
>> that didn't solve my problem. What I have done in the meantime is
>> replace the check in line 247 of trilinos_vector.cc with true. I
>> don't know if this causes memory leaks or anything but my code
>> seems to be working fine with that change. 
>> To your suggestion: Would I have also had to call the templated
>> version for BlockVectors or only for Vectors? I only tried the
>> latter. Would I have had to also apply some patch to my dealii
>> library for it to work or is the patch you talked about simply
>> that you included the functionality of the call
>> 
>> GrowingVectorMemory::release_unused_memory()
>> in some places?
>> I have also wanted to try MPICH instead of OpenMPI because of a
>> post about an internal error in OpenMPI and one of the functions
>> appearing in the call stacks sometimes not blocking properly.
>> Thank you for your time and your fast responses - the whole
>> library and the people developing it and making it available are
>> simply awesome ;)
>> Pascal
>> Am Mittwoch, 15. März 2017 17:26:23 UTC+1 schrieb Martin Kronbichler:
>>
>> Dear Pascal,
>>
>> This problem seems related to a problem we recently worked
>> around in https://github.com/dealii/dealii/pull/4043
>> 
>>
>> Can you check what happens if you call
>> 
>> GrowingVectorMemory::release_unused_memory()
>>
>> between your optimization steps? If a communicator gets stack
>> in those places it is likely a stale object somewhere that we
>> fail to work around for some reason.
>>
>> Best, Martin
>>
>> On 15.03.2017 14:10, Pascal Kraft wrote:
>>> Dear Timo,
>>> I have done some more digging and found out the following.
>>> The problems seem to happen in trilinos_vector.cc between
>>> the lines 240 and 270.
>>> What I see on the call stacks is, that one process reaches
>>> line 261 ( ierr = vector->GlobalAssemble (last_action); )
>>> and then waits inside this call at an MPI_Barrier with the
>>> following stack:
>>> 20  7fffd4d18f56
>>> 19 opal_progress()  7fffdc56dfca
>>> 18 ompi_request_default_wait_all()  7fffddd54b15
>>> 17 ompi_coll_tuned_barrier_intra_recursivedoubling()
>>>  7fffcf9abb5d
>>> 16 PMPI_Barrier()  7fffddd68a9c
>>> 15 Epetra_MpiDistributor::DoPosts()  7fffe4088b4f
>>> 14 Epetra_MpiDistributor::Do()  7fffe4089773
>>> 13 Epetra_DistObject::DoTransfer()  7fffe400a96a
>>> 12 Epetra_DistObject::Export()  7fffe400b7b7
>>> 11 int Epetra_FEVector::GlobalAssemble()  7fffe4023d7f
>>> 10 Epetra_FEVector::GlobalAssemble()  7fffe40228e3
>>> The other (in my case three) processes are stuck in the head
>>> of the if/else-f statement leading up to this point, namely
>>> in the line 
>>> if(vector->Map().SameAs(v.vector
>>> 
>>> ->Map())
>>>  

Re: [deal.II] Re: Internal instability of the GMRES Solver / Trilinos

2017-03-16 Thread Pascal Kraft
Hi Martin,

 I have tried a version 
with 
GrowingVectorMemory::release_unused_memory()
 
at the end of each step and removed my change to trilinos_vector.cc l.247 
(back to the version from dealii source) and it seems to work fine. I have 
not tried the other solution you proposed, should I? Would the result help 
you?

Thank you a lot for your support! This had been driving me crazy :)

Best,
Pascal

Am Donnerstag, 16. März 2017 08:58:53 UTC+1 schrieb Martin Kronbichler:
>
> Dear Pascal,
>
> You are right, in your case one needs to call 
>
> GrowingVectorMemory::release_unused_memory()
> rather than for the vector. Can you try that as well?
>
> The problem appears to be that the call to SameAs returns different 
> results for different processors, which it should not be, which is why I 
> suspect that there might be some stale communicator object around. Another 
> indication for that assumption is that you get stuck in the initialization 
> of the temporary vectors of the GMRES solver, which is exactly this kind of 
> situation.
>
> As to the particular patch I referred to: It does release some memory that 
> might have stale information but it also changes some of the call 
> structures slightly. Could you try to change the following:
>
> if (vector->Map().SameAs(v.vector->Map()) == false)
>
> to 
>
> if (v.vector->Map().SameAs(vector 
> ->Map())
>  
> == false)
>
> Best, Martin 
> On 16.03.2017 01:28, Pascal Kraft wrote: 
>
> Hi Martin,
> that didn't solve my problem. What I have done in the meantime is replace 
> the check in line 247 of trilinos_vector.cc with true. I don't know if this 
> causes memory leaks or anything but my code seems to be working fine with 
> that change. 
> To your suggestion: Would I have also had to call the templated version 
> for BlockVectors or only for Vectors? I only tried the latter. Would I have 
> had to also apply some patch to my dealii library for it to work or is the 
> patch you talked about simply that you included the functionality of the 
> call 
> GrowingVectorMemory::release_unused_memory() 
> in some places?
> I have also wanted to try MPICH instead of OpenMPI because of a post about 
> an internal error in OpenMPI and one of the functions appearing in the call 
> stacks sometimes not blocking properly.
> Thank you for your time and your fast responses - the whole library and 
> the people developing it and making it available are simply awesome ;)
> Pascal
> Am Mittwoch, 15. März 2017 17:26:23 UTC+1 schrieb Martin Kronbichler:
>>
>> Dear Pascal,
>>
>> This problem seems related to a problem we recently worked around in 
>> https://github.com/dealii/dealii/pull/4043
>>
>> Can you check what happens if you call 
>> GrowingVectorMemory::release_unused_memory()
>>
>> between your optimization steps? If a communicator gets stack in those 
>> places it is likely a stale object somewhere that we fail to work around 
>> for some reason.
>>
>> Best, Martin 
>> On 15.03.2017 14:10, Pascal Kraft wrote: 
>>
>> Dear Timo, 
>> I have done some more digging and found out the following. The problems 
>> seem to happen in trilinos_vector.cc between the lines 240 and 270.
>> What I see on the call stacks is, that one process reaches line 261 
>> ( ierr = vector->GlobalAssemble (last_action); ) and then waits inside this 
>> call at an MPI_Barrier with the following stack:
>> 20  7fffd4d18f56 
>> 19 opal_progress()  7fffdc56dfca 
>> 18 ompi_request_default_wait_all()  7fffddd54b15 
>> 17 ompi_coll_tuned_barrier_intra_recursivedoubling()  7fffcf9abb5d 
>> 16 PMPI_Barrier()  7fffddd68a9c 
>> 15 Epetra_MpiDistributor::DoPosts()  7fffe4088b4f 
>> 14 Epetra_MpiDistributor::Do()  7fffe4089773 
>> 13 Epetra_DistObject::DoTransfer()  7fffe400a96a 
>> 12 Epetra_DistObject::Export()  7fffe400b7b7 
>> 11 int Epetra_FEVector::GlobalAssemble()  7fffe4023d7f 
>> 10 Epetra_FEVector::GlobalAssemble()  7fffe40228e3 
>> The other (in my case three) processes are stuck in the head of the 
>> if/else-f statement leading up to this point, namely in the line 
>> if (vector->Map().SameAs(v.vector 
>> ->Map())
>>  
>> == false) 
>> inside the call to SameAs(...) with stacks like
>> 15 opal_progress() 7fffdc56dfbc 14 ompi_request_default_wait_all() 
>> 7fffddd54b15 13 ompi_coll_tuned_allreduce_intra_recursivedoubling() 
>> 7fffcf9a4913 12 PMPI_Allreduce() 7fffddd6587f 11 Epetra_MpiComm::MinAll() 
>> 7fffe408739e 10 Epetra_BlockMap::SameAs() 7fffe3fb9d74 
>> Maybe this helps. Producing a smaller example will likely not be possible 
>> in the coming two weeks but if there are no solutions until then I can try.
>> Greetings,
>> Pascal
>> -- The deal.II project is located at http://www.dealii.org/ For mailing 
>> list/forum options, see https://groups.google.com/

Re: [deal.II] Re: Internal instability of the GMRES Solver / Trilinos

2017-03-16 Thread Pascal Kraft
Dear Martin,

my local machine is dying to a Valgrind run at the moment, but as soon as 
that is done with one step I will put these changes in right away and post 
the results here (<6 hrs).
>From what I make of the call stacks on process somehow gets out of the 
SameAs() call without being MPI-blocked, and the others are then forced to 
wait during the All_Reduce call. How or where that happens I will try to 
figure out later today. SDM is now working well in my eclipse setup and I 
hope to be able to track the problem.

Best,
Pascal

Am Donnerstag, 16. März 2017 08:58:53 UTC+1 schrieb Martin Kronbichler:
>
> Dear Pascal,
>
> You are right, in your case one needs to call 
>
> GrowingVectorMemory::release_unused_memory()
> rather than for the vector. Can you try that as well?
>
> The problem appears to be that the call to SameAs returns different 
> results for different processors, which it should not be, which is why I 
> suspect that there might be some stale communicator object around. Another 
> indication for that assumption is that you get stuck in the initialization 
> of the temporary vectors of the GMRES solver, which is exactly this kind of 
> situation.
>
> As to the particular patch I referred to: It does release some memory that 
> might have stale information but it also changes some of the call 
> structures slightly. Could you try to change the following:
>
> if (vector->Map().SameAs(v.vector->Map()) == false)
>
> to 
>
> if (v.vector->Map().SameAs(vector 
> ->Map())
>  
> == false)
>
> Best, Martin 
> On 16.03.2017 01:28, Pascal Kraft wrote: 
>
> Hi Martin,
> that didn't solve my problem. What I have done in the meantime is replace 
> the check in line 247 of trilinos_vector.cc with true. I don't know if this 
> causes memory leaks or anything but my code seems to be working fine with 
> that change. 
> To your suggestion: Would I have also had to call the templated version 
> for BlockVectors or only for Vectors? I only tried the latter. Would I have 
> had to also apply some patch to my dealii library for it to work or is the 
> patch you talked about simply that you included the functionality of the 
> call 
> GrowingVectorMemory::release_unused_memory() 
> in some places?
> I have also wanted to try MPICH instead of OpenMPI because of a post about 
> an internal error in OpenMPI and one of the functions appearing in the call 
> stacks sometimes not blocking properly.
> Thank you for your time and your fast responses - the whole library and 
> the people developing it and making it available are simply awesome ;)
> Pascal
> Am Mittwoch, 15. März 2017 17:26:23 UTC+1 schrieb Martin Kronbichler:
>>
>> Dear Pascal,
>>
>> This problem seems related to a problem we recently worked around in 
>> https://github.com/dealii/dealii/pull/4043
>>
>> Can you check what happens if you call 
>> GrowingVectorMemory::release_unused_memory()
>>
>> between your optimization steps? If a communicator gets stack in those 
>> places it is likely a stale object somewhere that we fail to work around 
>> for some reason.
>>
>> Best, Martin 
>> On 15.03.2017 14:10, Pascal Kraft wrote: 
>>
>> Dear Timo, 
>> I have done some more digging and found out the following. The problems 
>> seem to happen in trilinos_vector.cc between the lines 240 and 270.
>> What I see on the call stacks is, that one process reaches line 261 
>> ( ierr = vector->GlobalAssemble (last_action); ) and then waits inside this 
>> call at an MPI_Barrier with the following stack:
>> 20  7fffd4d18f56 
>> 19 opal_progress()  7fffdc56dfca 
>> 18 ompi_request_default_wait_all()  7fffddd54b15 
>> 17 ompi_coll_tuned_barrier_intra_recursivedoubling()  7fffcf9abb5d 
>> 16 PMPI_Barrier()  7fffddd68a9c 
>> 15 Epetra_MpiDistributor::DoPosts()  7fffe4088b4f 
>> 14 Epetra_MpiDistributor::Do()  7fffe4089773 
>> 13 Epetra_DistObject::DoTransfer()  7fffe400a96a 
>> 12 Epetra_DistObject::Export()  7fffe400b7b7 
>> 11 int Epetra_FEVector::GlobalAssemble()  7fffe4023d7f 
>> 10 Epetra_FEVector::GlobalAssemble()  7fffe40228e3 
>> The other (in my case three) processes are stuck in the head of the 
>> if/else-f statement leading up to this point, namely in the line 
>> if (vector->Map().SameAs(v.vector 
>> ->Map())
>>  
>> == false) 
>> inside the call to SameAs(...) with stacks like
>> 15 opal_progress() 7fffdc56dfbc 14 ompi_request_default_wait_all() 
>> 7fffddd54b15 13 ompi_coll_tuned_allreduce_intra_recursivedoubling() 
>> 7fffcf9a4913 12 PMPI_Allreduce() 7fffddd6587f 11 Epetra_MpiComm::MinAll() 
>> 7fffe408739e 10 Epetra_BlockMap::SameAs() 7fffe3fb9d74 
>> Maybe this helps. Producing a smaller example will likely not be possible 
>> in the coming two weeks but if there are no solutions until then I can try.
>> Greetings,
>>

Re: [deal.II] Re: Internal instability of the GMRES Solver / Trilinos

2017-03-16 Thread Martin Kronbichler
Dear Pascal,

You are right, in your case one needs to call
GrowingVectorMemory::release_unused_memory()
rather than for the vector. Can you try that as well?

The problem appears to be that the call to SameAs returns different
results for different processors, which it should not be, which is why I
suspect that there might be some stale communicator object around.
Another indication for that assumption is that you get stuck in the
initialization of the temporary vectors of the GMRES solver, which is
exactly this kind of situation.

As to the particular patch I referred to: It does release some memory
that might have stale information but it also changes some of the call
structures slightly. Could you try to change the following:

if(vector->Map().SameAs(v.vector->Map()) == false)

to

if(v.vector->Map().SameAs(vector->Map())
== false)

Best, Martin

On 16.03.2017 01:28, Pascal Kraft wrote:
> Hi Martin,
> that didn't solve my problem. What I have done in the meantime is
> replace the check in line 247 of trilinos_vector.cc with true. I don't
> know if this causes memory leaks or anything but my code seems to be
> working fine with that change. 
> To your suggestion: Would I have also had to call the templated
> version for BlockVectors or only for Vectors? I only tried the latter.
> Would I have had to also apply some patch to my dealii library for it
> to work or is the patch you talked about simply that you included the
> functionality of the call
> GrowingVectorMemory::release_unused_memory()
> in some places?
> I have also wanted to try MPICH instead of OpenMPI because of a post
> about an internal error in OpenMPI and one of the functions appearing
> in the call stacks sometimes not blocking properly.
> Thank you for your time and your fast responses - the whole library
> and the people developing it and making it available are simply awesome ;)
> Pascal
> Am Mittwoch, 15. März 2017 17:26:23 UTC+1 schrieb Martin Kronbichler:
>
> Dear Pascal,
>
> This problem seems related to a problem we recently worked around
> in https://github.com/dealii/dealii/pull/4043
> 
>
> Can you check what happens if you call
> 
> GrowingVectorMemory::release_unused_memory()
>
> between your optimization steps? If a communicator gets stack in
> those places it is likely a stale object somewhere that we fail to
> work around for some reason.
>
> Best, Martin
>
> On 15.03.2017 14:10, Pascal Kraft wrote:
>> Dear Timo,
>> I have done some more digging and found out the following. The
>> problems seem to happen in trilinos_vector.cc between the lines
>> 240 and 270.
>> What I see on the call stacks is, that one process reaches line
>> 261 ( ierr = vector->GlobalAssemble (last_action); ) and then
>> waits inside this call at an MPI_Barrier with the following stack:
>> 20  7fffd4d18f56
>> 19 opal_progress()  7fffdc56dfca
>> 18 ompi_request_default_wait_all()  7fffddd54b15
>> 17 ompi_coll_tuned_barrier_intra_recursivedoubling()  7fffcf9abb5d
>> 16 PMPI_Barrier()  7fffddd68a9c
>> 15 Epetra_MpiDistributor::DoPosts()  7fffe4088b4f
>> 14 Epetra_MpiDistributor::Do()  7fffe4089773
>> 13 Epetra_DistObject::DoTransfer()  7fffe400a96a
>> 12 Epetra_DistObject::Export()  7fffe400b7b7
>> 11 int Epetra_FEVector::GlobalAssemble()  7fffe4023d7f
>> 10 Epetra_FEVector::GlobalAssemble()  7fffe40228e3
>> The other (in my case three) processes are stuck in the head of
>> the if/else-f statement leading up to this point, namely in the line 
>> if(vector->Map().SameAs(v.vector
>> 
>> ->Map())
>> == false)
>> inside the call to SameAs(...) with stacks like
>> 15 opal_progress() 7fffdc56dfbc 14
>> ompi_request_default_wait_all() 7fffddd54b15 13
>> ompi_coll_tuned_allreduce_intra_recursivedoubling() 7fffcf9a4913
>> 12 PMPI_Allreduce() 7fffddd6587f 11 Epetra_MpiComm::MinAll()
>> 7fffe408739e 10 Epetra_BlockMap::SameAs() 7fffe3fb9d74
>> Maybe this helps. Producing a smaller example will likely not be
>> possible in the coming two weeks but if there are no solutions
>> until then I can try.
>> Greetings,
>> Pascal
>> -- The deal.II project is located at http://www.dealii.org/ For
>> mailing list/forum options, see
>> https://groups.google.com/d/forum/dealii?hl=en
>>  --- You received
>> this message because you are subscribed to the Google Groups
>> "deal.II User Group" group. To unsubscribe from this group and
>> stop receiving emails from it, send an email to
>> dealii+un...@googlegroups.com . For more options,
>> visi