On Jan 5, 2012, at 6:41 PM, Jed Brown wrote:

> On Thu, Jan 5, 2012 at 17:13, Ravi Kannan <rxk at cfdrc.com> wrote:
> Files are attached.
> 
> Could you try attaching a debugger to get stack traces?
> 
> It is reducing to a smaller communicator for the coarse level. The processes 
> are likely both hung later in gamg.c:createLevel(). Mark, the appearance is 
> that all procs that call MPI_Comm_create() are also doing things on the newly 
> created communicator, even though it will be MPI_COMM_NULL on processes that 
> are not part of the subgroup. Also, I'm skeptical that you can get correct 
> results with MatPartitioningSetAdjacency(mpart,adj) when mpart and adj are on 
> different communicators. Those other rows of adj are not moved by 
> MatPartitioningApply_Parmetis().

This is scary having two communicators running around but the processors that 
are dropped out of the new communicator have no rows -- that is why they are 
dropped out.

There are several logical paths through this code and I fixed a bug looks like 
this one a few weeks ago, but looks like you have a configuration that I have 
not debugged yet.

It would be very useful if you could give me the lines that each processor is 
hung on.

Mark

> 
> I must be confused about what is actually happening.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120105/2c2d0112/attachment.html>

Reply via email to