You can avoid lying by seriously tuning UP your code while tuning down your 
competitors code in testing. We do this all the time when comparing PETSc with 
Trilinos :-) Just kidding. Comparisons are always a dangerous business. 

   Barry

On Aug 16, 2011, at 10:18 PM, Jack Poulson wrote:

> On Tue, Aug 16, 2011 at 9:35 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On Aug 16, 2011, at 5:14 PM, Jack Poulson wrote:
> 
> > Hello all,
> >
> > I am working on a project that requires very fast sparse direct solves and 
> > MUMPS and SuperLU_Dist haven't been cutting it. From what I've read, when 
> > properly tuned, WSMP is significantly faster, particularly with multiple 
> > right-hand sides on large machines. The obvious drawback is that it's not 
> > open source, but the binaries seem to be readily available for most 
> > platforms.
> >
> > Before I reinvent the wheel, I would like to check if anyone has already 
> > done some work on adding it into PETSc. If not, its interface is quite 
> > similar to MUMPS and I should be able to mirror most of that code. On the 
> > other hand, there are a large number of platform-specific details that need 
> > to be handled, so keeping things both portable and fast might be a 
> > challenge. It seems that the CSC storage format should be used since it is 
> > required for Hermitian matrices.
> >
> > Thanks,
> > Jack
> 
>  Jack,
> 
>   By all means do it. That would be a nice thing to have. But be aware that 
> the WSMP folks have a reputation for exaggerating how much better their 
> software is so don't be surprised if after all that work it is not much 
> better.
> 
> 
> Good to know. I was somewhat worried about that, but perhaps it is a matter 
> of getting all of the tuning parameters right. The manual does mention that 
> performance is significantly degraded without tuning. I would sincerely hope 
> no one would out right lie in their publications, e.g., this one: 
> http://portal.acm.org/citation.cfm?id=1654061
>  
>   BTW: are you solving with many right hand sides? Maybe before you muck with 
> WSMP we should figure out how to get you access to the multiple right hand 
> side support of MUMPS (I don't know if SuperLU_Dist has it) so you can speed 
> up your current computations a good amount? Currently PETSc's MatMatSolve() 
> calls a separate solve for each right hand side with MUMPS.
> 
>   Barry
> 
> 
> I will eventually need to solve against many right-hand sides, but for now I 
> am solving against one and it is still taking too long; in fact, not only 
> does it take too long, but memory per core increased for fixed problem sizes 
> as I increase the number of MPI processes (for both SuperLU_Dist and MUMPS). 
> This was occurring for quasi2d Helmholtz problems over a couple hundred 
> cores. My only logical explanation for this behavior is that the 
> communication buffers grow proportional to the number of processes on each 
> process, but I stress that this is just a guess. I tried reading through the 
> MUMPS code and quickly gave up.
> 
> Another problem with MUMPS is that requires the entire set of right-hand 
> sides to reside on the root process...that will clearly not work for a 
> billion degrees of freedom with several hundred RHSs. WSMP gets this part 
> right and actually distributes those vectors.
> 
> Jack


Reply via email to