On Sat, Mar 9, 2019 at 6:28 PM Jack Poulson <[email protected]> wrote:
> Hi Gael, > > Interesting! I hadn't run a comparison against CholMod yet (but it is > obviously the gold standard). There is a slight amount of automatic > algorithmic switching that needs to be added in ( > https://gitlab.com/hodge_star/catamari/issues/2) before I would assume a > blackbox comparison across the gamut of matrix types would be competitive. > Catamari currently implements: > * Sequential and multithreaded, supernodal right-looking (multifrontal) > Cholesky, LDL^T, and LDL^H, > * Sequential and multithreaded, supernodal left-looking Cholesky, LDL^T, > and LDL^H, > * Sequential scalar left-looking Cholesky, LDL^T, and LDL^H > * Sequential scalar up-looking Cholesky, LDL^T, and LDL^H. > > At the moment, the first category is used by default, but there are more > refined categories: > * For large, reasonably dense, multithreaded environments, the > supernodal right-looking approach is preferred. > * For large, reasonably dense, sequential environments, the supernodal > left-looking approach might be preferred. > * For small/sparse problems, the up-looking approach should be > preferred, but there is not yet parallel support for this in catamari (not > that it would be difficult). > parallel simplicial would be nice, and the first publicly available implementation AFAIK. > According to CholMod's publications, it intelligently switches between the > supernodal left-looking factorization and the scalar up-looking > factorization (based upon the computational intensity as determined from > the symbolic factorization). For higher core counts, the supernodal > right-looking approach becomes critical so that the top of the assembly > tree is parallelized. > I did not know that CholMod was supposed to be smart in this regard. On my own experienced I always had to explicitly choose between simplicial and supernodal modes to get best performance (and for simplicial I always use our built-in one of course!). > On the subject of Eigen integration: I would defer to Eigen experts on the > best/most-useful means of integration. If you believe that there would be > added usefulness in transplanting Catamari's logic into core Eigen (on top > of its BLAS and matrix classes), I think that could be interesting. > That would definitely be the ideal option as duplicating code and efforts is rarely a good idea. Of course, this is also more work on the short term. > I believe that my permissively-licensed AMD implementation (quotient) > should be roughly equivalent to SuiteSparse's (and, I suppose, Eigen's) in > terms of performance. > Thanks to Google, we very recently managed to relicense SuiteSparse's AMD as MPL2. But the code is still horrible to us, C++ developers, so yours might still be very desirable if it's in par regarding performance/robustness. About AMD, I slightly modified our copy to support matrices with explicit zeros on the diagonal: those must be moved/kept at the bottom-right corner. Typical use case is equality constraints handled through Lagrange multipliers. > A wrapper might be a good first step, but this conversation has gotten me > thinking that improving the algorithmic switching logic is a priority. > Yes, starting with a wrapper would be a good start to make catamari visible and easily testable/usable to the Eigen's community. If we envision an integration within Eigen's official module, a large refactoring will be required through. Gaël. > > With Thanks, > Jack > > On Sat, Mar 9, 2019, at 2:31 AM, Gael Guennebaud wrote: > > Hi Jack, > > This is a very nice piece of work, congrats! I did one benchmark if > catamari+MKL on the "ldoor" matrix [1] and got same speed as > suitesparse/Cholmod: about 8s on my Haswell quad-core (15s without openmp). > > Requiring c++14 for new features is not a big deal anymore. > > How do you envision this "Eigen/Catamari" module? A simple wrapper as we > do with Cholmod/UmfPack/Pastix/Pardiso? or a built-in Eigen-ified version? > As you probably already figured out Eigen already provides several features > redundant with what you developed in Catamari: > - BLAS/LAPACK layer with optional compile-time fallback to any BLAS / > LAPACKE / MKL > - Sparse storage > - Fill-in reorderings, we have AMD and COLAMD, so yours might rather be > complementary. > > Gaël > > [1] https://sparse.tamu.edu/GHS_psdef/ldoor > > On Fri, Mar 8, 2019 at 4:59 PM Jack Poulson <[email protected]> wrote: > > > Dear Eigen Maintainers, > > I noticed that Eigen's current documentation only lists a simplicial LDLT > sparse-direct solver; I spent my last several months building a header-only > C++ (optionally multithreaded) supernodal Cholesky/LDL^T/LDL^H > sparse-direct solver (which also implements high-performance Determinantal > Point Process sampling with similar techniques) and am curious if it would > be seen as a good fit for an unsupported Eigen module. The solver achieves > as much as 1 TF in double-precision on my 16-core skylake workstation. > > The current project is described here: > https://hodgestar.com/catamari/ > I released v0.1 a few days ago, but I have improved the performance in > 'master' substantially since then. > > My assumption is that the biggest hurdle will be due to the solver > (catamari) being implemented in C++14, and that Eigen is currently > restricted to C++03. And, while catamari optionally incorporates OpenMP > task scheduling for the multithreaded parallelism, this support is only > enabled if an appropriate preprocessor directive is defined > (CATAMARI_OPENMP). > > Any advice would be appreciated. > > Sincerely, > Jack Poulson > > >
