On 2020-05-23 14:18, Jed Brown wrote:
Drew Parsons <dpars...@debian.org> writes:

Hi, the Debian project is discussing whether we should start providing a
64 bit build of PETSc (which means we'd have to upgrade our entire
computational library stack, starting from BLAS and going through MPI,
MUMPS, etc).

You don't need to change BLAS or MPI.

I see, the PETSc API allows for PetscBLASInt and PetscMPIInt distinct from PetscInt. That gives us more flexibility. (In any case, the Debian BLAS maintainer is already providing blas64 packages. We've started discussions about MPI).

But what about MUMPS? Would MUMPS need to be built with 64 bit support to work with 64-bit PETSc? (the MUMPS docs indicate that its 64 bit support needs 64-bit versions of BLAS, SCOTCH, METIS and MPI).

A default PETSc build uses 32 bit addressing to index vectors and
matrices.  64 bit addressing can be switched on by configuring with
--with-64-bit-indices=1, allowing much larger systems to be handled.

My question for petsc-maint is, is there a reason why 64 bit indexing is
not already activated by default on 64-bit systems?  Certainly C
pointers and type int would already be 64 bit on these systems.

Umm, x86-64 Linux is LP64, so int is 32-bit. ILP64 is relatively exotic
these days.

oh ok. I had assumed int was 64 bit on x86-64. Thanks for the correction.

Is it a question of performance? Is 32 bit indexing executed faster (in the sense of 2 operations per clock cycle), such that 64-bit addressing
is accompanied with a drop in performance?

Sparse iterative solvers are entirely limited by memory bandwidth;
sizeof(double) + sizeof(int64_t) = 16 incurs a performance hit relative
to 12 for int32_t.  It has nothing to do with clock cycles for
instructions, just memory bandwidth (and usage, but that is less often
an issue).

In that case we'd only want to use 64-bit PETSc if the system being
modelled is large enough to actually need it. Or is there a different
reason that 64 bit indexing is not switched on by default?

It's just about performance, as above.

Thanks Jed. That's good justification for us to keep our current 32-bit built then, and provide a separate 64-bit build alongside it.

 There are two situations in
which 64-bit is needed.  Historically (supercomputing with thinner
nodes), it has been that you're solving problems with more than 2B dofs.
In today's age of fat nodes, it also happens that a matrix on a single
MPI rank has more than 2B nonzeros.  This is especially common when
using direct solvers.  We'd like to address the latter case by only
promoting the row offsets (thereby avoiding the memory hit of promoting
column indices):


An interesting extra challenge.

I wonder if you are aware of any static analysis tools that can
flag implicit conversions of this sort:

int64_t n = ...;
for (int32_t i=0; i<n; i++) {

There is -fsanitize=signed-integer-overflow (which generates a runtime
error message), but that requires data to cause overflow at every
possible location.

I'll ask the Debian gcc team and the Science team if they have ideas about this.


Reply via email to