I have started a branch with utilities to help catch/handle these integer overflow issues https://bitbucket.org/petsc/petsc/pull-requests/389/add-utilities-for-handling-petscint/diff all suggestions are appreciated
Barry > On Nov 16, 2015, at 12:26 PM, Eric Chamberland > <[email protected]> wrote: > > Barry, > > I can't launch the code again and retrieve other informations, since I am not > allowed to do so: the cluster have around ~780 nodes and I got a very special > permission to reserve 530 of them... > > So the best I can do is to give you the backtrace PETSc gave me... :/ > (see the first post with the backtrace: > http://lists.mcs.anl.gov/pipermail/petsc-users/2015-November/027644.html) > > And until today, all smaller meshes with the same solver succeeded to > complete... (I went up to 219 millions of unknowns on 64 nodes). > > I understand then that there could be some use of PetscInt64 in the actual > code that would help fix problems like the one I got. I found it is a big > challenge to track down all occurrence of this kind of overflow in the code, > due to the size of the systems you have to have to reproduce this problem.... > > Eric > > > On 16/11/15 12:40 PM, Barry Smith wrote: >> >> Eric, >> >> The behavior you get with bizarre integers and a crash is not the >> behavior we want. We would like to detect these overflows appropriately. >> If you can track through the error and determine the location where the >> overflow occurs then we would gladly put in additional checks and use of >> PetscInt64 to handle these things better. So let us know the exact cause and >> we'll improve the code. >> >> Barry >> >> >> >>> On Nov 16, 2015, at 11:11 AM, Eric Chamberland >>> <[email protected]> wrote: >>> >>> On 16/11/15 10:42 AM, Matthew Knepley wrote: >>>> Sometimes when we do not have exact counts, we need to overestimate >>>> sizes. This is especially true >>>> in sparse MatMat. >>> >>> Ok... so, to be sure, I am correct if I say that recompiling petsc with >>> "--with-64-bit-indices" is the only solution to my problem? >>> >>> I mean, no other fixes exist for this overestimation in a more recent >>> release of petsc, like putting the result in a "long int" instead? >>> >>> Thanks, >>> >>> Eric >>> >
