Someone asked for more details; I'm replying publicly. The code decomposes the domain into many small "components" which are characterized by a tuple of 4 (small) integers. We want to use these integers as key in a C++ map (i.e. Dict). To simplify things, we calculate a single, unique integer key from these 4 integers, very much like linearizing an array index. As the maximum value of all 4 integers is known, this is straightforward. Only a few components need to be handled by a particular node, so this map is very sparsely populated and efficient.
In this case, the maximum key exceeded the capacity of a C++ int. The solution was to use a 64-bit integer instead. (Another option would be to use a C++ tuple, but tuples did not exist in C++ in 2002 when the code was originally developed.) I realize now that Julia actually would have prevented this issue on Blue Waters, since this is a 64-bit architecture, and Julia uses 64-bit integers there by default. On the other hand, a simulation on a Blue Gene/Q would have exhibited the same issue, and this is a 32-bit architecture -erik On Wed, Jul 13, 2016 at 4:29 PM, Erik Schnetter <[email protected]> wrote: > I'm hoping for a system where integer operations are checked by default. > If this becomes expensive in a particular function / module, then one can > there (1) perform respective checks when the function is entered, (2) > disable further checks inside the function, and (3) (for bonus points) use > static analysis to prove that the manual checks prevent accidental overflow. > > Except for the bonus part, that's how array indices are currently checked, > and that's how we handle IEEE floating-point conformance. > > Anyway -- adding one item to my to-do list: Experiment with a command-line > flag to Julia that switches on checked integer operations. > > -erik > > On Wed, Jul 13, 2016 at 4:21 PM, John Myles White < > [email protected]> wrote: > >> This seems more like a use case for static analysis that checked >> operations to me. The problem IIUC isn't about the usage of >> high-performance code that is unsafe, but rather that the system was >> nominally tested, but tested in an imperfect way that didn't cover the >> failure cases. If you were rewriting this in Rust, it's easy for me to >> imagine that you would use checked arithmetic at the start until 5 years >> have passed, then you would decide it's safe and turn off the checks -- all >> because you had never really tested the obscure cases that only a static >> analyzer is likely to catch. >> >> -- John >> >> On Wednesday, July 13, 2016 at 1:07:59 PM UTC-7, Erik Schnetter wrote: >>> >>> We have this code <https://einsteintoolkit.org> that simulates black >>> holes and other astrophysical systems. It's written in C++ (and a few other >>> languages). I obviously intend to rewrite it in Julia, but that's not the >>> point here. >>> >>> One of the core functions allows evaluating (interpolating) the value of >>> a function at any point in the domain. That code was originally written in >>> 2002, and has been used and optimized and tested extensively. So you'd >>> think it's reasonably bug-free... >>> >>> Today, a colleague ran this code on Blue Waters, using 32,000 nodes, and >>> with some other parameters set to higher resolutions that before. Given the >>> subject of the email, you can guess what happened. >>> >>> Luckily, a debugging routine was active, and caught an inconsistency (an >>> inconsistent domain decomposition), alerting us to the problem. >>> >>> Would Julia have prevented this? I know that everybody wants speed -- >>> and if you are using 32,000 nodes, you want a lot of speed -- but the idea >>> of bugs that only appear when you are pushing the limits makes me >>> uncomfortable. So, no -- Julia's unchecked integer arithmetic would not >>> have caught this bug either. >>> >>> Score: Julia vs. C++, both zero. >>> >>> -erik >>> >>> -- >>> Erik Schnetter <[email protected]> >>> http://www.perimeterinstitute.ca/personal/eschnetter/ >>> >> > > > -- > Erik Schnetter <[email protected]> > http://www.perimeterinstitute.ca/personal/eschnetter/ > -- Erik Schnetter <[email protected]> http://www.perimeterinstitute.ca/personal/eschnetter/
