On Thu, Jul 7, 2016 at 8:03 PM, Barry Smith <[email protected]> wrote:
> > > On Jul 7, 2016, at 7:05 PM, Jeff Hammond <[email protected]> wrote: > > > > > > > > On Thu, Jul 7, 2016 at 1:04 PM, Matthew Knepley <[email protected]> > wrote: > > On Fri, Jul 1, 2016 at 4:32 PM, Barry Smith <[email protected]> wrote: > > > > The DOE SciDAC institutes have supported PETSc linear solver > research/code development for the past fifteen years. > > > > This email is to solicit ideas for linear solver research/code > development work for the next round of SciDAC institutes (which will be a 4 > year period) in PETSc. Please send me any ideas, no matter how crazy, on > things you feel are missing, broken, or incomplete in PETSc with regard to > linear solvers that we should propose to work on. In particular, issues > coming from particular classes of applications would be good. Generic > "multi physics" coupling types of things are too general (and old :-)) > while work for extreme large scale is also out since that is covered under > another call (ECP). But particular types of optimizations etc for existing > or new codes could be in, just not for the very large scale. > > > > Rough ideas and pointers to publications are all useful. There is an > extremely short fuse so the sooner the better, > > > > I think the suggestions so far are fine, however they all seem to start > at the "how", whereas I would prefer we start at the "why". Maybe something > like > > > > 1) How do we run at bandwidth peak on new architectures like Cori or > Aurora? > > Huh, there is a how here, not a why? > The why is "We need to run at bandwidth peak on new arches". I do not prescribe the How, just ask for it. Matt > > > > Patrick and Rich have good suggestions here. Karl and Rich showed some > promising numbers for KNL at the PETSc meeting. > > > > > > Future systems from multiple vendors basically move from 2-tier memory > hierarchy of shared LLC and DRAM to a 3-tier hierarchy of fast memory (e.g. > HBM), regular memory (e.g. DRAM), and slow (likely nonvolatile) memory on > a node. > > Jeff, > > Would Intel sell me a system that had essentially no regular memory > DRAM (which is too slow anyway) and no slow memory (which is absurdly too > slow)? What cost savings would I get in $ and power usage compared to say > what is going in the theta? 10% and 20%, 5% and 30%, 5% and 5 %? If it is a > significant savings then get the cut down machine, if it is insignificant > than realize the cost of not using it (the DRAM you paid so little for) is > insignificant and not worth worrying about, just like cruise control when > you don't use the highway. Actually I could use the DRAM to store the > history needed for the adjoints; so maybe it is ok to keep, but surely not > useful for data that is continuously involved in the computation. > > Barry > > > > > > Xeon Phi and some GPUs have caches, but it is unclear to me if it > actually benefits software like PETSc to consider them. Figuring out how > to run PETSc effectively on KNL should be generally useful... > > > > Jeff > > > > -- > > Jeff Hammond > > [email protected] > > http://jeffhammond.github.io/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
