I submit it is actually a good amount of additional work and requires real creativity and very good judgment; it is not a good intro or undergrad project; especially for someone without a huge amount of hands-on experience already. Look who had to do the new SpecHPC multigrid benchmark. The last time I checked Sam was not an undergrad. Senior Scientist, Lawrence Berkeley National Laboratory - Cited by 11194 I definitely do not plan to involve myself in any brand new serious benchmarking studies in my current lifetime, doing one correctly is a massive undertaking IMHO.
> On Jan 22, 2022, at 6:43 PM, Jed Brown <j...@jedbrown.org> wrote: > > This isn't so much more or less work, but work in more useful places. Maybe > this is a good undergrad or intro project to make a clean workflow for these > experiments. > > Barry Smith <bsm...@petsc.dev> writes: > >> Performance studies are enormously difficult to do well; which is why there >> are so few good ones out there. And unless you fall into the LINPACK >> benchmark or hit upon Streams the rewards of doing an excellent job are >> pretty thin. Even Streams was not properly maintained for many years, you >> could not just get it and use it out of the box for a variety of purposes >> (which is why PETSc has its hacked-up ones). I submit a properly performance >> study is a full-time job and everyone always has those. >> >>> On Jan 22, 2022, at 2:11 PM, Jed Brown <j...@jedbrown.org> wrote: >>> >>> Barry Smith <bsm...@petsc.dev> writes: >>> >>>>> On Jan 22, 2022, at 12:15 PM, Jed Brown <j...@jedbrown.org> wrote: >>>>> Barry, when you did the tech reports, did you make an example to >>>>> reproduce on other architectures? Like, run this one example (it'll run >>>>> all the benchmarks across different sizes) and then run this script on >>>>> the output to make all the figures? >>>> >>>> It is documented in >>>> https://www.overleaf.com/project/5ff8f7aca589b2f7eb81c579 You may need >>>> to dig through the submit scripts etc to find out exactly. >>> >>> This runs a ton of small jobs and each job doesn't really preload, but >>> instead of loops in job submission scripts, the loops could be inside the C >>> code and it could directly output tabular data. This would run faster and >>> be easier to submit and analyze. >>> >>> https://gitlab.com/hannah_mairs/summit-performance/-/blob/master/summit-submissions/submit_gpu1.lsf >>> >>> It would hopefully also avoid writing the size range manually over here in >>> the analysis script where it has to match exactly the job submission. >>> >>> https://gitlab.com/hannah_mairs/summit-performance/-/blob/master/python/graphs.py#L8-9 >>> >>> >>> We'd make our lives a lot easier understanding new machines if we put into >>> the design of performance studies just a fraction of the kind of thought we >>> put into public library interfaces.