Excellent stuff I can work with here. Thanks! Another thing I was thinking of is that besides full-system simulation, we can do full-system NOT simulation. Drop OpenShader in a huge FPGA (or array of them), and maybe it's not as fast as a full-custom design, but then you have a real damn GPU you can mess with. Need to collect statistics we didn't think of? Drop in some new architectural counters, resynthesize, and go. If you have a good energy model, you can project full-custom energy from FPGA energy plus architectural counters. And it's 100% cycle-accurate. Moreover, if we make this work with Verilator, we can get a reasonably high-perf GPU simulator straight from our hardware design.
The big out-stanging question for me is how would writing our own simulator (e.g. adding various GPU ISAs to MARSS and making some other necessary changes) different from integrating GPGPU-sim with MARSS? Adding various system overheads like PCIe isn't revolutionary. Going fusion-style isn't a big deal either. So what is is that we can (or would) do that they can't (or won't)? A case can be made for making a full-system simulation with all the interconnects and everything and sharing memory spaces, network on chip, etc. But can we be truly innovative in this space? On Mon, Apr 15, 2013 at 12:28 PM, hozer <[email protected]> wrote: > On Mon, Apr 15, 2013 at 11:18:53AM -0400, Timothy Normand Miller wrote: > > http://www.gpgpu-sim.org > > > > I'm not sure when this stuff popped up. although the MICRO-45 tutorial > > appeared after I put the first set of students on it. Anyhow, GPGPU-sim > is > > pretty sophisticated now, with native nVidia ISA support, more accurate > > timing models, and even a power model. If we're going to get funding to > > adapt MARSS to support GPUs, then we have some serious competition. Not > > sure yet how to distinguish ourselves. > > GPGPU is BSD-style license. It only does one microarchitecture. > > Take a look at http://www.fusionsim.ca/home > > I cannot boot an OS and run a workload on a cycle-accurate simulator. > > I have booted an OS and debugged the whole software stack of hardware for > which the silicon was still at the fab with Qemu+MARRS > > If you are talking about power, you MUST be able to include what the > operating system does, and how it handles interrupts, etc. It is an > extremely powerful validation tool to be able to boot linux on your > simulator, then get to your computation kernel, *then* turn on the > cycle-accurate sim mode. (some combination of MARS+QEMU+PTLSIM does > this) > > > But where we still have the edge is in having our own hardware > > implementation. Somehow, we need to leverage that, and I'm feeling a > sense > > of urgency -- I don't want some other better-funded group to steal or > > thunder on this one. > > +1 > > so... uh, how about a project on http://coinfunder.com/ or > https://bitcoinstarter.com/faq ;) > > > Anyhow, so I'm reading through the tutorials and looking for limitations > we > > can overcome. Either we'll make our own, or we'll join them for > > simulation. But I think it's hardware (as a design at least, in > gate-level > > simulation) where we need to focus, at least for the moment. Having a > > functionally verified design is excellent motivation to take us > seriously. > > You have a pretty good roadmap to get to a gate-level design. > > What I think might have a lot broader benefit, however, is a *device > functional* > model so we can boot an OS in QEMU and work on the whole-system > interactions. > > This is where the current GPGPU world falls flat on it's face. The vendor > software > provided to access AMD and Nvidia GPUs has all kinds of bugs, and is > getting in > the way of the potential of the microarchitecture. > > If someone wants and easy paper, take this: > > http://static.usenix.org/event/usenix04/tech/sigs/full_papers/benjegerdes/benjegerdes_html/usenix-ib-04.html > > and replace every mention of 'infiniband' with GP-GPU, and then detail all > the issues with the vendor proprietary drivers and software stacks that are > the gatekeepers to actually accessing the hardware. > > Case in point, I have a teraflop of double-precision GPU on my desk that > draws under 300 watt, but I have to run X-windows on a system that I really > just want a text console on to be able to find out what the temperature, > clockspeed, and voltage of the GPU is. > -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
