Victor Eijkhout <[email protected]> writes: > On Jan 2, 2014, at 10:50 AM, Jed Brown <[email protected]> wrote: > >> I find simple demonstrations as unconvincing as most patents. 99% of >> the work remains in extending the idea into something practical. It may >> or may not pan out, but we can't say anything from the simple >> demonstration alone. > > Maybe you and I disagree on what I'm demonstrating. My goal was to > show that my notion of parallelism generalizes MPI & tasking > notions. Not that I have a better notation for VecScatters.
My impression is that your transformation is recognizing a common pattern of communication into temporary buffers, followed by computation, followed by post-communication and putting a declarative syntax on it (with callbacks for the computation). The same operations can also be written imperatively and I'm not seeing the profound advantage of converting to your callback system. > And from this demonstration we can definitely say something: namely > that I've shown how one API can address multiple types of > parallelism. That's more than any other system I know of. There are systems like AMPI that run MPI programs on top of threads, and the MPI implementations already optimize shared-memory communication. If you want separate work buffers per thread, those systems will give you a single abstraction. But the reason people want native interfaces to threads is so that they can use large shared data structures and techniques like cooperative prefetch. Your abstraction is not uniform if you need to index into owned parts of shared data structures or perform optimizations like cooperative prefetch. If you're going to use separate buffers, why is a system like MPI not sufficient? What semantic does your abstraction provide for hybrid distributed/shared memory that imperative communication systems cannot? > But let's be constructive: I want to use this demonstration to get > funding. NSF/DOE/Darpa, I don't know. Now if you can't say anything > from this simple demonstration, then what would convince you as a > reviewer? Make a precise and concrete (falsifiable) statement about what semantic your system can provide that others cannot. Show examples that are hard to express with existing solutions (such as MPI), but are cleanly represented by your system. Be fair and show examples of the converse if they exist (silver bullets are rare). >> another layer of callbacks > > If you have mentioned that objection before it escaped my attention. See earlier message on "Callback Hell". > Yes, I agree that in that respect (which has little to do with the > parallelism part) my demonstration is not optimal. The unification of > MPI & tasks is going too far there. For MPI it would be possible to > have calls like VecScatterBegin/End and instead of a callback just > have the local node code in place. For task models that is not > possible (afaik). See for instance Quark, where each task contains a > function pointer and a few data pointers. Quark does this because it is a dynamic scheduler. (When people compare, static schedules are usually as good or better, though if the dependency graph changes from run to run, you may still want to write it as a DAG and have callbacks.) But your model is not a general DAG and the execution model is BSP so it's not clear what is gained by switching to callbacks.
pgpqYSBPBlQHH.pgp
Description: PGP signature
