Hi Amir, Yes, if there are N load-balanced filters and they are mapped to N cores, then one would expect hardware pipelining to perform similarly to software pipelining.
However, there is still a difference in the buffering strategy. Hardware pipelining relies on some implementation of a FIFO queue, where filters block upon reading from an empty queue or writing to a full queue. Software pipelining does not need these checks, since the buffering is established during the prolog schedule and synchronization is enforced with some kind of barrier at iteration boundaries. The performance implications of this difference in buffering depend on the cost of implementing a blocking queue versus the cost of filter synchronization on a given architecture. For programs with a large amount of computation (relative to communication), this difference is unlikely to be significant. If other members of the StreamIt team have anything to add, please feel free to chime in. -Bill On Fri, Oct 31, 2008 at 11:32 AM, Amir Hossein Hormati <[EMAIL PROTECTED]> wrote: > Hi all, > I have a question about hardware and software pipelining(as they are > described in ASPLOS'06 paper). Consider a scenario in which each filter is > mapped to a separate physical core(assume this mapping is load-balanced). In > that case, is software pipelining the same as hardware pipelining? Basically > in that case the issue of finding contiguous sets of filters for performing > efficient hardware pipelining does not exist. > > Thanks, > -- > Amir Hormati > > _______________________________________________ > StreamIt-users mailing list > [email protected] > https://lists.csail.mit.edu/mailman/listinfo/streamit-users > > _______________________________________________ StreamIt-users mailing list [email protected] https://lists.csail.mit.edu/mailman/listinfo/streamit-users
