Hi Amir,

Yes, if there are N load-balanced filters and they are mapped to N
cores, then one would expect hardware pipelining to perform similarly
to software pipelining.

However, there is still a difference in the buffering strategy.
Hardware pipelining relies on some implementation of a FIFO queue,
where filters block upon reading from an empty queue or writing to a
full queue.  Software pipelining does not need these checks, since the
buffering is established during the prolog schedule and
synchronization is enforced with some kind of barrier at iteration
boundaries.  The performance implications of this difference in
buffering depend on the cost of implementing a blocking queue versus
the cost of filter synchronization on a given architecture.  For
programs with a large amount of computation (relative to
communication), this difference is unlikely to be significant.

If other members of the StreamIt team have anything to add, please
feel free to chime in.

-Bill

On Fri, Oct 31, 2008 at 11:32 AM, Amir Hossein Hormati
<[EMAIL PROTECTED]> wrote:
> Hi all,
> I have a question about hardware and software pipelining(as they are
> described in ASPLOS'06 paper). Consider a scenario in which each filter is
> mapped to a separate physical core(assume this mapping is load-balanced). In
> that case, is software pipelining the same as hardware pipelining? Basically
> in that case the issue of finding contiguous sets of filters for performing
> efficient hardware pipelining does not exist.
>
> Thanks,
> --
> Amir Hormati
>
> _______________________________________________
> StreamIt-users mailing list
> [email protected]
> https://lists.csail.mit.edu/mailman/listinfo/streamit-users
>
>

_______________________________________________
StreamIt-users mailing list
[email protected]
https://lists.csail.mit.edu/mailman/listinfo/streamit-users

Reply via email to