Some number of ages ago, I posted some things to the list where I described how to handle all of this. We'll have to dig it out of the archive. And some of this may have ended up on the wiki, but I'm not sure.
In short, I described a method for merging the pipeline segment (a segment being a set of stages) with the fifo such that (a) the pipeline can't 'overfill' and lose anything, (b) 'packs' bubbles, meaning that earlier stages can progress even when later stages are busy as long as there are bubbles in the pipeline, and (c) prevents the busy signal handling from becoming a critical path (with some careful planning). I described generic modules called "HEADER", "FOOTER", and two others that can perform processing, one that is a simple pipeline stage, and the other that can communicate with external logic (such as reading/writing a FIFO). Each pipeline stage has a 'busy' or 'valid' bit that indicates that something is IN that pipeline stage. It means that the next stage should accept output (if it can), or if it can't, it means that earlier stages must wait. The last stage of a segment pays attention only to the busy input from the next segment (unless it performs some multi-step process), the second-to-last stage pays attention to the busy input and the valid status of the last stage. As you move backward, the number of inputs to for a stage to determine whether or not to do something grows, and eventually reaches a point where it can become a critical path. At that point, you break the pipeline into another segment. The FOOTER's job is basically just to interpret the signals from a subsequent segment. The HEADER's job is to break the combinatorial path between all of the busy signals in one segment and those in another by making the segment's busy signal registered. To do this, it acts as a queue that can hold 0 or 1 entries, multiplexing between either the input from the previous segment (if this segment is not busy) or a registered copy of an earlier input (if the segment is busy and therefore holding something). Due to the multiplexing, you typically want to put a DUMMY stage after a HEADER. All it does is remove the multiplexing from the critical paths. The nice thing about HEADER and FOOTER stages is that they "speak fifo language", which means that it is trivial to bolt a fifo onto the end of a pipeline segment, because a pipeline segment looks just like a fifo. It has exactly the same sorts of external control signals. I provided source code to all of this. We'll have to dig it out. So far, I haven't been able to get Google to find it. A stage that performs a multi-step process (like what we'll have to do with textures in certain modes) will basically function as a pipeline segment all its own, managing a registered busy signal for prior stages, and that busy signal will be a combination of stage machine states and busy signals from subsequent stages. On 3/4/09, Kenneth Ostby <[email protected]> wrote: > In the series on architectural / performance rants. > > The last week I committed some basic verilog skeleton files which marks > the beginning of the rasterization module of the 3D engine. However, it > was one problem which struck me with the implementation of the logic > itself, namely how to solve pipeline stalls. A good example would be in > the horizontal rasterizer[1], where you can see that I have divided it > up into 3 basic stages, corresponding to the calculations found in the > new_model code. Basically: > > 1) adjustment = some math > 2) initial point = more math * adjustment > 3) for each step in width: > calculate values for step. > > Now, the two first stages are easy, the cycle count from the entry of > data into stage 1 until it's ready for output in stage 2 is fixed, and > mainly depends on the latency introduced by the floating point > operations involved. However, the problem comes with stage 3. Since > stage 3 logically involves a loop over the width, it might have to stall > the pipeline while waiting for the while loop to finish processing. Thus > the question arise, what are we going to do with the data already > introduced into the pipeline? > > The naive solution to the problem/challenge is to introduce a queue at > the end of each stage. This means that the depth of the queue has to be > at least the same number as the cycles used by the stage itself. This is > explained by the case where we have filled the entire pipeline of the > floating point module and we encounter a pipeline stall. In that case we > have to be able to store all of the output generated by the floating > points modules, since the FP modules have no mechanisms for stalling > themselves. Then the stage has to stall, not being able to accept any > data, nor processing anything until the stall ends. > > My major problem with this solution is that it, as far as I can see, in > the case of a stall will introduce a latency in terms of startup costs. > If the queues are full, and if we encounter a stall from the next step > in the pipeline we cannot accept new incoming data either, since we > cannot guarantee that we will have space in the outgoing queue for the > results. Hence we will have a startup cost in terms of cycles equal to > the cycle count data uses through the pipeline step. > > Anybody have some good solutions? The easy answer is to say that it > doesn't cost that much compared to feature Y, but it feels a bit like > cheating. Other solution is to, in every pipeline step, incorporate a > "store/do not continue" flag and store the output in a register local to > the step. > > If you read through this entire mail, > Thank you for your attention :) > > Regards, > Kenneth > > > [1] http://langly.org/og/rastHori.png > > > > -- > Life on the earth might be expensive, but it > includes an annual free trip around the sun. > > Kenneth Østby > http://langly.org > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > > iEYEARECAAYFAkmuSo0ACgkQpcFZhY+Vljz1VQCghnHoj66iOm9bHzUqGnAWl9zU > AZIAoIEbIcTRIzwU4vY1RTT7dqVzJoFe > =PqGo > -----END PGP SIGNATURE----- > > _______________________________________________ > Open-graphics mailing list > [email protected] > http://lists.duskglow.com/mailman/listinfo/open-graphics > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
