On 9/4/07, Patrick McNamara <[EMAIL PROTECTED]> wrote:

> > Much worse than that.  The requests have to cross the bridge from the
> > XP10 to the 3S4000, through some other processing logic, through the
> > memory controller, and back.
> >
>
> How much worse?

Not sure.  But good enough to make PCI accesses plenty fast.

> My concern is that since we don't have a whole lot of
> scratch RAM to play with in the XP10, we are going to do a pretty
> sizable number of reads from main card memory.

We don't really have to do the translation in less than the time of
one video frame.

> > I think we don't need interrupts.  I think we need to put subroutine
> > calls at appropriate places in the nanocontroller code to a routine
> > that polls for an outstanding PCI transaction and handles it.  Is it a
> > problem that we'd impose significant delays (and may retries on the
> > PCI bus)?  No more than already happens if you try to read the
> > framebuffer at the same time the video controller is reading a chunk.
> >
> >
>
> This would kill PCI bus throughput and/or really kill our update rate.

I won't do much to our update rate that isn't very important.  And
it'll hurt PCI throughput only in a situation where we don't care too
much about that either!

And like I said, video itself has a significant impact, and yet it
doesn't seem like a major big deal in practice.  If you do a PCI read
of the graphics memory, and the video controller has started reading a
burst, you have to wait for that burst to finish.

> The card has 16 bus clock cycles to be ready to handle data from the
> time the PCI master initiates a transfer.

After which it terminates with retry, and it can do that any number of
times.  (There's a limit of like 32k or 64k cycles or something.)

> Assuming the nanocontroller
> does poll to see the transaction waiting in time then the card has to
> issue a retry and the PCI master has to give up the bus and wait for its
> next time slice.  In this case, the card will be required to memorize
> the transaction and be ready to respond promptly if the master retries.

When doing delayed transactions, it's suggested that the target
remember the request.  But it isn't strictly necessary.  Of course, in
our case, we'd better, because otherwise, we might miss it the next
time.

> This means we effectively get one bus transaction every 25 or so PCI
> clock cycles at best.  This drops our throughput to around 1.2M reads or
> writes per second.    That sounds like a bunch but unfortunately every
> read and write to VGA memory is a separate PCI transaction.  It also
> means that every time a read or write is done to to VGA memory the PCI
> bus is held busy (but idle) for at least 16 cycles.

How about we put a gap between frames where we do nothing but poll?

But I also would like to see an analysis that suggests something will
be terribly hurt by us having slow performance in VGA mode.

>
> The solution is to of course call the PCI check frequently enough that
> we can claim every PCI transaction.  But, we also have to be ready to
> accept or transfer data within that same 16 clock window.

Why are you so concerned about the 16-cycle window?  We blow that all the time.

> Assuming a
> 200Mhz

We MIGHT manage 100MHz.

> nanocontroller clock and a 33Mhz PCI clock we would have a total
> of 96 nanocontroller clock cycles in which to claim the transaction and
> and have data ready to send or be ready to receive.
>
> We run into a further problem since the PCI spec requires that if a
> target generally can't respond in time to a request that it memorize
> that request so that it can respond in the necessary window on the next

This isn't really a problem.  I mean, yeah, it's a problem from a
kernel perspective in that we don't want to hog the CPU, and we'll be
doing that.  But DMA transactions (from other devices) can still go on
in the gaps, at least.

> call.  This means that if we miss a PCI read transaction, we still have
> to get the data if possible.  This could cause problems if we have other
> queue memory requests unless we have separate memory pipelines for PCI
> and VGA nanocontroller memory requests.  We have to be able to determine
> whether a data value coming from a memory read is for a prior request by
> the VGA code or by the PCI code.

My suggestion is to check for PCI requests only at those points where
we have no pending requests.  We can check for a significant number of
them.  We can do that between every pair of characters when
translating 80x25.  Then between frames, we can poll a few thousand
times before going to the next.


-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to