Re: FB-Driver-HOWTO

Andreas Beck Sun, 10 Oct 1999 11:32:13 -0700
> > But of course I can write up some theory about mode calculation.
> Thats what I need. I also need to get down what exactly all the clocks are
> and what they do. This will help for future fb driver writers.

O.K. - here we go. Steve - could you put that somewhere as kind of
"mode calculation howto - with examples" ?

How to calculate video modes:
-----------------------------

1. A look from the monitor's point of view:

To understand how modeline calculation works, we should first have a look at
how CRTs (cathode ray tubes - the main part of common monitors) work.

Basically a CRT builds a picture sequentially from the data it gets on its
input lines. To achieve this, the cathode ray scans over the screen in a
kind of "zig-zag" pattern that covers the whole visible part of the screen
once per frame.
This motion is so quick, that the eye is unable to follow and we see a
steady picture.

The zig-zag pattern I talked about needs a closer look:
It would be pretty inconvenient for data ordering, if we would really always
go left->right, then right->left. Thus the real movement is always
left->right, with a quick jump to the left again at the end.
The same goes for the top->bottom motion, that of course happens much more
slowly, as the ray draws line by line.

Obviously, the displayed data needs to be synchronized with the current
position of the CRT to be able to build a steady picture.

This is what the HSYNC and VSYNC signals do. On PC monitors, there are
two hardware lines that say "move the ray to the left now" and "move the ray
to the top now". (Some systems encode that information e.g. in the green
channel, which is called SyncOnGreen, but that doesn't change the
principle).

>From the monitor's perspective, all information about a mode it gets is
contained in the frequency with which those signals return. These
frequencies are called the horizontal and vertical frequency (the latter 
is also often called refresh rate, as it determines how often per second a
whole picture gets drawn).

That means: The monitor has no idea about the other properties of a mode,
like clock, depth, borders, ... If two modes share the same frequencies, the
monitor will consider them the same. This is why you usually can not store
different centering data for e.g. 640x480x16 and 640x480x32 in the monitor.
The monitor cannot distinguish those modes.

Now let's look at what the signal on the RGB lines should look like in
between two HSYNC pulses:

HSYNC __/~~~\______________________________________________/~~~\___
RGB   ___________datadatadatadatadatadatadatadatadatad_____________
time    1   2    3                                    4    5

At 1, the HSYNC pulse gets raised. The CRT ray will now quickly move to the
left. During that time, the rgb lines should be black (ray off), as otherwise 
it would leave a noticeable trace while moving, which would look ugly.

At 2, the HSYNC pulse ends. This point isn't of much interest, as you can
not tell, if the ray is already at the left edge. The only thing important
about point 2 is, that the time between point 1 and 2 must be sufficiently
high for the monitor to detect the HSYNC signal. Usually the HSYNC pulse can
be made very small.

At some point after 1, the ray will start flying to the right again. When
point 3 comes, it will actually start to display data. Point 3 can thus be
adjusted to change the left border location. If you wait longer until you
start sending data, the left border will move to the right.

When you have sent all data you reach point 4. As a HSYNC pulse should then
be sent, to start a new line, we set the RGB lines to black again.
At point 5 we have completed a cycle and start the next line.

Now to specify a complete mode timing, we just need to give the times
between all those points.

2. A look from the graphics card's point of view:

>From the graphics cards point of view, the graphics data could just be
streamed out continuously without interruption. However as the monitor will
need time for retracing, the graphics card will have to put in some "delays"
at specific times. To be precise between point 4 and point 3 (well point 7
actually - the equivalent of point 3 on the next line) on the previous diagram.

For the graphics card, the "natural" coordinate system starts at point 3,
when it starts emitting data. This point usually causes some confusion
with modeline-calculation:

HSYNC __/~~~\______________________________________________/~~~\___
RGB   ___________datadatadatadatadatadatadatadatadatad_____________
time    1   2    3                                    4    5   6
grc     SS  SE   0                                    W    SS  SE

>From the graphics cards point of view a line starts at "0". From that point
onward, it will output the data in its video ram. There is a counter 
that will limit the number of pixels that are put on one line. This is what
we call the width of a mode. On a 640x480 standard VGA mode, this is 640
pixels.

Now we will usually want a small right border to allow the monitor to
prepare for the following SYNC pulse we will generate. The aforementioned
counter will run on (but data output from video RAM will be suppressed)
until we reach the point SS (SyncStart). On a 640x480 standard VGA mode,
this happens at 664 pixels. That is, we left a border of 24 pixels.

Now we raise the HSYNC to tell the monitor to go left. This signal remains
asserted until we reach the point SE (SyncEnd). (760 pixels on VGA -
i.e. 96 pixels of sync pulse. This is pretty long, but VGA monitors weren't
very quick.)

We will also want some left border, so we wait until we reach the next "0"
point before starting to generate a signal again. On standard VGA this
happens at 800 pixels (40 pixels left border). At that point, we reset the
counter to 0 and start over. This point is usually called the "total" 
for this reason.

Now let us look at how we can change the picture's appearance by changing
values in such a modeline.

Moving SE should not cause any difference at all, except if you make the
sync pulse too small for the monitor to recognize.

Moving SS and SE together will move the location of the sync pulse within
the picture. Let us assume we move them both to the "left", i.e. we
decrease their startpoints. What happens is, that we decrease the distance
W-SS (which determines the right border) and increase 0-SE (the left
border). As a result, the picture moves to the right.

Now what happens, if you change W ? You get extra pixels at the right
border. As usually borders are pretty large for standard VGA modes, 
you can usually display something like 648x486 without a problem on a
standard VGA monitor. It will not be able to see the difference.

Of course there are limits to this: If you go too far, you will produce
pixels beyond the visiable area of the monitor which is useless, or
interfere with the retrace, what gives ugly stripes from the retracing
CRT ray.

Now let's shed some light on a few remaining terms:

Blankstart BS and blankend BE. Between SE and 0, you can put a BE point on
many graphics cards. At that point, the RGB lines are no longer clamped to
black (to avoid interfering with the retrace), but can be programmed to a
border color. The same goes for BS, which can be placed between W and SS.
Usually one doesn't use that feature nowadays, as we have tuneable monitors
that allow to stretch the mode to the physical limits of the monitor.

On old monitors, one used large borders to ensure the data was always
visible. There the border color made some sense as kind of eye candy.

Pixelclock. That is the rate at which pixels are output to the RGB lines.
It is usually the basic unit for all timing values in a graphics card.

3. Actually calculating a mode
------------------------------

This is more complex, than one might think, but not as complex, as some
people might want to make you believe. 

An unrelated note first: It is a common misconception, that graphics cards 
cannot do anything but the VGA/VESA induced "standard" modes like 640x480, 
800x600, 1024x768, 1280x960, ...

With most cards, about any resolution can be programmed, though many
accelerators have been optimized for the abovementioned modes, so that it is
usually not wise to use other widths, as one might need to turn of
accelerator support ... :-(.

So if you write a driver, don't cling to these modes. If the card can handle
it, allow any mode.

O.K. - Let us first consider a VGA-standard, fixed frequency monitor.

It is called fixed frequency, as it can only handle a single horizontal
frequency (they usually can do multiple vertical frequencies, though, which
are much less critical, as they are much lower ...). 

The monitor manual says the horizontal frequency is 31.5 kHz.

And we want to display a given mode, say 640x480.

We can now already determine the absolute minimum dotclock we need, as

dotclock = horiz_total * hfreq

and

horiz_total = width + right_border + sync + left_border > width

. The minimum dotclock computes to 20.16 MHz. We will probably need
something around 20% "slack" for borders and sync, so let's say we need
about a 24MHz clock. Now we look at the next higher clock our card can 
handle, which is 25.175 MHz, as we have a VGA card.

Now we can compute the horizontal total:

horiz_total = dotclock / hfreq = 799.2

We can only program this is multiples of 8, so we round to 800.
Now we still need to determine the length and placement of the sync pulse,
which will give all remaining parameters.

There is no clean mathematical requirement for those. Technically, the sync
must be long enough to be detected, and placed in a way that the mode is
centered. The centering issue is not very pressing anymore, as digitally
controlled monitors are common, which allow to control that externally.
Generally you should place the sync pulse early (i.e. keep right_border
small), as this will usually not cause artifacts that would arise from 
turning on the output again too early, when the sync pulse is placed too
late.

So if we as a "rule-of-thumb" use a half of the blanking period for the sync
and divide the rest as 1/3 right-border, 2/3 left border, we get a modeline
of 

"640x480" 25.175   640 664 744 800  ??? ??? ??? ???

While this is not perfectly the same as a standard VGA timing, it should run
as well on VGA monitors. The sync is a bit shorter, but that shouldn't be a
real problem.

Now for the vertical timing. At 480 lines, a VGA monitor uses 60Hz.

hfreq = vfreq * vert_total

yields vert_total=525. The vertical timings usually use much less overhead
than the hoizontal ones, as here we count whole _lines_, which means much
longer delays than just pixels. 10% overhead suffice here, and we again
split the borders 1/3 : 2/3, with only a few lines (say 2) for the sync
pulse, as this is already much longer than a HSYNC and thus easily
detectable.

"640x480" 25.175   640 664 744 800   480 495 497 525

let us compare that to an XF86 Modeline that claims to be standard VGA:

Modeline "640x480"     25.175 640  664  760  800   480  491  493  525

Not much difference - right ? They should both work well, just a little
shifted against each other vertically.

4. The common case:
-------------------

Now let us consider a multiscan monitor, that will do hfreq 31-95kHz and
vfreq 50-130Hz.

Now let's look at a 640x480 mode. Our heuristics say, that we will need
about 768x528 (20% and 10% more) for borders and sync.

We as well want maximum refresh rate, so let's look what limits the mode:

hfreq = vfreq * vtotal = 130 * 528 = 68.6 kHz

Oh - we cannot use the full hfreq of our monitor ... well no problem. What
counts is the vfreq, as it determines how much flicker we see.

O.K. - what pixelclock will we need ? I leave that as an excercise to the
reader. The calculation yields 52.7MHz, which should be enough for you to
determine the formula, in case you don't know it yet :-).

Now we look what the card can do. Say we have a fixed set of clocks. We look
what clocks we have close by. Assume the card can do 50 and 60 MHz.

Now we have the choice: We can either use a lower clock, thus scaling down
the refresh rate a bit (by 5% ... so what ...): This is what one usually
does.

Or we can use a higher clock, but this would exceed the monitor
specifications. That can be countered by adding more border, but this is
usually not done, as it is a waste of space on the screen
However keep it in mind as a trick for displaying 320x200, when you do not
have doubling features. It will display in a tiny window in the middle of
the screen, but it will display.

O.K. - what will our calculation yield ?

"640x480"    50   640 664 728 768   480 496 498 528 # 65kHz 123Hz

I just mentioned doubling features. This is how VGA does 320x200. It
displays each pixel twice in both directions. Thus it effectively is a
640x400 mode. If this would not be done, you would need a pixelclock of
12.59MHz and you would still have the problem of needing a 140Hz refresh, if
hsync should stay at 31.5kHz.

A horizontal doubling feature allows to use the 25.175MHz clock intended
for 640, and a line doubling feature keeps the vsync the same as 400 lines.
Actually the line-doubler is programmable, so you can as well use modes as
sick as 640x50.

O.K. - another example. Same monitor, 1280x1024.

Now we need about 1536x1126 total (same rule of thumb).
That yields 130Hz*1126lines = 146 kHz. we would exceed the hfreq with that,
so now the hfreq is the limit and we can only reach a refresh rate of about
84 Hz anymore.

The required clock is now 146MHz. That would yield:

"1280x1024"   146   1280 1320 1448 1536   1024 1058 1060 1126 # 95kHz 84Hz

Now the clock might be programmable, but keep in mind, that there may be
limits to the clock. DO NOT OVERCLOCK a graphics card. This will result in
the RAMDAC producing blurry images (as it cannot cope with the high speed),
and more importantly, the RAMDAC will OVERHEAT and might be destroyed.

Another issue is memory bandwidth. The video memory can only give a certain
amount of data per time unit. This often limits the maximum clock at modes
with high depth (i.e. much data per pixel). In the case of my card it limits
the clock to 130MHz at 16 bit depth, what would produce:

"1280x1024"   130   1280 1320 1448 1536   1024 1058 1060 1126 # 85kHz 75Hz

which is pretty much, what my monitor shows now, if I ask it.

5. Recipe for multisync monitors:

a) determine the totals by calculating htotal=width*1.2 and vtotal=height*1.1 .
b) check, what limits the refresh by calculating vfreq2=hfreqmax/vtotal.
   If that exceeds vfreqmax, the limit is on the vfreq side, and we use
   vfre=vfreqmax and hfreq=vfreqmax*vtotal. If it doesn't, the mode is limited
   by hfreq and we have to use vfreq=vfreq2. 
   Note, that if the exceeds vfreqmin, the mode cannot be displayed.
   In the vfreq-limited case, you might exceed hfreqmin, which can be
   countered by using linedoubling facilities, if available. You can also
   add extra blank lines (increase vtotal) as a last-resort alternative.
c) Now that you have hfreq and vfreq, calculate the pixel clock using
   pixclock=hfreq*htotal. Use the next lower pixel clock. If you are below
   the lowest clock, you might want to try horizontal doubling features or
   you will have to pad the mode by increasing htotal.
d) Again check the monitor limits. You might be violating lower bounds now
   ... In that case you might need to resort to choosing a higher clock
   and padding as well.
e) You now have pixclock, width, height, htotal and vtotal. Calculate the
   sync start positions: hss=width+(htotal-width)/2/3 ; 
   vss=height+(vtotal-height)/3; Make sure to properly align them as
   required by the video card hardware hss usually has to be a multiple of
   8.
f) SyncEnd is calculated similarly: hse=hss+(htotal-width)/2 and vse=vss+2.

Voila.

6. Receipe for Monosync:

a) Calculate the number of lines. As hfreq and vfreq are given, vtotal is
   fixed: vtotal=hfreq/vfreq. If there are multiple vfreqs allowed, choose
   them according to your desired vtotal (i.e. one that gives the lowest
   vtotal above what you really need).
b) Calculate the pixelclock. pixclock=hfreq*htotal. htotal starts at the
   same estimate (width*1.2) we used above.
c) Adjust the pixelclock to hardware-limits. Adjust _UP_. Now recalculate
   the htotal=pixclock/hfreq.
d) Go to 5. e)

CU, ANdy

-- 
= Andreas Beck                    |  Email :  <[EMAIL PROTECTED]> =
Re: FB-Driver-HOWTO

Reply via email to