Re: Vertical retrace

1999-10-22 Thread Niklas Höglund

On Wed, Oct 20, 1999 at 03:46:42PM -0700, Jon M. Taylor wrote:
 On Wed, 20 Oct 1999, [iso-8859-1] Rubén wrote:
 
  On 1999/Oct/20, Brian S. Julin wrote:
  
   Basically there are two situations to worry about.  One is trying
   to find the ray position from userspace when the process is running.
   The other is trying to get the scheduler to run the process 
   promptly at a certain ray positions.
  
  I think there is an easy solution for the first thing. Why don't you
  share a flag variable between the kernel and the process, that becomes 1 at
  the vertical retrace and 0 when it finishes? It would be for programmers
  even more efficient than reading from the IO port directly (which is
  impossible under Linux, obviously). It could be managed in a similar way
  than the graphics context, couldn't it?
 
   I think that putting this flag in the context map is the way to
 go.  The flag can be quickly toggled by the interrupt handler, and nothing
 else is necessary.  This doesn't get around the scheduling Hz problem, but
 that is a separate issue anyway, and needs to be fixed for a lot more than
 just GGI.

It'd be even better to increase a value every vblank so applications
can find out if it has missed any frames, and also to make sure
applications don't redraw more than once per frame.

-- 
Niklas



Re: Vertical retrace

1999-10-22 Thread Rubén

On 1999/Oct/22, Niklas Höglund wrote:

 I think there is an easy solution for the first thing. Why don't you
   share a flag variable between the kernel and the process, that becomes 1 at
   the vertical retrace and 0 when it finishes? It would be for programmers
   even more efficient than reading from the IO port directly (which is
   impossible under Linux, obviously). It could be managed in a similar way
   than the graphics context, couldn't it?
  
  I think that putting this flag in the context map is the way to
  go.  The flag can be quickly toggled by the interrupt handler, and nothing
  else is necessary.  This doesn't get around the scheduling Hz problem, but
  that is a separate issue anyway, and needs to be fixed for a lot more than
  just GGI.
 
 It'd be even better to increase a value every vblank so applications
 can find out if it has missed any frames, and also to make sure
 applications don't redraw more than once per frame.

I'm not agree, applications doesn't have to use the same refresh
than the monitor. Your monitor can be refreshing at 100 Hz, and 50 frames
per second is more than enough to see an animation smooth.
And if you do this, some people would synchronize the applications
in infinite loops testing this number, instead of using alarms (the right
method, IMHO).
Other reason is that a flag is tested much faster than a number (at
least on intel).

I think that the number isn't needed at all, your alarm system
should figure out when a frame is losed (GGL does this way the frame
skipping), and It may not worry about real card-monitor frames.
-- 
  _
 /_) \/ / /  email: mailto:[EMAIL PROTECTED]
/ \  / (_/   www  : http://programacion.mundivia.es/ryu
[ GGL developer ]  [ IRCore developer ]  [ GPUL member ]



Re: Vertical retrace

1999-10-22 Thread Jon M. Taylor

On Fri, 22 Oct 1999, Brian S. Julin wrote:

 On Fri, 22 Oct 1999, [iso-8859-1] Rubén wrote:
 
  On 1999/Oct/22, Niklas Höglund wrote:
   It'd be even better to increase a value every vblank so applications
   can find out if it has missed any frames, and also to make sure
   applications don't redraw more than once per frame.
  
  I'm not agree, applications doesn't have to use the same refresh
  than the monitor. Your monitor can be refreshing at 100 Hz, and 50 frames
  per second is more than enough to see an animation smooth.
  And if you do this, some people would synchronize the applications
  in infinite loops testing this number, instead of using alarms (the right
  method, IMHO).
 
 I think an incrementing number is more flexible and leaves the
 application programmer more leeway.

I see no reason why we could not have both.  And we _should_ have
both, since lots of hardware supports both.  We have not yet even filled
up one 4K intel page with context info yet.

Jon 

---
'Cloning and the reprogramming of DNA is the first serious step in 
becoming one with God.'
- Scientist G. Richard Seed



Re: Vertical retrace

1999-10-20 Thread Rubén

On 1999/Oct/20, Brian S. Julin wrote:

 Basically there are two situations to worry about.  One is trying
 to find the ray position from userspace when the process is running.
 The other is trying to get the scheduler to run the process 
 promptly at a certain ray positions.

I think there is an easy solution for the first thing. Why don't you
share a flag variable between the kernel and the process, that becomes 1 at
the vertical retrace and 0 when it finishes? It would be for programmers
even more efficient than reading from the IO port directly (which is
impossible under Linux, obviously). It could be managed in a similar way
than the graphics context, couldn't it?
-- 
  _
 /_) \/ / /  email: mailto:[EMAIL PROTECTED]
/ \  / (_/   www  : http://programacion.mundivia.es/ryu
[ GGL developer ]  [ IRCore developer ]  [ GPUL member ]



Re: Vertical retrace

1999-10-20 Thread Brian S. Julin

On Wed, 20 Oct 1999, Jon M. Taylor wrote:
 On Wed, 20 Oct 1999, [iso-8859-1] Rubén wrote:
 
  On 1999/Oct/20, Brian S. Julin wrote:
  
   Basically there are two situations to worry about.  One is trying
   to find the ray position from userspace when the process is running.
   The other is trying to get the scheduler to run the process 
   promptly at a certain ray positions.
  
  I think there is an easy solution for the first thing. Why don't you
  share a flag variable between the kernel and the process, that becomes 1 at
  the vertical retrace and 0 when it finishes? It would be for programmers
  even more efficient than reading from the IO port directly (which is
  impossible under Linux, obviously). It could be managed in a similar way
  than the graphics context, couldn't it?
 
   I think that putting this flag in the context map is the way to
 go.  The flag can be quickly toggled by the interrupt handler, and nothing
 else is necessary.  This doesn't get around the scheduling Hz problem, but
 that is a separate issue anyway, and needs to be fixed for a lot more than
 just GGI.

This is sort of what I intend, however, more than a simple IRQ driven
flag, because, as I said in the previous e-mail, anything IRQ driven will
miss/waste the front porch entirely.

Right now I have a stand-alone system so I can develop without
entanglement in the GGI source tree; it is a module implementing 
an MMAPable /proc/rt page that is altered by the kernel.  The page is
altered by a configurable combination of an IRQ handler, a standard linux
periodic task, or a user-space initiated IOCTL.  The routine is always
the same: store the TSC, read a value from the card, store the TSC 
again.  That way, userspace will be able to calculate a formula that will
get a pretty accurate value for WaitRayPos based on the value of the 
processor TSC register, and that means finding out the ray position 
using a very few cycles and no context switch.

For chipsets that have line counters and such goodies this is very simple,
but I'm cracking the toughest nut (VGA retrace detect bits only) first.
This involves convergence of a frequency detection algorithm that is working
with data taken at more or less random time locations, and it has defied
my first few naive attempts.  Still I am confident when I find time to finish
it that I can get it to converge on most systems pretty quickly.  It should
be pretty easy to hook into LibGGIMisc once I'm done; the only potential
snag I see is somehow hooking it so whenever a SetMode or console switch-to 
happens the userspace code that recalibrates gets called.  KGI metalanguages 
should make short work of the kernel side of the code.

--
Brian

P.S. Actually the toughest nut would be if I was using the CTC via IOCTL,
or working on a system with a variable/halting TSC register, but for now
I'm assuming the TSC is there and constant.



Re: Vertical retrace

1999-10-20 Thread James Simmons


 For the latter, there is no elegant solution under standard Linux.
 The closest interval a periodic task can be run at to check the
 ray position without hacking the kernel (on i386) is 100 times a 
 second.  This can hardly give an accurate reading/signal.  So we are
 stuck with either losing the back porch (using IRQs) or using a 
 different flavor of Linux scheduler (RTLinux, or maybe something 
 customized and collaborative with James's planned SMP resource scheduler.)

You really need RTLinux for VLB. My schedular has *HOOKS* that help manage
any threads that are using the accel engine. You don't want to have a
process that owns the accel engine to do a sleep(60) and have other
graphics process waiting around.



Re: KGI_COMMANDS, was: Re: Doesn't need vertical retrace!

1999-10-07 Thread Jon M. Taylor

On Wed, 6 Oct 1999, Andreas Beck wrote:

struct kgi_3dtriangle {int x0,y0,z0,x1,y1,z1,x2,y2,z2};
 
  What about the exta fields (W, specular/diffuse color, texture
  coords, vertex fog)?  
 
 Extra commands (i.e. DRAW3DTRIANGLE_TEXTURED, *_GORAUD). We'd need to take 
 up too much bandwidth on PingPong or similar, if we always transfer the 
 full set.

Not if we used two pipes like I was suggesting 

Jon

---
'Cloning and the reprogramming of DNA is the first serious step in 
becoming one with God.'
- Scientist G. Richard Seed



Re: Doesn't need vertical retrace!

1999-10-07 Thread Rubén

On 1999/Oct/06, Andreas Beck wrote:

  screen blinking a lot, I think. Anyway, there is another bigger problem,
  IMO, that switching to kernel mode, copying data structures, and returning
  back into user mode, may be too much time, and maybe when the ioctl returns,
  you haven't enough time to copy your buffer.
 
 No. That shouldn't matter much. I once measured a full ioctl round trip on
 my old 486 to take 600 cycles. Given even that machine (486/66), this gives
 an extra delay of about 10 microseconds. Shouldn't be the crucial point.

If you are drawing objects with little polygons (i.e. a torus with
many polygon resolution), where each polygon can take 300 cycles or less,
it's very crucial, I think.
In fact, in 2D accel, I only draw hlines by hardware when they are
more than 100 pixels in length, because if they are shorter, drawing by
hardware is mucho more slow (because of the ioctl call?).
-- 
Come to GPUL http://ceu.fi.udc.es/GPUL
  _
 /_) \/ / /  email: mailto:[EMAIL PROTECTED]
/ \  / (_/   www  : http://programacion.mundivia.es/ryu
[ GGL developer ]  [ IRCore developer ]  [ GPUL member ]



Re: KGI_COMMANDS, was: Re: Doesn't need vertical retrace!

1999-10-06 Thread Jos Hulzink

On Tue, 5 Oct 1999, Andreas Beck wrote:

 
   struct kgi_3dtriangle {int x0,y0,z0,x1,y1,z1,x2,y2,z2};
   Comments please !
 
  I don't like this kind of 3dtriangle at all, it needs 9 copies of
  data to draw a triangle, maybe it's insignificant when you must call later
  ioctl, which surely eats up more cpu. But when you implement multiple
  commands (one call for each triangle is very slow), it will be more
  significant. I would propose this alternative:
 
  struct kgi_3dvertex { int x,y,z;};
  struct kgi_3dtriangle { kgi_3dvertex *v0, *v1, *v2; };
 
 The passing of pointers is undefined for KGI. It is not possible to
 transparently pass pointers across protection ring boundaries.
 
 Something similar would be possible, though by allowing a kind of "upload"
 of a vertex array that is accessed by (numeric) indexes later on.
 
 However I suppose the above call might be enough for the simple "common
 ioctl" layer. If you want to be really fast, you will need a card-specific
 communications layer anyway.

Which call ? :)

 
  And... int? Aren't there cards that use float?
 
 Good point. I do not know about cards that use float. Should be pretty rare,
 as floats are very expensive to handle and rarely needed, unless the card
 has an internal geometry processor. 

ViRGE uses some fixedpoint 16 bit value, that's all I know... The format
of this fixpoint value can be modified though (if anyone can tell me the
use of this...) so you can call it floating point :)

 However most cards allow for fixedpoint, as this is how they work
 internally. Would 16.16 fixedpoint be o.k. ?

Well... All I can say is that on my ViRGE I'd have to drop the fractional
part of the Z value, and create a signed fixpoint value of the integer
part... Still thinking if this would have consequences besides loss of
resolution. I know... I should buy another videocard... :)

Jos



Re: KGI_COMMANDS, was: Re: Doesn't need vertical retrace!

1999-10-06 Thread Andreas Beck

   struct kgi_3dtriangle {int x0,y0,z0,x1,y1,z1,x2,y2,z2};

   What about the exta fields (W, specular/diffuse color, texture
 coords, vertex fog)?  

Extra commands (i.e. DRAW3DTRIANGLE_TEXTURED, *_GORAUD). We'd need to take 
up too much bandwidth on PingPong or similar, if we always transfer the 
full set.

CU, ANdy

-- 
= Andreas Beck|  Email :  [EMAIL PROTECTED] =



Re: Doesn't need vertical retrace!

1999-10-04 Thread Jon M. Taylor

On Mon, 4 Oct 1999, Jos Hulzink wrote:

 On Fri, 1 Oct 1999, [iso-8859-1] Rubén wrote:
 
  Ah, ok, well, it does what I want, its enough for me. Anyway, I will
  continue reading docs and learning how to include vertical retrace support
  into KGIcon, it's best (I have readed the GGI tech. docs, and it seems to be
  a bit difficult).
 
 Actually, most code is already in the KGIcon ViRGE driver to support that.
 The ViRGE has completely functional vertical retrace interrupt code, but
 at the moment, nothing is done with it. Reason: I have not seen a
 mechanism to get the sync to userspace neatly. (Maybe some
 CHIP_WAITFORRECTRACE ioctl ? We never intended to be real-time, so some
 while not in retrace schedule (); can do the job here)

Perhaps we need a new class of exportable KGI data structures, which
can in turn be exported by fbcon-kgi.c's procfs code so that userspace can
select() on a file to wait for any interrupt the chip can generate?

Jon

---
'Cloning and the reprogramming of DNA is the first serious step in 
becoming one with God.'
- Scientist G. Richard Seed



Re: Doesn't need vertical retrace!

1999-10-04 Thread Jon M. Taylor

On Mon, 4 Oct 1999, [iso-8859-1] Rubén wrote:

 On 1999/Oct/04, Jos Hulzink wrote:
 
   into KGIcon, it's best (I have readed the GGI tech. docs, and it seems to be
   a bit difficult).
  
  The ViRGE has completely functional vertical retrace interrupt code, but
  at the moment, nothing is done with it. Reason: I have not seen a
  mechanism to get the sync to userspace neatly. (Maybe some
  CHIP_WAITFORRECTRACE ioctl ? We never intended to be real-time, so some
 
   Not, FB has now a good way of use it:
 [ /usr/include/linux/fb.h ]
 #define FB_ACTIVATE_VBL16   /* activate values on next vbl  */
 
   It can be *very* usefull to do page flipping. 

*Simple* page flipping, sure.  Basic double buffering.  But what
if you want to triple-buffer?  What if you have to use the vblank
interrupt to balance a FIFO or flush texture state in addition to buffer
swapping?  What is needed is full user and kernel level asynchronous
notification and kernel-only callback system.

Modern video chipsets can do quite a bit of stuff with interrupts,
hardware semaphores, AGP fault notification, etc.  It seems a shame to
restrict this wonderful generalized capability to the ancient idea of VGA
pageflip-on-vsync.  See my other post on 3D KGI ioctls for how I propose
to handle this in KGI.

 You still have the
 problem of dumping buffers into video memory, but at least, this is
 something... I understand that a waitretrace ioctl makes no sense.

It makes sense with manual userspace buffer flipping using VGA
splitline.  IIRC, that is what informed the original design.  However,
these days the vblank IRQ is just one of many sources of hardware RT
events that can come from a video card, and I do not see why it must be
treated differently or the async notification subsystem should not be
generalized.
 
   Well, let me imagine a bit. Applications has two main ways of
 writing his data into video memory, either with page-flipping (very easy to
 handle, I think) or with double (triple, ...) buffering, right, or I have
 *very* old ideas?. 

I have always considered the terms "pageflipping" and "double
buffering" to mean basically the same thing.

 So, _maybe_ it would be a good idea if kernel could
 handle and dump the memory buffer for you...
   You only call the FBIODUMP_BUFFER ioctl, passing as parameter, a
 structure like this:
 
 struct kgi_buffer{
   void *buffer;
   uint16 x0, y0;
   uint16 width, height;
 }
 
 and the buffer is dumped when it's possible, getting the proccess locked
 until this is done. 

Blocking I/O sucks and should be avoided like the plague wherever
possible, especially in quasi-RT situations like this.

 In some cards it it could be done with HW accel by AGP?

Yes, on some cards you can schedule pageflips on vsync.  On lots
of cards you cannot, but the vsync IRQ is still available.  You will drown
in a mess of spacial cases if you don't abstract this stuff carefully.

   It's a very simple idea, and surely more people thought in it
 before, what's the problem with it?

The problem is that it is a card-specific feature.  Card-specific
features must be abstracted carefully in order to be able to support al
features of all cards while not losing performance.  In particular,
though, exporting random asynchronous callback hooks from the kernel is a
real PITA unless you have something like devfs or procfs where you can
dynamically create files at will.  We have this now, so I figure we might
as well do it right.
 
   I know that the idea of kernel doing it wouldn't like many people,
 and I know that in some systems it isn't possible to dump big buffers in a
 vertycal retrace/blank, but I can't see a best solution.

Dynamic asynchronous notification selector files.
 
 I'm still porting 3D routines from DOS to Linux, I have finished
   flat and gouraud, and will start with zbuffer. And a Cube is the easyest
   way, IMHO, for finding bugs in the ported routines, it isn't the demo itself
   :-D
  
  You're talking about software emulation of the functions you describe or
  hardware support ?
 
   Both. In the port I'm using extensive the hardware accel if possible
 (is for demos, I _may_ use as much resources as available)

On new cards, you cannot really squeeze performance optimizations
unless you do extensive run-time balancing of the hardware state, and
without a good system to get the hardware info to the kernel and/or
userspace quickly much of the benefits will be lost.
 
  Want to help writing 3D accelleration in the KGIcon S3 Virge driver ? 
 
   Yes, I really want, where can I start searching the HW specs of the
 3D commands?

See the draft proposal for 3D kgicommands that was recently
posted.  Also read the techincal documentation on whatever video chipset
you want to work with.

Jon

---
'Cloning and the reprogramming of DNA is the first serious step in 
b