Re: In pursuit of Dead Wild Cat

2008-05-16 Thread Colin Piggot
Thomas wrote:
 I'm not doing too terribly (see
 http://www.youtube.com/watch?v=hcMiB1ZkukM
  for my attempt at a Cobra Mk 3 versus the Spectrum original),

Great work!


 Dead Wild Cat

Never looked at the disassembly of that, but can remember the distorted
perspective when it was running.

Colin
=
Quazar : Hardware, Software, Spares and Repairs for the SAM Coupe
1995-2008 - Celebrating 14 Years of developing for the SAM Coupe
Website: http://www.samcoupe.com/



Re: In pursuit of Dead Wild Cat

2008-05-16 Thread Thomas Harte
In fairness, I'm not completely comparing like for like in that video.

I am doing realistic 3d engine stuff of creating one view matrix to
represent the orientation and location of the camera, a separate
transformation matrix to represent the orientation and location of the
model, then composing the two in order to transform and project the
geometry.

But, compared to Elite I am using Euler angles to represent
orientation. That means that I regenerate both matrices from scratch
every single frame. Elite is more likely to be maintaining an
orientation matrix for each object and adjusting it every frame, which
means substantially more work there but allows the Elite sort of
rotation and movement. So I'm not doing some expensive per-object
stuff that Elite is.

Also, I'm still not quite yet doing any vector clipping to the
viewport. So I'm not yet paying for some cheap extra tests and some
not-so-bad linear algebra whenever a line crosses a plain.

However, I do have several avenues of future optimisation to pursue,
some algorithmic (mostly to do with mirroring), some machine level (I
reckon I can cut 20-30% off the cost of all my multiplies but it means
changing number representation slightly).

I almost certainly am not actually going to implement anything very
much like Elite, as I've not only seen the original source code but
was also a listed contributor to Elite: The New Kind, the direct C
conversion of the 6502 original that David Braben exercised his
legitimate copyright powers to remove from the 'net. So, legally, I'd
just be asking for trouble.

The key point is: even when I do all that, I'll still be at best 50%
as fast as Dead Wild Cat. And I'm still curious as to why.

On Fri, May 16, 2008 at 10:04 AM, Colin Piggot [EMAIL PROTECTED] wrote:
 Thomas wrote:
 I'm not doing too terribly (see
 http://www.youtube.com/watch?v=hcMiB1ZkukM
  for my attempt at a Cobra Mk 3 versus the Spectrum original),

 Great work!


 Dead Wild Cat

 Never looked at the disassembly of that, but can remember the distorted
 perspective when it was running.

 Colin
 =
 Quazar : Hardware, Software, Spares and Repairs for the SAM Coupe
 1995-2008 - Celebrating 14 Years of developing for the SAM Coupe
 Website: http://www.samcoupe.com/




Re: In pursuit of Dead Wild Cat

2008-05-16 Thread David Brant
The DWC demo is about 450K file, and I believe that a lot of this is tables 
for multiplying and dividing etc. You say that your divide routine costs 
2000 cycles is this each time you divide? How fast is the multiply routine?


Dave

- Original Message - 
From: Thomas Harte [EMAIL PROTECTED]

To: sam-users@nvg.ntnu.no
Sent: Thursday, May 15, 2008 10:48 PM
Subject: In pursuit of Dead Wild Cat


As I previously said, and as mentioned in the current Sam Revival, I'm
experimenting with 3d on the Sam. At the minute I'm just playing with
vector graphics, since they move reasonably quickly, making it easy to
observe problems with the algorithms.

I'm not doing too terribly (see http://www.youtube.com/watch?v=hcMiB1ZkukM
 for my attempt at a Cobra Mk 3 versus the Spectrum original), but
I'm intimidated by the revelation by Marc Broster on this list 12
years ago that with my dead wild cat demo I had about 70-80 points
being calculated in 3D with lines being plotted in 25fps.

70 points at 25 fps would be just over 3,428 cycles/point, even if
line drawing were completely free and memory was uncontended. Even my
divide routine costs something like 2,000 cycles. Part of it seems to
be that the perspective in Dead Wild Cat isn't correct — when objects
transition in and out they are obviously zooming rather than actually
moving (because the relative perspective of points doesn't change
correctly; though it's mostly hidden by the fact that many of the
objects have a flat front), and moving them around manually shows some
very odd effects. But even eliminating the divide in favour of some
weird limited range (i.e. unsuitable for a real game) table approach
doesn't account for everything.  I would imagine that even when I've
pulled out all the stops, I'd still be spending at least 7,000 to
8,000 (pencil calculated, uncontended and hence unrealistic) cycles to
process each individual point all the way from world space to a
location on screen.

So, is Marc on the list? If not, has anyone done any significant
disassembly on the demo? I've been told before that most of RAM is
given over to pre-calculated line drawing routines. I'm only spending
something like 15% of my time on drawing so it isn't the major
concern. Is there anything else that can be learnt from it that isn't
essentially limited to doing small objects with local perspective?= 



Re: In pursuit of Dead Wild Cat

2008-05-16 Thread Thomas Harte
Yep, I think I'm on approximately 2000 cycles for every divide. There  
are no special cases and no tables are used. That number was arrived  
at through the Sim Coupe debugger, so shouldn't be too inaccurate,  
though it obviously won't be completely dependable because of the  
usual RAM access timing issues.


I have two separate types of multiply. Both deal with multiplying 8.8  
numbers, but one only pays attention to the bottom 10 bits of both  
numbers. I could adjust the other one to only use the bottom 10 bits  
of one of the numbers but I haven't yet.


The 16x16 multiply uses no tables whatsoever and costs between 700 and  
1200 cycles depending on RAM timings. The 10x10 multiply uses a 4kb  
table and costs between 270 and 400 cycles. By changing my number  
format slightly and inlining those, I think I can chop the 10x10s down  
to fewer than 200 paper cycles. The vast majority of my multiplies are  
10x10s, and the proportion goes up as models become more detailed,  
which hopefully will occur if everything else speeds up.


So far I've been a good functional programmer and put lots of in nice  
calable places. I think I'm spending 2–3% of my processing time on  
call/ret pairs. Just inlining the 10x10 multiplies would eliminate  
most of that.


Divides would be nice to optimise, especially once I'm line clipping,  
but they occur very infrequently compared to multiplies, so are not so  
much of a worry.


If you accept that the Cobra Mk 3 is of a similar level of detail to  
the objects in DWC then I guess the most relevant number to compare is  
frame rate. My program goes between 11 and 25 fps when the object is  
approximately screen sized, depending which side you're looking at and  
therefore how many points actually need to be processed. My program is  
currently 12.5kb in size, but I'm being quite wasteful in some areas.  
I have 4kb of sine and cosine tables for example, which is just silly.  
Though part of my general idea to speed up 10x10 multiplies involves  
doubling the size of that table.


Re: mirroring, I'm still thinking about how I want to do that in my  
head. At the minute I have a fairly traditional data structure that  
stores a separate location for each vertex and fairly traditional code  
that transforms and projects each of those separately, on demand.  
Since that object and indeed all but one of the Elite objects (so as I  
recall) is symmetrical, I could easily cut a whole bunch of multiplies  
there without much recoding. A more radical idea is to have each point  
index tables, e.g. so that instead of saying that a point is at (10,  
128, 30), it says that a point is at (vec1, -vec2, vec3) and the  
various vectors are calculated once for the model then summed and/or  
negatived to get the location of each vertex. That'd cut down even  
more on the number of multiplies required and take better advantage of  
symmetry across many more axes, but I'm not sure that the  
administrative costs wouldn't be more troublesome than is worthwhile  
if multiplies really are going to significantly drop in cost.


Incidentally, having watched DWC again, I have the feeling that Marc  
counted the stars in his claim of 70-80 points. Which is silly since  
you really don't need to put them through a real 3d transformation/ 
project.


On 16 May 2008, at 17:58, David Brant wrote:

The DWC demo is about 450K file, and I believe that a lot of this is  
tables for multiplying and dividing etc. You say that your divide  
routine costs 2000 cycles is this each time you divide? How fast is  
the multiply routine?


Dave

- Original Message - From: Thomas Harte [EMAIL PROTECTED] 


To: sam-users@nvg.ntnu.no
Sent: Thursday, May 15, 2008 10:48 PM
Subject: In pursuit of Dead Wild Cat


As I previously said, and as mentioned in the current Sam Revival, I'm
experimenting with 3d on the Sam. At the minute I'm just playing with
vector graphics, since they move reasonably quickly, making it easy to
observe problems with the algorithms.

I'm not doing too terribly (see http://www.youtube.com/watch?v=hcMiB1ZkukM
for my attempt at a Cobra Mk 3 versus the Spectrum original), but
I'm intimidated by the revelation by Marc Broster on this list 12
years ago that with my dead wild cat demo I had about 70-80 points
being calculated in 3D with lines being plotted in 25fps.

70 points at 25 fps would be just over 3,428 cycles/point, even if
line drawing were completely free and memory was uncontended. Even my
divide routine costs something like 2,000 cycles. Part of it seems to
be that the perspective in Dead Wild Cat isn't correct — when objects
transition in and out they are obviously zooming rather than actually
moving (because the relative perspective of points doesn't change
correctly; though it's mostly hidden by the fact that many of the
objects have a flat front), and moving them around manually shows some
very odd effects. But even eliminating the divide in favour of some
weird limited 

Re: In pursuit of Dead Wild Cat

2008-05-16 Thread Edwin Blink


From: Thomas Harte

... dead wild cat demo ...

I'm don't remember this demo Where can it be found ?

BTW If you need some help with optimizing your (multiply/devide) code.
I'm always in for some byte/T-state banging :-)

Edwin


Re: In pursuit of Dead Wild Cat

2008-05-16 Thread Thomas Harte
It's originally on Fred 50, but not really in the sense that you have  
to expand it to another floppy disk and then boot it off that. But,  
here, I'll save you the effort of all that:


http://members.allegro.cc/ThomasHarte/temp/DWC.DSK

The more I watch it, the more I become convinced that it's  
sufficiently much of a demo technique as not to be usable in a full  
game — the perspective is too far off for separate objects to appear  
to move correctly relative to each other. But then, I want to believe  
that so I'm probably not being at all critical.


On 16 May 2008, at 21:49, Edwin Blink wrote:



From: Thomas Harte

... dead wild cat demo ...

I'm don't remember this demo Where can it be found ?

BTW If you need some help with optimizing your (multiply/devide) code.
I'm always in for some byte/T-state banging :-)

Edwin




RE: In pursuit of Dead Wild Cat

2008-05-16 Thread Adrian Brown
If i remember correctly alot of divides in 3d systems on older platforms
can actually be a nice big table ;)

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Edwin Blink
Sent: 16 May 2008 22:00
To: Adrian
Subject: Re: In pursuit of Dead Wild Cat


From: Thomas Harte

... dead wild cat demo ...

I'm don't remember this demo Where can it be found ?

BTW If you need some help with optimizing your (multiply/devide) code.
I'm always in for some byte/T-state banging :-)

Edwin