Re: [Dri-devel] Maya Testing on New Radeon Driver

2002-04-10 Thread Keith Whitwell

Ian Romanick wrote:
 
 On Tue, Apr 09, 2002 at 06:13:30PM -0600, Brian Paul wrote:
  Ian Romanick wrote:
 
   The issue is that Maya is requesting at least 23-bits of Z-buffer.  In
   16-bit mode, we only have 16-bits of Z-buffer, so it can't find a visual.
 
  Just for grins you could hack glXChooseVisual so that a 16-bit Z buffer
  satisfies the request for 23 bits.
 
 Heh...I think I'd like to see it work properly with the TCL driver before I
 go do any hacking like that. :)
 
   On the Radeon, is depth bpp == color bpp a hardware limitation or just the
   way the drivers were done?
 
  I think it's a limitation of some hardware, but not Radeon (IIRC).
 
  In general, it's kind of natural to increase Z buffer depth with color
  depth.  If an application cares about color precision then it probably
  cares about Z precision too (DCC).  Similarly, an app that isn't sensitive
  to color precision probably isn't sensitive to Z precision either (some
  games).
 
 Fair enough.  The only exception that I can think of with existing hardware
 is the case where you want 16-bit framebuffer to save memory for textures
 and you want a stencil buffer (which typically requires 24-bits of Z-buffer
 on e.g. Radeon and G400).

I believe you can get 15+1 depth/stencil, but then you loose even more depth
and you still have to write some code.

Keith

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Dri-devel] Mesa MMX blend code finished

2002-04-10 Thread Sergey V. Udaltsov

 I've finally ( hopefully) finished the rewrite of Mesa's MMX blend code.
Is it already in binary snapshots?

Cheers,

Sergey

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Dri-devel] Mesa MMX blend code finished

2002-04-10 Thread José Fonseca

On 2002.04.10 09:03 Sergey V. Udaltsov wrote:
  I've finally ( hopefully) finished the rewrite of Mesa's MMX blend
 code.
 Is it already in binary snapshots?
 
 Cheers,
 
 Sergey
 

Nope. It's really a small drop in the ocean so there is no need to rush. I 
hope Brian will integrate on the mesa trunk soon. This way there are less 
places to fix eventual bugs.

Regards,

José Fonseca

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



[Dri-devel] radeon-20020409-i386-Linux

2002-04-10 Thread Martin Spott

Wohooo, now I see clouds in FlightGear even _with_ TCL enabled. And it's
pretty fast - even at 1600x1024 (Apple Cinema Display) I still get 45 fps.

There appears to be another issue (besides the somewhat dark window content)
I don't recognize with SuSE's 4.2.0 driver modules: Screen updates appear
not to happen on a linear time scale on large resolutions. Maybe the effect
is still there on a 'small' 800x600' window but the amount of data that has
to be moved is much smaller so the effect is hardly recognized.

The scenery that moves underneath is moving at different speed - not only
dependent on the actual airspeed of my plane. It looks like the plane would
move somewhat slower for half a second and another half second it moves a
little bit faster so the picture is in sync with the simulated reality
about everey second.
Does anyone understand what I'm talking about ? It's a little bit difficult
for me to express this 'feature' in a foreign language 

I'll do a couple of test runs these days with different driver modules,
with/without TCL and different colour depth,

Martin.
-- 
 Unix _IS_ user friendly - it's just selective about who its friends are !
--

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Dri-devel] Mesa MMX blend code finished

2002-04-10 Thread Brian Paul

José Fonseca wrote:
 
 On 2002.04.10 09:03 Sergey V. Udaltsov wrote:
   I've finally ( hopefully) finished the rewrite of Mesa's MMX blend
  code.
  Is it already in binary snapshots?
 
  Cheers,
 
  Sergey
 
 
 Nope. It's really a small drop in the ocean so there is no need to rush. I
 hope Brian will integrate on the mesa trunk soon. This way there are less
 places to fix eventual bugs.

I've checked it into Mesa CVS, both the trunk and the mesa_4_0_branch
in case there's a 4.0.3 release.

-Brian

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Dri-devel] Mesa MMX blend code finished

2002-04-10 Thread Brian Paul


José,

I've checked in the code after testing with Glean and the OpenGL conformance
tests.

Was I supposed to change something in the C code?  It passes the conformance
tests as-is.

Thanks for you work!

-Brian

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Mesa3d-dev] Re: [Dri-devel] Mesa software blending

2002-04-10 Thread Ian Romanick

On Tue, Apr 09, 2002 at 06:29:54PM -0700, Raystonn wrote:

 First off, current market leaders began their hardware designs back when the
 main CPU was much much slower.  They have an investment in this technology
 and likely do not want to throw it away.  Back when these companies were
 founded, such 3d rendering could not be performed on the main processor at
 all.  The computational power of the main processor has since increased
 dramatically.  The algorithmic approach to 3d rendering should be reexamined
 with current and future hardware in mind.  What was once true may no longer
 be so.
 
 Second, if a processor intensive algorithm was capable of better efficiency
 than a bandwidth intensive algorithm, there is a good chance these
 algorithms would be movd back over to the main CPU.  If the main processor
 took over 3D rendering, what would the 3D card manufacturers sell?  It would
 put them out of business essentially.  Therefore you cannot gauge what is
 the most efficient algorithm based on what the 3D card manufacturers decide
 to push.  They will push whatever is better for their bottom line and their
 own future.

I'm getting very tired of this thread.  If modern CPUs are s much better
for 3D, then why does Intel, of all companies, still make its own 3D
hardware in addition to CPUs?!?  If the main CPU was so wonderful for 3D
rendering, Intel would be all over it.  In fact, they tried to push that
agenda once when MMX first became available.  Remember?  Had it come out
before the original Voodoo Graphics, things might have been different for a
time.

You know what they found out with all of the hundreds of millions of dollars
they spent?  Dedicated hardware still does it faster and cheaper.  Period.
It's just like writing a custom routine to sort an array will pretty much
always be faster than using the generic qsort.  When you hand-tune for a
specific data set you will always get better performance.  This is not to
say that the generic implementation will not perform well or even acceptably
well, but only to say that it will never, ever, ever perform better.

-- 
Tell that to the Marines!

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Mesa3d-dev] Re: [Dri-devel] Mesa MMX blend code finished

2002-04-10 Thread José Fonseca

On 2002.04.10 17:42 Brian Paul wrote:
 
 José,
 
 I've checked in the code after testing with Glean and the OpenGL
 conformance
 tests.
 

Great.

 Was I supposed to change something in the C code?  It passes the
 conformance tests as-is.
 

I was surprised that the C code passed the conformance tests, because of 
the signed arithmetic it doesn't give the same results as before. So I've 
made a small comparision with the several methods (test program attached):

// Nathan's method - unsigned 24bit arithmetic
// NOTE: this was the original Mesa code
t1 = p*a + q*(255 - a);
s1 = (t1 + (t1  8) + 256)  16;
 
// Nathan's method - signed 24bit arithmetic (less one multiply)
// NOTE: this is how I changed and is now
t2 = (p - q)*a;
s2 = (t2 + (t2  8) + 256)  16;
s2 += q;
s2 = 0xff;
 
// Blin's method - unsigned 16bit arithmetic
// NOTE: is exact
t3 = p*a + q*(255-a) + 128;
s3 = (t3 + (t3  8))  8;
 
// Blin's method - signed 16bit arithmetic (less one multiply)
// NOTE: is exact because the negative sign is considered
t4 = ((p - q)*a + (p  q ? 128 : -128))  0x;
s4 = (t4 + (t4  8))  8;
s4 += q;
s4 = 0xff;

When one compares with the exact result

// exact result - rounded
s = (unsigned) (((double)p)*(((double)a)/255.0) + 
((double)q)*(1.0-((double)a)/255.0) + 0.5);

one gets:

1: 8164890 differences in 16777216
2: 8148697 differences in 16777216
3: 0 differences in 16777216
4: 0 differences in 16777216

So spite of the different results between 1 and 2, 2 gives better results 
overall!!

What happens is that method 1 is aimed to follow the truncated results and 
not the rounded. If one compares with the truncated result

// truncated result
s = (unsigned) (((double)p)*(((double)a)/255.0) + 
((double)q)*(1.0-((double)a)/255.0));

one gets:

1: 15467 differences in 16777216
2: 31660 differences in 16777216
3: 8180357 differences in 16777216
4: 8180357 differences in 16777216

Notice that, by this point of view, the method 2 is indeed worst, but this 
really doesn't matter because is the wrong point of view.

This explains why the current C code passes the conformance tests.

At this moment the MMX code implements method 4, which is very fast. There 
is no point in implement method 2, spite being a little faster than method 
4 (because of the simpler rounding) because it would requite 24bit 
arithmetic instead of 16, so less numbers could be multiplied at the same 
time.

So, in contrary of what I thought, there is no need to switch to method 1. 
When I implement the double blend trick I will have to use the method 4, 
again for the same reasons of above.

But since the specs give some tolerance it would be nice to run the 
conformance tests with different settings in mmx_blend.S, specially the 
single multiply w/o rouding which would give at least 5% improvement (it 
will be a little more because it would allow to free some registers 
allowing to leaving some necessary constants there).

For that is just necessary to change

#define GMBT_ROUNDOFF   0

leaving the rest as before

#define GMBT_ALPHA_PLUS_ONE 0
#define GMBT_GEOMETRIC_SERIES   1
#define GMBT_SIGNED_ARITHMETIC  1

Using the alpha+1 method and not using the geometric series would be the 
even faster but it is already marked on the C code as rejected by glean...

 Thanks for you work!
 
 -Brian
 

Regards,

José Fonseca


#include stdio.h
#include stdlib.h

int main()
{
	unsigned short p, q, a;
	unsigned c1 = 0, c2 = 0, c3 = 0, c4 = 0;
	
	for (p = 0; p = 255; ++p)
	for (q = 0; q = 255; ++q)
	for (a = 0; a = 255; ++a)
	{
		unsigned s;
		unsigned s1, s2, s3, s4;
		unsigned t1, t2, t3, t4;

#if 1
		// exact result - rounded
		s = (unsigned) (((double)p)*(((double)a)/255.0) + ((double)q)*(1.0-((double)a)/255.0) + 0.5);
#else
		// truncated result
		s = (unsigned) (((double)p)*(((double)a)/255.0) + ((double)q)*(1.0-((double)a)/255.0));
#endif

		// Nathan's method - unsigned 24bit arithmetic
		t1 = p*a + q*(255 - a);
		s1 = (t1 + (t1  8) + 256)  16;
		
		// Nathan's method - signed 24bit arithmetic
		t2 = (p - q)*a;
		s2 = (t2 + (t2  8) + 256)  16;
		s2 += q;
		s2 = 0xff;
		
		// Blin's method - unsigned 16bit arithmetic
		// NOTE: is exact
		t3 = p*a + q*(255-a) + 128;
		s3 = (t3 + (t3  8))  8;
		
		// Blin's method - signed 16bit arithmetic
		// NOTE: is exact because the negative sign is considered
		t4 = ((p - q)*a + (p  q ? 128 : -128))  0x;
		s4 = (t4 + (t4  8))  8;
		s4 += q;
		s4 = 0xff;
		
		if(s1 != s) ++c1;
		if(s2 != s) ++c2;
		if(s3 != s) ++c3;
		if(s4 != s) ++c4;
		if (s1 != s || s2 != s || s3 != s || s4 != s)
		{
//			printf(%3ux%3ux%3u:\t(%3u)\t%3u\t%3u\t%3u\t%3u\n, p, a, q, s, s1, s2, 

Re: [Dri-devel] Mesa MMX blend code finished

2002-04-10 Thread José Fonseca

On 2002.04.10 11:41 Sergey V. Udaltsov wrote:
  Nope. It's really a small drop in the ocean so there is no need to
 rush. I
  hope Brian will integrate on the mesa trunk soon. This way there are
 less
  places to fix eventual bugs.
 I see. Actually, AFAIU it would be not exactly mach64 snapshot but
 rather libGL shapshot (since it is about indirect rendering speedup). Am
 right?
 

Not really. It would speed up the indirect rendering and the mach64 when 
it fallbacks to software, which doesn't happens so often yet because we're 
not really striving for opengl conformance *yet*.

 Looking forward to hear some news from mach64 front.
 
 Regards,
 
 Sergey
 

José Fonseca

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



[Dri-devel] dri and gatos?

2002-04-10 Thread Steven Paul Lilly

Hi! Is it possible to have both the video features of my aiw radeon
working and the latest (tcl) 3d working at the same time? The two don't
seem to want to work at the same time for me.

Steven


___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



[Dri-devel] Savage 4 thorough search results

2002-04-10 Thread José Fonseca

After sending a couple of emails to S3 Graphics and no reply whatsoever to 
date I made a thorough search for any Savage 4 related documentation in 
the web. Although I didn't found no holy grail I did found this link:

http://www.geocrawler.com/archives/3/4856/2001/1/0/5043174/

This SDK includes the header for the MeTaL API - a rather low level API 
for interfacing with the Savage chip based cards. With this and perhaps a 
little reverse engineering of some MeTaL drivers it should be sufficient 
to get us started when the time comes.

This SDK also has quite a deal of examples, code and tools regarding S3TC. 
In fact also on the same thread there are some links to information about 
S3TC:

http://www.geocrawler.com/archives/3/4856/2001/1/0/5032050/
http://www.geocrawler.com/archives/3/4856/2001/1/0/5032122/

and here some assorted information too:

http://www.ee.ed.ac.uk/~cdh/publications/dissertation.pdf

Perhaps in the XFree86 private ftp there is something usefull too. If yes 
it's just a question of becoming a XFree86 developer. Could some of the 
XFree86 developers tell if there is?


I confess that I find all this [of trying to overcome the lack of 
documentation] a little sad. I would be much happier if there was some 
kind of feedback from the vendors, even if just a negative [and funded] 
answer..!

 From what I've understand it seems that S3's IP have been by so many 
hands (Diammond, Sonicblue, Via) that they've fallen a little in nowhere 
land [or at least nobody cares land]. This is reflected, for instance, in 
the drivers for other OS, where most of the card manufactures (e.g., 
Creative Labs) have stopped supporting it, being very difficult to find 
good quality drivers for the recent versions of Windows. This was 
something I had hope to change in Linux but I really don't know...


José Fonseca

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



[Dri-devel] DRM causes video lock on virtual console switching

2002-04-10 Thread K. Petersen

I have come upon a reproducible lockup on my system when switching from a
console virtual terminal to X.  It can be produced as follows:

Begin X
Switch back to a virtual console
Switch back to X

This causes the X display to appear on the screen, with a strip along the
top deformed.  In this state, the X server is unresponsive to
ctrl-alt-backspace, and I am unable to switch to any other virtual
terminals.  However, I can still log in remotely.
Using alt-sysreq-k, the display then changes colors, to what I believe
might be a lower color palatte, but cannot be certain of.  After this, I
am then able to switch virtual terminals, log in as root, run mode3, and
have the display restored.
At this point, vc/7 has not been restored to an unused state.  Instead
there is any text I typed while the display was locked, followed by a
blinking cursor.
If I now attempt to restart X, the system will lock completely.  The
monitor will go into power saving mode, the keyboard is unresponsive, and
I am unable to log in remotely.

I hope this has been a reasonably thorough description of the problem, now
my hardware and software configuration.

Hardware:
ATI Radeon 64DDR
AMD AthlonMP in SMP configuration
AMD 760MP chipset on a Tyan S2460 Motherboard

Software:
Linux kernel 2.4.18
Latest DRI cvs, with kernel module from the dri

This problem has occurred since X410, and kernel versions back at least
through 2.4.16, both with kernel DRM, and the DRM provided with the DRI.
I should also note that this only occurs when DRI is enabled.  There is no
problem switching between X and other virtual terminals when DRI is
disabled.

I believe this to be the correct forum for this issue, but if it is not,
then feel free to forward me to the linux kernel lists or whatever's more
appropriate.

I thank you in advance for your assistance,

--Kalen Petersen


___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Dri-devel] new Radeon DRI driver binaries not compatible with SuSE

2002-04-10 Thread Michael

On Tue, Apr 09, 2002 at 02:38:43PM +0200, Martin Spott wrote:
 Nah, not everything 
 I see sudden lockups, mostly on displaying complex structures, i.e. in
 FlightGear when enabling panel display.
 This is obviously a graphics card lockup. The picture freezes but I can log
 into the machine via network. When I restart the X server also the rest of
 the machine gets locked up and my network connection freezes,

If you can ssh in, can you run the program from gdb from a ssh session?

export DISPLAY=:0
gdb `which fgfs`
run

and see if you get 'Error could not get dma buffer... exiting' when your
main machine locks up? If not, you might be able to ctrl-c and get a bt
at the point the main machine locks. Don't get the sack though ;)

I've just noticed this doing the above in rtcw. I haven't really looked
further yet, but it's possibly a subtle bug in the new maos stuff (or
the drm discard changes) losing buffers somewhere...I note a similar
theme where a lot needs to be happening on screen to trigger it which
did make me wonder if the buffers are all genuinely in use?
 
-- 
Michael.

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



[Dri-devel] Mach64 PCI support added to 2D driver

2002-04-10 Thread José Fonseca

I've just commited support for PCI only Mach64 cards to the 2D driver. It 
should recognize a PCI card and allocate memory from the PCIGART. There is 
still missing support for PCI operation of an AGP card when agpgart isn't 
available - not much really - just a question of book-keeping so it will 
be done shortly.

I'm still not sure if everything is in place for DRI on Mach64 PCI cards. 
I'm in the process of updating and recompiling the kernel to have agpgart 
as a module to be able to check it myself.

Regards,

José Fonseca

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel



Re: [Dri-devel] dri and gatos?

2002-04-10 Thread Jens Owen

Steven Paul Lilly wrote:
 
 Hi! Is it possible to have both the video features of my aiw radeon
 working and the latest (tcl) 3d working at the same time? The two don't
 seem to want to work at the same time for me.

I don't believe the Gatos and TCL functionality have been merged, yet. 
They have a special version of the Radeon DRM as do we.  We haven't, yet
pushed the TCL functionality to the main branch of the DRI.

Regards,
Jens

-- /\
 Jens Owen/  \/\ _
  [EMAIL PROTECTED]  /\ \ \   Steamboat Springs, Colorado

___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel