Re: [Xpert]Using MMX assembly (for video card drivers)

2002-01-11 Thread Ewald Snel

[...]

> BTW - does anyone know why the mga driver internally converts to 422
> format ? It seems to me that mga 400 and 450 chips do support 420
> planar format... (I saw some sample code using it, I can probably find
> it back if needed). I think XFree would benefit from using this
> feature instead of converting to nonplanar 422.

I also wrote a patch for this several months ago (even before XFree86-4.1.0).
If you're interested, I've uploaded it here :

http://rambo.its.tudelft.nl/~ewald/XFree86-4.0.99.3-mga-xv-planar-data.patch

It's about 13% faster decoding DVD movies on a PII-350 using planar format 
instead of converting to YUY2. Unfortunately, the Matrox hardware is not 
capable of filtering the chrominance component in vertical direction, so you 
can't have that at the same time.

> Cheers,

bye,

ewald
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]Using MMX assembly (for video card drivers

2002-01-11 Thread Ewald Snel

> At 11:26 AM 4/01/02 +0100, Ewald Snel wrote:

(sorry for the duplicate message, it was delayed for one week (see date))

[...]

> It would be interesting to see if the same could be achieved with 3DNow!
> instructions, as this would provide a welcome boost for anyone with an AMD
> K6-2 or K6-3 or any of the other 3DNow! capable CPU's. I'm sure there are

Using MMX will benefit any CPU capable of MMX instructions, including AMD K6, 
K6-2, K6-3 and Athlon/Duron processors. That's why I did not use SSE or 
3DNow!.

> also a number of other platforms that could use in-line assembly to do the
> same (eg: PPC/Altivec).
>
> Out of interest, how much in-line assembly code are you referring to?
> Anywhere some of us can get a look-see?

Here's an image of what it looks like ...
http://rambo.its.tudelft.nl/~ewald/xfree86-chrominance-filter.jpg

And here are some patches ...
http://rambo.its.tudelft.nl/~ewald/

bye,

ewald
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



[Xpert]Using MMX assembly (for video card drivers)

2002-01-10 Thread Ewald Snel

Hi,

Could I use MMX assembly for improving the mga video driver? I wrote a 
vertical chrominance filter (*) for the XVideo module using inline MMX 
assembly. This allows me to improve output quality without any speed penalty.

Of course, I'm using "#ifdef USE_MMX_ASM" and the original C code as an 
alternative for other CPU architectures. Runtime detection of MMX support is 
not included yet, but will be added if MMX is allowed.

Thanks in advance,

ewald

(*) This fixes red blockiness (2x2 pixels) for DVD/MPEG movies (attachment).

<>

Re: [Xpert]Using MMX assembly (for video card drivers)

2002-01-04 Thread Ewald Snel

Hi,

[...]

> > Something like that, the filter uses 0.75x nearest chrominance sample
> > and 0.25x second nearest chrominance sample. This is more accurate as
> > it doesn't shift the chrominance signal by 1 pixel.
>
>   Please, please correct me if I'm wrong here.  In MPEG sampling, the
> chrominance sample is halfway between the two luminance samples on the
> same vertical scanline (by is138182):

I think you're right, my interpolation looks like this :

o   o   (c=.75*c1 + .25*c0)
 c1
o   o   (c=.75*c1 + .25*c2)

o   o   (c=.75*c2 + .25*c1)
 c2
o   o   (c=.75*c2 + .25*c3)

[...]

>   So, are not the chroma samples above and below the same distance away?
> I thought this was the purpose of MPEG sampling, that is, it's
> reasonable to convert to 4:2:2 sampling by doubling the scanlines.

It's reasonable, but doubling the scanlines will make the image look a little 
blocky as both scanlines use the same chrominance values. That's why you 
should use filtering.

>   Are you sure that maybe the images where you see that nasty chroma
> artifact aren't from when the DVD is using interlaced encoding?  In this
> case, each second chroma sample is from a different field, and you can
> get blocky errors because you don't correllate samples correctly.

The source was a non-interlaced MPEG-1 video file. The red blocks are very 
small for (high resolution) DVD movies, but they are still visible.

>   What do you mean by shifting the chroma by one pixel?

It's actually 0.5 pixel (my mistake :)) using the following filter :

o   o   (c=c1)
 c1
o   o   (c=.5*c1 + .5*c2)

o   o   (c=c2)
 c2
o   o   (c=.5*c2 + .5*c3)

bye,

ewald
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]Using MMX assembly (for video card drivers)

2002-01-04 Thread Ewald Snel

Hi,

> > I wrote a vertical chrominance filter (*) for the XVideo module using
> > inline MMX assembly. This allows me to improve output quality without
> > any speed penalty.
>
>   Do you mean for upsampling to 4:2:2 ?  How do you filter?  Do you
> average to create the new chroma line?

Something like that, the filter uses 0.75x nearest chrominance sample and 
0.25x second nearest chrominance sample. This is more accurate as it doesn't 
shift the chrominance signal by 1 pixel.

Here are the patches, the second one is for enabling the horizontal filtering 
in hardware:

http://rambo.its.tudelft.nl/~ewald/XFree86-4.1.99.4-mga-xv-mmx-chromafilter.patch
http://rambo.its.tudelft.nl/~ewald/XFree86-4.2.0-mga-xv-uvfilter.patch

These are not paired for Pentium MMX, but performance is already better than 
the C version (which compiles to slow "movzx" instructions). It's nearly 
optimal for AMD Athlon though (about 2 IPC using L1-cache).

bye,

ewald
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



[Xpert]Using MMX assembly (for video card drivers)

2002-01-04 Thread Ewald Snel

Hi,

Could I use MMX assembly for improving the mga video driver? I wrote a 
vertical chrominance filter (*) for the XVideo module using inline MMX 
assembly. This allows me to improve output quality without any speed penalty.

Of course, I'm using "#ifdef USE_MMX_ASM" and the original C code as an 
alternative for other CPU architectures. Runtime detection of MMX support is 
not included yet, but will be added if MMX is allowed.

Thanks in advance,

ewald

(*) This fixes red blockiness (2x2 pixels) for DVD/MPEG movies
http://rambo.its.tudelft.nl/~ewald/xfree86-chrominance-filter.jpg
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



[Xpert][patch] Matrox XVideo chrominance filtering (horizontal)

2001-12-23 Thread Ewald Snel

Hi,

For some reason, this hardware feature has been disabled in the previous 
release of XFree86 (4.1.0). I don't know if there was a good reason for this 
(don't think so), but videos look better with this bit on.

It will eliminate "red blockiness" in the horizontal direction. Vertical 
filtering should be done in software (patching MGACopyMungedData).

bye,

ewald

--- XFree86-4.2.0/xc/programs/Xserver/hw/xfree86/drivers/mga/mga_video.c.orig	Sun Dec 23 11:36:19 2001
+++ XFree86-4.2.0/xc/programs/Xserver/hw/xfree86/drivers/mga/mga_video.c	Sun Dec 23 11:36:29 2001
@@ -720,9 +720,9 @@
 OUTREG(MGAREG_BESA1ORG, offset);
 
 if(y1 & 0x0001)
-	OUTREG(MGAREG_BESCTL, 0x00040c41);
+	OUTREG(MGAREG_BESCTL, 0x00050c41);
 else 
-	OUTREG(MGAREG_BESCTL, 0x00040c01);
+	OUTREG(MGAREG_BESCTL, 0x00050c01);
  
 OUTREG(MGAREG_BESHCOORD, (dstBox->x1 << 16) | (dstBox->x2 - 1));
 OUTREG(MGAREG_BESVCOORD, (dstBox->y1 << 16) | (dstBox->y2 - 1));



Re: [Xpert]Tearing on overlay surfaces

2001-12-03 Thread Ewald Snel

Hi,

[...]

> > It's a little more than that because the driver is using 4:2:2
> > internally.  Copying the way it is doing you can't get much more than
> > 160 MB/sec and uses the CPU the whole time.
>
>   Hrm, I hope it doesn't just double each chroma scanline.  I'm in fear
> now.  Wish that was documented somewhere, I would have written a better
> chroma upsampler sooner.

A few months ago, I tried to add support for planar modes to the mga xvideo 
extension. It had some problems (noise) with downscaled DVD movies, but it 
works and is indeed faster than the "copy munged data" method.

I have attached the patch. You might need to make some changed to use it with 
the current CVS version though (only tried it with XFree86-4.1.0).

bye,

ewald

--- xc/programs/Xserver/hw/xfree86/drivers/mga/mga_video.c.mga-xv-planar-data	Fri Apr 27 20:59:53 2001
+++ xc/programs/Xserver/hw/xfree86/drivers/mga/mga_video.c	Fri Apr 27 21:16:41 2001
@@ -622,6 +622,42 @@
}
 }
 
+static void
+MGACopyPlanarData(
+   unsigned char *src1,
+   unsigned char *src2,
+   unsigned char *src3,
+   unsigned char *dst1,
+   unsigned char *dst2,
+   unsigned char *dst3,
+   int srcPitch,
+   int srcPitch2,
+   int dstPitch,
+   int h,
+   int w
+){
+int w2, dstPitch2;
+
+w2 = w >> 1;
+dstPitch2 = dstPitch >> 1;
+h >>= 1;
+
+while(h--) {
+	memcpy(dst1, src1, w);
+	src1 += srcPitch;
+	dst1 += dstPitch;
+	memcpy(dst1, src1, w);
+	src1 += srcPitch;
+	dst1 += dstPitch;
+	memcpy(dst2, src2, w2);
+	src2 += srcPitch2;
+	dst2 += dstPitch2;
+	memcpy(dst3, src3, w2);
+	src3 += srcPitch2;
+	dst3 += dstPitch2;
+}
+}
+
 
 static FBLinearPtr
 MGAAllocateMemory(
@@ -690,21 +726,44 @@
 hzoom = (pScrn->currentMode->Clock > 135000) ? 1 : 0;
 
 switch(id) {
+case FOURCC_YV12:
+case FOURCC_I420:
+	if(pMga->Chipset == PCI_CHIP_MGAG400)
+	OUTREG(MGAREG_BESGLOBCTL, 0x00a0 | (3 * hzoom) | (tmp << 16));
+	else
+	OUTREG(MGAREG_BESGLOBCTL, 0x0080 | (3 * hzoom) | (tmp << 16));
+	break;
 case FOURCC_UYVY:
 	OUTREG(MGAREG_BESGLOBCTL, 0x00c0 | (3 * hzoom) | (tmp << 16));
 	break;
 case FOURCC_YUY2:
-default:
 	OUTREG(MGAREG_BESGLOBCTL, 0x0080 | (3 * hzoom) | (tmp << 16));
 	break;
 }
 
 OUTREG(MGAREG_BESA1ORG, offset);
 
-if(y1 & 0x0001)
-	OUTREG(MGAREG_BESCTL, 0x00040c41);
-else 
-	OUTREG(MGAREG_BESCTL, 0x00040c01);
+if((id == FOURCC_UYVY) || (id == FOURCC_YUY2) ||
+   (pMga->Chipset != PCI_CHIP_MGAG400))
+{
+	if(y1 & 0x0001)
+	OUTREG(MGAREG_BESCTL, 0x00050c41);
+	else
+	OUTREG(MGAREG_BESCTL, 0x00050c01);
+} else {
+	int n = (pitch * height);
+	int corg = offset + n - (3 * (pitch >> 1) * (y1 >> 17));
+
+	if(y1 & 0x0001)
+	OUTREG(MGAREG_BESCTL, 0x00070c41);
+	else
+	OUTREG(MGAREG_BESCTL, 0x00070c01);
+
+	OUTREG(MGAREG_BESA1CORG, corg);
+	OUTREG(MGAREG_BESA1C3ORG, corg + (n >> 2));
+
+	pitch <<= 1;
+}
  
 OUTREG(MGAREG_BESHCOORD, (dstBox->x1 << 16) | (dstBox->x2 - 1));
 OUTREG(MGAREG_BESVCOORD, (dstBox->y1 << 16) | (dstBox->y2 - 1));
@@ -712,7 +771,7 @@
 OUTREG(MGAREG_BESHSRCST, x1 & 0x03fc);
 OUTREG(MGAREG_BESHSRCEND, (x2 - 0x0001) & 0x03fc);
 OUTREG(MGAREG_BESHSRCLST, (width - 1) << 16);
-   
+
 OUTREG(MGAREG_BESPITCH, pitch >> 1);
 
 OUTREG(MGAREG_BESV1WGHT, y1 & 0xfffc);
@@ -871,6 +930,10 @@
switch(id) {
case FOURCC_YV12:
case FOURCC_I420:
+	if(pMga->Chipset == PCI_CHIP_MGAG400) {
+	dstPitch = (width + 31) & ~31;
+	new_size = (((3 * dstPitch * height) >> 1) + bpp - 1) / bpp;
+	}
 	srcPitch = (width + 3) & ~3;
 	offset2 = srcPitch * height;
 	srcPitch2 = ((width >> 1) + 3) & ~3;
@@ -914,9 +977,23 @@
 	   offset3 = tmp;
 	}
 	nlines = y2 + 0x) >> 16) + 1) & ~1) - top;
-	MGACopyMungedData(buf + (top * srcPitch) + (left >> 1), 
-			  buf + offset2, buf + offset3, dst_start,
-			  srcPitch, srcPitch2, dstPitch, nlines, npixels);
+	if(pMga->TexturedVideo || (pMga->Chipset != PCI_CHIP_MGAG400))
+	MGACopyMungedData(buf + (top * srcPitch) + (left >> 1),
+			  buf + offset2, buf + offset3, dst_start,
+			  srcPitch, srcPitch2, dstPitch, nlines,
+			  npixels);
+	else {
+	int corg, n;
+	left >>= 1;
+	n = (dstPitch * height);
+	corg = n - ((3 * top * dstPitch) >> 2) - (left >> 1);
+	dst_start -= left;
+	MGACopyPlanarData(buf + (top * srcPitch) + left,
+			  buf + offset2, buf + offset3,
+			  dst_start, dst_start + corg + (n >> 2),
+			  dst_start + corg, srcPitch, srcPitch2,
+			  dstPitch, nlines, npixels);
+	}
 	break;
 case FOURCC_UYVY:
 case FOURCC_YUY2:



Re: [Xpert]Wheel mouse does not work with USB mouse when using input core driver

2001-11-25 Thread Ewald Snel

Hi.

> Hi,
>
> I'm using a Logitech MouseManPlus USB (also works when using a USB-PS/2
> adapter).
> Current setup: Linux kernel 2.4.15-pre5, XFree86 from CVS (Nov. 24th).
>
> Bug: Mouse wheel does not work with xterm/mozilla/any X app when using
> USB mouse and Input core driver whereas it works with the PS/2 adapter.

I have exactly the same mouse and almost the same setup, but I have the 
following lines in XF86Config (slightly different from your version) :

Section "InputDevice"
Identifier  "Mouse1"
Driver  "mouse"
Option  "Protocol" "ImPS/2"
Option  "Device" "/dev/input/mice"
Option  "ZAxisMapping" "4 5"
Option  "Emulate3Buttons" "no"
Option  "SampleRate" "200"
EndSection

I hope this helps.
bye,

ewald
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert