Sounds like a cool effort. May I suggest that you use sysdep_display_params.effect for now to let the user choise the effect and sysdep_display_params.widthscale and .heightscale to choise between scale2x and scale4x for example. That way it will work the same way for the end users as the software effects do.

Be aware that the user may change the effect and or scaling options on the fly (effect with ctrl-page-up/down).

I also would like to support F.J.McCloud's call to keep it portable and test it on multiple cards I volunteer to test it on a radeon 9250 (aka 9200 pro) with the opensource drivers. Although I wonder if:
1) my card has a shader
2) the shader is supported by the opensource drivers.


---

Luckily the opengl code doesn't use the generic blit and effect code so my effect rewrite and your opengl shader work shouldn't cause any conflicts.

I've talked to Lawrence this weekend I'll submit my effect rewrite to CVS soon. It will be missing hq2x, lq2x, 6tap and fakescan.

It will however have much cleaner code for the rest, fixed a few bugs, much faster on DGA and it won't fix the scaling to a certain factor with each effect, currently it can do:
-normal 1x1 - 8x8
-scale2x 2x2 - 3x6
-scan2 1x2 - 4x2
-rgbscan 1x3 - 6x3
-scan3 1x3 - 6x3


All the scanline effects come in both a horizontal and a vertical version, the scaling factors given are for the horizontal version, for the vertical the widthscale is the heightscale and the heightscale supports -arbheight, so can do basicly anything. The new code automaticly selects vertical scanlines for rotated games.

You can choise what you want as usual -effect selects the effect, -widthscale and -heightscale the scalingfactor.

Regards,

Hans



Matthew Earl wrote:
Having recently acquired a suitably powerful graphics card, I thought
it would be fun to have MAME do the scale2x resize effect through a
fragment shader. It started off as an exercise to learn how to write
GPU programs, but I quickly realised implementing the scale2x
algorithm off the CPU could be a desirable feature in an emulator such
as MAME:

- Allows the CPU to spend most of its time actually emulating hardware
- Takes advantage of advanced graphics cards functions which for the
most part are unused when scaling algorithms are performed on CPU.

Fragment shaders could also be used to implement other effects such as
rgb effects (scanline, pixel triad, etc) and possibly HQ/LQ resize
algorithms. RGB effects should be trivial to code very efficiently.
There are potential stumbling blocks with implementing the HQ
algorithm, because of the constraints of fragment programs and my
inexperience with writing them.

By applying multiple effects, or the same effect many times, the full
set of effects available in xmame could be implemented on the GPU with
only a handful of fragment programs. For example, scale4x with a
scanline effect could be implemented by applying the scale2x FP twice,
and then the scanline FP.

My proof-of-concept implementation of the scale2x algorithm works on
the xmame-0.88 source on top of the existing opengl code. Modification
to the actual source code was minimal; all that was required was the
fragment program be loaded with the rest of opengl initialisation, and
a single call to glProgramLocalParameter4fARB made at render time, to
pass parameters to the program. I have uploaded the fragment program
here:

http://users.ox.ac.uk/~newc2303/scale2x.fp

To work correctly, regular opengl bilinear filtering must be disabled.

Functionally, the effect exactly mimics the CPU implementation of the
algorithm. As far as speed is concerned, I get 90fps compared with
60fps, on a run through half a level of mslug, on my 1.9GHz P4/Geforce
6800 GT (I would be interested to know if there are any more precise
benchmarking methods that people use).

To allow multiple shaders to be applied at the same time, the opengl
driver could be modified to render one pass per shader. The first pass
would be rendered as normal. The remaining passes would then take the
previous pass as a texture, and render it applied to a quad filling
the screen.

The scale2x algorithm on its own may not be particularly useful as a
GPU implementation; people with hardware capable of running fragment
programs most likely have CPUs capable of running a MAME emulation and
a scale2x resize. However, I suspect there will be a lot of people
whose computers fall into the category of not being able to run
scale4x or HQ algorithms on the CPU in real time, but have a graphics
card suitable to allow it to be run on the GPU (I myself fall into
this category).

I would be interested in hearing what other people think about this:
Developers and end users. Would it be worth my time continuing and
attempting to implement the HQ algorithm and other effects? Should I
set about implementing my changes to the opengl driver so that it
neatly merges with the rest of the code? Implementing this algorithm
such that the fragment programs are used when in GL mode, and the
current effects code when in any other mode, seems sensible, but might
be difficult for me to do as I have little experience with the xmame
source. It could also be the case that my propositions would require
xmame to be restructured a little; is it the case that currently
opengl mode itself is implemented as a filter? This approach would
probably be incompatible with my changes.

Looking forward to hearing what people think,

Matthew Earl

_______________________________________________
Xmame mailing list
[EMAIL PROTECTED]
http://toybox.twisted.org.uk/mailman/listinfo/xmame


_______________________________________________ Xmame mailing list [EMAIL PROTECTED] http://toybox.twisted.org.uk/mailman/listinfo/xmame

Reply via email to