Be aware that the user may change the effect and or scaling options on the fly (effect with ctrl-page-up/down).
I also would like to support F.J.McCloud's call to keep it portable and test it on multiple cards I volunteer to test it on a radeon 9250 (aka 9200 pro) with the opensource drivers. Although I wonder if:
1) my card has a shader
2) the shader is supported by the opensource drivers.
---
Luckily the opengl code doesn't use the generic blit and effect code so my effect rewrite and your opengl shader work shouldn't cause any conflicts.
I've talked to Lawrence this weekend I'll submit my effect rewrite to CVS soon. It will be missing hq2x, lq2x, 6tap and fakescan.
It will however have much cleaner code for the rest, fixed a few bugs, much faster on DGA and it won't fix the scaling to a certain factor with each effect, currently it can do:
-normal 1x1 - 8x8
-scale2x 2x2 - 3x6
-scan2 1x2 - 4x2
-rgbscan 1x3 - 6x3
-scan3 1x3 - 6x3
All the scanline effects come in both a horizontal and a vertical version, the scaling factors given are for the horizontal version, for the vertical the widthscale is the heightscale and the heightscale supports -arbheight, so can do basicly anything. The new code automaticly selects vertical scanlines for rotated games.
You can choise what you want as usual -effect selects the effect, -widthscale and -heightscale the scalingfactor.
Regards,
Hans
Matthew Earl wrote:
Having recently acquired a suitably powerful graphics card, I thought it would be fun to have MAME do the scale2x resize effect through a fragment shader. It started off as an exercise to learn how to write GPU programs, but I quickly realised implementing the scale2x algorithm off the CPU could be a desirable feature in an emulator such as MAME:
- Allows the CPU to spend most of its time actually emulating hardware - Takes advantage of advanced graphics cards functions which for the most part are unused when scaling algorithms are performed on CPU.
Fragment shaders could also be used to implement other effects such as rgb effects (scanline, pixel triad, etc) and possibly HQ/LQ resize algorithms. RGB effects should be trivial to code very efficiently. There are potential stumbling blocks with implementing the HQ algorithm, because of the constraints of fragment programs and my inexperience with writing them.
By applying multiple effects, or the same effect many times, the full set of effects available in xmame could be implemented on the GPU with only a handful of fragment programs. For example, scale4x with a scanline effect could be implemented by applying the scale2x FP twice, and then the scanline FP.
My proof-of-concept implementation of the scale2x algorithm works on the xmame-0.88 source on top of the existing opengl code. Modification to the actual source code was minimal; all that was required was the fragment program be loaded with the rest of opengl initialisation, and a single call to glProgramLocalParameter4fARB made at render time, to pass parameters to the program. I have uploaded the fragment program here:
http://users.ox.ac.uk/~newc2303/scale2x.fp
To work correctly, regular opengl bilinear filtering must be disabled.
Functionally, the effect exactly mimics the CPU implementation of the algorithm. As far as speed is concerned, I get 90fps compared with 60fps, on a run through half a level of mslug, on my 1.9GHz P4/Geforce 6800 GT (I would be interested to know if there are any more precise benchmarking methods that people use).
To allow multiple shaders to be applied at the same time, the opengl driver could be modified to render one pass per shader. The first pass would be rendered as normal. The remaining passes would then take the previous pass as a texture, and render it applied to a quad filling the screen.
The scale2x algorithm on its own may not be particularly useful as a GPU implementation; people with hardware capable of running fragment programs most likely have CPUs capable of running a MAME emulation and a scale2x resize. However, I suspect there will be a lot of people whose computers fall into the category of not being able to run scale4x or HQ algorithms on the CPU in real time, but have a graphics card suitable to allow it to be run on the GPU (I myself fall into this category).
I would be interested in hearing what other people think about this: Developers and end users. Would it be worth my time continuing and attempting to implement the HQ algorithm and other effects? Should I set about implementing my changes to the opengl driver so that it neatly merges with the rest of the code? Implementing this algorithm such that the fragment programs are used when in GL mode, and the current effects code when in any other mode, seems sensible, but might be difficult for me to do as I have little experience with the xmame source. It could also be the case that my propositions would require xmame to be restructured a little; is it the case that currently opengl mode itself is implemented as a filter? This approach would probably be incompatible with my changes.
Looking forward to hearing what people think,
Matthew Earl
_______________________________________________ Xmame mailing list [EMAIL PROTECTED] http://toybox.twisted.org.uk/mailman/listinfo/xmame
_______________________________________________ Xmame mailing list [EMAIL PROTECTED] http://toybox.twisted.org.uk/mailman/listinfo/xmame