Re: [Freedos-devel] VGA frame rates and mouse
Hi again, at the risk of going too much into detail... A good compiler might support MMX and even newer stuff as long as you use smart data types or macros or whatever else to hint it about the possibility to use vector data. The idea of conditional move and set (movcc and setcc in Assembly slang) is to avoid conditional jumps, as those tend to spoil hardware efficiency. Even given the attempts of modern hardware to predict how often or even when a condition is likely to be true or false. The idea of multi byte data types for byte pixels is, obviously, to do multiple pixels in one operation :-) So for example you could do for ... { long m = mask[i]; // e.g. 0xff00 at one moment screen[i] &= ~m; // e.g. screen[i] is 00?? now screen[i] |= (sprite[i] & m); // sprite replaces 00s } I think generating the mask by unpacking a mask of one bit per pixel into one byte per pixel, using bitstring operations combined with setcc or maybe something with a shift loop, would already undo most of the speed gain. Of course it takes a lot of memory to have 1 Byte of mask for every 1 Byte pixel and not even having real alpha blending. But memory is easily available today. I still think the PCX based approach with one color being reserved for "transparent" would have a very nice balance between memory use and speed and would not even need permanent buffers for sprite & mask :-) Regards, Eric -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi, On Mon, Jul 23, 2018 at 1:15 PM, Bret Johnson wrote: > > Multiple issues. First, if you're going to do it in "pure C", you can't depend on anything like MMX. MMX is deprecated in lieu of "better" SSE2, though. Of course, that demands P4/AMD64, but most people have those by now. (I hate to be that guy, but the war is lost, nobody cares about anything less these days, and most aren't sympathetic to IA-32 anymore either in lieu of AMD64.) You can of course have fallbacks for both MMX and generic, whether separate files (modules) or #ifdef. So you can support both if you do the extra work. But it's only (usually) worth it for a (very) small number of routines. PAQ8 sped up literally twice as fast by rewriting two functions for MMX, which barely made up a few hundred bytes once assembled. Even including all i386, i686, i786 versions of those two functions, detecting cpu at runtime, didn't add noticeably any bigger .EXE size. For me, that was an obvious "win", where everyone was supported instead of excluding others for vain speed reasons. (But it was fairly slow overall, I'll give you that. 7-Zip is a better compromise in speed and efficiency.) Also, there are "intrinsics" (macros in headers?) that many compilers (e.g. Intel, GNU, MS) use to allow access to such instructions. Or you could just use some third-party library that abstracts it all away for you (GMP?). > You're going to need to virtualize the multiple-byte-functions-at-the-same-time manually, > taking advantage of CPU and data storage characteristics (little-endian, two's complement, etc.). > That pretty much defeats the purpose of sticking with "pure C". Pure/portable/strictly conformant C is only good for programs/utilities that people actually could benefit from using in different environments, e.g. 7-Zip. If you're targeting DOS or VGA specifically, then being portable is only good for maybe supporting other DOS compilers (for whatever minor benefits). > What you're trying to avoid is (conditional) JMPing and multiplication/division, since they are costly > in terms of speed, even though they will work just fine. Division can sometimes be avoided by doing multiplication of the inverse. I don't know the math behind it, but many articles and people have talked about it before, so I assume you recognize what I mean here. > You are probably also going to want to minimize the number of loops, since loops are also a type of JMP. > But, in modern CPU's with caches and branch prediction and pipelining and similar enhancements, > loops generally aren't that bad in terms of overall speed. 486s were the usually first ones that had very small internal caches on the cpu itself. The 486 was pipelined, unlike the 386, so yes it was faster (but it was very sensitive to code and data alignment). But the Pentium was the superscalar one (U and V pipes), which could be much faster with correct scheduling of instructions to pair properly (e.g. GCC 2.8.1). Out of order instructions didn't come until 686, I believe (4-1-1 micro-ops?), and even that had to be tuned a certain way. The term "blended" means your generic code provably works well enough for all target cpus (no obvious penalties). If that still isn't good enough, you have to determine the appropriate cpu and run specific code (for a very few select routines that matter, after profiling) via function pointers. Even GCC itself has supported "-march=native" since 4.4.0 or such. > Any kind of speed or size optimization you do in C (whether it's the compiler doing the optimization or > you doing it manually) again depends on specific CPU characteristics and features, and again defeats > the purpose of using "pure C". Just assume the compiler sucks (because it probably does). It doesn't mean they're all bad or that it doesn't have some virtues. But overall compilers don't know much (or assume wrongly). If you want speed, you have to do it yourself. It won't be handed to you on a silver platter. P.S. Avoid (186) ENTER/LEAVE, they are much slower on new machines than the equivalent 8086 code. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] Starting FreeDOS 1.3
On Mon, 23 Jul 2018, Rugxulo wrote: I'm just saying, I suspect that Jim wants to (eventually?) mirror this to iBiblio, and he will of course want "full" sources. But I realize this isn't finalized yet, just a preview snapshot. It's sadly too easy to overlook full sources and dependencies, so most people ignore it, and that makes things harder for us. :-/ Not just that, but there's a lot of frequently overlooked quirks of the GPL, and that's one of them... When I released FreeDOS ODIN many moons ago, I was helpfully informed that a link to the source wasn't actually good enough and I did have to mirror every. single. file. myself either as part of the package, or in the same folder with obvious links, to not be in violation of the GPL, and putting the project in compliance back then was a bit of a pain in the keester. Best to do it from the beginning. -uso. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
If you think you can do ti with subtraction and bit-shifting (without requiring MMX or something similar) please show it to us. With multiplication: new_mask = color & alpha_mask; new_mask >>= 6; new_mask *= 0x7F; Without multiplication: new_mask = color & alpha_mask; new_mask <<= 1; new_mask -= new_mask >> 7; This gets the same result with only two shifts and a subtraction. Happy Hacking, David E. McMackins II Supporting Member, Electronic Frontier Foundation (#2296972) Associate Member, Free Software Foundation (#12889) www.mcmackins.org www.delwink.com www.eff.org www.gnu.org www.fsf.org On 2018-07-23 13:15, Bret Johnson wrote: Multiple issues. First, if you're going to do it in "pure C", you can't depend on anything like MMX. You're going to need to virtualize the multiple-byte-functions-at-the-same-time manually, taking advantage of CPU and data storage characteristics (little-endian, two's complement, etc.). That pretty much defeats the purpose of sticking with "pure C". What you're trying to avoid is (conditional) JMPing and multiplication/division, since they are costly in terms of speed, even though they will work just fine. You are probably also going to want to minimize the number of loops, since loops are also a type of JMP. But, in modern CPU's with caches and branch prediction and pipelining and similar enhancements, loops generally aren't that bad in terms of overall speed. Any kind of speed or size optimization you do in C (whether it's the compiler doing the optimization or you doing it manually) again depends on specific CPU characteristics and features, and again defeats the purpose of using "pure C". If you think you can do ti with subtraction and bit-shifting (without requiring MMX or something similar) please show it to us. 'GENIUS' PILL - TOP 1% DIDN'T WANT THE PUBLIC TO KNOW ABOUT The Brain Insider http://thirdpartyoffers.juno.com/TGL3142/5b561b7888f711b787b75st03vuc [1] Links: -- [1] http://thirdpartyoffers.juno.com/TGL3142/5b561b7888f711b787b75st03vuc -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Multiple issues. First, if you're going to do it in "pure C", you can't depend on anything like MMX. You're going to need to virtualize the multiple-byte-functions-at-the-same-time manually, taking advantage of CPU and data storage characteristics (little-endian, two's complement, etc.). That pretty much defeats the purpose of sticking with "pure C". What you're trying to avoid is (conditional) JMPing and multiplication/division, since they are costly in terms of speed, even though they will work just fine. You are probably also going to want to minimize the number of loops, since loops are also a type of JMP. But, in modern CPU's with caches and branch prediction and pipelining and similar enhancements, loops generally aren't that bad in terms of overall speed. Any kind of speed or size optimization you do in C (whether it's the compiler doing the optimization or you doing it manually) again depends on specific CPU characteristics and features, and again defeats the purpose of using "pure C". If you think you can do ti with subtraction and bit-shifting (without requiring MMX or something similar) please show it to us. 'Genius' Pill - Top 1% Didn't Want The Public To Know About The Brain Insider http://thirdpartyoffers.juno.com/TGL3141/5b561b7888f711b787b75st03vuc-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] Starting FreeDOS 1.3
Hi, On Tue, Jul 17, 2018 at 6:21 AM, TK Chia wrote: > > I have uploaded updated packages at > https://github.com/tkchia/build-ia16/releases/tag/20180616-update-20180708 . Okay, I've downloaded this now but haven't tried it yet. Please don't feel pressure from me about this, but "build-ia16-20180616-update-20180708.zip" (35 kb) is only some small files, not the full sources. Obviously the rest is on Github. But what exactly do you need to rebuild this? (I probably won't try, just curious anyways.) Ubuntu (18.04?) host OS (and various dependencies)? Full GCC 6.x sources (etc)? I also noticed that the binaries (.EXEs) seem forcibly 8.3 compatible, will that still work in raw DOS? I mean, I know it can and should, but did you test it? GCC usually demands LFNs (although DJGPP seems to work both ways, thankfully). I guess I need to try to do a temporary install and test it myself for us. Okay, I do see "fetch.sh", which halfway shows what to do to rebuild. I'm just saying, I suspect that Jim wants to (eventually?) mirror this to iBiblio, and he will of course want "full" sources. But I realize this isn't finalized yet, just a preview snapshot. It's sadly too easy to overlook full sources and dependencies, so most people ignore it, and that makes things harder for us. :-/ Thanks for your efforts. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi, On Mon, Jul 23, 2018 at 11:54 AM, David McMackins wrote: > > I have two oppositions to this. First, I'd like to be able to do this in > pure C. Second, this appears to be a byte-level operation, but the whole > point of doing this is to work on multiple bytes simultaneously. > > I think I might be able to do it with a subtraction and two bitshifts, > though. Think that would be meaningfully faster? "Pure C" doesn't exist. You're still relying on your environment (hardware, userland software, memory, APIs, OS, etc). Even "standard" C was intentionally meant to be both portable and unportable (if needed/desired). So you're allowed to shoot yourself in the foot for extra speed or features or whatever. It's a noble goal to be as "strictly conformant" as possible, but it's not wrong to have non-portable routines, optional or mandatory. Lots of things aren't well-supported across platforms and compilers (e.g. bitfields). Also, if you're targeting 16-bit, you're at the mercy of your compiler and which exact cpu you're running on. Like I've mentioned, the 8086 isn't as efficient as the 286, much less the 486 or Pentium. And I don't just mean clock speed, I mean overall many things (internally) were implemented better/smarter/faster in later cpus. And most compilers of the era weren't very good or only targeted a small subset of those cpus. Basically, your compiler probably sucks, so don't treat it like it knows everything. Don't expect compiled output to be optimal by default. So "faster" means nothing to a C compiler. You have to make it fast, work around it's lacks, bugs, misfeatures, quirks, omissions, etc. I'll admit that inline asm is annoyingly incompatible across compilers, and even external .ASM files take a lot of effort to work across compilers too. It's not really worth it unless you're desperate or extremely diligent (bored!). But (some minimal amount of) assembly is the only true way to get it faster. Well, technically you need a good algorithm first (and must avoid any obvious pitfalls)! So assembly alone is no panacea. Sorry, I know this post isn't directly helpful, just some simple advice. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi, On Mon, Jul 23, 2018 at 5:58 AM, Eric Auer wrote: > > Hi! I am not sure whether I understand your method, so > maybe you can explain it in more detail. Is the alpha > mask 1 byte per pixel, either 00 or ff per pixel? The > multiplication is costly. Since when is MUL costly? Or only because you're doing it for every pixel (i.e. thousands of times)? I know he's targeting 16-bit machines for fun, but indeed, most people will use 386 or newer cpus, where MUL (etc.) have an "early out" algorithm, so they don't take nearly as long as you'd think. It's still much faster than DIV, of course. The normal workaround for MUL, even further, is using faster (386) LEA to do add/shift/mul in one instruction. BTW, Bret mentions SHL, which is indeed a teeny bit slower on many processors, so you may wish to just use a simple ADD instead. * http://web.archive.org/web/20150920114055fw_/http://dflund.se/~john_e/gems/gem0009.html But it's not worth counting cycles, even for ancient machines, until you've finalized exactly what you're trying to do. Premature optimization is usually a waste of time (but a bit of forethought beforehand doesn't hurt). See Agner Fog's manuals (although for an actual 8088/8086, you might just want to email a guru like Jim Leonard). > You can also use bit test > and "set conditionally" (to 0 or 255) and "move > conditionally" byte sized 386 operations, but then > you are back to pixel wise processing. The good > thing about conditional setting and moving is that > you avoid conditional jumps which are always more > time-consuming than a fixed calculation which can > involve conditional setting and moving :-) CMOVxx is 686 (PPro) only. SETxx is indeed 386 only, but you can halfway fake it on 8086 / 16-bit cpus. I don't know of a perfect example offhand, but even I've done it (barely). Basically, you combine boolean results into one and only jump when absolutely needed. Or else you use a mask, "or" onto it in certain cases, then do your operation with that value (where false is a no-op). Something like that, it's hard to explain. BTW, jumps aren't really slow except on 8086, so newer processors (e.g. 486) make it not worth worrying about (except maybe due to small cpu instruction cache or no branch prediction or slow cpu clock or such other problem). Actually, I forgot that (barely documented) SALC is basically SBB AL,AL, which is similar to (386) SETC AL. So you're basically moving/extending into a register from a flag result of some operation then using that mask to do some further conditional bitwise operation. * http://web.archive.org/web/20150920114042fw_/http://dflund.se/~john_e/gems/gem0013.html * http://web.archive.org/web/20150920114042fw_/http://dflund.se/~john_e/gems/gem000f.html I don't know if this explains it, but I gleaned this from some old Usenet posting: xchg ah,al cmp ah,10 sbb bh,bh cmp al,10 sbb bl,bl and bx,0707h add ax,'77' sub ax,bx See what I mean? Here's another example I wrote myself (but it's a bit sloppy/confusing): mov cx,'az' ; check if lowercase alpha push cx call rangecheck ... ret int 20h ... rangecheck: ; in: (upper_limit shl 8) + lower_limit pop bp pop bx ; mov bx,[sp+2] push bp .check: ; int3 cmp al,bl sbb ch,ch inc bh cmp al,bh cmc sbb bl,bl or bl,ch cmp bl,1 ; set CF if BL == 0 cmc ; return NC if AL within valid range ret But of course even CALL/RET is slow on 8086, too, but newer cpus make it not a problem. Again, I don't know if he really truly cares about every single old cpu. I only pretend to care (for fun, completeness, etc.) because I don't even have any 8086s or similar old cpus. (But I do heavily prefer backwards compatible software!) Even my old 486 is disconnected, probably broken. But it doesn't hurt to be careful and try to be compatible in software anyways (in theory). >> With 4 pixels loaded in a 32-bit register: >> >> AND the input pixels with the alpha mask >> SHR this result so that the bit is in position 0 >> Multiply so that this bit is expanded to a full byte of 1s >> AND the input and screen with this mask >> OR the modified input onto the screen I don't think he cares as much about 386, but it doesn't hurt to tell him anyways. In particular, it's fairly easy (even before CPUID) to detect cpu at runtime (see Eric's CPULEVEL tool) ... or at least let the user manually enable it via cmdline, if that isn't feasible. So the optimal solution, if you're diligent enough, is to optimize very frequently used routines for both 8086 and 386 (or 686 or whatever). Dynamic cpu dispatch via function pointers (or whatever you want to call it). -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi, On Mon, Jul 23, 2018 at 5:27 AM, Eric Auer wrote: > > Hi, just a quick extra idea: You could read about > the PCX file format for 8-bit colors and define one > color to be "opaque". > > https://www.fileformat.info/format/pcx/egff.htm This reminds me of Benjamin David Lunt's webpage, where I think he once described (RLE) PCX: http://www.fysnet.net/#video "Video/Graphics Programming - Info/source/etc. that has to do with the Video and Graphics Programming." "PCX files explained (31 Oct 1999) - Describes the PCX files' header and how it stores graphics and attributes. Also explains Run-length encoding (RLE), which is used in PCX files." http://www.fysnet.net/pcxfile.htm Just FYI. P.S. While very popular, I don't think (or at least can't remember) whether most modern web browsers (or even MS Paint) still support such an "old" format. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
I have two oppositions to this. First, I'd like to be able to do this in pure C. Second, this appears to be a byte-level operation, but the whole point of doing this is to work on multiple bytes simultaneously. I think I might be able to do it with a subtraction and two bitshifts, though. Think that would be meaningfully faster? Happy Hacking, David E. McMackins II Supporting Member, Electronic Frontier Foundation (#2296972) Associate Member, Free Software Foundation (#12889) www.mcmackins.org www.delwink.com www.eff.org www.gnu.org www.fsf.org On 2018-07-23 11:44, Bret Johnson wrote: One way around this might be to use CBW, which essentially copies the high bit of AL into all the bits of AH. Using your example, if this is the value in AL: 0100 ^ the alpha bit You can put a saturated mask in AH with two instructions: SHL AL,1 CBW Or put a saturated mask in AH and leave AL unchanged with three instructions: ROL AL,1 CBW ROR AL,1 Those instructions will work even on an 8086 CPU. UNDERGROUND TREATMENT MELTS TUMOR WITH NO CHEMO (WATCH) pro.allianceforadvancedhealth.com http://thirdpartyoffers.juno.com/TGL3142/5b560616f196e6162acbst04vuc [1] Links: -- [1] http://thirdpartyoffers.juno.com/TGL3142/5b560616f196e6162acbst04vuc -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
One way around this might be to use CBW, which essentially copies the high bit of AL into all the bits of AH. Using your example, if this is the value in AL: 0100 ^ the alpha bit You can put a saturated mask in AH with two instructions: SHL AL,1 CBW Or put a saturated mask in AH and leave AL unchanged with three instructions: ROL AL,1 CBW ROR AL,1 Those instructions will work even on an 8086 CPU. Underground Treatment Melts Tumor With No Chemo (Watch) pro.allianceforadvancedhealth.com http://thirdpartyoffers.juno.com/TGL3141/5b560616f196e6162acbst04vuc-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
The multiplication does appear to be costly indeed. I'm now trying to find a way to get around it. I'll explain the method using a 1-byte example, but the logic scales up: Our dear color: 0100 ^ the alpha bit Alpha mask: 0100 The color on screen doesn't matter, as will be demonstrated. First, we AND the color with the alpha mask, yielding the alpha mask itself: 0100 & 0100 -> 0100 Then we shift this bit to the right: 0100 >> 6 -> 0001 Then multiply by 255 so that the whole byte is filled: 0001 * 255 -> Next we invert the mask (this is a modification I considered after writing my previous email): ~ -> Now since this color is opaque, when we AND the screen with it, it sets this pixel on screen to 0 (if our color were transparent, the mask would all be 1s here, so ANDing it would have no effect on the screen). screen & -> Finally, with the assumption that transparent pixels in our input are all 0s, we can just OR the color onto the screen: 0100 | -> 0100 That's the method. Suggestions on improving it are greatly appreciated. P.S. In response to the other criticism of cutting down my color depth, I'm really not concerned about that. In the interest of making my library support many different color formats, it's not feasible at this time to change it all up just to squeeze out some more colors. Happy Hacking, David E. McMackins II Supporting Member, Electronic Frontier Foundation (#2296972) Associate Member, Free Software Foundation (#12889) www.mcmackins.org www.delwink.com www.eff.org www.gnu.org www.fsf.org On 07/23/2018 05:58 AM, Eric Auer wrote: > > Hi! I am not sure whether I understand your method, so > maybe you can explain it in more detail. Is the alpha > mask 1 byte per pixel, either 00 or ff per pixel? The > multiplication is costly. You can also use bit test > and "set conditionally" (to 0 or 255) and "move > conditionally" byte sized 386 operations, but then > you are back to pixel wise processing. The good > thing about conditional setting and moving is that > you avoid conditional jumps which are always more > time-consuming than a fixed calculation which can > involve conditional setting and moving :-) > > Eric > >> With 4 pixels loaded in a 32-bit register: >> >> AND the input pixels with the alpha mask >> SHR this result so that the bit is in position 0 >> Multiply so that this bit is expanded to a full byte of 1s >> AND the input and screen with this mask >> OR the modified input onto the screen > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Freedos-devel mailing list > Freedos-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freedos-devel > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi! I am not sure whether I understand your method, so maybe you can explain it in more detail. Is the alpha mask 1 byte per pixel, either 00 or ff per pixel? The multiplication is costly. You can also use bit test and "set conditionally" (to 0 or 255) and "move conditionally" byte sized 386 operations, but then you are back to pixel wise processing. The good thing about conditional setting and moving is that you avoid conditional jumps which are always more time-consuming than a fixed calculation which can involve conditional setting and moving :-) Eric > With 4 pixels loaded in a 32-bit register: > > AND the input pixels with the alpha mask > SHR this result so that the bit is in position 0 > Multiply so that this bit is expanded to a full byte of 1s > AND the input and screen with this mask > OR the modified input onto the screen -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
I'll take a look at that. An idea I came up with last night while I was asleep was a method that uses my current encoding. With 4 pixels loaded in a 32-bit register: AND the input pixels with the alpha mask SHR this result so that the bit is in position 0 Multiply so that this bit is expanded to a full byte of 1s AND the input and screen with this mask OR the modified input onto the screen I'm trying to think of how I may be able to skip something near the end, but this is pretty good so far I think. Happy Hacking, David E. McMackins II Supporting Member, Electronic Frontier Foundation (#2296972) Associate Member, Free Software Foundation (#12889) www.mcmackins.org www.delwink.com www.eff.org www.gnu.org www.fsf.org On 07/23/2018 05:27 AM, Eric Auer wrote: > > Hi, just a quick extra idea: You could read about > the PCX file format for 8-bit colors and define one > color to be "opaque". Then you can store your image > in PCX format in RAM and do run length coded BLOCKS > of either overwriting or not overwriting pixels on > screen, without needing per-pixel decisions at all! > > https://www.fileformat.info/format/pcx/egff.htm > > Note that PCX compression uses values 192 to 255 > for special codes, so it helps to optimize your > palette to use mainly pixel colors 0 to 191... > > Regards, Eric > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Freedos-devel mailing list > Freedos-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freedos-devel > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi, just a quick extra idea: You could read about the PCX file format for 8-bit colors and define one color to be "opaque". Then you can store your image in PCX format in RAM and do run length coded BLOCKS of either overwriting or not overwriting pixels on screen, without needing per-pixel decisions at all! https://www.fileformat.info/format/pcx/egff.htm Note that PCX compression uses values 192 to 255 for special codes, so it helps to optimize your palette to use mainly pixel colors 0 to 191... Regards, Eric -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel
Re: [Freedos-devel] VGA frame rates and mouse
Hi! 1) new pixels = (old pixels & mask1) | (new pixels & mask2) Where mask1 and mask2 are the negated forms of each other. >> That one only works for boolean masks, but it works on 386. > By boolean mask, do you mean something like all 1s over the colors for > opaque and all 0s for transparent? The way I've got it now is just 1 bit > in the color byte represents opacity. It's either opaque or transparent, > and then the remaining bits are for colors. Actually one BYTE, either 0xff or 0x00, otherwise the mask thing does not work. I know that this wastes RAM, but it makes the computations faster to have the opacity boolean the same size as the pixel data which is 1 byte per pixel in your MCGA scenario. You can also have a run length code for storing the opacity masks in a small way on disk and then dynamically decide whether you want run length based, per-4-pixel, per-2-pixel or per-pixel computations etc. There are always many ways, depending on the scenario :-) Eric -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel