The multiplication does appear to be costly indeed. I'm now trying to find a way to get around it. I'll explain the method using a 1-byte example, but the logic scales up:
Our dear color: 01011110 ^ the alpha bit Alpha mask: 01000000 The color on screen doesn't matter, as will be demonstrated. First, we AND the color with the alpha mask, yielding the alpha mask itself: 01011110 & 01000000 -> 01000000 Then we shift this bit to the right: 01000000 >> 6 -> 00000001 Then multiply by 255 so that the whole byte is filled: 00000001 * 255 -> 11111111 Next we invert the mask (this is a modification I considered after writing my previous email): ~11111111 -> 00000000 Now since this color is opaque, when we AND the screen with it, it sets this pixel on screen to 0 (if our color were transparent, the mask would all be 1s here, so ANDing it would have no effect on the screen). screen & 00000000 -> 00000000 Finally, with the assumption that transparent pixels in our input are all 0s, we can just OR the color onto the screen: 01011110 | 00000000 -> 01011110 That's the method. Suggestions on improving it are greatly appreciated. P.S. In response to the other criticism of cutting down my color depth, I'm really not concerned about that. In the interest of making my library support many different color formats, it's not feasible at this time to change it all up just to squeeze out some more colors. Happy Hacking, David E. McMackins II Supporting Member, Electronic Frontier Foundation (#2296972) Associate Member, Free Software Foundation (#12889) www.mcmackins.org www.delwink.com www.eff.org www.gnu.org www.fsf.org On 07/23/2018 05:58 AM, Eric Auer wrote: > > Hi! I am not sure whether I understand your method, so > maybe you can explain it in more detail. Is the alpha > mask 1 byte per pixel, either 00 or ff per pixel? The > multiplication is costly. You can also use bit test > and "set conditionally" (to 0 or 255) and "move > conditionally" byte sized 386 operations, but then > you are back to pixel wise processing. The good > thing about conditional setting and moving is that > you avoid conditional jumps which are always more > time-consuming than a fixed calculation which can > involve conditional setting and moving :-) > > Eric > >> With 4 pixels loaded in a 32-bit register: >> >> AND the input pixels with the alpha mask >> SHR this result so that the bit is in position 0 >> Multiply so that this bit is expanded to a full byte of 1s >> AND the input and screen with this mask >> OR the modified input onto the screen > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Freedos-devel mailing list > Freedos-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freedos-devel > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel