Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Eric Auer


Hi again, at the risk of going too much into detail...

A good compiler might support MMX and even newer stuff
as long as you use smart data types or macros or whatever
else to hint it about the possibility to use vector data.

The idea of conditional move and set (movcc and setcc
in Assembly slang) is to avoid conditional jumps, as
those tend to spoil hardware efficiency. Even given
the attempts of modern hardware to predict how often
or even when a condition is likely to be true or false.

The idea of multi byte data types for byte pixels is,
obviously, to do multiple pixels in one operation :-)

So for example you could do

for ... {
long m = mask[i]; // e.g. 0xff00 at one moment
screen[i] &= ~m; // e.g. screen[i] is 00?? now
screen[i] |= (sprite[i] & m); // sprite replaces 00s
}

I think generating the mask by unpacking a mask of one
bit per pixel into one byte per pixel, using bitstring
operations combined with setcc or maybe something with
a shift loop, would already undo most of the speed gain.

Of course it takes a lot of memory to have 1 Byte of
mask for every 1 Byte pixel and not even having real
alpha blending. But memory is easily available today.

I still think the PCX based approach with one color
being reserved for "transparent" would have a very
nice balance between memory use and speed and would
not even need permanent buffers for sprite & mask :-)

Regards, Eric


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Rugxulo
Hi,

On Mon, Jul 23, 2018 at 1:15 PM, Bret Johnson  wrote:
>
> Multiple issues.  First, if you're going to do it in "pure C", you can't
depend on anything like MMX.

MMX is deprecated in lieu of "better" SSE2, though. Of course, that demands
P4/AMD64, but most people have those by now. (I hate to be that guy, but
the war is lost, nobody cares about anything less these days, and most
aren't sympathetic to IA-32 anymore either in lieu of AMD64.)

You can of course have fallbacks for both MMX and generic, whether separate
files (modules) or #ifdef. So you can support both if you do the extra
work. But it's only (usually) worth it for a (very) small number of
routines.

PAQ8 sped up literally twice as fast by rewriting two functions for MMX,
which barely made up a few hundred bytes once assembled. Even including all
i386, i686, i786 versions of those two functions, detecting cpu at runtime,
didn't add noticeably any bigger .EXE size. For me, that was an obvious
"win", where everyone was supported instead of excluding others for vain
speed reasons. (But it was fairly slow overall, I'll give you that. 7-Zip
is a better compromise in speed and efficiency.)

Also, there are "intrinsics" (macros in headers?) that many compilers (e.g.
Intel, GNU, MS) use to allow access to such instructions. Or you could just
use some third-party library that abstracts it all away for you (GMP?).

> You're going to need to virtualize the
multiple-byte-functions-at-the-same-time manually,
> taking advantage of CPU and data storage characteristics (little-endian,
two's complement, etc.).
> That pretty much defeats the purpose of sticking with "pure C".

Pure/portable/strictly conformant C is only good for programs/utilities
that people actually could benefit from using in different environments,
e.g. 7-Zip. If you're targeting DOS or VGA specifically, then being
portable is only good for maybe supporting other DOS compilers (for
whatever minor benefits).

> What you're trying to avoid is (conditional) JMPing and
multiplication/division, since they are costly
> in terms of speed, even though they will work just fine.

Division can sometimes be avoided by doing multiplication of the inverse. I
don't know the math behind it, but many articles and people have talked
about it before, so I assume you recognize what I mean here.

> You are probably also going to want to minimize the number of loops,
since loops are also a type of JMP.
> But, in modern CPU's with caches and branch prediction and pipelining and
similar enhancements,
> loops generally aren't that bad in terms of overall speed.

486s were the usually first ones that had very small internal caches on the
cpu itself. The 486 was pipelined, unlike the 386, so yes it was faster
(but it was very sensitive to code and data alignment). But the Pentium was
the superscalar one (U and V pipes), which could be much faster with
correct scheduling of instructions to pair properly (e.g. GCC 2.8.1). Out
of order instructions didn't come until 686, I believe (4-1-1 micro-ops?),
and even that had to be tuned a certain way. The term "blended" means your
generic code provably works well enough for all target cpus (no obvious
penalties). If that still isn't good enough, you have to determine the
appropriate cpu and run specific code (for a very few select routines that
matter, after profiling) via function pointers. Even GCC itself has
supported "-march=native" since 4.4.0 or such.

> Any kind of speed or size optimization you do in C (whether it's the
compiler doing the optimization or
> you doing it manually) again depends on specific CPU characteristics and
features, and again defeats
> the purpose of using "pure C".

Just assume the compiler sucks (because it probably does). It doesn't mean
they're all bad or that it doesn't have some virtues. But overall compilers
don't know much (or assume wrongly). If you want speed, you have to do it
yourself. It won't be handed to you on a silver platter.

P.S. Avoid (186) ENTER/LEAVE, they are much slower on new machines than the
equivalent 8086 code.
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Starting FreeDOS 1.3

2018-07-23 Thread Steve Nickolas

On Mon, 23 Jul 2018, Rugxulo wrote:


I'm just saying, I suspect that Jim wants to (eventually?) mirror this
to iBiblio, and he will of course want "full" sources. But I realize
this isn't finalized yet, just a preview snapshot. It's sadly too easy
to overlook full sources and dependencies, so most people ignore it,
and that makes things harder for us.  :-/


Not just that, but there's a lot of frequently overlooked quirks of the 
GPL, and that's one of them...  When I released FreeDOS ODIN many moons 
ago, I was helpfully informed that a link to the source wasn't actually 
good enough and I did have to mirror every. single. file. myself either as 
part of the package, or in the same folder with obvious links, to not be 
in violation of the GPL, and putting the project in compliance back then 
was a bit of a pain in the keester.  Best to do it from the beginning.


-uso.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread David McMackins

If you think you can do ti with subtraction and bit-shifting (without
requiring MMX or something similar) please show it to us.


With multiplication:

new_mask = color & alpha_mask;
new_mask >>= 6;
new_mask *= 0x7F;

Without multiplication:

new_mask = color & alpha_mask;
new_mask <<= 1;
new_mask -= new_mask >> 7;

This gets the same result with only two shifts and a subtraction.


Happy Hacking,

David E. McMackins II
Supporting Member, Electronic Frontier Foundation (#2296972)
Associate Member, Free Software Foundation (#12889)

www.mcmackins.org www.delwink.com
www.eff.org www.gnu.org www.fsf.org

On 2018-07-23 13:15, Bret Johnson wrote:

Multiple issues. First, if you're going to do it in "pure C", you
can't depend on anything like MMX. You're going to need to virtualize
the multiple-byte-functions-at-the-same-time manually, taking
advantage of CPU and data storage characteristics (little-endian,
two's complement, etc.). That pretty much defeats the purpose of
sticking with "pure C".

What you're trying to avoid is (conditional) JMPing and
multiplication/division, since they are costly in terms of speed, even
though they will work just fine. You are probably also going to want
to minimize the number of loops, since loops are also a type of JMP.
But, in modern CPU's with caches and branch prediction and pipelining
and similar enhancements, loops generally aren't that bad in terms of
overall speed. Any kind of speed or size optimization you do in C
(whether it's the compiler doing the optimization or you doing it
manually) again depends on specific CPU characteristics and features,
and again defeats the purpose of using "pure C".

If you think you can do ti with subtraction and bit-shifting (without
requiring MMX or something similar) please show it to us.


'GENIUS' PILL - TOP 1% DIDN'T WANT THE PUBLIC TO KNOW ABOUT
The Brain Insider
http://thirdpartyoffers.juno.com/TGL3142/5b561b7888f711b787b75st03vuc
[1]


Links:
--
[1] 
http://thirdpartyoffers.juno.com/TGL3142/5b561b7888f711b787b75st03vuc


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Bret Johnson
Multiple issues.  First, if you're going to do it in "pure C", you can't depend 
on anything like MMX.  You're going to need to virtualize the 
multiple-byte-functions-at-the-same-time manually, taking advantage of CPU and 
data storage characteristics (little-endian, two's complement, etc.).  That 
pretty much defeats the purpose of sticking with "pure C". What you're trying 
to avoid is (conditional) JMPing and multiplication/division, since they are 
costly in terms of speed, even though they will work just fine. You are 
probably also going to want to minimize the number of loops, since loops are 
also a type of JMP.  But, in modern CPU's with caches and branch prediction and 
pipelining and similar enhancements, loops generally aren't that bad in terms 
of overall speed. Any kind of speed or size optimization you do in C (whether 
it's the compiler doing the optimization or you doing it manually) again 
depends on specific CPU characteristics and features, and again defeats the 
purpose of using "pure C". If you think you can do ti with subtraction and 
bit-shifting (without requiring MMX or something similar) please show it to us.

'Genius' Pill - Top 1% Didn't Want The Public To Know About
The Brain Insider
http://thirdpartyoffers.juno.com/TGL3141/5b561b7888f711b787b75st03vuc--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Starting FreeDOS 1.3

2018-07-23 Thread Rugxulo
Hi,

On Tue, Jul 17, 2018 at 6:21 AM, TK Chia  wrote:
>
> I have uploaded updated packages at
> https://github.com/tkchia/build-ia16/releases/tag/20180616-update-20180708 .

Okay, I've downloaded this now but haven't tried it yet.

Please don't feel pressure from me about this, but
"build-ia16-20180616-update-20180708.zip" (35 kb) is only some small
files, not the full sources. Obviously the rest is on Github. But what
exactly do you need to rebuild this? (I probably won't try, just
curious anyways.) Ubuntu (18.04?) host OS (and various dependencies)?
Full GCC 6.x sources (etc)?

I also noticed that the binaries (.EXEs) seem forcibly 8.3 compatible,
will that still work in raw DOS? I mean, I know it can and should, but
did you test it? GCC usually demands LFNs (although DJGPP seems to
work both ways, thankfully). I guess I need to try to do a temporary
install and test it myself for us.

Okay, I do see "fetch.sh", which halfway shows what to do to rebuild.
I'm just saying, I suspect that Jim wants to (eventually?) mirror this
to iBiblio, and he will of course want "full" sources. But I realize
this isn't finalized yet, just a preview snapshot. It's sadly too easy
to overlook full sources and dependencies, so most people ignore it,
and that makes things harder for us.  :-/

Thanks for your efforts.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Rugxulo
Hi,

On Mon, Jul 23, 2018 at 11:54 AM, David McMackins  wrote:
>
> I have two oppositions to this. First, I'd like to be able to do this in
> pure C. Second, this appears to be a byte-level operation, but the whole
> point of doing this is to work on multiple bytes simultaneously.
>
> I think I might be able to do it with a subtraction and two bitshifts,
> though. Think that would be meaningfully faster?

"Pure C" doesn't exist. You're still relying on your environment
(hardware, userland software, memory, APIs, OS, etc). Even "standard"
C was intentionally meant to be both portable and unportable (if
needed/desired). So you're allowed to shoot yourself in the foot for
extra speed or features or whatever. It's a noble goal to be as
"strictly conformant" as possible, but it's not wrong to have
non-portable routines, optional or mandatory. Lots of things aren't
well-supported across platforms and compilers (e.g. bitfields).

Also, if you're targeting 16-bit, you're at the mercy of your compiler
and which exact cpu you're running on. Like I've mentioned, the 8086
isn't as efficient as the 286, much less the 486 or Pentium. And I
don't just mean clock speed, I mean overall many things (internally)
were implemented better/smarter/faster in later cpus. And most
compilers of the era weren't very good or only targeted a small subset
of those cpus. Basically, your compiler probably sucks, so don't treat
it like it knows everything. Don't expect compiled output to be
optimal by default.

So "faster" means nothing to a C compiler. You have to make it fast,
work around it's lacks, bugs, misfeatures, quirks, omissions, etc.
I'll admit that inline asm is annoyingly incompatible across
compilers, and even external .ASM files take a lot of effort to work
across compilers too. It's not really worth it unless you're desperate
or extremely diligent (bored!). But (some minimal amount of) assembly
is the only true way to get it faster. Well, technically you need a
good algorithm first (and must avoid any obvious pitfalls)! So
assembly alone is no panacea.

Sorry, I know this post isn't directly helpful, just some simple advice.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Rugxulo
Hi,

On Mon, Jul 23, 2018 at 5:58 AM, Eric Auer  wrote:
>
> Hi! I am not sure whether I understand your method, so
> maybe you can explain it in more detail. Is the alpha
> mask 1 byte per pixel, either 00 or ff per pixel? The
> multiplication is costly.

Since when is MUL costly? Or only because you're doing it for every
pixel (i.e. thousands of times)? I know he's targeting 16-bit machines
for fun, but indeed, most people will use 386 or newer cpus, where MUL
(etc.) have an "early out" algorithm, so they don't take nearly as
long as you'd think. It's still much faster than DIV, of course.

The normal workaround for MUL, even further, is using faster (386) LEA
to do add/shift/mul in one instruction. BTW, Bret mentions SHL, which
is indeed a teeny bit slower on many processors, so you may wish to
just use a simple ADD instead.

* 
http://web.archive.org/web/20150920114055fw_/http://dflund.se/~john_e/gems/gem0009.html

But it's not worth counting cycles, even for ancient machines, until
you've finalized exactly what you're trying to do. Premature
optimization is usually a waste of time (but a bit of forethought
beforehand doesn't hurt). See Agner Fog's manuals (although for an
actual 8088/8086, you might just want to email a guru like Jim
Leonard).

> You can also use bit test
> and "set conditionally" (to 0 or 255) and "move
> conditionally" byte sized 386 operations, but then
> you are back to pixel wise processing. The good
> thing about conditional setting and moving is that
> you avoid conditional jumps which are always more
> time-consuming than a fixed calculation which can
> involve conditional setting and moving :-)

CMOVxx is 686 (PPro) only. SETxx is indeed 386 only, but you can
halfway fake it on 8086 / 16-bit cpus. I don't know of a perfect
example offhand, but even I've done it (barely). Basically, you
combine boolean results into one and only jump when absolutely needed.
Or else you use a mask, "or" onto it in certain cases, then do your
operation with that value (where false is a no-op). Something like
that, it's hard to explain. BTW, jumps aren't really slow except on
8086, so newer processors (e.g. 486) make it not worth worrying about
(except maybe due to small cpu instruction cache or no branch
prediction or slow cpu clock or such other problem).

Actually, I forgot that (barely documented) SALC is basically SBB
AL,AL, which is similar to (386) SETC AL. So you're basically
moving/extending into a register from a flag result of some operation
then using that mask to do some further conditional bitwise operation.

* 
http://web.archive.org/web/20150920114042fw_/http://dflund.se/~john_e/gems/gem0013.html
* 
http://web.archive.org/web/20150920114042fw_/http://dflund.se/~john_e/gems/gem000f.html

I don't know if this explains it, but I gleaned this from some old
Usenet posting:

  xchg ah,al
  cmp ah,10
  sbb bh,bh
  cmp al,10
  sbb bl,bl
  and bx,0707h
  add ax,'77'
  sub ax,bx

See what I mean? Here's another example I wrote myself (but it's a bit
sloppy/confusing):

mov cx,'az' ; check if lowercase alpha
push cx
call rangecheck
...
ret
int 20h
...
rangecheck: ; in: (upper_limit shl 8) + lower_limit
pop bp
pop bx ; mov bx,[sp+2]
push bp
.check:
; int3
cmp al,bl
sbb ch,ch
inc bh
cmp al,bh
cmc
sbb bl,bl
or bl,ch
cmp bl,1 ; set CF if BL == 0
cmc ; return NC if AL within valid range
ret

But of course even CALL/RET is slow on 8086, too, but newer cpus make
it not a problem. Again, I don't know if he really truly cares about
every single old cpu. I only pretend to care (for fun, completeness,
etc.) because I don't even have any 8086s or similar old cpus. (But I
do heavily prefer backwards compatible software!) Even my old 486 is
disconnected, probably broken. But it doesn't hurt to be careful and
try to be compatible in software anyways (in theory).

>> With 4 pixels loaded in a 32-bit register:
>>
>> AND the input pixels with the alpha mask
>> SHR this result so that the bit is in position 0
>> Multiply so that this bit is expanded to a full byte of 1s
>> AND the input and screen with this mask
>> OR the modified input onto the screen

I don't think he cares as much about 386, but it doesn't hurt to tell
him anyways. In particular, it's fairly easy (even before CPUID) to
detect cpu at runtime (see Eric's CPULEVEL tool) ... or at least let
the user manually enable it via cmdline, if that isn't feasible. So
the optimal solution, if you're diligent enough, is to optimize very
frequently used routines for both 8086 and 386 (or 686 or whatever).
Dynamic cpu dispatch via function pointers (or whatever you want to
call it).

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Rugxulo
Hi,

On Mon, Jul 23, 2018 at 5:27 AM, Eric Auer  wrote:
>
> Hi, just a quick extra idea: You could read about
> the PCX file format for 8-bit colors and define one
> color to be "opaque".
>
> https://www.fileformat.info/format/pcx/egff.htm

This reminds me of Benjamin David Lunt's webpage, where I think he
once described (RLE) PCX:

http://www.fysnet.net/#video

"Video/Graphics Programming - Info/source/etc. that has to do with the
Video and Graphics Programming."

"PCX files explained (31 Oct 1999)
- Describes the PCX files' header and how it stores graphics and
attributes. Also explains Run-length encoding (RLE), which is used in
PCX files."

http://www.fysnet.net/pcxfile.htm

Just FYI.

P.S. While very popular, I don't think (or at least can't remember)
whether most modern web browsers (or even MS Paint) still support such
an "old" format.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread David McMackins
I have two oppositions to this. First, I'd like to be able to do this in 
pure C. Second, this appears to be a byte-level operation, but the whole 
point of doing this is to work on multiple bytes simultaneously.


I think I might be able to do it with a subtraction and two bitshifts, 
though. Think that would be meaningfully faster?



Happy Hacking,

David E. McMackins II
Supporting Member, Electronic Frontier Foundation (#2296972)
Associate Member, Free Software Foundation (#12889)

www.mcmackins.org www.delwink.com
www.eff.org www.gnu.org www.fsf.org

On 2018-07-23 11:44, Bret Johnson wrote:

One way around this might be to use CBW, which essentially copies the
high bit of AL into all the bits of AH. Using your example, if this is
the value in AL:

0100
 ^ the alpha bit

You can put a saturated mask in AH with two instructions:

SHL AL,1

CBW

Or put a saturated mask in AH and leave AL unchanged with three
instructions:

ROL AL,1

CBW

ROR AL,1

Those instructions will work even on an 8086 CPU.


UNDERGROUND TREATMENT MELTS TUMOR WITH NO CHEMO (WATCH)
pro.allianceforadvancedhealth.com
http://thirdpartyoffers.juno.com/TGL3142/5b560616f196e6162acbst04vuc
[1]


Links:
--
[1] 
http://thirdpartyoffers.juno.com/TGL3142/5b560616f196e6162acbst04vuc


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Bret Johnson
One way around this might be to use CBW, which essentially copies the high bit 
of AL into all the bits of AH.  Using your example, if this is the value in AL: 
0100
 ^ the alpha bit You can put a saturated mask in AH with two instructions:
 SHL AL,1
CBW Or put a saturated mask in AH and leave AL unchanged with three 
instructions:
 ROL AL,1
CBW
ROR AL,1
 Those instructions will work even on an 8086 CPU.

Underground Treatment Melts Tumor With No Chemo (Watch)
pro.allianceforadvancedhealth.com
http://thirdpartyoffers.juno.com/TGL3141/5b560616f196e6162acbst04vuc--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread David McMackins
The multiplication does appear to be costly indeed. I'm now trying to
find a way to get around it. I'll explain the method using a 1-byte
example, but the logic scales up:

Our dear color:

0100
 ^ the alpha bit

Alpha mask:

0100

The color on screen doesn't matter, as will be demonstrated.

First, we AND the color with the alpha mask, yielding the alpha mask itself:

0100 & 0100 -> 0100

Then we shift this bit to the right:

0100 >> 6 -> 0001

Then multiply by 255 so that the whole byte is filled:

0001 * 255 -> 

Next we invert the mask (this is a modification I considered after
writing my previous email):

~ -> 

Now since this color is opaque, when we AND the screen with it, it sets
this pixel on screen to 0 (if our color were transparent, the mask would
all be 1s here, so ANDing it would have no effect on the screen).

screen &  -> 

Finally, with the assumption that transparent pixels in our input are
all 0s, we can just OR the color onto the screen:

0100 |  -> 0100

That's the method. Suggestions on improving it are greatly appreciated.

P.S. In response to the other criticism of cutting down my color depth,
I'm really not concerned about that. In the interest of making my
library support many different color formats, it's not feasible at this
time to change it all up just to squeeze out some more colors.


Happy Hacking,

David E. McMackins II
Supporting Member, Electronic Frontier Foundation (#2296972)
Associate Member, Free Software Foundation (#12889)

www.mcmackins.org www.delwink.com
www.eff.org www.gnu.org www.fsf.org

On 07/23/2018 05:58 AM, Eric Auer wrote:
> 
> Hi! I am not sure whether I understand your method, so
> maybe you can explain it in more detail. Is the alpha
> mask 1 byte per pixel, either 00 or ff per pixel? The
> multiplication is costly. You can also use bit test
> and "set conditionally" (to 0 or 255) and "move
> conditionally" byte sized 386 operations, but then
> you are back to pixel wise processing. The good
> thing about conditional setting and moving is that
> you avoid conditional jumps which are always more
> time-consuming than a fixed calculation which can
> involve conditional setting and moving :-)
> 
> Eric
> 
>> With 4 pixels loaded in a 32-bit register:
>>
>> AND the input pixels with the alpha mask
>> SHR this result so that the bit is in position 0
>> Multiply so that this bit is expanded to a full byte of 1s
>> AND the input and screen with this mask
>> OR the modified input onto the screen
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Freedos-devel mailing list
> Freedos-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/freedos-devel
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Eric Auer


Hi! I am not sure whether I understand your method, so
maybe you can explain it in more detail. Is the alpha
mask 1 byte per pixel, either 00 or ff per pixel? The
multiplication is costly. You can also use bit test
and "set conditionally" (to 0 or 255) and "move
conditionally" byte sized 386 operations, but then
you are back to pixel wise processing. The good
thing about conditional setting and moving is that
you avoid conditional jumps which are always more
time-consuming than a fixed calculation which can
involve conditional setting and moving :-)

Eric

> With 4 pixels loaded in a 32-bit register:
> 
> AND the input pixels with the alpha mask
> SHR this result so that the bit is in position 0
> Multiply so that this bit is expanded to a full byte of 1s
> AND the input and screen with this mask
> OR the modified input onto the screen


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread David McMackins
I'll take a look at that. An idea I came up with last night while I was
asleep was a method that uses my current encoding.

With 4 pixels loaded in a 32-bit register:

AND the input pixels with the alpha mask
SHR this result so that the bit is in position 0
Multiply so that this bit is expanded to a full byte of 1s
AND the input and screen with this mask
OR the modified input onto the screen

I'm trying to think of how I may be able to skip something near the end,
but this is pretty good so far I think.



Happy Hacking,

David E. McMackins II
Supporting Member, Electronic Frontier Foundation (#2296972)
Associate Member, Free Software Foundation (#12889)

www.mcmackins.org www.delwink.com
www.eff.org www.gnu.org www.fsf.org

On 07/23/2018 05:27 AM, Eric Auer wrote:
> 
> Hi, just a quick extra idea: You could read about
> the PCX file format for 8-bit colors and define one
> color to be "opaque". Then you can store your image
> in PCX format in RAM and do run length coded BLOCKS
> of either overwriting or not overwriting pixels on
> screen, without needing per-pixel decisions at all!
> 
> https://www.fileformat.info/format/pcx/egff.htm
> 
> Note that PCX compression uses values 192 to 255
> for special codes, so it helps to optimize your
> palette to use mainly pixel colors 0 to 191...
> 
> Regards, Eric
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Freedos-devel mailing list
> Freedos-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/freedos-devel
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Eric Auer


Hi, just a quick extra idea: You could read about
the PCX file format for 8-bit colors and define one
color to be "opaque". Then you can store your image
in PCX format in RAM and do run length coded BLOCKS
of either overwriting or not overwriting pixels on
screen, without needing per-pixel decisions at all!

https://www.fileformat.info/format/pcx/egff.htm

Note that PCX compression uses values 192 to 255
for special codes, so it helps to optimize your
palette to use mainly pixel colors 0 to 191...

Regards, Eric


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] VGA frame rates and mouse

2018-07-23 Thread Eric Auer



Hi!

 1) new pixels = (old pixels & mask1) | (new pixels & mask2)

 Where mask1 and mask2 are the negated forms of each other.

>> That one only works for boolean masks, but it works on 386.

> By boolean mask, do you mean something like all 1s over the colors for
> opaque and all 0s for transparent? The way I've got it now is just 1 bit
> in the color byte represents opacity. It's either opaque or transparent,
> and then the remaining bits are for colors.

Actually one BYTE, either 0xff or 0x00, otherwise the mask
thing does not work. I know that this wastes RAM, but it
makes the computations faster to have the opacity boolean
the same size as the pixel data which is 1 byte per pixel
in your MCGA scenario. You can also have a run length code
for storing the opacity masks in a small way on disk and
then dynamically decide whether you want run length based,
per-4-pixel, per-2-pixel or per-pixel computations etc.

There are always many ways, depending on the scenario :-)

Eric


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel