Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-12 Thread Bret Johnson
> However, the topic of port I/O and MMIO reminds me of some activity
> by RayeR and others regarding low-level configuration to speed up
> access to PCI / PCIe graphics in DOS. I think the issue was that no
> fast defaults were applied by the BIOS, slowing down VESA LFB a lot.

Video is kind of a different animal than most other things.  There is I/O that 
controls the hardware things like video modes, display resolutions, etc.), and 
that can be either MMIO or PMIO.  But the memory-mapping of what actually gets 
displayed on screen (like segment B800h for color text modes) really isn't 
MMIO, and least not in the traditional sense.  It is actually called 
dual-ported RAM or Video RAM, and allows both reads and writes from the 
CPU-side while also allowing the graphics hardware to read the same RAM at the 
same time so the graphics hardware knows what to display on the screen.  
Regular RAM can't do that.  In addition, like MMIO, Video RAM can't be cached 
since that would effectively delay what the graphics hardware needs to see to 
know what to display on the screen.

I do remember seeing some stuff from RayeR that I think was related to video 
and modifying something in the Model Specific Registers (MSRs) of the CPUs, but 
don't remember all the details.


___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-06 Thread Bret Johnson
>> I tried posting a much longer response to this, but it was
>> apparently rejected by the moderators.  Here's a shorter one.

> As I already mentioned in the other reply to this thread, feel free
> to send me more specific replies to that article.

I'm not sure why it didn't get posted -- it got "lost in the Ether" somehow and 
that maybe had nothing to do with the moderation.

It was in part a response to your specific article, but also a much longer 
response to the overall thread which has taken a lot of circuitous tangents.  I 
know it will piss off some people and will spur an even longer thread, so am 
unsure that I should try to post it again.  But I also think it says some 
things that need to be said.

If you're willing, I'll send it to you privately and see if you think I should 
try posting it again.

> I/O has also vastly speedup (we have SSD speeds of up to 6
> GB/sec).  Just not by doing IN/OUT, but by using memory mapped
> PCI devices.

 I think you're confusing two different things -- MMIO and
 DMA/Bus-Mastering.
>>>
>>> He is NOT.
>> 
>> I wasn't talking about ecm being confused, I was talking about you.
>> AFAIK, ecm never tested either MMIO or bus-mastering so never said
>> anything about them.
>
> Yes, the only tests I did involved running Slowdown with and without
> the one port I/O instruction patched out in the waste loop. However,
> Tom is correct that this specific *port* I/O access to that
> particular port is not representative of all possible ways of doing
> I/O.

That is correct, but when I pointed out there really is not any difference in 
speed between MMIO and PMIO (the two general categories of doing I/O), Tom 
accused me of spewing BS.  I gave him the opportunity to defend himself with 
some data, and he so far hasn't done that.  Let me explain why I said what I 
did.

In the Intel architecture, there is only one address bus and one data bus (in 
some of the older CPUs the address and data bus were the same physical pins on 
the CPU, but that is a peripheral discussion).  The exact same address and data 
bus(ses) are used for access to both ports (PMIO) and memory (RAM or MMIO).

There is a pin (I've seen it referred to both as IO/M and M/IO) on the CPU that 
is also part of the bus that tells the external devices whether the address on 
the bus is a port address or a memory address.  The device (or RAM) with both 
the correct address _and_ address type is the only one allowed to respond.  For 
PMIO, of course, only the first 16 address lines are valid and the rest are 
ignored (there are only 64k possible PMIO addresses).  Also, instead of single 
pin on the CPU, some of them have a set of pins where some combination of highs 
and lows on the pins designate ports vs. memory addresses, but the concept is 
the same.  The point is that there is only one bus. 

When the CPU wants to send or receive data from a device (whether it is RAM or 
I/O) it sets the pins on the address bus appropriately (the address, address 
type, direction of transfer, and some others as well) and waits for the device 
to respond.  The fact that the IO/M pin is set or not has nothing to do with 
the transfer speed -- it all uses the same bus.  The CPU must simply wait 
however long it takes for the device to respond.  The simple fact that a device 
responds to port address(es) versus memory address(es) has nothing to do with 
the speed of transfer.

However, there is a little more "flexibility" in the CPU instructions that can 
be used with MMIO vs. PMIO.  With PMIO the only way you can transfer data is 
with IN, INS, OUT, or OUTS.  With MMIO you can use (at least theoretically) all 
the CPU instructions that have memory pointers (which a lot of them do).  But 
some devices have limitations on that, too (e.g., with some devices you can 
only modify an entire Word or DWord at once and not individual Bits or Bytes, 
and on some devices reading data from a port can actually change some 
configuration parameter and you don't even need to write to it).  In theory, 
MMIO can make the code a little faster (or at least more efficient) if it is 
done properly, but if done improperly can actually make things slower.


___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-05 Thread C. Masloch

On at 2023-04-05 03:32 +, Bret Johnson wrote:

The article is found at
https://pushbx.org/ecm/dokuwiki
/doku.php?id=blog:pushbx:2023:0321_cpu_performance_comparison


I mostly agree with you and your article, but:


fine that you agree,  but at most 50% of the article is even close to
'right'.


You're the one who said, "I mostly agree with you and your article, but:", not 
me.


Conclusion

CPU-bound benchmarks are much faster on a modern machine than they
are on older ones.
The frequency increase does not actually suffice to explain the
speedup.
Some things, like doing I/O, were not sped up nearly as much
however.



I tried posting a much longer response to this, but it was
apparently rejected by the moderators.  Here's a shorter one.


As I already mentioned in the other reply to this thread, feel free to 
send me more specific replies to that article.



I/O has also vastly speedup (we have SSD speeds of up to 6 GB/sec).
Just not by doing IN/OUT, but by using memory mapped PCI devices.



I think you're confusing two different things -- MMIO and DMA/Bus-
Mastering.



He is NOT.


I wasn't talking about ecm being confused, I was talking about you.  AFAIK, ecm 
never tested either MMIO or bus-mastering so never said anything about them.


Yes, the only tests I did involved running Slowdown with and without the 
one port I/O instruction patched out in the waste loop. However, Tom is 
correct that this specific *port* I/O access to that particular port is 
not representative of all possible ways of doing I/O.


Regards,
ecm


___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-05 Thread C. Masloch

On at 2023-04-04 21:09 +0200, tom ehlert wrote:

Dear Bret,


The article is found at

https://pushbx.org/ecm/dokuwiki/doku.php?id=blog:pushbx:2023:0321_cpu_performance_comparison



I mostly agree with you and your article, but:


fine that you agree,  but at most 50% of the article is even close to
'right'.




Conclusion

CPU-bound benchmarks are much faster on a modern machine than they
are on older ones.
The frequency increase does not actually suffice to explain the
speedup.
Some things, like doing I/O, were not sped up nearly as much
however.



I tried posting a much longer response to this, but it was
apparently rejected by the moderators.  Here's a shorter one.


I don't think that the mailing lists are moderated that way? Anyway, 
Bret, you can send any additional comments to me per email or as a 
comment on the blog.



I/O has also vastly speedup (we have SSD speeds of up to 6 GB/sec).
Just not by doing IN/OUT, but by using memory mapped PCI devices.



I think you're confusing two different things -- MMIO and DMA/Bus-Mastering.


He is NOT.


Who's "he"? In case this was meant to refer to me, it is wrong because 
I'm not a "he".



Whether I/O is PMIO or MMIO is pretty much irrelevant to the speed.
For example, I/O port 201h (the analog joystick) and I/O port 92h
(which controls A20 on some computers) are both VERY slow and would
not be any faster if they were MMIO instead of PMIO.


this is plain bullshit.


The speed depends on the I/O device, not the type of I/O mapping.

which is nonsense.


  Plus, I/O
_can't_ be cached, whether PMIO or MMIO, so the cache(s) are irrelevant to I/O.


yes. I/O device data can't be cached. you are such a clever person to
discover this fact. WOW.



SSD speeds are fast because they use bus-mastering, not because
they use MMIO.  The I/O ports are used to "control" the device, but
the data from the SSD is transferred in and out of RAM using
bus-mastering (which is fast because it doesn't use the CPU at all).


I understand that you don't have the faintest clue how modern PCI devices
work. Just go ahead with undertaining us ...


Tom


Regards,
ecm




___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-04 Thread Bret Johnson
>>> The article is found at
>>> https://pushbx.org/ecm/dokuwiki
>>> /doku.php?id=blog:pushbx:2023:0321_cpu_performance_comparison
>>
>> I mostly agree with you and your article, but:
>
> fine that you agree,  but at most 50% of the article is even close to
> 'right'.

You're the one who said, "I mostly agree with you and your article, but:", not 
me.

>>> Conclusion
>>>
>>> CPU-bound benchmarks are much faster on a modern machine than they
>>> are on older ones.
>>> The frequency increase does not actually suffice to explain the
>>> speedup.
>>> Some things, like doing I/O, were not sped up nearly as much
>>> however.

> I tried posting a much longer response to this, but it was
> apparently rejected by the moderators.  Here's a shorter one.

>> I/O has also vastly speedup (we have SSD speeds of up to 6 GB/sec).
>> Just not by doing IN/OUT, but by using memory mapped PCI devices.

>> I think you're confusing two different things -- MMIO and DMA/Bus-
>> Mastering.

> He is NOT.

I wasn't talking about ecm being confused, I was talking about you.  AFAIK, ecm 
never tested either MMIO or bus-mastering so never said anything about them.

>> Whether I/O is PMIO or MMIO is pretty much irrelevant to the speed.
>> For example, I/O port 201h (the analog joystick) and I/O port 92h
>> (which controls A20 on some computers) are both VERY slow and would
>> not be any faster if they were MMIO instead of PMIO.

> this is plain bullshit.

Care to explain in more detail?

>> The speed depends on the I/O device, not the type of I/O mapping.

> which is nonsense.

Care to explain in more detail?

>> Plus, I/O _can't_ be cached, whether PMIO or MMIO, so the cache(s)
>> are irrelevant to I/O.

> yes. I/O device data can't be cached. you are such a clever person to
> discover this fact. WOW.

Glad you agree that I/O can't be cached.

>> SSD speeds are fast because they use bus-mastering, not because
>> they use MMIO.  The I/O ports are used to "control" the device, but
>> the data from the SSD is transferred in and out of RAM using
>> bus-mastering (which is fast because it doesn't use the CPU at all).

> I understand that you don't have the faintest clue how modern PCI
> devices work. Just go ahead with undertaining us ...

Please enlighten me.


___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-04 Thread tom ehlert
Dear Bret,

> The article is found at
>> https://pushbx.org/ecm/dokuwiki/doku.php?id=blog:pushbx:2023:0321_cpu_performance_comparison

> I mostly agree with you and your article, but:

fine that you agree,  but at most 50% of the article is even close to
'right'.



>>> Conclusion
>>>
>>> CPU-bound benchmarks are much faster on a modern machine than they
>>> are on older ones.
>>> The frequency increase does not actually suffice to explain the
>>> speedup.
>>> Some things, like doing I/O, were not sped up nearly as much
>>> however.

> I tried posting a much longer response to this, but it was
> apparently rejected by the moderators.  Here's a shorter one.

>> I/O has also vastly speedup (we have SSD speeds of up to 6 GB/sec).
>> Just not by doing IN/OUT, but by using memory mapped PCI devices.

> I think you're confusing two different things -- MMIO and DMA/Bus-Mastering.

He is NOT.

> Whether I/O is PMIO or MMIO is pretty much irrelevant to the speed.
> For example, I/O port 201h (the analog joystick) and I/O port 92h
> (which controls A20 on some computers) are both VERY slow and would
> not be any faster if they were MMIO instead of PMIO.

this is plain bullshit.

> The speed depends on the I/O device, not the type of I/O mapping.
which is nonsense.

>  Plus, I/O
> _can't_ be cached, whether PMIO or MMIO, so the cache(s) are irrelevant to 
> I/O.

yes. I/O device data can't be cached. you are such a clever person to
discover this fact. WOW.


> SSD speeds are fast because they use bus-mastering, not because
> they use MMIO.  The I/O ports are used to "control" the device, but
> the data from the SSD is transferred in and out of RAM using
> bus-mastering (which is fast because it doesn't use the CPU at all).

I understand that you don't have the faintest clue how modern PCI devices
work. Just go ahead with undertaining us ...


Tom



___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] Slowdown-Units ratings and a CPU-bound depacker be nchmark

2023-04-04 Thread Bret Johnson
Hi,

am Dienstag, 21. März 2023 um 23:26 schrieben Sie:


> Hello!

The article is found at
> https://pushbx.org/ecm/dokuwiki/doku.php?id=blog:pushbx:2023:0321_cpu_performance_comparison

I mostly agree with you and your article, but:


>> Conclusion
>>
>> CPU-bound benchmarks are much faster on a modern machine than they
>> are on older ones.
>> The frequency increase does not actually suffice to explain the
>> speedup.
>> Some things, like doing I/O, were not sped up nearly as much
>> however.

I tried posting a much longer response to this, but it was apparently rejected 
by the moderators.  Here's a shorter one.

> I/O has also vastly speedup (we have SSD speeds of up to 6 GB/sec).
> Just not by doing IN/OUT, but by using memory mapped PCI devices.

I think you're confusing two different things -- MMIO and DMA/Bus-Mastering.

Whether I/O is PMIO or MMIO is pretty much irrelevant to the speed.  For 
example, I/O port 201h (the analog joystick) and I/O port 92h (which controls 
A20 on some computers) are both VERY slow and would not be any faster if they 
were MMIO instead of PMIO.  The speed depends on the I/O device, not the type 
of I/O mapping.  Plus, I/O _can't_ be cached, whether PMIO or MMIO, so the 
cache(s) are irrelevant to I/O.

SSD speeds are fast because they use bus-mastering, not because they use MMIO.  
The I/O ports are used to "control" the device, but the data from the SSD is 
transferred in and out of RAM using bus-mastering (which is fast because it 
doesn't use the CPU at all).


___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel