Re: AMD64 buffer cache 4GB cap anything new, multiqueueing plans? ("64bit DMA on amd64" cont)

2018-11-06 Thread Philip Guenther
On Tue, Nov 6, 2018 at 9:51 PM Joseph Mayer 
wrote:

> Previously there was a years-long thread about a 4GB (32bit) buffer
> cache constraint on AMD64, ref
> https://marc.info/?t=14682443664=1=2 .
>
> What I gather is,
>
>  * The problematique is that on AMD64, DMA is limited to 32bit
>addressing, I guess because unlike AMD64 arch CPU:s which all have
>64bit DMA support, popular PCI accessories and supporting hardware
>out there like bridges, have DMA functionality limited to 32bit
>addressing.
>

My read of that thread, particularly Theo's comments, is that no one
actually demonstrated a case where lack of 64bit DMA caused any problems or
limitations.

If you have a system and use where lack of 64bit DMA creates a performance
limitation, then describe it and, *more importantly*, *why* you think the
DMA limit is involved.


Philip Guenther


AMD64 buffer cache 4GB cap anything new, multiqueueing plans? ("64bit DMA on amd64" cont)

2018-11-06 Thread Joseph Mayer
Hi,

Previously there was a years-long thread about a 4GB (32bit) buffer
cache constraint on AMD64, ref
https://marc.info/?t=14682443664=1=2 .

What I gather is,

 * The problematique is that on AMD64, DMA is limited to 32bit
   addressing, I guess because unlike AMD64 arch CPU:s which all have
   64bit DMA support, popular PCI accessories and supporting hardware
   out there like bridges, have DMA functionality limited to 32bit
   addressing.

   (Is this a feature of lower-quality hardware, or for very old PCI
   devices, or is it systemic to the whole AMD64 ecosystem today?

   Could a system be configured to use 64bit DMA on AMD64 and be
   expected to work presuming recent or higher-quality / well-selected
   hardware?)

 * The OS asks the disk hardware to load disk data to give memory
   locations via DMA, and then userland fread() and mmap() is fed with
   that data - no need for further data moving or mapping. This is the
   dynamics leading to the 4GB cap.

   And, the 4GB cap is kind of constraining for any computer with much
   RAM and lots of disk reading, as it means lots of reads that
   wouldn't need to hit the disk (as it could be cached using all this
   free memory) isn't cached and is directed to disk anyhow which takes
   a lot of time, yes?

 * This was recognized a long time ago and Bob wrote a solution in
   the form of a "buffer cache flipper" that would push buffer cache
   data out of the 32bit area (to "high memory" as in >32bit) hence
   lifting the limit, via a "(generic) backpressure" mechanism that as
   a bonus used the DMA engine to do the memory moving, I guess this
   means that the buffer cache would be pretty much zero-cost to the
   CPU - sounds incredibly neat!

   And then, it didn't really work, malfunctioned and irritated people
   (was "busted" - for unknown reasons, actually why was it?) and Theo
   wrote it will be fixed in the future.


Has it been fixed since?


Also - when fixed, fread() and mmap() reads to data that's in the
buffer cache will be incredibly fast right, as, in optimal conditions
the mmap:ed addresses will be already-mapped to the buffer cache data
and hence in optimal conditions mmap:ed buffer cache data reads will
have the speed of any memory access, right?


(The ML thread also mentioned an undeadly.org post discussing this
topic, however both searching and browsing I can't find it, the closest
i find is 5 words here
https://undeadly.org/cgi?action=article;sid=20170815171854 - do you
have any URL?)


Last, OpenBSD's biggest limit as an OS seems to be that the disk/file
subsystem is sequential. A modern SSD can read at 2.8GB/sec but that
requires parallellism, without multiqueueing and with small reads e.g.
4KB or smaller, speeds stay around 70-120MB/sec = ~3.5% of the
hardware's potential performance. This would be really worthy goal to
donate to for instance, in particular as OpenBSD leads the way in many
other areas.

Are there any thoughts about implementing this in the future?

Thanks,
Joseph