Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
On 2016-02-15 10:15, Constantine A. Murenin wrote: .. I think it got reverted by: .. but I'm not an expert so would wait on confirmation by Bob Beck. Yes, I think you are correct, and it was indeed reverted. .. But it looks like the functions that were introduced in the above commit are still WIP and don't actually flip anything yet: http://bxr.su/o/sys/kern/vfs_bio.c#buf_flip_high 307buf_flip_high(struct buf *bp) .. 313/* XXX does nothing to buffer for now */ .. 317buf_flip_dma(struct buf *bp) .. 324/* XXX does not flip buffer for now */ Thank you for clarifying. This is a quite big deal, for anyone with lots of disk IO and RAM. This will be #2 on my OpenBSD wishlist for the year. How complex is this to implement, and who would be able to do it? May donor powers come to me or someone else this year to contribute. OpenBSD is the finest OS out there.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
On 14 February 2016 at 10:29, Karel Gardaswrote: > On Sat, Feb 13, 2016 at 9:39 PM, Stuart Henderson > wrote: >> There was this commit, I don't *think* it got reverted. >> >> >> >> CVSROOT:/cvs >> Module name:src >> Changes by: b...@cvs.openbsd.org2013/06/11 13:01:20 >> >> Modified files: >> sys/kern : kern_sysctl.c spec_vnops.c vfs_bio.c >> vfs_biomem.c vfs_vops.c >> sys/sys: buf.h mount.h >> sys/uvm: uvm_extern.h uvm_page.c >> usr.bin/systat : iostat.c >> >> Log message: >> High memory page flipping for the buffer cache. >> >> This change splits the buffer cache free lists into lists of dma reachable >> buffers and high memory buffers based on the ranges returned by pmemrange. >> Buffers move from dma to high memory as they age, but are flipped to dma >> reachable memory if IO is needed to/from and high mem buffer. The total >> amount of buffers allocated is now bufcachepercent of both the dma and >> the high memory region. >> >> This change allows the use of large buffer caches on amd64 using more than >> 4 GB of memory >> >> ok tedu@ krw@ - testing by many. > > I think it got reverted by: > > commit ac77fb26761065b7f6031098e6a182cacfaf7437 > Author: beck > Date: Tue Jul 9 15:37:43 2013 + > > back out the cache flipper temporarily to work out of tree. > will come back soon. > ok deraadt@ > > > but I'm not an expert so would wait on confirmation by Bob Beck. Yes, I think you are correct, and it was indeed reverted. Some parts have since been reimplemented and brought back by http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/vfs_bio.c#rev1.170 on 2015/07/19: http://marc.info/?l=openbsd-cvs=143732292523715=2 > CVSROOT:/cvs > Module name:src > Changes by:b...@cvs.openbsd.org2015/07/19 10:21:11 > > Modified files: > sys/kern : vfs_bio.c vfs_vops.c > sys/sys: buf.h > > Log message: > Use two 2q caches for the buffer cache, moving previously warm buffers from > the > first queue to the second. > Mark the first queue as DMA in preparation for being able to use more memory > by flipping. Flipper code currently only sets and clears the flag. > ok tedu@ guenther@ But it looks like the functions that were introduced in the above commit are still WIP and don't actually flip anything yet: http://bxr.su/o/sys/kern/vfs_bio.c#buf_flip_high 307buf_flip_high(struct buf *bp) 308{ 309KASSERT(ISSET(bp->b_flags, B_BC)); 310KASSERT(ISSET(bp->b_flags, B_DMA)); 311KASSERT(bp->cache == DMA_CACHE); 312CLR(bp->b_flags, B_DMA); 313/* XXX does nothing to buffer for now */ 314} http://bxr.su/o/sys/kern/vfs_bio.c#buf_flip_dma 317buf_flip_dma(struct buf *bp) 318{ 319KASSERT(ISSET(bp->b_flags, B_BC)); 320KASSERT(ISSET(bp->b_flags, B_BUSY)); 321if (!ISSET(bp->b_flags, B_DMA)) { 322KASSERT(bp->cache > DMA_CACHE); 323KASSERT(bp->cache < NUM_CACHES); 324/* XXX does not flip buffer for now */ Cheers, Constantine.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
On Sat, Feb 13, 2016 at 9:39 PM, Stuart Hendersonwrote: > There was this commit, I don't *think* it got reverted. > > > > CVSROOT:/cvs > Module name:src > Changes by: b...@cvs.openbsd.org2013/06/11 13:01:20 > > Modified files: > sys/kern : kern_sysctl.c spec_vnops.c vfs_bio.c > vfs_biomem.c vfs_vops.c > sys/sys: buf.h mount.h > sys/uvm: uvm_extern.h uvm_page.c > usr.bin/systat : iostat.c > > Log message: > High memory page flipping for the buffer cache. > > This change splits the buffer cache free lists into lists of dma reachable > buffers and high memory buffers based on the ranges returned by pmemrange. > Buffers move from dma to high memory as they age, but are flipped to dma > reachable memory if IO is needed to/from and high mem buffer. The total > amount of buffers allocated is now bufcachepercent of both the dma and > the high memory region. > > This change allows the use of large buffer caches on amd64 using more than > 4 GB of memory > > ok tedu@ krw@ - testing by many. I think it got reverted by: commit ac77fb26761065b7f6031098e6a182cacfaf7437 Author: beck Date: Tue Jul 9 15:37:43 2013 + back out the cache flipper temporarily to work out of tree. will come back soon. ok deraadt@ but I'm not an expert so would wait on confirmation by Bob Beck.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
I think you would also like to investigate this one: http://www.undeadly.org/cgi?action=article=2006061416 > Some quite deep reading [1] taught me that at least quite recently, there > was a ~3GB cap on the buffer cache, independent of architecture and system > RAM size.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
Dear Karel, Thanks - wait - this post from 2006 you mentioned now, is it saying that actually >32bit/>~3GB buffer cache IS SUPPORTED/WORKS on any AMD64 *with IOMMU* support in the CPU, and was working all the time?? (That would mean that I misunderstood those references I posted in the previous email because in actuality the 32bit/~3GB constraint only is in certain usecases that is on non-IOMMU CPU:s.) Please clarify! So if so, any Intel processor with the "Intel® Virtualization Technology for Directed I/O (VT-d)" feature such as for example this one http://ark.intel.com/products/81061/ , or any of the CPU:s listed on https://en.wikipedia.org/wiki/List_of_IOMMU-supporting_hardware , has that support? Does the motherboard need specific support too as Wikipedia indicates - though at least any Xeon server motherboard from the last 3-4 years must for sure have it right? Thank you everyone for your excellent work with OpenBSD! Best regards, Tinker [1] David Mazieres linked to two docs in there, those are only on archive.org now: https://web.archive.org/web/20150814051509/http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf https://web.archive.org/web/20081218031805/http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/34434.pdf On 2016-02-14 01:15, Karel Gardas wrote: I think you would also like to investigate this one: http://www.undeadly.org/cgi?action=article=2006061416 Some quite deep reading [1] taught me that at least quite recently, there was a ~3GB cap on the buffer cache, independent of architecture and system RAM size.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
I'm afraid you read too quickly and w/o attention to detail, please reread and pay special attention to the last paragraph. Especially to: "IOMMU is present in all "real" AMD64 machines, but not the Intel clones. Unfortunately, OpenBSD support for IOMMU on the AMD machines is not quite ready for primetime (code exists, but "real life" has consorted against me finishing it)." On Sat, Feb 13, 2016 at 7:35 PM, Tinkerwrote: > Dear Karel, > > Thanks - wait - this post from 2006 you mentioned now, is it saying that > actually >32bit/>~3GB buffer cache IS SUPPORTED/WORKS on any AMD64 *with > IOMMU* support in the CPU, and was working all the time?? > > (That would mean that I misunderstood those references I posted in the > previous email because in actuality the 32bit/~3GB constraint only is in > certain usecases that is on non-IOMMU CPU:s.) > > Please clarify! > > > > So if so, any Intel processor with the "Intel® Virtualization Technology for > Directed I/O (VT-d)" feature such as for example this one > http://ark.intel.com/products/81061/ , or any of the CPU:s listed on > https://en.wikipedia.org/wiki/List_of_IOMMU-supporting_hardware , has that > support? > > Does the motherboard need specific support too as Wikipedia indicates - > though at least any Xeon server motherboard from the last 3-4 years must for > sure have it right? > > Thank you everyone for your excellent work with OpenBSD! > > Best regards, > Tinker > > [1] > David Mazieres linked to two docs in there, those are only on archive.org > now: > https://web.archive.org/web/20150814051509/http://www.intel.com/content/dam/w ww/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf > https://web.archive.org/web/20081218031805/http://www.amd.com/us-en/assets/co ntent_type/white_papers_and_tech_docs/34434.pdf > > > > On 2016-02-14 01:15, Karel Gardas wrote: >> >> I think you would also like to investigate this one: >> http://www.undeadly.org/cgi?action=article=2006061416 >> >>> Some quite deep reading [1] taught me that at least quite recently, there >>> was a ~3GB cap on the buffer cache, independent of architecture and >>> system >>> RAM size.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
Aha. So the article is saying that full IOMMU support is waiting on all AMD64 machines (so that would mean any Intel and AMD-manufactured processor with VT-d etc.), and you're saying that this is what needs to be implemented for the buffer cache to finally get >32bit/>~3GB support? Are there plans today to implement this and then get the >32bit/>~3GB support going? (I don't understand exactly how these things fit together, this is why I asked tentatively.) Thanks, Tinker On 2016-02-14 02:03, Karel Gardas wrote: I'm afraid you read too quickly and w/o attention to detail, please reread and pay special attention to the last paragraph. Especially to: "IOMMU is present in all "real" AMD64 machines, but not the Intel clones. Unfortunately, OpenBSD support for IOMMU on the AMD machines is not quite ready for primetime (code exists, but "real life" has consorted against me finishing it)." On Sat, Feb 13, 2016 at 7:35 PM, Tinkerwrote: Dear Karel, Thanks - wait - this post from 2006 you mentioned now, is it saying that actually >32bit/>~3GB buffer cache IS SUPPORTED/WORKS on any AMD64 *with IOMMU* support in the CPU, and was working all the time?? (That would mean that I misunderstood those references I posted in the previous email because in actuality the 32bit/~3GB constraint only is in certain usecases that is on non-IOMMU CPU:s.) Please clarify! So if so, any Intel processor with the "Intel® Virtualization Technology for Directed I/O (VT-d)" feature such as for example this one http://ark.intel.com/products/81061/ , or any of the CPU:s listed on https://en.wikipedia.org/wiki/List_of_IOMMU-supporting_hardware , has that support? Does the motherboard need specific support too as Wikipedia indicates - though at least any Xeon server motherboard from the last 3-4 years must for sure have it right? Thank you everyone for your excellent work with OpenBSD! Best regards, Tinker [1] David Mazieres linked to two docs in there, those are only on archive.org now: https://web.archive.org/web/20150814051509/http://www.intel.com/content/dam/w ww/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf https://web.archive.org/web/20081218031805/http://www.amd.com/us-en/assets/co ntent_type/white_papers_and_tech_docs/34434.pdf On 2016-02-14 01:15, Karel Gardas wrote: I think you would also like to investigate this one: http://www.undeadly.org/cgi?action=article=2006061416 Some quite deep reading [1] taught me that at least quite recently, there was a ~3GB cap on the buffer cache, independent of architecture and system RAM size.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
On 2016-02-13, Tinkerwrote: > Hi, > > Some quite deep reading [1] taught me that at least quite recently, > there was a ~3GB cap on the buffer cache, independent of architecture > and system RAM size. > > Reading the source history of vfs_bio.c [2] gives me a vague impression > that this cap is there also today. > > Just wanted to check, has this cap been removed, or is there any plan to > remove it next months from now? There was this commit, I don't *think* it got reverted. CVSROOT:/cvs Module name:src Changes by: b...@cvs.openbsd.org2013/06/11 13:01:20 Modified files: sys/kern : kern_sysctl.c spec_vnops.c vfs_bio.c vfs_biomem.c vfs_vops.c sys/sys: buf.h mount.h sys/uvm: uvm_extern.h uvm_page.c usr.bin/systat : iostat.c Log message: High memory page flipping for the buffer cache. This change splits the buffer cache free lists into lists of dma reachable buffers and high memory buffers based on the ranges returned by pmemrange. Buffers move from dma to high memory as they age, but are flipped to dma reachable memory if IO is needed to/from and high mem buffer. The total amount of buffers allocated is now bufcachepercent of both the dma and the high memory region. This change allows the use of large buffer caches on amd64 using more than 4 GB of memory ok tedu@ krw@ - testing by many.
Re: Buffer cache made to use >32bit mem addresses (i.e. >~3GB support for the buffer cache) nowadays or planned soon?
On 2016-02-14 03:39, Stuart Henderson wrote: On 2016-02-13, Tinkerwrote: Hi, Some quite deep reading [1] taught me that at least quite recently, there was a ~3GB cap on the buffer cache, independent of architecture and system RAM size. Reading the source history of vfs_bio.c [2] gives me a vague impression that this cap is there also today. Just wanted to check, has this cap been removed, or is there any plan to remove it next months from now? There was this commit, I don't *think* it got reverted. CVSROOT:/cvs Module name:src Changes by: b...@cvs.openbsd.org2013/06/11 13:01:20 Modified files: sys/kern : kern_sysctl.c spec_vnops.c vfs_bio.c vfs_biomem.c vfs_vops.c sys/sys: buf.h mount.h sys/uvm: uvm_extern.h uvm_page.c usr.bin/systat : iostat.c Log message: High memory page flipping for the buffer cache. This change splits the buffer cache free lists into lists of dma reachable buffers and high memory buffers based on the ranges returned by pmemrange. Buffers move from dma to high memory as they age, but are flipped to dma reachable memory if IO is needed to/from and high mem buffer. The total amount of buffers allocated is now bufcachepercent of both the dma and the high memory region. This change allows the use of large buffer caches on amd64 using more than 4 GB of memory ok tedu@ krw@ - testing by many. Indeed it's in the box e.g. http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/vfs_bio.c etc., So >32bit/>~3GB support has been in the box since OpenBSD 5.6 or 5.7. Awesome, thank you so much for clarifying! Tinker