Re: expanding past 1 TB on amd64
Hi Chris, Alan: On Sun, Jul 21, 2013 at 1:19 AM, Chris Torek chris.to...@gmail.com wrote: (Apologies for delay in reply, family issues) I'd be fine with 4 TB instead of 16; and, at this point, with the latest patches, it's easily tuned. The auto-sizing of the direct map is not affected by sparse space as it keys off Maxmem, which is not actually physical size, but rather one past last valid physical page. Here is the patch that I intend to commit: http://people.freebsd.org/~neel/patches/amd64_pmap_4TB.patch This patch bumps up the direct map to 4TB and the KVA to 2TB. It is identical to the patch you posted on hackers except for changing the limits: http://lists.freebsd.org/pipermail/freebsd-hackers/2013-July/043139.html Tested inside a bhyve virtual machine with a *very* sparse memory layout with a memory segment that cuddles up with the 4TB limit. best Neel The direct map limit might not need to be twice kernel virtual size but on Intel memory-controller systems needs to be greater than KVM size due to moving DRAM up past the PCI hole. Unless the restriction that the direct-map area be a power of two size is removed, that winds up meaning twice. (Removing the restriction seems easy enough—instead of pa | highbits to obtain VA and va ~ highbits to obtain PA, just use phys + offset and virt - offset. I didn't see a reason to bother with the effort, though.) Chris ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
(Apologies for delay in reply, family issues) I'd be fine with 4 TB instead of 16; and, at this point, with the latest patches, it's easily tuned. The auto-sizing of the direct map is not affected by sparse space as it keys off Maxmem, which is not actually physical size, but rather one past last valid physical page. The direct map limit might not need to be twice kernel virtual size but on Intel memory-controller systems needs to be greater than KVM size due to moving DRAM up past the PCI hole. Unless the restriction that the direct-map area be a power of two size is removed, that winds up meaning twice. (Removing the restriction seems easy enough—instead of pa | highbits to obtain VA and va ~ highbits to obtain PA, just use phys + offset and virt - offset. I didn't see a reason to bother with the effort, though.) Chris ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On 7/16/2013 2:12 PM, Alan Cox wrote: ... The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. I'd point you at stuff like the Supermicro X8BQ6 series of mainboards. QP E5-8800 systems with 1 TB of memory have been around since 2011. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On 7/16/2013 2:12 PM, Alan Cox wrote: ... The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. I'd point you at stuff like the Supermicro X8BQ6 series of mainboards. QP E5-8800 systems with 1 TB of memory have been around since 2011. That might have been true, but I did check SuperMicro's motherboard matrix of available products before posting. The largest listed memory configuration on any of their current products is 768GB. http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory -Kurt ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
Hi, On 17 Jul 2013, at 15:17, Kurt Lidl wrote: On 7/16/2013 2:12 PM, Alan Cox wrote: ... The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. I'd point you at stuff like the Supermicro X8BQ6 series of mainboards. QP E5-8800 systems with 1 TB of memory have been around since 2011. That might have been true, but I did check SuperMicro's motherboard matrix of available products before posting. The largest listed memory configuration on any of their current products is 768GB. http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory -Kurt http://www.supermicro.com/products/motherboard/Xeon7000 Looks like their matrix is not up-to-date. -- Bob Bishop r...@gid.co.uk ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On Wed, Jul 17, 2013 at 7:50 AM, Bob Bishop r...@gid.co.uk wrote: Hi, On 17 Jul 2013, at 15:17, Kurt Lidl wrote: On 7/16/2013 2:12 PM, Alan Cox wrote: ... The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. I'd point you at stuff like the Supermicro X8BQ6 series of mainboards. QP E5-8800 systems with 1 TB of memory have been around since 2011. That might have been true, but I did check SuperMicro's motherboard matrix of available products before posting. The largest listed memory configuration on any of their current products is 768GB. http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory -Kurt http://www.supermicro.com/products/motherboard/Xeon7000 Looks like their matrix is not up-to-date. There's also several AMD motherboards that support 1 TB of RAM: http://www.supermicro.com/products/nfo/AMD_G34.cfm?pg=MOBO You know, the CPUs that started the 64-bit x86 support ... :) -- Freddie Cash fjwc...@gmail.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On 7/17/13 11:26 AM, Freddie Cash wrote: On Wed, Jul 17, 2013 at 7:50 AM, Bob Bishop r...@gid.co.uk mailto:r...@gid.co.uk wrote: Hi, On 17 Jul 2013, at 15:17, Kurt Lidl wrote: On 7/16/2013 2:12 PM, Alan Cox wrote: ... The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. I'd point you at stuff like the Supermicro X8BQ6 series of mainboards. QP E5-8800 systems with 1 TB of memory have been around since 2011. That might have been true, but I did check SuperMicro's motherboard matrix of available products before posting. The largest listed memory configuration on any of their current products is 768GB. http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory -Kurt http://www.supermicro.com/products/motherboard/Xeon7000 Looks like their matrix is not up-to-date. There's also several AMD motherboards that support 1 TB of RAM: http://www.supermicro.com/products/nfo/AMD_G34.cfm?pg=MOBO You know, the CPUs that started the 64-bit x86 support ... :) Searching a bit harder, it looks like Intel is shipping a quad-socket board that supports 1500GB of memory. http://ark.intel.com/products/61033 So, 1500GB is now, and the next doubling will probably be soon, assuming Intel revs their quad processor boards for Haswell, and that support for 64GB DIMMs is there. I'm not trying to find the biggest motherboard out there, I'm just trying to say that Chris' patch for support up to 16TB isn't too farfetched. And within the next 5 year window, it's entirely likely that 4TB systems will be available. -Kurt ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.torek at gmail.com wrote: In src/sys/amd64/include/vmparam.h is this handy map: * 0x - 0x7fff user map * 0x8000 - 0x7fff does not exist (hole) * 0x8000 - 0x804020100fff recursive page table (512GB slot) * 0x804020101000 - 0xfdff unused * 0xfe00 - 0xfeff 1TB direct map * 0xff00 - 0xff7f unused * 0xff80 - 0x 512GB kernel map showing that the system can deal with at most 1 TB of address space (because of the direct map), using at most half of that for kernel memory (less, really, due to the inevitable VM fragmentation). New boards are coming soonish that will have the ability to go past that (24 DIMMs of 64 GB each = 1.5 TB). Or, if some crazy people :-) might want to use a most of a 768 GB board (24 DIMMs of 32 GB each, possible today although the price is kind of staggering) as wired-down kernel memory, the 512 GB VM area is already a problem. I have not wrapped my head around the amd64 pmap code but figured I'd ask: what might need to change to support larger spaces? Obviously NKPML4E in amd64/include/pmap.h, for the kernel start address; and NDMPML4E for the direct map. It looks like this would adjust KERNBASE and the direct map appropriately. But would that suffice, or have I missed something? For that matter, if these are changed to make space for future expansion, what would be a good expansion size? Perhaps multiply the sizes by 16? (If memory doubles roughly every 18 months, that should give room for at least 5 years.) Chris, Neel, The actual data that I've seen shows that DIMMs are doubling in size at about half that pace, about every three years. For example, see http://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide #8. So, I think that a factor of 16 is a lot more than we'll need in the next five years. I would suggest configuring the kernel virtual address space for 4 TB. Once you go beyond 512 GB, 4 TB is the net plateau in terms of address translation cost. At 4 TB all of the PML4 entries for the kernel virtual address space will reside in the same L2 cache line, so a page table walk on a TLB miss for an instruction fetch will effectively prefetch the PML4 entry for the kernel heap and vice versa. The largest commodity motherboards that are shipping today support 24 DIMMs, at a max size of 32GB per DIMM. That's 768GB, right now. (So FreeBSD is already out of bits in terms of supporting current shipping hardware.) The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. Using your figure of doubling in 3 years, we'll see 3072GB systems by ~2016. And in ~2019, we'll see 6TB systems, and need to finally expand to using more than a single cache line to hold all the PML4 entries. Of course, that's speculating furiously about two generations out, and assumes keeping the current memory architecture / board design constraints. -Kurt ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On Tue, Jul 16, 2013 at 7:08 AM, Kurt Lidl l...@pix.net wrote: On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.torek at gmail.com wrote: In src/sys/amd64/include/vmparam.**h is this handy map: * 0x - 0x7fff user map * 0x8000 - 0x7fff does not exist (hole) * 0x8000 - 0x804020100fff recursive page table (512GB slot) * 0x804020101000 - 0xfdff unused * 0xfe00 - 0xfeff 1TB direct map * 0xff00 - 0xff7f unused * 0xff80 - 0x 512GB kernel map showing that the system can deal with at most 1 TB of address space (because of the direct map), using at most half of that for kernel memory (less, really, due to the inevitable VM fragmentation). New boards are coming soonish that will have the ability to go past that (24 DIMMs of 64 GB each = 1.5 TB). Or, if some crazy people :-) might want to use a most of a 768 GB board (24 DIMMs of 32 GB each, possible today although the price is kind of staggering) as wired-down kernel memory, the 512 GB VM area is already a problem. I have not wrapped my head around the amd64 pmap code but figured I'd ask: what might need to change to support larger spaces? Obviously NKPML4E in amd64/include/pmap.h, for the kernel start address; and NDMPML4E for the direct map. It looks like this would adjust KERNBASE and the direct map appropriately. But would that suffice, or have I missed something? For that matter, if these are changed to make space for future expansion, what would be a good expansion size? Perhaps multiply the sizes by 16? (If memory doubles roughly every 18 months, that should give room for at least 5 years.) Chris, Neel, The actual data that I've seen shows that DIMMs are doubling in size at about half that pace, about every three years. For example, see http://users.ece.cmu.edu/~**omutlu/pub/mutlu_memory-** scaling_imw13_invited-talk.**pdfslidehttp://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide #8. So, I think that a factor of 16 is a lot more than we'll need in the next five years. I would suggest configuring the kernel virtual address space for 4 TB. Once you go beyond 512 GB, 4 TB is the net plateau in terms of address translation cost. At 4 TB all of the PML4 entries for the kernel virtual address space will reside in the same L2 cache line, so a page table walk on a TLB miss for an instruction fetch will effectively prefetch the PML4 entry for the kernel heap and vice versa. The largest commodity motherboards that are shipping today support 24 DIMMs, at a max size of 32GB per DIMM. That's 768GB, right now. (So FreeBSD is already out of bits in terms of supporting current shipping hardware.) Actually, this scenario with 768 GB of RAM on amd64 as it is today is analogous to the typical 32-bit i386 machine, where the amount of RAM has long exceeded the default 1 GB size of the kernel virtual address space. In theory, we could currently handle up to 1 TB of RAM, but the kernel virtual address space would only be 512 GB. ... The Haswell line of CPUs is widely reported to support DIMMs twice as large, and it's due in September. That would make the systems of late 2013 hold up to 1536GB of memory. Using your figure of doubling in 3 years, we'll see 3072GB systems by ~2016. And in ~2019, we'll see 6TB systems, and need to finally expand to using more than a single cache line to hold all the PML4 entries. Yes, this is a reasonable prognostication. Alan Of course, that's speculating furiously about two generations out, and assumes keeping the current memory architecture / board design constraints. -Kurt __**_ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/**mailman/listinfo/freebsd-**currenthttp://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscribe@** freebsd.org freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Max mmap space (was: expanding past 1 TB on amd64)
On Tue, Jul 16, 2013 at 02:12:42PM -0700, Alan Cox wrote: On Tue, Jul 16, 2013 at 7:08 AM, Kurt Lidl l...@pix.net wrote: On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.torek at gmail.com wrote: In src/sys/amd64/include/vmparam.**h is this handy map: * 0x - 0x7fff user map * 0x8000 - 0x7fff does not exist (hole) * 0x8000 - 0x804020100fff recursive page table (512GB slot) * 0x804020101000 - 0xfdff unused * 0xfe00 - 0xfeff 1TB direct map * 0xff00 - 0xff7f unused * 0xff80 - 0x 512GB kernel map The actual data that I've seen shows that DIMMs are doubling in size at about half that pace, about every three years. For example, see http://users.ece.cmu.edu/~**omutlu/pub/mutlu_memory-** scaling_imw13_invited-talk.**pdfslidehttp://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide #8. So, I think that a factor of 16 is a lot more than we'll need in the next five years. I would suggest configuring the kernel virtual address space for 4 TB. Once you go beyond 512 GB, 4 TB is the net plateau in terms of address translation cost. At 4 TB all of the PML4 entries for the kernel virtual address space will reside in the same L2 cache line, so a page table walk on a TLB miss for an instruction fetch will effectively prefetch the PML4 entry for the kernel heap and vice versa. The largest commodity motherboards that are shipping today support 24 DIMMs, at a max size of 32GB per DIMM. That's 768GB, right now. (So FreeBSD is already out of bits in terms of supporting current shipping hardware.) Actually, this scenario with 768 GB of RAM on amd64 as it is today is analogous to the typical 32-bit i386 machine, where the amount of RAM has long exceeded the default 1 GB size of the kernel virtual address space. In theory, we could currently handle up to 1 TB of RAM, but the kernel virtual address space would only be 512 GB. Talking about virtual address space. I plan to permanently mmap serveral nGB sized files (all together 6-8TB) into a single process address space. Actually I see the user map is 128TB, so I shouldn't get into trouble by doing this and also have lot of additional space left to avoid problems because of fragmentation. The system has 192G physical memory, so mapping tables have enough space. Is there anything else to worry about? -- B.Walter be...@bwct.de http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: expanding past 1 TB on amd64
On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.to...@gmail.com wrote: In src/sys/amd64/include/vmparam.h is this handy map: * 0x - 0x7fff user map * 0x8000 - 0x7fff does not exist (hole) * 0x8000 - 0x804020100fff recursive page table (512GB slot) * 0x804020101000 - 0xfdff unused * 0xfe00 - 0xfeff 1TB direct map * 0xff00 - 0xff7f unused * 0xff80 - 0x 512GB kernel map showing that the system can deal with at most 1 TB of address space (because of the direct map), using at most half of that for kernel memory (less, really, due to the inevitable VM fragmentation). New boards are coming soonish that will have the ability to go past that (24 DIMMs of 64 GB each = 1.5 TB). Or, if some crazy people :-) might want to use a most of a 768 GB board (24 DIMMs of 32 GB each, possible today although the price is kind of staggering) as wired-down kernel memory, the 512 GB VM area is already a problem. I have not wrapped my head around the amd64 pmap code but figured I'd ask: what might need to change to support larger spaces? Obviously NKPML4E in amd64/include/pmap.h, for the kernel start address; and NDMPML4E for the direct map. It looks like this would adjust KERNBASE and the direct map appropriately. But would that suffice, or have I missed something? For that matter, if these are changed to make space for future expansion, what would be a good expansion size? Perhaps multiply the sizes by 16? (If memory doubles roughly every 18 months, that should give room for at least 5 years.) Chris, Neel, The actual data that I've seen shows that DIMMs are doubling in size at about half that pace, about every three years. For example, see http://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide #8. So, I think that a factor of 16 is a lot more than we'll need in the next five years. I would suggest configuring the kernel virtual address space for 4 TB. Once you go beyond 512 GB, 4 TB is the net plateau in terms of address translation cost. At 4 TB all of the PML4 entries for the kernel virtual address space will reside in the same L2 cache line, so a page table walk on a TLB miss for an instruction fetch will effectively prefetch the PML4 entry for the kernel heap and vice versa. Also, I don't know if this is immediately relevant to the patch, but the reason that the direct map is currently twice the size of the kernel virtual address space is that the largest machine (in terms of physical memory) that we were running on a couple of years ago had a sparse physical address space. Specifically, we needed to have a direct map spanning 1 TB in order to support 256 GB of RAM on that machine. This may, for example, become an issue if you try to autosize the direct map based upon the amount of DRAM. Alan ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
expanding past 1 TB on amd64
In src/sys/amd64/include/vmparam.h is this handy map: * 0x - 0x7fff user map * 0x8000 - 0x7fff does not exist (hole) * 0x8000 - 0x804020100fff recursive page table (512GB slot) * 0x804020101000 - 0xfdff unused * 0xfe00 - 0xfeff 1TB direct map * 0xff00 - 0xff7f unused * 0xff80 - 0x 512GB kernel map showing that the system can deal with at most 1 TB of address space (because of the direct map), using at most half of that for kernel memory (less, really, due to the inevitable VM fragmentation). New boards are coming soonish that will have the ability to go past that (24 DIMMs of 64 GB each = 1.5 TB). Or, if some crazy people :-) might want to use a most of a 768 GB board (24 DIMMs of 32 GB each, possible today although the price is kind of staggering) as wired-down kernel memory, the 512 GB VM area is already a problem. I have not wrapped my head around the amd64 pmap code but figured I'd ask: what might need to change to support larger spaces? Obviously NKPML4E in amd64/include/pmap.h, for the kernel start address; and NDMPML4E for the direct map. It looks like this would adjust KERNBASE and the direct map appropriately. But would that suffice, or have I missed something? For that matter, if these are changed to make space for future expansion, what would be a good expansion size? Perhaps multiply the sizes by 16? (If memory doubles roughly every 18 months, that should give room for at least 5 years.) Chris ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org