Re: expanding past 1 TB on amd64

2013-08-16 Thread Neel Natu
Hi Chris, Alan:

On Sun, Jul 21, 2013 at 1:19 AM, Chris Torek chris.to...@gmail.com wrote:

 (Apologies for delay in reply, family issues)

 I'd be fine with 4 TB instead of 16; and, at this point, with the latest
 patches, it's easily tuned.  The auto-sizing of the direct map is not
 affected by sparse space as it keys off Maxmem, which is not actually
 physical size, but rather one past last valid physical page.


Here is the patch that I intend to commit:
http://people.freebsd.org/~neel/patches/amd64_pmap_4TB.patch

This patch bumps up the direct map to 4TB and the KVA to 2TB.

It is identical to the patch you posted on hackers except for changing the
limits:
http://lists.freebsd.org/pipermail/freebsd-hackers/2013-July/043139.html

Tested inside a bhyve virtual machine with a *very* sparse memory layout
with a memory segment that cuddles up with the 4TB limit.

best
Neel



 The direct map limit might not need to be twice kernel virtual size but
 on Intel memory-controller systems needs to be greater than KVM size due
 to moving DRAM up past the PCI hole.  Unless the restriction that the
 direct-map area be a power of two size is removed, that winds up meaning
 twice.  (Removing the restriction seems easy enough—instead of pa |
 highbits to obtain VA and va ~ highbits to obtain PA, just use phys +
 offset and virt - offset.  I didn't see a reason to bother with the
 effort, though.)

 Chris

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-21 Thread Chris Torek
(Apologies for delay in reply, family issues)

I'd be fine with 4 TB instead of 16; and, at this point, with the latest
patches, it's easily tuned.  The auto-sizing of the direct map is not
affected by sparse space as it keys off Maxmem, which is not actually
physical size, but rather one past last valid physical page.

The direct map limit might not need to be twice kernel virtual size but
on Intel memory-controller systems needs to be greater than KVM size due
to moving DRAM up past the PCI hole.  Unless the restriction that the
direct-map area be a power of two size is removed, that winds up meaning
twice.  (Removing the restriction seems easy enough—instead of pa |
highbits to obtain VA and va ~ highbits to obtain PA, just use phys +
offset and virt - offset.  I didn't see a reason to bother with the
effort, though.)

Chris
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-17 Thread Darren Pilgrim

On 7/16/2013 2:12 PM, Alan Cox wrote:

... The Haswell line of CPUs is widely reported to

support DIMMs twice as large, and it's due in September.  That would
make the systems of late 2013 hold up to 1536GB of memory.


I'd point you at stuff like the Supermicro X8BQ6 series of mainboards. 
QP E5-8800 systems with 1 TB of memory have been around since 2011.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-17 Thread Kurt Lidl

On 7/16/2013 2:12 PM, Alan Cox wrote:

... The Haswell line of CPUs is widely reported to

support DIMMs twice as large, and it's due in September.  That would
make the systems of late 2013 hold up to 1536GB of memory.


I'd point you at stuff like the Supermicro X8BQ6 series of mainboards.
QP E5-8800 systems with 1 TB of memory have been around since 2011.


That might have been true, but I did check SuperMicro's
motherboard matrix of available products before posting.

The largest listed memory configuration on
any of their current products is 768GB.

http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory

-Kurt

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-17 Thread Bob Bishop
Hi,

On 17 Jul 2013, at 15:17, Kurt Lidl wrote:

 On 7/16/2013 2:12 PM, Alan Cox wrote:
 ... The Haswell line of CPUs is widely reported to
 support DIMMs twice as large, and it's due in September.  That would
 make the systems of late 2013 hold up to 1536GB of memory.
 
 I'd point you at stuff like the Supermicro X8BQ6 series of mainboards.
 QP E5-8800 systems with 1 TB of memory have been around since 2011.
 
 That might have been true, but I did check SuperMicro's
 motherboard matrix of available products before posting.
 
 The largest listed memory configuration on
 any of their current products is 768GB.
 
 http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory
 
 -Kurt

http://www.supermicro.com/products/motherboard/Xeon7000

Looks like their matrix is not up-to-date.

--
Bob Bishop
r...@gid.co.uk




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-17 Thread Freddie Cash
On Wed, Jul 17, 2013 at 7:50 AM, Bob Bishop r...@gid.co.uk wrote:

 Hi,

 On 17 Jul 2013, at 15:17, Kurt Lidl wrote:

  On 7/16/2013 2:12 PM, Alan Cox wrote:
  ... The Haswell line of CPUs is widely reported to
  support DIMMs twice as large, and it's due in September.  That would
  make the systems of late 2013 hold up to 1536GB of memory.
 
  I'd point you at stuff like the Supermicro X8BQ6 series of mainboards.
  QP E5-8800 systems with 1 TB of memory have been around since 2011.
 
  That might have been true, but I did check SuperMicro's
  motherboard matrix of available products before posting.
 
  The largest listed memory configuration on
  any of their current products is 768GB.
 
 
 http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory
 
  -Kurt

 http://www.supermicro.com/products/motherboard/Xeon7000

 Looks like their matrix is not up-to-date.


There's also several AMD motherboards that support 1 TB of RAM:
http://www.supermicro.com/products/nfo/AMD_G34.cfm?pg=MOBO

You know, the CPUs that started the 64-bit x86 support ... :)

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-17 Thread Kurt Lidl

On 7/17/13 11:26 AM, Freddie Cash wrote:

On Wed, Jul 17, 2013 at 7:50 AM, Bob Bishop r...@gid.co.uk
mailto:r...@gid.co.uk wrote:

Hi,

On 17 Jul 2013, at 15:17, Kurt Lidl wrote:

  On 7/16/2013 2:12 PM, Alan Cox wrote:
  ... The Haswell line of CPUs is widely reported to
  support DIMMs twice as large, and it's due in September.  That
would
  make the systems of late 2013 hold up to 1536GB of memory.
 
  I'd point you at stuff like the Supermicro X8BQ6 series of
mainboards.
  QP E5-8800 systems with 1 TB of memory have been around since 2011.
 
  That might have been true, but I did check SuperMicro's
  motherboard matrix of available products before posting.
 
  The largest listed memory configuration on
  any of their current products is 768GB.
 
 

http://www.supermicro.com/products/motherboard/matrix/?cpuclass=allsorton=memory
 
  -Kurt

http://www.supermicro.com/products/motherboard/Xeon7000

Looks like their matrix is not up-to-date.


There's also several AMD motherboards that support 1 TB of RAM:
http://www.supermicro.com/products/nfo/AMD_G34.cfm?pg=MOBO

You know, the CPUs that started the 64-bit x86 support ... :)


Searching a bit harder, it looks like Intel is shipping a quad-socket
board that supports 1500GB of memory.

http://ark.intel.com/products/61033

So, 1500GB is now, and the next doubling will probably be soon,
assuming Intel revs their quad processor boards for Haswell, and that
support for 64GB DIMMs is there.

I'm not trying to find the biggest motherboard out there, I'm just
trying to say that Chris' patch for support up to 16TB isn't too
farfetched.  And within the next 5 year window, it's entirely likely
that  4TB systems will be available.

-Kurt

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-16 Thread Kurt Lidl

On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.torek at gmail.com wrote:


In src/sys/amd64/include/vmparam.h is this handy map:

 * 0x - 0x7fff   user map
 * 0x8000 - 0x7fff   does not exist (hole)
 * 0x8000 - 0x804020100fff   recursive page table (512GB
slot)
 * 0x804020101000 - 0xfdff   unused
 * 0xfe00 - 0xfeff   1TB direct map
 * 0xff00 - 0xff7f   unused
 * 0xff80 - 0x   512GB kernel map

showing that the system can deal with at most 1 TB of address space
(because of the direct map), using at most half of that for kernel
memory (less, really, due to the inevitable VM fragmentation).

New boards are coming soonish that will have the ability to go
past that (24 DIMMs of 64 GB each = 1.5 TB).  Or, if some crazy
people :-) might want to use a most of a 768 GB board (24 DIMMs of
32 GB each, possible today although the price is kind of
staggering) as wired-down kernel memory, the 512 GB VM area is
already a problem.

I have not wrapped my head around the amd64 pmap code but figured
I'd ask: what might need to change to support larger spaces?
Obviously NKPML4E in amd64/include/pmap.h, for the kernel start
address; and NDMPML4E for the direct map.  It looks like this
would adjust KERNBASE and the direct map appropriately.  But would
that suffice, or have I missed something?

For that matter, if these are changed to make space for future
expansion, what would be a good expansion size?  Perhaps multiply
the sizes by 16?  (If memory doubles roughly every 18 months,
that should give room for at least 5 years.)



Chris, Neel,

The actual data that I've seen shows that DIMMs are doubling in size at
about half that pace, about every three years.  For example, see
http://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide
#8.  So, I think that a factor of 16 is a lot more than we'll need in
the next five years.  I would suggest configuring the kernel virtual
address space for 4 TB.  Once you go beyond 512 GB, 4 TB is the net
plateau in terms of address translation cost.  At 4 TB all of the PML4
entries for the kernel virtual address space will reside in the same L2
cache line, so a page table walk on a TLB miss for an instruction fetch
will effectively prefetch the PML4 entry for the kernel heap and vice versa.


The largest commodity motherboards that are shipping today support
24 DIMMs, at a max size of 32GB per DIMM.  That's 768GB, right now.
(So FreeBSD is already out of bits in terms of supporting current
shipping hardware.) The Haswell line of CPUs is widely reported to
support DIMMs twice as large, and it's due in September.  That would
make the systems of late 2013 hold up to 1536GB of memory.

Using your figure of doubling in 3 years, we'll see 3072GB systems by
~2016.  And in ~2019, we'll see 6TB systems, and need to finally expand
to using more than a single cache line to hold all the PML4 entries.

Of course, that's speculating furiously about two generations out, and
assumes keeping the current memory architecture / board design
constraints.

-Kurt

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-16 Thread Alan Cox
On Tue, Jul 16, 2013 at 7:08 AM, Kurt Lidl l...@pix.net wrote:

 On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.torek at gmail.com
 wrote:

  In src/sys/amd64/include/vmparam.**h is this handy map:

  * 0x - 0x7fff   user map
  * 0x8000 - 0x7fff   does not exist (hole)
  * 0x8000 - 0x804020100fff   recursive page table (512GB
 slot)
  * 0x804020101000 - 0xfdff   unused
  * 0xfe00 - 0xfeff   1TB direct map
  * 0xff00 - 0xff7f   unused
  * 0xff80 - 0x   512GB kernel map

 showing that the system can deal with at most 1 TB of address space
 (because of the direct map), using at most half of that for kernel
 memory (less, really, due to the inevitable VM fragmentation).

 New boards are coming soonish that will have the ability to go
 past that (24 DIMMs of 64 GB each = 1.5 TB).  Or, if some crazy
 people :-) might want to use a most of a 768 GB board (24 DIMMs of
 32 GB each, possible today although the price is kind of
 staggering) as wired-down kernel memory, the 512 GB VM area is
 already a problem.

 I have not wrapped my head around the amd64 pmap code but figured
 I'd ask: what might need to change to support larger spaces?
 Obviously NKPML4E in amd64/include/pmap.h, for the kernel start
 address; and NDMPML4E for the direct map.  It looks like this
 would adjust KERNBASE and the direct map appropriately.  But would
 that suffice, or have I missed something?

 For that matter, if these are changed to make space for future
 expansion, what would be a good expansion size?  Perhaps multiply
 the sizes by 16?  (If memory doubles roughly every 18 months,
 that should give room for at least 5 years.)


  Chris, Neel,

 The actual data that I've seen shows that DIMMs are doubling in size at
 about half that pace, about every three years.  For example, see
 http://users.ece.cmu.edu/~**omutlu/pub/mutlu_memory-**
 scaling_imw13_invited-talk.**pdfslidehttp://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide
 #8.  So, I think that a factor of 16 is a lot more than we'll need in
 the next five years.  I would suggest configuring the kernel virtual
 address space for 4 TB.  Once you go beyond 512 GB, 4 TB is the net
 plateau in terms of address translation cost.  At 4 TB all of the PML4
 entries for the kernel virtual address space will reside in the same L2
 cache line, so a page table walk on a TLB miss for an instruction fetch
 will effectively prefetch the PML4 entry for the kernel heap and vice
 versa.


 The largest commodity motherboards that are shipping today support
 24 DIMMs, at a max size of 32GB per DIMM.  That's 768GB, right now.
 (So FreeBSD is already out of bits in terms of supporting current
 shipping hardware.)



Actually, this scenario with 768 GB of RAM on amd64 as it is today is
analogous to the typical 32-bit i386 machine, where the amount of RAM has
long exceeded the default 1 GB size of the kernel virtual address space.
 In theory, we could currently handle up to 1 TB of RAM, but the kernel
virtual address space would only be 512 GB.


... The Haswell line of CPUs is widely reported to
 support DIMMs twice as large, and it's due in September.  That would
 make the systems of late 2013 hold up to 1536GB of memory.

 Using your figure of doubling in 3 years, we'll see 3072GB systems by
 ~2016.  And in ~2019, we'll see 6TB systems, and need to finally expand
 to using more than a single cache line to hold all the PML4 entries.


Yes, this is a reasonable prognostication.

Alan


 Of course, that's speculating furiously about two generations out, and
 assumes keeping the current memory architecture / board design
 constraints.

 -Kurt


 __**_
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/**mailman/listinfo/freebsd-**currenthttp://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscribe@**
 freebsd.org freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Max mmap space (was: expanding past 1 TB on amd64)

2013-07-16 Thread Bernd Walter
On Tue, Jul 16, 2013 at 02:12:42PM -0700, Alan Cox wrote:
 On Tue, Jul 16, 2013 at 7:08 AM, Kurt Lidl l...@pix.net wrote:
 
  On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.torek at gmail.com
  wrote:
 
   In src/sys/amd64/include/vmparam.**h is this handy map:
 
   * 0x - 0x7fff   user map
   * 0x8000 - 0x7fff   does not exist (hole)
   * 0x8000 - 0x804020100fff   recursive page table (512GB
  slot)
   * 0x804020101000 - 0xfdff   unused
   * 0xfe00 - 0xfeff   1TB direct map
   * 0xff00 - 0xff7f   unused
   * 0xff80 - 0x   512GB kernel map
 
  The actual data that I've seen shows that DIMMs are doubling in size at
  about half that pace, about every three years.  For example, see
  http://users.ece.cmu.edu/~**omutlu/pub/mutlu_memory-**
  scaling_imw13_invited-talk.**pdfslidehttp://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide
  #8.  So, I think that a factor of 16 is a lot more than we'll need in
  the next five years.  I would suggest configuring the kernel virtual
  address space for 4 TB.  Once you go beyond 512 GB, 4 TB is the net
  plateau in terms of address translation cost.  At 4 TB all of the PML4
  entries for the kernel virtual address space will reside in the same L2
  cache line, so a page table walk on a TLB miss for an instruction fetch
  will effectively prefetch the PML4 entry for the kernel heap and vice
  versa.
 
 
  The largest commodity motherboards that are shipping today support
  24 DIMMs, at a max size of 32GB per DIMM.  That's 768GB, right now.
  (So FreeBSD is already out of bits in terms of supporting current
  shipping hardware.)
 
 
 
 Actually, this scenario with 768 GB of RAM on amd64 as it is today is
 analogous to the typical 32-bit i386 machine, where the amount of RAM has
 long exceeded the default 1 GB size of the kernel virtual address space.
  In theory, we could currently handle up to 1 TB of RAM, but the kernel
 virtual address space would only be 512 GB.

Talking about virtual address space.
I plan to permanently mmap serveral nGB sized files (all together 6-8TB)
into a single process address space.
Actually I see the user map is 128TB, so I shouldn't get into trouble
by doing this and also have lot of additional space left to avoid problems
because of fragmentation.
The system has 192G physical memory, so mapping tables have enough space.
Is there anything else to worry about?

-- 
B.Walter be...@bwct.de http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: expanding past 1 TB on amd64

2013-07-15 Thread Alan Cox
On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek chris.to...@gmail.com wrote:

 In src/sys/amd64/include/vmparam.h is this handy map:

  * 0x - 0x7fff   user map
  * 0x8000 - 0x7fff   does not exist (hole)
  * 0x8000 - 0x804020100fff   recursive page table (512GB
 slot)
  * 0x804020101000 - 0xfdff   unused
  * 0xfe00 - 0xfeff   1TB direct map
  * 0xff00 - 0xff7f   unused
  * 0xff80 - 0x   512GB kernel map

 showing that the system can deal with at most 1 TB of address space
 (because of the direct map), using at most half of that for kernel
 memory (less, really, due to the inevitable VM fragmentation).

 New boards are coming soonish that will have the ability to go
 past that (24 DIMMs of 64 GB each = 1.5 TB).  Or, if some crazy
 people :-) might want to use a most of a 768 GB board (24 DIMMs of
 32 GB each, possible today although the price is kind of
 staggering) as wired-down kernel memory, the 512 GB VM area is
 already a problem.

 I have not wrapped my head around the amd64 pmap code but figured
 I'd ask: what might need to change to support larger spaces?
 Obviously NKPML4E in amd64/include/pmap.h, for the kernel start
 address; and NDMPML4E for the direct map.  It looks like this
 would adjust KERNBASE and the direct map appropriately.  But would
 that suffice, or have I missed something?

 For that matter, if these are changed to make space for future
 expansion, what would be a good expansion size?  Perhaps multiply
 the sizes by 16?  (If memory doubles roughly every 18 months,
 that should give room for at least 5 years.)


Chris, Neel,

The actual data that I've seen shows that DIMMs are doubling in size at
about half that pace, about every three years.  For example, see
http://users.ece.cmu.edu/~omutlu/pub/mutlu_memory-scaling_imw13_invited-talk.pdfslide
#8.  So, I think that a factor of 16 is a lot more than we'll need in
the next five years.  I would suggest configuring the kernel virtual
address space for 4 TB.  Once you go beyond 512 GB, 4 TB is the net
plateau in terms of address translation cost.  At 4 TB all of the PML4
entries for the kernel virtual address space will reside in the same L2
cache line, so a page table walk on a TLB miss for an instruction fetch
will effectively prefetch the PML4 entry for the kernel heap and vice versa.

Also, I don't know if this is immediately relevant to the patch, but the
reason that the direct map is currently twice the size of the kernel
virtual address space is that the largest machine (in terms of physical
memory) that we were running on a couple of years ago had a sparse physical
address space.  Specifically, we needed to have a direct map spanning 1 TB
in order to support 256 GB of RAM on that machine.  This may, for example,
become an issue if you try to autosize the direct map based upon the amount
of DRAM.

Alan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


expanding past 1 TB on amd64

2013-06-19 Thread Chris Torek
In src/sys/amd64/include/vmparam.h is this handy map:

 * 0x - 0x7fff   user map
 * 0x8000 - 0x7fff   does not exist (hole)
 * 0x8000 - 0x804020100fff   recursive page table (512GB slot)
 * 0x804020101000 - 0xfdff   unused
 * 0xfe00 - 0xfeff   1TB direct map
 * 0xff00 - 0xff7f   unused
 * 0xff80 - 0x   512GB kernel map

showing that the system can deal with at most 1 TB of address space
(because of the direct map), using at most half of that for kernel
memory (less, really, due to the inevitable VM fragmentation).

New boards are coming soonish that will have the ability to go
past that (24 DIMMs of 64 GB each = 1.5 TB).  Or, if some crazy
people :-) might want to use a most of a 768 GB board (24 DIMMs of
32 GB each, possible today although the price is kind of
staggering) as wired-down kernel memory, the 512 GB VM area is
already a problem.

I have not wrapped my head around the amd64 pmap code but figured
I'd ask: what might need to change to support larger spaces?
Obviously NKPML4E in amd64/include/pmap.h, for the kernel start
address; and NDMPML4E for the direct map.  It looks like this
would adjust KERNBASE and the direct map appropriately.  But would
that suffice, or have I missed something?

For that matter, if these are changed to make space for future
expansion, what would be a good expansion size?  Perhaps multiply
the sizes by 16?  (If memory doubles roughly every 18 months,
that should give room for at least 5 years.)

Chris
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org