vm.kmem_size controls the maximum size of the kernel's heap, i.e., the region where the kernel's slab and malloc()-like memory allocators obtain their memory. While this heap may occupy the largest portion of the kernel's virtual address space, it cannot occupy the entirety of the address space. There are other things that must be given space within the kernel's address space, for example, the file system buffer map.

ZFS does not, however, use the regular file system buffer cache. The ARC takes its place, and the ARC abuses the kernel's heap like nothing else. So, if you are running a machine that only makes trivial use of a non-ZFS file system, like you boot from UFS, but store all of your data in ZFS, then you can dramatically reduce the size of the buffer map via boot loader tuneables and proportionately increase vm.kmem_size.

Any further increases in the kernel virtual address space size will, however, require code changes. Small changes, but changes nonetheless.

Alan

On 8/17/2012 3:16 PM, Gezeala M. Bacuño II wrote:
On Fri, Aug 17, 2012 at 7:38 AM, Gezeala M. Bacuño II <[email protected]> wrote:
On Thu, Aug 16, 2012 at 11:55 PM, Andrey Zonov <[email protected]> wrote:
On 8/17/12 7:15 AM, Marie Bacuno II wrote:

On Aug 16, 2012, at 18:47, Garrett Cooper <[email protected]> wrote:

On Thu, Aug 16, 2012 at 6:44 PM, Garrett Cooper <[email protected]>
wrote:
On Thu, Aug 16, 2012 at 5:46 PM, Gezeala M. Bacuño II
<[email protected]> wrote:
Hello fellow listers,

On a server with 512GB RAM it appears that vm.kmem_size_max is not
being auto-tuned to use >329853485875 (~307GB).

On this machine vm.kmem_size is equal to vm.kmem_size_max

# from sysctl
vm.kmem_size_max: 329853485875
vm.kmem_size: 329853485875

On a machine with 1GB of RAM, I have successfully set vm.kmem_size_max
to 330GB and vm.kmem_size automatically adjusts to 1GB even if I
manually set it in /boot/loader.conf.

But on the machine with 512GB of RAM it just resets. For the machine
to boot, we need to go to the loader prompt and issue:

OK set vm.kmem_size_max="300G"
OK boot

On all PCBSD (8,9) or FreeBSD (8.1,8.2,9) machines we have,
vm.kmem_size_max is always set to 329853485875.

How can I increase vm.kmem_size_max to use at least 500GB? And how is
329853485875 determined (formula)? I need to increase vm.kmem_size_max
and vm.kmem_size so I can set vfs.zfs.arc_max (ZFS ARC) to use say
490GB.

I'm browsing thru the source code at
http://fxr.watson.org/fxr/search?v=FREEBSD9&string=vm.kmem_size_max
and I'm still trying to make sense of how vm.kmem_size_max is
computed.

I have posted the same topic on forums.freebsd.org but I'm not getting
any recommendations.

Please see the link for additional details:
http://forums.freebsd.org/showthread.php?t=33977

Have you tried defining VM_KMEM_SIZE_MAX to your target value?

Its architecture specific BTW... see
sys/<architecture>/include/vmparam.h -- look for `VM_KMEM_SIZE_MAX`.

Also, it's a tunable, not a sysctl... so you need to set the value in
/boot/loader.conf .
-Garrett

Thanks for the quick reply.

Yes, had it set on /boot/loader.conf and by trial and error on the loader
prompt.

We were able to bump it to 400G successfully. Tried 500G and 450G and the
machine just spews out garbage in the screen.

The latest output from "zfs-stats -a" with vm.kmem_size_max="400G" is in
the forum: http://forums.freebsd.org/showthread.php?t=33977

About the code, I am looking into amd64 arch. Still checking the values of
the variables.. Can't just retrieve them using getconf. If you can point me
to a doxygen like documentation appreciate it a lot.

Where does the constant value 329853485875 came from?

It comes from this macro:

#define VM_KMEM_SIZE_MAX        ((VM_MAX_KERNEL_ADDRESS - \
     VM_MIN_KERNEL_ADDRESS + 1) * 3 / 5)

((1<<39) * 3 / 5) = 329853488332

AFAIK, VM_MAX_KERNEL_ADDRESS is limited to 512Gb.  May be it's time to
increase it again.  I would asked kib@ or alc@ about that.

--
Andrey Zonov
Thanks! That's great (great for deriving 512GB) but looks like bad
news for us, we've really hit some limits there (FreeBSD auto-tuning
wise). As I've stated above, we have tried setting vm.kmem_size_max to
500GB/450GB unsuccessfully so there may be some part of the code
that's breaking. Is there any thread or discussion where you can point
me as to why they used only 60%(hard coded) of 512GB?

Some relevant codes I've gathered on this machine:

/usr/src/sys/amd64/include/vmparam.h
/*
  * Virtual addresses of things.  Derived from the page directory and
  * page table indexes from pmap.h for precision.
  *
  * 0x0000000000000000 - 0x00007fffffffffff   user map
  * 0x0000800000000000 - 0xffff7fffffffffff   does not exist (hole)
  * 0xffff800000000000 - 0xffff804020100fff   recursive page table (512GB slot)
  * 0xffff804020101000 - 0xfffffdffffffffff   unused
  * 0xfffffe0000000000 - 0xfffffeffffffffff   1TB direct map
  * 0xffffff0000000000 - 0xffffff7fffffffff   unused
  * 0xffffff8000000000 - 0xffffffffffffffff   512GB kernel map
  *
  * Within the kernel map:
  *
  * 0xffffffff80000000                        KERNBASE
  */

#define VM_MAX_KERNEL_ADDRESS   KVADDR(KPML4I, NPDPEPG-1, NPDEPG-1, NPTEPG-1)
#define VM_MIN_KERNEL_ADDRESS   KVADDR(KPML4I, NPDPEPG-512, 0, 0)

/usr/src/sys/amd64/include/pmap.h
/*
  * Pte related macros.  This is complicated by having to deal with
  * the sign extension of the 48th bit.
  */
#define KVADDR(l4, l3, l2, l1) ( \
         ((unsigned long)-1 << 47) | \
         ((unsigned long)(l4) << PML4SHIFT) | \
         ((unsigned long)(l3) << PDPSHIFT) | \
         ((unsigned long)(l2) << PDRSHIFT) | \
         ((unsigned long)(l1) << PAGE_SHIFT))

/usr/src/sys/amd64/include/param.h
#define PAGE_SHIFT      12              /* LOG2(PAGE_SIZE) */
#define PDRSHIFT        21              /* LOG2(NBPDR) */
#define PDPSHIFT        30              /* LOG2(NBPDP) */
#define PML4SHIFT       39              /* LOG2(NBPML4) */

Yet to derive: KPML4I, NPDPEPG-1, NPDEPG-1, NPTEPG-1. Checking pmap.c,
vm_machdep.c etc.

Additional Info:
1] Installed using PCBSD-9 Release amd64.

2] uname -a
FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD
9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011
[email protected]:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC
  amd64

3] first few lines from /var/run/dmesg.boot:
FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011
     
[email protected]:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC
amd64
CPU: Intel(R) Xeon(R) CPU E7- 8837  @ 2.67GHz (2666.82-MHz K8-class CPU)
   Origin = "GenuineIntel"  Id = 0x206f2  Family = 6  Model = 2f  Stepping = 2
   
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
   
Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI>
   AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
   AMD Features2=0x1<LAHF>
   TSC: P-state invariant, performance statistics
real memory  = 549755813888 (524288 MB)
avail memory = 530339893248 (505771 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <ALASKA A M I>
FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs
FreeBSD/SMP: 8 package(s) x 8 core(s)

4] relevant sysctl's with manual tuning:
kern.maxusers: 384
kern.maxvnodes: 8222162
vfs.numvnodes: 675740
vfs.freevnodes: 417524
kern.ipc.somaxconn: 128
kern.openfiles: 5238
vfs.zfs.arc_max: 428422987776
vfs.zfs.arc_min: 53552873472
vfs.zfs.arc_meta_used: 3167391088
vfs.zfs.arc_meta_limit: 107105746944
vm.kmem_size_max: 429496729600    ==>> manually tuned
vm.kmem_size: 429496729600    ==>> manually tuned
vm.kmem_map_free: 107374727168
vm.kmem_map_size: 144625156096
vfs.wantfreevnodes: 2055540
kern.minvnodes: 2055540
kern.maxfiles: 197248    ==>> manually tuned
vm.vmtotal:
System wide totals computed every five seconds: (values in kilobytes)
===============================================
Processes:              (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150)
Virtual Memory:         (Total: 1086325716K Active: 12377876K)
Real Memory:            (Total: 144143408K Active: 803432K)
Shared Virtual Memory:  (Total: 81384K Active: 37560K)
Shared Real Memory:     (Total: 32224K Active: 27548K)
Free Memory Pages:      365565564K

hw.availpages: 134170294
hw.physmem: 549561524224
hw.usermem: 391395241984
hw.realmem: 551836188672
vm.kmem_size_scale: 1
kern.ipc.nmbclusters: 2560000    ==>> manually tuned
kern.ipc.maxsockbuf: 2097152
net.inet.tcp.sendbuf_max: 2097152
net.inet.tcp.recvbuf_max: 2097152
kern.maxfilesperproc: 18000
net.inet.ip.intr_queue_maxlen: 256
kern.maxswzone: 33554432
kern.ipc.shmmax: 10737418240    ==>> manually tuned
kern.ipc.shmall: 2621440    ==>> manually tuned
vfs.zfs.write_limit_override: 0
vfs.zfs.prefetch_disable: 0
hw.pagesize: 4096
hw.availpages: 134170294
kern.ipc.maxpipekva: 8586895360
kern.ipc.shm_use_phys: 1    ==>> manually tuned
vfs.vmiodirenable: 1
debug.numcache: 632148
vfs.ncsizefactor: 2
vm.kvm_size: 549755809792
vm.kvm_free: 54456741888
kern.ipc.semmni: 256
kern.ipc.semmns: 512
kern.ipc.semmnu: 256


Thanks!


_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[email protected]"

Reply via email to