On 08/18/2012 19:57, Gezeala M. Bacuño II wrote:
On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox<[email protected]> wrote:On 08/17/2012 17:08, Gezeala M. Bacuño II wrote:On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox<[email protected]> wrote:vm.kmem_size controls the maximum size of the kernel's heap, i.e., the region where the kernel's slab and malloc()-like memory allocators obtain their memory. While this heap may occupy the largest portion of the kernel's virtual address space, it cannot occupy the entirety of the address space. There are other things that must be given space within the kernel's address space, for example, the file system buffer map.ZFS does not, however, use the regular file system buffer cache. The ARC takes its place, and the ARC abuses the kernel's heap like nothing else. So, if you are running a machine that only makes trivial use of a non-ZFS file system, like you boot from UFS, but store all of your data in ZFS, then you can dramatically reduce the size of the buffer map via boot loader tuneables and proportionately increase vm.kmem_size. Any further increases in the kernel virtual address space size will, however, require code changes. Small changes, but changes nonetheless. Alan<<snip>>Additional Info: 1] Installed using PCBSD-9 Release amd64. 2] uname -a FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 [email protected]:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC amd64 3] first few lines from /var/run/dmesg.boot: FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 [email protected]:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC amd64 CPU: Intel(R) Xeon(R) CPU E7- 8837 @ 2.67GHz (2666.82-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x206f2 Family = 6 Model = 2f Stepping = 2 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x1<LAHF> TSC: P-state invariant, performance statistics real memory = 549755813888 (524288 MB) avail memory = 530339893248 (505771 MB) Event timer "LAPIC" quality 600 ACPI APIC Table:<ALASKA A M I> FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs FreeBSD/SMP: 8 package(s) x 8 core(s) 4] relevant sysctl's with manual tuning: kern.maxusers: 384 kern.maxvnodes: 8222162 vfs.numvnodes: 675740 vfs.freevnodes: 417524 kern.ipc.somaxconn: 128 kern.openfiles: 5238 vfs.zfs.arc_max: 428422987776 vfs.zfs.arc_min: 53552873472 vfs.zfs.arc_meta_used: 3167391088 vfs.zfs.arc_meta_limit: 107105746944 vm.kmem_size_max: 429496729600 ==>> manually tuned vm.kmem_size: 429496729600 ==>> manually tuned vm.kmem_map_free: 107374727168 vm.kmem_map_size: 144625156096 vfs.wantfreevnodes: 2055540 kern.minvnodes: 2055540 kern.maxfiles: 197248 ==>> manually tuned vm.vmtotal: System wide totals computed every five seconds: (values in kilobytes) =============================================== Processes: (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150) Virtual Memory: (Total: 1086325716K Active: 12377876K) Real Memory: (Total: 144143408K Active: 803432K) Shared Virtual Memory: (Total: 81384K Active: 37560K) Shared Real Memory: (Total: 32224K Active: 27548K) Free Memory Pages: 365565564K hw.availpages: 134170294 hw.physmem: 549561524224 hw.usermem: 391395241984 hw.realmem: 551836188672 vm.kmem_size_scale: 1 kern.ipc.nmbclusters: 2560000 ==>> manually tuned kern.ipc.maxsockbuf: 2097152 net.inet.tcp.sendbuf_max: 2097152 net.inet.tcp.recvbuf_max: 2097152 kern.maxfilesperproc: 18000 net.inet.ip.intr_queue_maxlen: 256 kern.maxswzone: 33554432 kern.ipc.shmmax: 10737418240 ==>> manually tuned kern.ipc.shmall: 2621440 ==>> manually tuned vfs.zfs.write_limit_override: 0 vfs.zfs.prefetch_disable: 0 hw.pagesize: 4096 hw.availpages: 134170294 kern.ipc.maxpipekva: 8586895360 kern.ipc.shm_use_phys: 1 ==>> manually tuned vfs.vmiodirenable: 1 debug.numcache: 632148 vfs.ncsizefactor: 2 vm.kvm_size: 549755809792 vm.kvm_free: 54456741888 kern.ipc.semmni: 256 kern.ipc.semmns: 512 kern.ipc.semmnu: 256Thanks. It will be mainly used for postgreSQL and java. We have a huge db (3TB and growing) and we need to have as much of it as we can on zfs' ARC. All data resides on zpools while root is on ufs. On 8.2 and 9 machines vm.kmem_size is always auto-tuned to almost the same size as our installed RAM. What I've tuned on those machines is lower vfs.zfs.arc_max to 50% or 75% of vm.kmem_size and that have worked well for us and the machines does not swap out. Now on this machine, I do think that I need to adjust my formula for tuning vfs.zfs.arc_max, 25% for other stuff is probably overkill. We were able to successfully bump vm.kmem_size_max and vm.kmem_size to 400GB: vm.kmem_size_max: 429496729600 ==>> manually tuned vm.kmem_size: 429496729600 ==>> manually tuned vfs.zfs.arc_max: 428422987776 ==>> auto-tuned (vm.kmem_size - 1G) vfs.zfs.arc_min: 53552873472 ==>> auto-tuned Which other tuneables do I need to set on /boot/loader.conf so we can boot the machine with vm.kmem_size> 400G. As I don't know which part of the boot-up process is failing with vm.kmem_size/_max set to 450G or 500G, I have no idea which to tune next.Your objective should be to reduce the value of "sysctl vfs.maxbufspace". You can do this by setting the loader.conf tuneable "kern.maxbcache" to the desired value. What does your machine currently report for "sysctl vfs.maxbufspace"?Here you go: vfs.maxbufspace: 54967025664 kern.maxbcache: 0
Try setting kern.maxbcache to two billion and adding 50 billion to the setting of vm.kmem_size{,_max}.
Other (probably) relevant values: vfs.hirunningspace: 16777216 vfs.lorunningspace: 11206656 vfs.bufdefragcnt: 0 vfs.buffreekvacnt: 2 vfs.bufreusecnt: 320149 vfs.hibufspace: 54966370304 vfs.lobufspace: 54966304768 vfs.maxmallocbufspace: 2748318515 vfs.bufmallocspace: 0 vfs.bufspace: 10490478592 vfs.runningbufspace: 0 Let me know if you need other tuneables or sysctl values. Thanks a lot for looking into this.
_______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "[email protected]"
