Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday
2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote: In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. Interesting. In the problem state the system is usually quite responsive, eg. not memory trashing. Under Linux which I'm more familiar with the 'used memory' = 'total memory - 'free memory', refers to physical memory being used for data caching by the kernel which is still available for processes to allocate as needed together with memory allocated to processes, as opposed to only physical memory already allocated and therefore really 'used'. Does this mean something different under Solaris ? Well, it is roughly similar. In Solaris there is a general notion [snipped] Dear Jim, Thanks for the detailed explanation of ZFS memory usage. Special thanks also to John D Groenveld for the initial suggestion of a lack of RAM problem. Since up-ing the RAM from 2GB to 4GB the machine has sailed though the last two Sunday mornings w/o problem. I was interested to subsequently discover the Solaris command 'echo ::memstat | mdb -k' which reveals just how much memory ZFS can use. Best regards Tom. -- Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 0EX, England. Email: T.Crane@rhul dot ac dot uk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday
in solaris zfs cache many things, you should have more ram if you setup 18gb swap , imho, ram should be high than 4gb regards Sent from my iPad On Jun 25, 2012, at 5:58, tpc...@mklab.ph.rhul.ac.uk wrote: 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote: In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. Interesting. In the problem state the system is usually quite responsive, eg. not memory trashing. Under Linux which I'm more familiar with the 'used memory' = 'total memory - 'free memory', refers to physical memory being used for data caching by the kernel which is still available for processes to allocate as needed together with memory allocated to processes, as opposed to only physical memory already allocated and therefore really 'used'. Does this mean something different under Solaris ? Well, it is roughly similar. In Solaris there is a general notion [snipped] Dear Jim, Thanks for the detailed explanation of ZFS memory usage. Special thanks also to John D Groenveld for the initial suggestion of a lack of RAM problem. Since up-ing the RAM from 2GB to 4GB the machine has sailed though the last two Sunday mornings w/o problem. I was interested to subsequently discover the Solaris command 'echo ::memstat | mdb -k' which reveals just how much memory ZFS can use. Best regards Tom. -- Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 0EX, England. Email: T.Crane@rhul dot ac dot uk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning
Offlist/OT - Sheer guess, straight out of my parts - maybe a cronjob to rebuild the locate db or something similar is hammering it once a week? In the problem condition, there appears to be very little going on on the system. eg., root@server5:/tmp# /usr/local/bin/top last pid: 3828; load avg: 4.29, 3.95, 3.84; up 6+23:11:4407:12:47 79 processes: 78 sleeping, 1 on cpu CPU states: 73.0% idle, 0.0% user, 27.0% kernel, 0.0% iowait, 0.0% swap Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap PID USERNAME LWP PRI NICE SIZE RES STATETIMECPU COMMAND 784 root 17 60 -20 88M 632K sleep 270:03 13.02% nfsd 2694 root 1 590 1376K 672K sleep1:45 0.69% touch 3814 root 5 590 30M 3928K sleep0:00 0.32% pkgserv 3763 root 1 600 8400K 1256K sleep0:02 0.20% zfs 3826 root 1 520 3516K 2004K cpu/10:00 0.05% top 3811 root 1 590 7668K 1732K sleep0:00 0.02% pkginfo 1323 noaccess 18 590 119M 1660K sleep4:47 0.01% java 174 root 50 590 8796K 1208K sleep1:47 0.01% nscd 332 root 1 490 2480K 456K sleep0:06 0.01% dhcpagent 8 root 15 590 14M 640K sleep0:07 0.01% svc.startd 1236 root 1 590 15M 5172K sleep2:06 0.01% Xorg 1281 root 1 590 11M 544K sleep1:00 0.00% dtgreet 26068 root 1 100 -20 2680K 1416K sleep0:01 0.00% xntpd 582 root 4 590 6884K 1232K sleep1:22 0.00% inetd 394 daemon 2 60 -20 2528K 508K sleep5:54 0.00% lockd Regards Tom Crane On 6/13/12 3:47 AM, tpc...@mklab.ph.rhul.ac.uk wrote: Dear All, I have been advised to enquire here on zfs-discuss with the ZFS problem described below, following discussion on Usenet NG comp.unix.solaris. The full thread should be available here https://groups.google.com/forum/#!topic/comp.unix.solaris/uEQzz1t-G1s Many thanks Tom Crane -- Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 0EX, England. Email: t.cr...@rhul.ac.uk Fax:+44 (0) 1784 472794 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning
In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. John groenv...@acm.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday
In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. Interesting. In the problem state the system is usually quite responsive, eg. not memory trashing. Under Linux which I'm more familiar with the 'used memory' = 'total memory - 'free memory', refers to physical memory being used for data caching by the kernel which is still available for processes to allocate as needed together with memory allocated to processes, as opposed to only physical memory already allocated and therefore really 'used'. Does this mean something different under Solaris ? Cheers Tom John groenv...@acm.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday
2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote: In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. Interesting. In the problem state the system is usually quite responsive, eg. not memory trashing. Under Linux which I'm more familiar with the 'used memory' = 'total memory - 'free memory', refers to physical memory being used for data caching by the kernel which is still available for processes to allocate as needed together with memory allocated to processes, as opposed to only physical memory already allocated and therefore really 'used'. Does this mean something different under Solaris ? Well, it is roughly similar. In Solaris there is a general notion of swap or virtual memory so as not to confuse adepts of other systems, which is a general combination of RAM and disk swap spaces. Tools imported from other environments, like top above, use the common notions of physical memory and on-disk swap; tools like vmstat under Solaris would print the swap = VM and the free = RAM columns... Processes are allocated their memory requirements from the generic swap = virt.mem, though some tricks are possible - some pages may be marked as not swappable to disk, others may require a reservation of on-disk swap space even if all the data still lives in RAM. Kernel memory, for example, that used by ZFS, does not go into on-disk swap (which can cause system freezes due to shortage of RAM for operations if some big ZFS task is not ready to just release that virtual memory). The ZFS ARC cache may release its memory on request for RAM from other processes, but this takes some time (and some programs check for lack of free memory and think they can't get more, and break without even trying), so a reserve of free memory is usually kept by the OS. To have the free RAM go as low as the 32Mb low watermark, some strong hammering must be going on... Now, back to the 2Gb RAM problem: ZFS has lots of metadata. Both reads and writes to the pool have to traverse a large tree of block pointers, with leaves of the tree containing pieces of your user-data. Updates to user-data cause rewriting of the whole path through the tree from updated blocks to the root (metadata blocks must be read, modified, re-checksummed at their parents - recurse to root). Metadata blocks are also stored on the disk, but in several copies per block (double-triple the IOPS cost). ZFS works fast when the hot paths through the needed portions of the blockpointer tree, or, even better, the whole tree, are cached into RAM. Otherwise, the least-used blocks are evicted to accomodate the recent newcomers. If you are low on RAM and useful blocks get evicted, this causes re-reads from disk to get them back (and evict some others), which may cause the lags you're seeing. The high part of kernel time also indicates that it is not some userspace computation hogging the CPUs, but likely waiting for hardware IOs. Running iostat 1 or zpool iostat 1 can help you see some patterns (at least, whether there are many disk reads when the system is hung). Perhaps the pool is getting scrubbed, or the slocate database gets updated, or some machines begin dumping their backups onto the fileserver at once - and with so little cache the machine nearly dies, in terms of performance and responsiveness at least. This lack of RAM is especially deadly upon writes into deduped pools, because DDT tables tend to be large (tens of GBs for moderate-sized pools of tens of TB). Your box seems to have a 12Tb pool with just a little bit used, yet already the shortage of RAM is well seen... Hope this helps (understanding at least), //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss