Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-25 Thread TPCzfs

 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:
 
  In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
  tpc...@mklab.ph.r
  hul.ac.uk writes:
  Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
  My WAG is that your zpool history is hanging due to lack of
  RAM.
 
  Interesting.  In the problem state the system is usually quite responsive, 
  eg. not memory trashing.  Under Linux which I'm more
  familiar with the 'used memory' = 'total memory - 'free memory', refers to 
  physical memory being used for data caching by
  the kernel which is still available for processes to allocate as needed 
  together with memory allocated to processes, as opposed to
  only physical memory already allocated and therefore really 'used'.  Does 
  this mean something different under Solaris ?

 Well, it is roughly similar. In Solaris there is a general notion

[snipped]

Dear Jim,
Thanks for the detailed explanation of ZFS memory usage.  Special 
thanks also to John D Groenveld for the initial suggestion of a lack of RAM
problem.  Since up-ing the RAM from 2GB to 4GB the machine has sailed though 
the last two Sunday mornings w/o problem.  I was interested to
subsequently discover the Solaris command 'echo ::memstat | mdb -k' which 
reveals just how much memory ZFS can use.

Best regards
Tom.

--
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England.
Email:  T.Crane@rhul dot ac dot uk
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-25 Thread Hung-Sheng Tsao (LaoTsao) Ph.D
in solaris zfs cache many things, you should have more ram
if you setup 18gb swap , imho, ram should be high than 4gb
regards

Sent from my iPad

On Jun 25, 2012, at 5:58, tpc...@mklab.ph.rhul.ac.uk wrote:

 
 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:
 
 In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
 tpc...@mklab.ph.r
 hul.ac.uk writes:
 Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
 My WAG is that your zpool history is hanging due to lack of
 RAM.
 
 Interesting.  In the problem state the system is usually quite responsive, 
 eg. not memory trashing.  Under Linux which I'm more
 familiar with the 'used memory' = 'total memory - 'free memory', refers to 
 physical memory being used for data caching by
 the kernel which is still available for processes to allocate as needed 
 together with memory allocated to processes, as opposed to
 only physical memory already allocated and therefore really 'used'.  Does 
 this mean something different under Solaris ?
 
 Well, it is roughly similar. In Solaris there is a general notion
 
 [snipped]
 
 Dear Jim,
Thanks for the detailed explanation of ZFS memory usage.  Special 
 thanks also to John D Groenveld for the initial suggestion of a lack of RAM
 problem.  Since up-ing the RAM from 2GB to 4GB the machine has sailed though 
 the last two Sunday mornings w/o problem.  I was interested to
 subsequently discover the Solaris command 'echo ::memstat | mdb -k' which 
 reveals just how much memory ZFS can use.
 
 Best regards
 Tom.
 
 --
 Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
 Egham, Surrey, TW20 0EX, England.
 Email:  T.Crane@rhul dot ac dot uk
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning

2012-06-14 Thread TPCzfs
 
 Offlist/OT - Sheer guess, straight out of my parts - maybe a cronjob to 
 rebuild the locate db or something similar is hammering it once a week?

In the problem condition, there appears to be very little going on on the 
system. eg.,

root@server5:/tmp# /usr/local/bin/top
last pid:  3828;  load avg:  4.29,  3.95,  3.84;   up 6+23:11:4407:12:47
79 processes: 78 sleeping, 1 on cpu
CPU states: 73.0% idle,  0.0% user, 27.0% kernel,  0.0% iowait,  0.0% swap
Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
   784 root  17  60  -20   88M  632K sleep  270:03 13.02% nfsd
  2694 root   1  590 1376K  672K sleep1:45  0.69% touch
  3814 root   5  590   30M 3928K sleep0:00  0.32% pkgserv
  3763 root   1  600 8400K 1256K sleep0:02  0.20% zfs
  3826 root   1  520 3516K 2004K cpu/10:00  0.05% top
  3811 root   1  590 7668K 1732K sleep0:00  0.02% pkginfo
  1323 noaccess  18  590  119M 1660K sleep4:47  0.01% java
   174 root  50  590 8796K 1208K sleep1:47  0.01% nscd
   332 root   1  490 2480K  456K sleep0:06  0.01% dhcpagent
 8 root  15  590   14M  640K sleep0:07  0.01% svc.startd
  1236 root   1  590   15M 5172K sleep2:06  0.01% Xorg
  1281 root   1  590   11M  544K sleep1:00  0.00% dtgreet
 26068 root   1 100  -20 2680K 1416K sleep0:01  0.00% xntpd
   582 root   4  590 6884K 1232K sleep1:22  0.00% inetd
   394 daemon 2  60  -20 2528K  508K sleep5:54  0.00% lockd

Regards
Tom Crane

 
 On 6/13/12 3:47 AM, tpc...@mklab.ph.rhul.ac.uk wrote:
  Dear All,
  I have been advised to enquire here on zfs-discuss with the
  ZFS problem described below, following discussion on Usenet NG
  comp.unix.solaris.  The full thread should be available here
  https://groups.google.com/forum/#!topic/comp.unix.solaris/uEQzz1t-G1s
 
  Many thanks
  Tom Crane
 


-- 
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England. 
Email:  t.cr...@rhul.ac.uk
Fax:+44 (0) 1784 472794
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning

2012-06-14 Thread John D Groenveld
In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r
hul.ac.uk writes:
Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap


My WAG is that your zpool history is hanging due to lack of
RAM.

John
groenv...@acm.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-14 Thread TPCzfs
 
 In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
 tpc...@mklab.ph.r
 hul.ac.uk writes:
 Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
 
 
 My WAG is that your zpool history is hanging due to lack of
 RAM.

Interesting.  In the problem state the system is usually quite responsive, eg. 
not memory trashing.  Under Linux which I'm more 
familiar with the 'used memory' = 'total memory - 'free memory', refers to 
physical memory being used for data caching by 
the kernel which is still available for processes to allocate as needed 
together with memory allocated to processes, as opposed to 
only physical memory already allocated and therefore really 'used'.  Does this 
mean something different under Solaris ?

Cheers
Tom

 
 John
 groenv...@acm.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-14 Thread Jim Klimov

2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:


In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r
hul.ac.uk writes:

Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap

My WAG is that your zpool history is hanging due to lack of
RAM.


Interesting.  In the problem state the system is usually quite responsive, eg. 
not memory trashing.  Under Linux which I'm more
familiar with the 'used memory' = 'total memory - 'free memory', refers to 
physical memory being used for data caching by
the kernel which is still available for processes to allocate as needed 
together with memory allocated to processes, as opposed to
only physical memory already allocated and therefore really 'used'.  Does this 
mean something different under Solaris ?


Well, it is roughly similar. In Solaris there is a general notion
of swap or virtual memory so as not to confuse adepts of other
systems, which is a general combination of RAM and disk swap
spaces. Tools imported from other environments, like top above,
use the common notions of physical memory and on-disk swap;
tools like vmstat under Solaris would print the swap = VM and
the free = RAM columns...

Processes are allocated their memory requirements from the generic
swap = virt.mem, though some tricks are possible - some pages
may be marked as not swappable to disk, others may require a
reservation of on-disk swap space even if all the data still
lives in RAM. Kernel memory, for example, that used by ZFS, does
not go into on-disk swap (which can cause system freezes due to
shortage of RAM for operations if some big ZFS task is not ready
to just release that virtual memory).

The ZFS ARC cache may release its memory on request for RAM
from other processes, but this takes some time (and some programs
check for lack of free memory and think they can't get more,
and break without even trying), so a reserve of free memory
is usually kept by the OS. To have the free RAM go as low as
the 32Mb low watermark, some strong hammering must be going on...

Now, back to the 2Gb RAM problem: ZFS has lots of metadata.
Both reads and writes to the pool have to traverse a large tree
of block pointers, with leaves of the tree containing pieces of
your user-data. Updates to user-data cause rewriting of the
whole path through the tree from updated blocks to the root
(metadata blocks must be read, modified, re-checksummed at
their parents - recurse to root).

Metadata blocks are also stored on the disk, but in several
copies per block (double-triple the IOPS cost).

ZFS works fast when the hot paths through the needed portions
of the blockpointer tree, or, even better, the whole tree, are
cached into RAM. Otherwise, the least-used blocks are evicted
to accomodate the recent newcomers. If you are low on RAM and
useful blocks get evicted, this causes re-reads from disk to
get them back (and evict some others), which may cause the lags
you're seeing. The high part of kernel time also indicates that
it is not some userspace computation hogging the CPUs, but
likely waiting for hardware IOs.

Running iostat 1 or zpool iostat 1 can help you see some
patterns (at least, whether there are many disk reads when the
system is hung). Perhaps the pool is getting scrubbed, or
the slocate database gets updated, or some machines begin
dumping their backups onto the fileserver at once - and
with so little cache the machine nearly dies, in terms of
performance and responsiveness at least.

This lack of RAM is especially deadly upon writes into
deduped pools, because DDT tables tend to be large (tens
of GBs for moderate-sized pools of tens of TB).
Your box seems to have a 12Tb pool with just a little bit
used, yet already the shortage of RAM is well seen...

Hope this helps (understanding at least),
//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss