Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-25 Thread TPCzfs

 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:
 
  In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
  tpc...@mklab.ph.r
  hul.ac.uk writes:
  Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
  My WAG is that your zpool history is hanging due to lack of
  RAM.
 
  Interesting.  In the problem state the system is usually quite responsive, 
  eg. not memory trashing.  Under Linux which I'm more
  familiar with the 'used memory' = 'total memory - 'free memory', refers to 
  physical memory being used for data caching by
  the kernel which is still available for processes to allocate as needed 
  together with memory allocated to processes, as opposed to
  only physical memory already allocated and therefore really 'used'.  Does 
  this mean something different under Solaris ?

 Well, it is roughly similar. In Solaris there is a general notion

[snipped]

Dear Jim,
Thanks for the detailed explanation of ZFS memory usage.  Special 
thanks also to John D Groenveld for the initial suggestion of a lack of RAM
problem.  Since up-ing the RAM from 2GB to 4GB the machine has sailed though 
the last two Sunday mornings w/o problem.  I was interested to
subsequently discover the Solaris command 'echo ::memstat | mdb -k' which 
reveals just how much memory ZFS can use.

Best regards
Tom.

--
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England.
Email:  T.Crane@rhul dot ac dot uk
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-25 Thread Hung-Sheng Tsao (LaoTsao) Ph.D
in solaris zfs cache many things, you should have more ram
if you setup 18gb swap , imho, ram should be high than 4gb
regards

Sent from my iPad

On Jun 25, 2012, at 5:58, tpc...@mklab.ph.rhul.ac.uk wrote:

 
 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:
 
 In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
 tpc...@mklab.ph.r
 hul.ac.uk writes:
 Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
 My WAG is that your zpool history is hanging due to lack of
 RAM.
 
 Interesting.  In the problem state the system is usually quite responsive, 
 eg. not memory trashing.  Under Linux which I'm more
 familiar with the 'used memory' = 'total memory - 'free memory', refers to 
 physical memory being used for data caching by
 the kernel which is still available for processes to allocate as needed 
 together with memory allocated to processes, as opposed to
 only physical memory already allocated and therefore really 'used'.  Does 
 this mean something different under Solaris ?
 
 Well, it is roughly similar. In Solaris there is a general notion
 
 [snipped]
 
 Dear Jim,
Thanks for the detailed explanation of ZFS memory usage.  Special 
 thanks also to John D Groenveld for the initial suggestion of a lack of RAM
 problem.  Since up-ing the RAM from 2GB to 4GB the machine has sailed though 
 the last two Sunday mornings w/o problem.  I was interested to
 subsequently discover the Solaris command 'echo ::memstat | mdb -k' which 
 reveals just how much memory ZFS can use.
 
 Best regards
 Tom.
 
 --
 Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
 Egham, Surrey, TW20 0EX, England.
 Email:  T.Crane@rhul dot ac dot uk
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning

2012-06-14 Thread TPCzfs
 
 Offlist/OT - Sheer guess, straight out of my parts - maybe a cronjob to 
 rebuild the locate db or something similar is hammering it once a week?

In the problem condition, there appears to be very little going on on the 
system. eg.,

root@server5:/tmp# /usr/local/bin/top
last pid:  3828;  load avg:  4.29,  3.95,  3.84;   up 6+23:11:4407:12:47
79 processes: 78 sleeping, 1 on cpu
CPU states: 73.0% idle,  0.0% user, 27.0% kernel,  0.0% iowait,  0.0% swap
Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
   784 root  17  60  -20   88M  632K sleep  270:03 13.02% nfsd
  2694 root   1  590 1376K  672K sleep1:45  0.69% touch
  3814 root   5  590   30M 3928K sleep0:00  0.32% pkgserv
  3763 root   1  600 8400K 1256K sleep0:02  0.20% zfs
  3826 root   1  520 3516K 2004K cpu/10:00  0.05% top
  3811 root   1  590 7668K 1732K sleep0:00  0.02% pkginfo
  1323 noaccess  18  590  119M 1660K sleep4:47  0.01% java
   174 root  50  590 8796K 1208K sleep1:47  0.01% nscd
   332 root   1  490 2480K  456K sleep0:06  0.01% dhcpagent
 8 root  15  590   14M  640K sleep0:07  0.01% svc.startd
  1236 root   1  590   15M 5172K sleep2:06  0.01% Xorg
  1281 root   1  590   11M  544K sleep1:00  0.00% dtgreet
 26068 root   1 100  -20 2680K 1416K sleep0:01  0.00% xntpd
   582 root   4  590 6884K 1232K sleep1:22  0.00% inetd
   394 daemon 2  60  -20 2528K  508K sleep5:54  0.00% lockd

Regards
Tom Crane

 
 On 6/13/12 3:47 AM, tpc...@mklab.ph.rhul.ac.uk wrote:
  Dear All,
  I have been advised to enquire here on zfs-discuss with the
  ZFS problem described below, following discussion on Usenet NG
  comp.unix.solaris.  The full thread should be available here
  https://groups.google.com/forum/#!topic/comp.unix.solaris/uEQzz1t-G1s
 
  Many thanks
  Tom Crane
 


-- 
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England. 
Email:  t.cr...@rhul.ac.uk
Fax:+44 (0) 1784 472794
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning

2012-06-14 Thread John D Groenveld
In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r
hul.ac.uk writes:
Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap


My WAG is that your zpool history is hanging due to lack of
RAM.

John
groenv...@acm.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-14 Thread TPCzfs
 
 In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
 tpc...@mklab.ph.r
 hul.ac.uk writes:
 Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
 
 
 My WAG is that your zpool history is hanging due to lack of
 RAM.

Interesting.  In the problem state the system is usually quite responsive, eg. 
not memory trashing.  Under Linux which I'm more 
familiar with the 'used memory' = 'total memory - 'free memory', refers to 
physical memory being used for data caching by 
the kernel which is still available for processes to allocate as needed 
together with memory allocated to processes, as opposed to 
only physical memory already allocated and therefore really 'used'.  Does this 
mean something different under Solaris ?

Cheers
Tom

 
 John
 groenv...@acm.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-14 Thread Jim Klimov

2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:


In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r
hul.ac.uk writes:

Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap

My WAG is that your zpool history is hanging due to lack of
RAM.


Interesting.  In the problem state the system is usually quite responsive, eg. 
not memory trashing.  Under Linux which I'm more
familiar with the 'used memory' = 'total memory - 'free memory', refers to 
physical memory being used for data caching by
the kernel which is still available for processes to allocate as needed 
together with memory allocated to processes, as opposed to
only physical memory already allocated and therefore really 'used'.  Does this 
mean something different under Solaris ?


Well, it is roughly similar. In Solaris there is a general notion
of swap or virtual memory so as not to confuse adepts of other
systems, which is a general combination of RAM and disk swap
spaces. Tools imported from other environments, like top above,
use the common notions of physical memory and on-disk swap;
tools like vmstat under Solaris would print the swap = VM and
the free = RAM columns...

Processes are allocated their memory requirements from the generic
swap = virt.mem, though some tricks are possible - some pages
may be marked as not swappable to disk, others may require a
reservation of on-disk swap space even if all the data still
lives in RAM. Kernel memory, for example, that used by ZFS, does
not go into on-disk swap (which can cause system freezes due to
shortage of RAM for operations if some big ZFS task is not ready
to just release that virtual memory).

The ZFS ARC cache may release its memory on request for RAM
from other processes, but this takes some time (and some programs
check for lack of free memory and think they can't get more,
and break without even trying), so a reserve of free memory
is usually kept by the OS. To have the free RAM go as low as
the 32Mb low watermark, some strong hammering must be going on...

Now, back to the 2Gb RAM problem: ZFS has lots of metadata.
Both reads and writes to the pool have to traverse a large tree
of block pointers, with leaves of the tree containing pieces of
your user-data. Updates to user-data cause rewriting of the
whole path through the tree from updated blocks to the root
(metadata blocks must be read, modified, re-checksummed at
their parents - recurse to root).

Metadata blocks are also stored on the disk, but in several
copies per block (double-triple the IOPS cost).

ZFS works fast when the hot paths through the needed portions
of the blockpointer tree, or, even better, the whole tree, are
cached into RAM. Otherwise, the least-used blocks are evicted
to accomodate the recent newcomers. If you are low on RAM and
useful blocks get evicted, this causes re-reads from disk to
get them back (and evict some others), which may cause the lags
you're seeing. The high part of kernel time also indicates that
it is not some userspace computation hogging the CPUs, but
likely waiting for hardware IOs.

Running iostat 1 or zpool iostat 1 can help you see some
patterns (at least, whether there are many disk reads when the
system is hung). Perhaps the pool is getting scrubbed, or
the slocate database gets updated, or some machines begin
dumping their backups onto the fileserver at once - and
with so little cache the machine nearly dies, in terms of
performance and responsiveness at least.

This lack of RAM is especially deadly upon writes into
deduped pools, because DDT tables tend to be large (tens
of GBs for moderate-sized pools of tens of TB).
Your box seems to have a 12Tb pool with just a little bit
used, yet already the shortage of RAM is well seen...

Hope this helps (understanding at least),
//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday morning problem

2012-06-13 Thread TPCzfs
Dear All,
I have been advised to enquire here on zfs-discuss with the
ZFS problem described below, following discussion on Usenet NG 
comp.unix.solaris.  The full thread should be available here 
https://groups.google.com/forum/#!topic/comp.unix.solaris/uEQzz1t-G1s

Many thanks
Tom Crane



-- forwarded message

cindy.swearin...@oracle.com wrote:
: On Tuesday, May 29, 2012 5:39:11 AM UTC-6, (unknown) wrote:
:  Dear All,
: Can anyone give any tips on diagnosing the following recurring problem?
:  
:  I have a Solaris box (server5, SunOS server5 5.10 Generic_147441-15
:  i86pc i386 i86pc ) whose ZFS FS NFS exported service fails every so
:  often, always in the early hours of Sunday morning. I am barely
:  familiar with Solaris but here what I have managed to discern when the
:  problem occurs;
:  
:  Jobs on other machines which access server5's shares (via automounter)
:  hang and attempts to manually remote-mount shares just timeout.
:  
:  Remotely, showmount -e server5 shows all the exported FS are available.
:  
:  On server5, the following services are running;
:  
:  root@server5:/var/adm# svcs | grep nfs 
:  online May_25   svc:/network/nfs/status:default
:  online May_25   svc:/network/nfs/nlockmgr:default
:  online May_25   svc:/network/nfs/cbd:default
:  online May_25   svc:/network/nfs/mapid:default
:  online May_25   svc:/network/nfs/rquota:default
:  online May_25   svc:/network/nfs/client:default
:  online May_25   svc:/network/nfs/server:default
:  
:  On server5, I can list and read files on the affected FSs w/o problem
:  but any attempt to write to the FS (eg. copy a file to or rm a file
:  on the FS) just hangs the cp/rm process.
:  
:  On server5, using a zfs command zfs 'get sharenfs pptank/local_linux'
:  displays the expected list of hosts/IPs with remote ro  rw access.
:  
:  Here is the O/P from some other hopefully relevant commands;
:  
:  root@server5:/# zpool status
:pool: pptank
:   state: ONLINE
:  status: The pool is formatted using an older on-disk format.  The pool can
:  still be used, but some features are unavailable.
:  action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
:  pool will no longer be accessible on older software versions.
:   scan: none requested
:  config:
:  
:  NAMESTATE READ WRITE CKSUM
:  pptank  ONLINE   0 0 0
:raidz1-0  ONLINE   0 0 0
:  c3t0d0  ONLINE   0 0 0
:  c3t1d0  ONLINE   0 0 0
:  c3t2d0  ONLINE   0 0 0
:  c3t3d0  ONLINE   0 0 0
:  c3t4d0  ONLINE   0 0 0
:  c3t5d0  ONLINE   0 0 0
:  c3t6d0  ONLINE   0 0 0
:  
:  errors: No known data errors
:  
:  root@server5:/# zpool list
:  NAME SIZE  ALLOC   FREECAP  HEALTH  ALTROOT
:  pptank  12.6T   384G  12.3T 2%  ONLINE  -
:  
:  root@server5:/# zpool history
:  History for 'pptank':
:  just hangs here
:  
:  root@server5:/# zpool iostat 5
: capacity operationsbandwidth
:  poolalloc   free   read  write   read  write
:  --  -  -  -  -  -  -
:  pptank   384G  12.3T 92115  3.08M  1.22M
:  pptank   384G  12.3T  1.11K629  35.5M  3.03M
:  pptank   384G  12.3T886889  27.1M  3.68M
:  pptank   384G  12.3T837677  24.9M  2.82M
:  pptank   384G  12.3T  1.19K757  37.4M  3.69M
:  pptank   384G  12.3T  1.02K759  29.6M  3.90M
:  pptank   384G  12.3T952707  32.5M  3.09M
:  pptank   384G  12.3T  1.02K831  34.5M  3.72M
:  pptank   384G  12.3T707503  23.5M  1.98M
:  pptank   384G  12.3T626707  20.8M  3.58M
:  pptank   384G  12.3T816838  26.1M  4.26M
:  pptank   384G  12.3T942800  30.1M  3.48M
:  pptank   384G  12.3T677675  21.7M  2.91M
:  pptank   384G  12.3T590725  19.2M  3.06M
:  
:  
:  top shows the following runnable processes.  Nothing excessive here AFAICT?
:  
:  last pid: 25282;  load avg:  1.98,  1.95,  1.86;   up 1+09:02:05 
07:46:29
:  72 processes: 67 sleeping, 1 running, 1 stopped, 3 on cpu
:  CPU states: 81.5% idle,  0.1% user, 18.3% kernel,  0.0% iowait,  0.0% swap
:  Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
:  
: PID USERNAME LWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
: 748 root  18  60  -20  103M 9752K cpu/1   78:44  6.62% nfsd
:   24854 root   1  540 1480K  792K cpu/10:42  0.69% cp
:   25281 root   1  590 3584K 2152K cpu/00:00  0.02% top
:  
:  The above cp job is as mentioned above, attempting to copy a file to
:  an effected FS, I've noticed is apparently not completely hung.
:  
:  The only thing that appears specific to Sunday morning is a