Bug#519586: Huge Slab Unreclaimable and continually growing

2013-02-16 Thread Josip Rodin
On Sat, Feb 16, 2013 at 03:13:06AM +, Ben Hutchings wrote:
 On Fri, 2013-02-15 at 08:56 +0100, Josip Rodin wrote:
   I appear to be experiencing a serious problem with a 768 MB RAM Xen domU
   machine running an NFS client - every now and then (for months now), often
   in the middle of the night, it enters some kind of a broken state where a
   few semi-random processes (mainly apache2's and vsftpd's which are told to
   serve files from the NFS mount)
 [...]
  I caught it earlier just now, at:
  
  [950084.590733] active_anon:2805 inactive_anon:11835 isolated_anon:0
  [950084.590735]  active_file:76 inactive_file:516 isolated_file:32
  [950084.590737]  unevictable:783 dirty:1 writeback:0 unstable:0
  [950084.590739]  free:26251 slab_reclaimable:15733 slab_unreclaimable:128868
  [950084.590741]  mapped:938 shmem:75 pagetables:651 bounce:0
  
  And snuck in a few slabtops (even some -o invocations were getting killed,
  along with my shell and pretty much everything else):
 [...]
   65390  65390 100%2.06K  13338   15426816K net_namespace
 [...]
 
 Looks like CVE-2011-2189, for which there was a fix/workaround in:
 
 vsftpd (2.3.2-3+squeeze2) stable-security; urgency=high
 
* Non-maintainer upload by the Security Team.
* Disable network isolation due to a problem with cleaning up network
  namespaces fast enough in kernels  2.6.35 (CVE-2011-2189).
  Thanks Ben Hutchings for the patch!
* Fix possible DoS via globa expressions in STAT commands by
  limiting the matching loop (CVE-2011-0762; Closes: #622741).
 
  -- Nico Golde n...@debian.org  Wed, 07 Sep 2011 20:39:59 +
 
 Do you have an old version of vsftpd, or perhaps an upstream version
 which doesn't include the workaround?

No, 2.3.2-3+squeeze2 is there, has been since 2012-03-22.

 Anyway, I'm closing the bug report; please don't hijack closed bugs.

Eh? It was not closed for being fixed, it was closed en masse on a
procedural reason that could easily be wrong, and I don't believe I was
hijacking it; you just confirmed that this is a kernel problem above,
so how could this possibly be improper?!

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130216213705.ga13...@entuzijast.net



Bug#519586: Huge Slab Unreclaimable and continually growing

2013-02-16 Thread Ben Hutchings
On Sat, 2013-02-16 at 22:37 +0100, Josip Rodin wrote:
 On Sat, Feb 16, 2013 at 03:13:06AM +, Ben Hutchings wrote:
  On Fri, 2013-02-15 at 08:56 +0100, Josip Rodin wrote:
I appear to be experiencing a serious problem with a 768 MB RAM Xen domU
machine running an NFS client - every now and then (for months now), 
often
in the middle of the night, it enters some kind of a broken state where 
a
few semi-random processes (mainly apache2's and vsftpd's which are told 
to
serve files from the NFS mount)
  [...]
   I caught it earlier just now, at:
   
   [950084.590733] active_anon:2805 inactive_anon:11835 isolated_anon:0
   [950084.590735]  active_file:76 inactive_file:516 isolated_file:32
   [950084.590737]  unevictable:783 dirty:1 writeback:0 unstable:0
   [950084.590739]  free:26251 slab_reclaimable:15733 
   slab_unreclaimable:128868
   [950084.590741]  mapped:938 shmem:75 pagetables:651 bounce:0
   
   And snuck in a few slabtops (even some -o invocations were getting killed,
   along with my shell and pretty much everything else):
  [...]
65390  65390 100%2.06K  13338   15426816K net_namespace
  [...]
  
  Looks like CVE-2011-2189, for which there was a fix/workaround in:
  
  vsftpd (2.3.2-3+squeeze2) stable-security; urgency=high
  
 * Non-maintainer upload by the Security Team.
 * Disable network isolation due to a problem with cleaning up network
   namespaces fast enough in kernels  2.6.35 (CVE-2011-2189).
   Thanks Ben Hutchings for the patch!
 * Fix possible DoS via globa expressions in STAT commands by
   limiting the matching loop (CVE-2011-0762; Closes: #622741).
  
   -- Nico Golde n...@debian.org  Wed, 07 Sep 2011 20:39:59 +
  
  Do you have an old version of vsftpd, or perhaps an upstream version
  which doesn't include the workaround?
 
 No, 2.3.2-3+squeeze2 is there, has been since 2012-03-22.

  Anyway, I'm closing the bug report; please don't hijack closed bugs.
 
 Eh? It was not closed for being fixed, it was closed en masse on a
 procedural reason that could easily be wrong, and I don't believe I was
 hijacking it; you just confirmed that this is a kernel problem above,
 so how could this possibly be improper?!

It's not the same bug.  Open a new bug report.

Ben.

-- 
Ben Hutchings
Computers are not intelligent.  They only think they are.


signature.asc
Description: This is a digitally signed message part


Bug#519586: Huge Slab Unreclaimable and continually growing

2013-02-15 Thread Ben Hutchings
On Fri, 2013-02-15 at 08:56 +0100, Josip Rodin wrote:
  I appear to be experiencing a serious problem with a 768 MB RAM Xen domU
  machine running an NFS client - every now and then (for months now), often
  in the middle of the night, it enters some kind of a broken state where a
  few semi-random processes (mainly apache2's and vsftpd's which are told to
  serve files from the NFS mount)
[...]
 I caught it earlier just now, at:
 
 [950084.590733] active_anon:2805 inactive_anon:11835 isolated_anon:0
 [950084.590735]  active_file:76 inactive_file:516 isolated_file:32
 [950084.590737]  unevictable:783 dirty:1 writeback:0 unstable:0
 [950084.590739]  free:26251 slab_reclaimable:15733 slab_unreclaimable:128868
 [950084.590741]  mapped:938 shmem:75 pagetables:651 bounce:0
 
 And snuck in a few slabtops (even some -o invocations were getting killed,
 along with my shell and pretty much everything else):
[...]
  65390  65390 100%2.06K  13338   15426816K net_namespace
[...]

Looks like CVE-2011-2189, for which there was a fix/workaround in:

vsftpd (2.3.2-3+squeeze2) stable-security; urgency=high

   * Non-maintainer upload by the Security Team.
   * Disable network isolation due to a problem with cleaning up network
 namespaces fast enough in kernels  2.6.35 (CVE-2011-2189).
 Thanks Ben Hutchings for the patch!
   * Fix possible DoS via globa expressions in STAT commands by
 limiting the matching loop (CVE-2011-0762; Closes: #622741).

 -- Nico Golde n...@debian.org  Wed, 07 Sep 2011 20:39:59 +

Do you have an old version of vsftpd, or perhaps an upstream version
which doesn't include the workaround?

Anyway, I'm closing the bug report; please don't hijack closed bugs.

Ben.

-- 
Ben Hutchings
Computers are not intelligent.  They only think they are.


signature.asc
Description: This is a digitally signed message part


Bug#519586: Huge Slab Unreclaimable and continually growing

2013-02-14 Thread Josip Rodin
On Tue, Jan 22, 2013 at 10:59:17AM +0100, Josip Rodin wrote:
 I appear to be experiencing a serious problem with a 768 MB RAM Xen domU
 machine running an NFS client - every now and then (for months now), often
 in the middle of the night, it enters some kind of a broken state where a
 few semi-random processes (mainly apache2's and vsftpd's which are told to
 serve files from the NFS mount) start battling it out for the memory, and
 everything including sshd starts invoking the OOM killer, over and over
 again. Nothing seems to halt the downward spiral; manual invocation of the
 OOM killer does nothing of any use. Terminating all processes is the only
 thing that makes it go quiet, but then that's effectively the same as a
 reboot.
 
 This is the SysRq+M output on the machine once it's been in the broken state
 for a while:
[...]
 active_anon:394 inactive_anon:3197 isolated_anon:0
  active_file:25 inactive_file:176 isolated_file:32
  unevictable:2659 dirty:1 writeback:0 unstable:0
  free:21456 slab_reclaimable:16177 slab_unreclaimable:143165
  mapped:677 shmem:76 pagetables:455 bounce:0
[...]
 The thing I noticed was the slab_unreclaimable explosion, by a factor
 of 122. That... doesn't sound like something that should be happenning.
 
 Googling for slab_unreclaimable found me this old bug report about
 slab_unreclaimable domU problems that was mass-closed with the switch to the
 new paravirtops Xen release. Granted, our use case is not Samba like with
 the original reporter, but the pattern of a file server was close enough for
 me to be uncomfortable with it :|

I caught it earlier just now, at:

[950084.590733] active_anon:2805 inactive_anon:11835 isolated_anon:0
[950084.590735]  active_file:76 inactive_file:516 isolated_file:32
[950084.590737]  unevictable:783 dirty:1 writeback:0 unstable:0
[950084.590739]  free:26251 slab_reclaimable:15733 slab_unreclaimable:128868
[950084.590741]  mapped:938 shmem:75 pagetables:651 bounce:0

And snuck in a few slabtops (even some -o invocations were getting killed,
along with my shell and pretty much everything else):

 Active / Total Objects (% used): 555753 / 587128 (94.7%)
 Active / Total Slabs (% used)  : 49430 / 49430 (100.0%)
 Active / Total Caches (% used) : 65 / 76 (85.5%)
 Active / Total Size (% used)   : 546613.78K / 553025.01K (98.8%)
 Minimum / Average / Maximum Object : 0.01K / 0.94K / 8.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
 90993  66836  73%0.19K   4333   21 17332K dentry
 75840  73664  97%0.12K   2370   32  9480K kmalloc-128
 68096  68092  99%0.01K133  512   532K kmalloc-8
 65888  65655  99%0.25K   4118   16 16472K kmalloc-256
 65820  65778  99%1.00K   4767   16 76272K kmalloc-1024
 65436  65414  99%0.63K   5454   12 43632K proc_inode_cache
 65419  65419 100%4.00K  141798453728K kmalloc-4096
 65390  65390 100%2.06K  13338   15426816K net_namespace
  4998   4990  99%0.08K 98   51   392K sysfs_dir_cache
  4224   2018  47%0.06K 66   64   264K kmalloc-64
  2288   2107  92%0.18K104   22   416K vm_area_struct
  1792   1789  99%0.02K  7  25628K kmalloc-16
  1470   1203  81%0.19K 70   21   280K kmalloc-192
  1300402  30%0.79K 65   20  1040K ext3_inode_cache
   896731  81%0.03K  7  12828K anon_vma
   784532  67%0.55K 56   14   448K radix_tree_node

A bit later:

 Active / Total Objects (% used): 555403 / 586704 (94.7%)
 Active / Total Slabs (% used)  : 49394 / 49394 (100.0%)
 Active / Total Caches (% used) : 65 / 76 (85.5%)
 Active / Total Size (% used)   : 546552.82K / 552827.43K (98.9%)
 Minimum / Average / Maximum Object : 0.01K / 0.94K / 8.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
 90993  66779  73%0.19K   4333   21 17332K dentry
 75840  73654  97%0.12K   2370   32  9480K kmalloc-128
 68096  68092  99%0.01K133  512   532K kmalloc-8
 65888  65601  99%0.25K   4118   16 16472K kmalloc-256
 65852  65741  99%1.00K   4760   16 76160K kmalloc-1024
 65436  65409  99%0.63K   5454   12 43632K proc_inode_cache
 65428  65428 100%4.00K  141818453792K kmalloc-4096
 65391  65391 100%2.06K  13339   15426848K net_namespace
  4998   4986  99%0.08K 98   51   392K sysfs_dir_cache
  4224   2017  47%0.06K 66   64   264K kmalloc-64
  2134   2108  98%0.18K 97   22   388K vm_area_struct
  1792   1789  99%0.02K  7  25628K kmalloc-16
  1449   1078  74%0.19K 69   21   276K kmalloc-192
  1100376  34%0.79K 55   20   880K ext3_inode_cache
   896639  71%0.03K  7  12828K anon_vma
   714554  77%0.55K 51   14