I came across a PDF-document from HP. A recent white-paper on LVM
snapshots.

http://h20000.www2.hp.com/bizsupport/TechSupport/CoreRedirect.jsp?redirectReason=DocIndexPDF&prodSeriesId=4296010&targetPage=http%3A%2F%2Fbizsupport2.austin.hp.com%2Fbc%2Fdocs%2Fsupport%2FSupportManual%2Fc02054539%2Fc02054539.pdf

Most interesting part:
"In very low system memory conditions, deletion of a single snapshot can hang 
indefinitely for memory to become available. Ensure that sufficient memory is 
available during deletion of a single snapshot that requires data to be copied 
to its predecessor. If the lvremove command hangs in these cases, increase the 
system memory or free some existing system memory to proceed with the snapshot 
deletion."

No further explaination is give....

Our host contains 64GB RAM and 2 6-core Intel CPU's.
We're using Munin to graph memory-usage. The graphs are updated every 5 
minutes, so we don't have a real numbers on usage on the moment the snapshot 
was removed.
At the moment the removal of the snapshot was initiated the host used 
approximately 51GB RAM, 6GB buffers, 10GB unused and 3GB swap.

I'm thinking about some NUMA-issues I researched last weeks. It's
probably nothing to do with this issue.

Some memory-statistics:
# free -m
             total       used       free     shared    buffers     cached
Mem:         64549      64062        487          0      23579        780
-/+ buffers/cache:      39702      24847
Swap:         7627        377       7250

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
node 0 size: 32768 MB
node 0 free: 63 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
node 1 size: 32758 MB
node 1 free: 437 MB
node distances:
node   0   1 
  0:  10  20 

The host is swapping a little now, but every day it swaps out 4GB of RAM. 
vm.swappiness=0
swapoff -a && swapon -a is run every day a couple times.
It should not swap, but it seems to be an issue with multiple CPU sockets and 
processes not using the same NUMA-node (CPU-pinning). It seems that hosts with 
multiple sockets (not cores) swaps out a lot more.

It could be possible that the lvremove action thinks there is not enough
ram and hangs indefinitely.

Hopefully someone can confirm some of this.

-- 
lvremove fails
https://bugs.launchpad.net/bugs/533493
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to