Re: [Lxc-users] concurrent aptitude/dpkg runs in separate containers -- bork bork bork

2011-02-02 Thread Trent W. Buck
Daniel Lezcano daniel.lezc...@free.fr writes:

 On 01/12/2011 07:39 AM, Trent W. Buck wrote:
 Mikedeb...@good-with-numbers.com  writes:

 Trent W. Buck wrote:
 I can provision a new LXC container, which includes running a few
 aptitude install foo lines (inside the containers), and it Just Works.
 If I try to provision two containers at the same time, both containers
 appear to hang with a dpkg process in the D state[0].

 My config for each container looks like this:
 lxc.mount.entry  = /home   /srv/lxc/proud/home   none bind
 Doesn't aptitude write into the home directory that you're sharing
 across containers? locks ~/.aptitude/cache?
 Possibly, but dhclient's running aptitude as uid 0, whose home is in
 /root, which is not shared between hosts.

 In any case, IIRC I export HOME=`mktemp -d` before aptitude runs, as a
 fugly workaround for etckeeper.

 I suppose if containers' /tmp are tmpfses, aptitude might be running
 into some problem with locking on tmpfs, although I'm not aware of any
 bogus operational semantics for tmpfs...

 Can you give, for each process in D state, the content of 
 /proc/pid/stack ?
 That may help to understand where the lock occurs in the kernel

I ran into it again today, accidentally.

I was recently reminded that dpkg doesn't mix well with ext4 and btrfs,
because it calls sync/fsync all the time.  Now, my containers are on
different ext4 LVs, but maybe 1 process calling f/sync all the time is
enough to slow them both down.

I'm being a bit more patient than last time, and I think they ARE
proceeding, just REALLY slowly.  Meanwhile aptitude consumes a 100% of a
core busy-waiting for a response from dpkg :-/

They look like this:

$ ssh omega cat /proc/7713/stack
Warning: Permanently added 'omega,192.168.155.22' (RSA) to the list of 
known hosts.
[811669b7] sync_inodes_sb+0x87/0xb0
[8116b292] __sync_filesystem+0x82/0x90
[8116b379] sync_filesystems+0xd9/0x130
[8116b431] sys_sync+0x21/0x40
[810121b2] system_call_fastpath+0x16/0x1b
[] 0x

$ ssh omega cat /proc/5619/stack
Warning: Permanently added 'omega,192.168.155.22' (RSA) to the list of 
known hosts.
[81222865] jbd2_log_wait_commit+0xc5/0x150
[811d7a2c] ext4_sync_file+0x13c/0x2e0
[8116b051] vfs_fsync_range+0xa1/0xe0
[8116b0fd] vfs_fsync+0x1d/0x20
[8116b13e] do_fsync+0x3e/0x60
[8116b190] sys_fsync+0x10/0x20
[810121b2] system_call_fastpath+0x16/0x1b
[] 0x

I couldn't capture 1 pid from the same ps run before one or the other
vanished.  The ps output of D states looks like this:

$ ssh omega lxc-ps auxf  | grep ' D '
   root   465  0.0  0.0  0 0 ?DJan12   0:04 
 \_ [kdmflush]
   root   476  0.0  0.0  0 0 ?DJan12   0:01 
 \_ [kdmflush]
   root   591  0.0  0.0  0 0 ?DJan12   0:10 
 \_ [jbd2/dm-0-8]
   root  1437  0.0  0.0  0 0 ?DJan12   0:11 
 \_ [jbd2/dm-3-8]
   root  2278  0.0  0.0  0 0 ?DJan12   0:21 
 \_ [kdmflush]
   root  2293  0.0  0.0  0 0 ?DJan12   1:06 
 \_ [jbd2/dm-21-8]
   root 26611  0.0  0.0  0 0 ?D15:27   0:00 
 \_ [kdmflush]
   root 26635  0.0  0.0  0 0 ?D15:27   0:00 
 \_ [jbd2/dm-6-8]
template-amd64 root  4763  0.7  0.0  17708  6064 ?D15:47   
0:00  |   |   \_ /usr/bin/dpkg --status-fd 34 
--unpack --auto-deconfigure 
/srv/mirror/ubuntu/pool/main/u/udev/udev_151-12.3_amd64.deb 
/srv/mirror/ubuntu/pool/main/l/lvm2/dmsetup_1.02.39-1ubuntu4.1_amd64.deb 
/srv/mirror/ubuntu/pool/main/g/glib2.0/libglib2.0-0_2.24.1-0ubuntu1_amd64.deb 
/srv/mirror/ubuntu/pool/main/libu/libusb/libusb-0.1-4_0.1.12-14ubuntu0.2_amd64.deb
 /srv/mirror/ubuntu/pool/main/p/plymouth/plymouth_0.8.2-2ubuntu2.2_amd64.deb 
/srv/mirror/ubuntu/pool/main/libp/libpng/libpng12-0_1.2.42-1ubuntu2.1_amd64.deb 
/srv/mirror/ubuntu/pool/main/p/plymouth/libplymouth2_0.8.2-2ubuntu2.2_amd64.deb 
/srv/mirror/ubuntu/pool/main/m/mountall/mountall_2.15.3_amd64.deb 
/srv/mirror/ubuntu/pool/main/u/upstart/upstart_0.6.5-8_amd64.deb 
/srv/mirror/ubuntu/pool/main/u/util-linux/bsdutils_2.17.2-0ubuntu1.10.04.2_amd64.deb
greed  root  4325  0.7  0.1  14296  8880 ?D15:45   0:01 
 |   \_ /usr/bin/dpkg --status-fd 35 --unpack 
--auto-deconfigure 
/srv/mirror/ubuntu/pool/main/x/xorg/x11-common_7.5+5ubuntu1_all.deb 
/srv/mirror/ubuntu/pool/main/f/freetype/libfreetype6_2.3.11-1ubuntu2.4_amd64.deb
 /srv/mirror/ubuntu/pool/main/t/ttf-dejavu/ttf-dejavu-core_2.30-2_all.deb 
/srv/mirror/ubuntu/pool/main/f/fontconfig/fontconfig-config_2.8.0-2ubuntu1_all.deb
 

Re: [Lxc-users] concurrent aptitude/dpkg runs in separate containers -- bork bork bork

2011-02-02 Thread Trent W. Buck
t...@cybersource.com.au (Trent W. Buck)
writes:

 I'm being a bit more patient than last time, and I think they ARE
 proceeding, just REALLY slowly.  Meanwhile aptitude consumes a 100% of a
 core busy-waiting for a response from dpkg :-/

 They look like this:

 $ ssh omega cat /proc/7713/stack
 Warning: Permanently added 'omega,192.168.155.22' (RSA) to the list of 
 known hosts.
 [811669b7] sync_inodes_sb+0x87/0xb0
 [8116b292] __sync_filesystem+0x82/0x90
 [8116b379] sync_filesystems+0xd9/0x130
 [8116b431] sys_sync+0x21/0x40
 [810121b2] system_call_fastpath+0x16/0x1b
 [] 0x

 $ ssh omega cat /proc/5619/stack
 Warning: Permanently added 'omega,192.168.155.22' (RSA) to the list of 
 known hosts.
 [81222865] jbd2_log_wait_commit+0xc5/0x150
 [811d7a2c] ext4_sync_file+0x13c/0x2e0
 [8116b051] vfs_fsync_range+0xa1/0xe0
 [8116b0fd] vfs_fsync+0x1d/0x20
 [8116b13e] do_fsync+0x3e/0x60
 [8116b190] sys_fsync+0x10/0x20
 [810121b2] system_call_fastpath+0x16/0x1b
 [] 0x

And here's one that is well and truly wedged:

root@omega:~# cat /proc/31430/stack
[811669b7] sync_inodes_sb+0x87/0xb0
[8116b292] __sync_filesystem+0x82/0x90
[8116b379] sync_filesystems+0xd9/0x130
[8116b431] sys_sync+0x21/0x40
[810121b2] system_call_fastpath+0x16/0x1b
[] 0x

In that case, even kill -SEGV'ing upstart won't stop it.  I got that
with only a single dpkg run (i.e. no concurrency), after switching the
container's rootfs from ext4 to ext3, and forcing dpkg[0] to be upgraded
before anything else.  Sigh...

I'm THIS CLOSE to giving up and wrapping apt-get in libeatmydata.

[0] I did this because I noticed that lucid's dpkg still suffers from

  http://bugs.debian.org/578635
  http://bugs.debian.org/605009
  https://launchpad.net/bugs/570805

But lucid-updates  lucid-security both contain a version that
contains CLAIMS to address the first of those.


--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users