Same issue here.  I had a set of backup script that create LVM snapshots
of XFS volumes that have been failing randomly since around the time I
upgraded to 7.10.  I use different backup programs (tar | bzip2, 7z,
rdiff-backup, ...) and they all fail the same way.  My backups run as
daily cron jobs so it has been difficult to track down the real issue,
since my PC would simply be locked up the next morning.  As far as how
often I've seen problem, it seems to vary:  I've seen it crash after a
single snapshot creation right after a reboot, and another time it ran
for over a week without a problem.  On average, creating a snapshot
crashes the system every 4 snapshots or less (several crashes a week).

Has anyone experienced this file systems other than XFS?  I have several
XFS partitions on LVs, and it seems like once one of them gets hung, the
others are soon to follow.  (Perhaps XFS shares some kernel worker
threads across XFS file systems?).  I setup a simple test script to
snapshot, mount, list the top-level directory, unmount, and remove the
snapshot.  I ran the script probably 20 times for both EXT3 and ReiserFS
LVs without any issues.  I then ran the script for an XFS LV, and it
crashed after the 3rd snapshot creation attempt.  So I'm suspecting that
this issue is specific to XFS.  The last LVM message shown (in verbose
mode), says:  "Suspending <LV> ((<MAJ>,<MIN>) with filesystem sync
without device flush".  And then it just sits there forever.

To elaborate one what Andreas and Mike wrote: After a failed (hung)
snapshot creation, various processes on the system have entered a state
of "D" (Disk sleep).  The number of processes stuck in this waiting-on-
disk state just keeps growing and eventually the entire system just
stops working...  Since some of these process include kernel threads,
such as "xfsdata/0" and "xfssyncd", it makes sense why why the whole
system eventually stops responding as new requests get blocked
indefinitely.  Alt+SysRq commands will respond.  My root file system is
a non-LVM partition using ext3.  So my root partition seems to stay
accessible for some time.  (Which may explain why the other users are
reporting an immediate lock up, and it doesn't always work that way for
me).


I have the verbose output from two "lvm lvcreate" commands, the first one was 
successful, and the second one is the one that failed and cause the system to 
hang:

Successfully create snapshot:

   Logging initialised at Tue Apr  1 10:50:48 2008
    Set umask to 0077
    Setting chunksize to 16 sectors.
    Finding volume group "system"
    Creating logical volume vmware-snap4bak
  WARNING: This metadata update is NOT backed up
    Found volume group "system"
    Creating system-vmware--snap4bak
    Loading system-vmware--snap4bak table
    Resuming system-vmware--snap4bak (254:6)
    Clearing start of logical volume "vmware-snap4bak"
    Found volume group "system"
    Found volume group "system"
    Creating system-vmware4-real
    Loading system-vmware4-real table
    Resuming  (254:7)
    Loading system-vmware4 table
    Creating system-vmware--snap4bak-cow
    Loading system-vmware--snap4bak-cow table
    Resuming  (254:8)
    Loading system-vmware--snap4bak table
    Suspending system-vmware4 (254:5) with filesystem sync without device flush
    Suspending system-vmware4-real (254:7) with filesystem sync without device 
flush
    Found volume group "system"
    Loading system-vmware4-real table
    Suppressed system-vmware4-real identical table reload.
    Resuming system-vmware4-real (254:7)
    Loading system-vmware--snap4bak-cow table
    Suppressed system-vmware--snap4bak-cow identical table reload.
    Resuming system-vmware--snap4bak (254:6)
    Resuming system-vmware4 (254:5)
  WARNING: This metadata update is NOT backed up
  Logical volume "vmware-snap4bak" created
    Wiping internal VG cache


Failed snapshot (hangs at the end, there is no error message):

    Logging initialised at Tue Apr  1 10:51:16 2008
    Set umask to 0077
    Setting chunksize to 16 sectors.
    Finding volume group "system"
    Creating logical volume vmware-snap4bak
  WARNING: This metadata update is NOT backed up
    Found volume group "system"
    Creating system-vmware--snap4bak
    Loading system-vmware--snap4bak table
    Resuming system-vmware--snap4bak (254:6)
    Clearing start of logical volume "vmware-snap4bak"
    Found volume group "system"
    Found volume group "system"
    Creating system-vmware4-real
    Loading system-vmware4-real table
    Resuming  (254:7)
    Loading system-vmware4 table
    Creating system-vmware--snap4bak-cow
    Loading system-vmware--snap4bak-cow table
    Resuming  (254:8)
    Loading system-vmware--snap4bak table
    Suspending system-vmware4 (254:5) with filesystem sync without device flush


Here is output from the "dmsetup" utility, which perhaps can help someone more 
knowledgeable track what is going on here.  I ran the "info" command for each 
of the relevant dm devices.

[EMAIL PROTECTED]:~$ sudo dmsetup ls
system-vmware--snap4bak (254, 6)
system-backups  (254, 4)
system-vmware--snap4bak-cow     (254, 8)
system-vmware4-real     (254, 7)
system-home2    (254, 3)
system-cache    (254, 2)
system-shared--files    (254, 0)
system-downloads        (254, 1)
system-vmware4  (254, 5)

[EMAIL PROTECTED]:~$ sudo dmsetup info system-vmware--snap4bak
Name:              system-vmware--snap4bak
State:             ACTIVE
Tables present:    LIVE & INACTIVE
Open count:        0
Event number:      0
Major, minor:      254, 6
Number of targets: 1
UUID: LVM-7dwhJG9gqkES5zWHYk2BSmWUhnVxlE28MFjNe4f5wRP3VpMoopRHDM5ngCecAaDf

[EMAIL PROTECTED]:~$ sudo dmsetup info system-vmware--snap4bak-cow
Name:              system-vmware--snap4bak-cow
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      254, 8
Number of targets: 1
UUID: LVM-7dwhJG9gqkES5zWHYk2BSmWUhnVxlE28MFjNe4f5wRP3VpMoopRHDM5ngCecAaDf-cow

[EMAIL PROTECTED]:~$ sudo dmsetup info system-vmware4-real
Name:              system-vmware4-real
State:             ACTIVE
Tables present:    LIVE
Open count:        2
Event number:      0
Major, minor:      254, 7
Number of targets: 10
UUID: LVM-7dwhJG9gqkES5zWHYk2BSmWUhnVxlE28NX4r1PFzSnNx2wW81r4rsHOq5V28rgHR-real

[EMAIL PROTECTED]:~$ sudo dmsetup info system-vmware4
Name:              system-vmware4
State:             ACTIVE
Tables present:    LIVE & INACTIVE
Open count:        1
Event number:      0
Major, minor:      254, 5
Number of targets: 10
UUID: LVM-7dwhJG9gqkES5zWHYk2BSmWUhnVxlE28NX4r1PFzSnNx2wW81r4rsHOq5V28rgHR

-- 
LVM snapshot freezes system since 7.10 upgrade
https://bugs.launchpad.net/bugs/163807
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to