Hello,

We're running in to an issue with merging snapshots and need some direction on 
how to debug further.  We have some jobs in Jenkins that are repeatedly 
creating and merging snapshots.  After several consecutive runs they fail 
inside the lvconvert command.  Sometimes this is 3 runs, sometimes it's 
hundreds, but eventually it will fail.  This is on CentOS 6.8 with 
lvm2-2.02.143-7.el6.x86_64, but I can reproduce the same problem when upgrading 
to lvm2-2.02.143-12.el6.x86_64 from 6.9. The error is:

  journal: Merging snapshot invalidated. Aborting merge.

Which is thrown by lvconvert during it's progress polling code when it gets 
back -1 (DM_PERCENT_INVALID) as the merge percentage.  The snapshot is healthy 
before merging, there are very few (if any) block changes in the LV when our 
Jenkins jobs are running, and the snapshot and source volume are quite small 
(few hundred MiB).  I've managed to reproduce the issue outside of the Jenkins 
tests with this Bash loop. It generally takes 15-30 minutes of running this to 
fail:


while [[ 1 ]]; do
  date
  /sbin/lvremove -f vg_os/journal_reserved_snap
  /sbin/lvcreate -s -n journal_snap -L 160.00m vg_os/journal
  /bin/umount /mnt
  /sbin/lvconvert --merge -i 5 vg_os/journal_snap
  /bin/mount /mnt
  /sbin/lvcreate -L 160.00m -n vg_os/journal_reserved_snap
  echo
done


When the merge fails, the snapshot is left in the merging state but invalid, 
and oddly 100% full:


# lvs --all
...
  [journal_snap]        vg_os Swi-I-s--- 160.00m      journal 100.00            
                     


Device-Mapper says this about the merge:

May  5 10:45:55 lddev-build-scotty04 kernel: device-mapper: snapshots: 
Cancelling snapshot handover.
May  5 10:45:55 lddev-build-scotty04 kernel: device-mapper: snapshots: Snapshot 
is invalid: can't merge


And the *-cow and *-real DMs still exist:

# dmsetup ls | grep journal
vg_os-journal   (253:3)
vg_os-journal_snap      (253:6)
vg_os-journal_snap-cow  (253:5)
vg_os-journal-real      (253:4)


I can clean up the snapshot with 'lvremove' and start the process all over 
again.  I can also reproduce the problem on bare metal hardware and in a KVM 
instance (not that it should make a difference).

I'm at a bit of a loss on how to debug this any further.  I've done a little 
bit of experimenting with rolling back metadata changes to before the merge, 
but I don't really know what I'm looking for, and I generally always end up 
locking up Device-Mapper in some way and having to reboot :-)

Can anyone suggest a way forward here?

As an aside, while ruling out possible causes I tried the same Bash loop on a 
CentOS 6.8 machine with a non-standard kernel by accident. The result is 
different; it never fails to merge, but the LVM operations start really fast 
and get slower and slower overnight. When it would take less than a second to 
complete a loop first it was taking 30+ seconds in the end, and giving an 
interesting message about reserved memory. A resource leak?  I mention it just 
in case it's sheds light on the first problem, I don't really expect you to 
help me when using our custom kernel :-)

  Internal error: Reserved memory (15560704) not enough: used 25010176. 
Increase activation/reserved_memory?
  Logical volume "journal_reserved_snap" successfully removed

Thanks,

-- 
Luke Bigum
Lead Engineer

Information Systems
---

LMAX Exchange, Yellow Building, 1A Nicholas Road, London W11 4AN
http://www.LMAX.com/

Recognised by the most prestigious technology and business awards
 
Financial technology awards
2017 Best FX trading venue, Fund Technology & WSL Awards
2016 Best trading & execution venue, HFM US Technology Awards

FX industry awards
2016, 2015, 2014, 2013 Winner, Profit & Loss Readers' Choice Awards
2016, 2015, 2014, 2013 Winner, WSL Institutional Trading Awards

Business growth awards
2016, 2015 Winner, Deloitte UK Technology Fast 50
2015, 2014, 2013, Winner, The Sunday Times Tech Track 100
2016, 2015 Winner, Deloitte EMEA Technology Fast 500
2015 Winner, Tech City UK Future 50

---

FX and CFDs are leveraged products that can result in losses exceeding your 
deposit. They are not suitable for everyone so please ensure you fully 
understand the risks involved.

This message and its attachments are confidential, may not be disclosed or used 
by any person other than the addressee and are intended only for the named 
recipient(s). This message is not intended for any recipient(s) who based on 
their nationality, place of business, domicile or for any other reason, is/are 
subject to local laws or regulations which prohibit the provision of such 
products and services. This message is subject to the following terms 
(http://lmax.com/pdf/general-disclaimers.pdf), if you cannot access these, 
please notify us by replying to this email and we will send you the terms. If 
you are not the intended recipient, please notify the sender immediately and 
delete any copies of this message.

LMAX Exchange is the trading name of LMAX Limited. LMAX Limited operates a 
multilateral trading facility. LMAX Limited is authorised and regulated by the 
Financial Conduct Authority (firm registration number 509778) and is a company 
registered in England and Wales (number 6505809).

LMAX Hong Kong Limited is a wholly-owned subsidiary of LMAX Limited. LMAX Hong 
Kong is licensed by the Securities and Futures Commission in Hong Kong to 
conduct Type 3 (leveraged foreign exchange trading) regulated activity with CE 
Number BDV088.

_______________________________________________
linux-lvm mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Reply via email to