On 09/28/2017 01:46 PM, Tom Buskey wrote: > I work with OpenStack. It manages images in Glance which sit above its > object storage, Swift. > > On the POC clouds, you can use LVM as a backend for Glance. Snapshotting is > *very* slow. 30 minutes for a snap of a > 80GB VM that's shutdown.
OK..., that surprises me. A lot. For comparison, I just made an LVM snapshot of a volume 50% larger than that, that's *in use* (and mostly not in cache, if that even makes a difference, since my buffer+cache shows as only 17GB *total*), and the whole operation took only a fraction of a second: rozzin@zuul:~ $ time sudo lvcreate --name home_snap --size 128G --snapshot zuul-vg/home Using default stripesize 64.00 KiB. Logical volume "home_snap" created. real 0m0.349s user 0m0.028s sys 0m0.060s How in the world does that translate to 30-minutes (*5 thousand* x time) for a volume only 0.63x as big? When you say "snapshotting on top of LVM", does that entail actually making a full copy after the LVM snapshot is made--or something like that? > You can use other storage backends in OpenStack that are faster. A full non > LVM Swift. Ceph and glusterfs are common > choices where performance matters. They wouldn't be using ZFS but probably > something using their S3 object store. > > > > > On Thu, Sep 28, 2017 at 1:32 PM, Ken D'Ambrosio <k...@jots.org > <mailto:k...@jots.org>> wrote: > > I would say it's unlikely to be LVM, because LVM is content-ignorant; it > snapshots the entire volume, which is inefficient, and when you're > Amazon, you care a LOT about being efficient. Instead, I imagine > they're using some content-aware CoW solution such as ZFS. But, > whatever mechanism, I agree with your opinion: I doubt that their > solution -- almost certainly CoW of some sort -- stands a chance of > being more than even slightly impactful. > > $.02, YMMV and other assorted disclaimers, > > -Ken > > > On 2017-09-28 13:16, Joshua Judson Rosen wrote: > > I'm working on a project that uses Amazon AWS-provided VPS instances, > > and the other guy on the project is telling me that "snapshotting > > hourly may degrade performance", > > and I'm trying to determine where that's actually true. My gut feeling > > is that it sounds kind of bogus. > > > >> From the information I've been able to find about how Amazon's stuff > >> works (either in terms > > of how it's _implemented_ [for which I'm finding basically no insight] > > or how it's _characterized_ > > [in the engineering sense, not the literary sense]...), it really > > sounds a _lot_ like Amazon > > is just using LVM snapshots, e.g. from > > <https://aws.amazon.com/ebs/faqs/ <https://aws.amazon.com/ebs/faqs/>>: > > > > "snapshots can be done in real time while the volume is attached > and > > in use. > > However, snapshots only capture data that has been written to > your > > Amazon EBS volume, > > which might exclude any data that has been locally cached by your > > application or OS." > > > > "By design, an EBS Snapshot of an entire 16 TB volume should take > no > > longer than the time > > it takes to snapshot an entire 1 TB volume. However, the actual > time > > taken to create > > a snapshot depends on several factors including the amount of > data > > that has changed > > since the last snapshot of the EBS volume." > > > > ... though I'm not entirely sure how to interpret that last bit about > > "time taken to create a snapshot > > depends on... the amount of data that has changed since the last > > snapshot"; > > the _first half of that statement_ reads as "creating a snapshot is > > constant time", > > which basically screams to me "copy-on-write just like LVM, and > > they're probably implemented > > in terms of LVM". > > > > Any insight here as to whether my gut is correct on this, or whether > > I'm actually likely > > to notice an impact from hourly snapshots of, say, a 200-GB volume? > > How about a 1-TB volume? > > > > The only thing I'm seeing from Amazon that seems to _vaguely_ support > > (maybe) the notion > > that `snapshotting too often' would be something to worry about is > > this bit from elsewhere > > in that same FAQ page (under the heading of "performance", whereas the > > others were > > under the heading of "snapshots" and a subheading of "performance > > consistency of my HDD-backed volumes": > > > > Another factor is taking a snapshot which will decrease expected > > write performance > > down to the baseline rate, until the snapshot completes. > > > > ... and, taken in the context of the previously-cited notes about > > snapshots being > > `not base on volume-size but maybe influenced by > > changed-since-last-snapshot set size' > > (and in the context of the explanations they give for HDD-backed vs. > > SSD-backed storage), > > I'm basically reading that as: > > > > `if you're using HDD-backed storage then it's because you care > about > > *throughput* > > more than *response time* and are likely to be monitoring > throughput, > > and if you're monitoring throughput you may notice a *momentary > dip > > in throughput* > > as the *HDDs* need to seek around to find the volume boundaries > and > > set up the COW records.' > > > > Even if you don't have any insight into what's actually happening > > under the covers at Amazon, > > does my reading of all of this sound right to you? > > > > And, perhaps more interestingly, are these same caveats from Amazon > > generally applicable to LVM? > _______________________________________________ > gnhlug-discuss mailing list > gnhlug-discuss@mail.gnhlug.org <mailto:gnhlug-discuss@mail.gnhlug.org> > http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ > <http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/> > > -- "Don't be afraid to ask (λf.((λx.xx) (λr.f(rr))))." _______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/