On Fri, Jul 22, 2011 at 3:21 PM, Frediano Ziglio <fredd...@gmail.com> wrote: > 2011/7/22 Stefan Hajnoczi <stefa...@gmail.com>: >> On Fri, Jul 22, 2011 at 10:13 AM, Frediano Ziglio <fredd...@gmail.com> wrote: >>> 2011/7/22 Kevin Wolf <kw...@redhat.com>: >>>> Am 21.07.2011 18:17, schrieb Frediano Ziglio: >>>>> Hi, >>>>> after a snapshot is taken currently many write operations are quite >>>>> slow due to >>>>> - refcount updates (decrement old and increment new ) >>>>> - cluster allocation and file expansion >>>>> - read-modify-write on partial clusters >>>>> >>>>> I found 2 way to improve refcount performance >>>>> >>>>> Method 1 - Lazy count >>>>> Mainly do not take into account count for current snapshot, that is >>>>> current snapshot counts as 0. This would require to add a >>>>> current_snapshot in header and update refcount when current is changed. >>>>> So for these operation >>>>> - creating snapshot, performance are the same, just increment for old >>>>> snapshot instead of the new one >>>>> - normal write operations. As current snaphot counts as 0 there is not >>>>> operations here so do not write any data >>>>> - changing current snapshot, this is the worst case, you have to >>>>> increment for the current snapshot and decrement for the new so it will >>>>> take twice >>>>> - deleting snapshot, if is the current just set current_snapshot to a >>>>> dummy not existing value, if is not the current just decrement counters, >>>>> no performance changes >>>> >>>> How would you do cluster allocation if you don't have refcounts any more >>>> that can tell you if a cluster is used or not? >>>> >>> >>> You have refcount, is only that current snapshot counts as 0. An >>> example may help, start with "A" snapshot A counts as zero so all >>> refcounts are 0, now we create a snapshot "B" and make it current so >>> refcounts are 1 >>> >>> A --- B >>> >>> If you change a cluster in snapshot "B" counts are still 1. If you go >>> back to "A" counters are increment (cause you leave B) and then >>> decrement (cause you enter in A). >>> >>> Perhaps the problem is how to distinguish 0 from "allocated in >>> current" and "not allocated". Yes, with which I suppose above it's a >>> problem, but we can easily use -1 as not allocated. If current and >>> refcount 0 mark as -1, if not current we would have to increment >>> counters of current, mark current as -1 than decrement for deleting, >>> yes in this case you have twice the time. >> >> I'm not sure I follow your last sentence but just having a different >> refcount value for "not allocated" vs "allocated" means allocating >> write requests will need to update refcounts. >> > > Now you have 0 for not allocated and >0 for allocated. If you assume > current snapshot counting as 0 a 0 in refcount could mean an allocated > cluster in current snapshot not shared with other snapshots and if you > don't use -1 could be also a not allocated cluster. > Thinking in another way is not that you don't update refcounts but you > update refcounts with 0 addend (that's practically not changing > refcounts). > Question was: is possible to use this trick? > >> But are non-append allocations common enough that we should bother >> with them in the allocating write path? Can we append to the end of >> the image file for allocating writes and handle defragmentation >> elsewhere (i.e. get rid of unallocated clusters in the middle of the >> file)? >> >> Stefan >> > > I think so but is better to have a way to know if a cluster is > allocated without having to scan all l2 tables.
You don't need to scan *all* L2 tables. You just need to scan the current L2 tables because the current "snapshot" doesn't affect refcounts. Stefan