On 11/17/2014 05:06 AM, Max Reitz wrote: >> Umm, that sounds backwards from what you document. It's a good test of >> the _new_ reftable needing a second round of allocations. So keep it >> with corrected comments. But I think you _intended_ to write a test >> that starts with a refcount_width=64 image and resize to a >> refcount_width=1, where the _old_ reftable then suffers a reallocation >> as part of allocating refblocks for the new table. It may even help if >> you add a tracepoint for every iteration through the walk function >> callback, to prove we are indeed executing it 3 times instead of the >> usual 2, for these test cases. > > I'm currently thinking about a way to test the old reftable reallocation > issue, and I can't find any. So, for the old reftable to require a > reallocation it must grow. For it to grow we need some allocation beyond > what it can currently represent. For this to happen during the refblock > allocation walk, this allocation must be the allocation of a new refblock. > > If the refblock is allocated beyond the current reftable's limit, this > means that either all clusters between free_cluster_index and that point > are already taken. If the reftable is then reallocated, it will > therefore *always* be allocated behind that refblock, which is beyond > its old limit. Therefore, that walk through the old reftable will never > miss that new allocation. > > So the issue can only occur if the old reftable is resized after the > walk through it, that is, when allocating the new reftable. That is > indeed an issue but I think it manifests itself basically like the issue > I'm testing here: There is now an area in the old refcount structures > which was free before but has is used now, and the allocation causing > that was the allocation of the new reftable. The only difference is > whether the it's the old or the new reftable that resides in the > previously free area. Thus, I think I'll leave it at this test – but if > you can describe to me how to create an image for a different "rewalk" > path, I'm all ears.
===== The test you wrote does: original image, pre-walk: reftable is one cluster; with one refblock and 63 zero entries that refblock holds 4096 width-1 refcounts; of those, the first 63 are non-zero, the remaining are zero. Image is 32256 bytes long During the first walk, we call operation() 64 times - the first time with refblock_empty false, the remaining 63 times with refblock_empty true. after first walk but before reftable allocation, we have allocated one refblock that holds 64 width-64 refcounts (all zero, because we don't populate them until the final walk); and the old table now has 64 refcounts populated. Image is 32768 bytes long. Then we allocating a new reftable; so far, we only created one refblock for it to hold, so one cluster is sufficient. The allocation causes the old table to now have 65 refcounts populated. Image is now 33280 bytes long. On the second pass, we call operation() 64 times; now the first two walks have refblock_empty as false, which means we allocate a new refblock. This allocation causes the old table to now have 66 refcounts populated. Image is now 33792 bytes long. So we free our first attempt at a new reftable, and allocate another (a single cluster is still sufficient to hold two refblocks); I'm not sure whether this free/realloc will reuse cluster 65 or if it will pick up cluster 67 and leave a hole in 65. [I guess it depends on whether cluster allocation is done by first-fit analysis or whether it blindly favors allocating at the end of the image]. Either way, we have to do a third iteration, because the second iteration allocated a refblock and "reallocated" a reftable. On the third pass, operation() is still called 64 times, but because the only two calls with refblock_empty as false already have an allocated refblock, no further allocations are needed, and we are done with the do loop; the fourth walk can set refcounts. ===== The test I thought you were writing would start original image, pre-walk: reftable is one cluster; with one refblock and 63 zero entries that refblock holds 64 width-64 refcounts; of those, the first 63 are non-zero, the remaining are zero. Image is 32256 bytes long During the first walk, we call operation() 1 time, with refblock_empty false. after first walk but before reftable allocation, we have allocated one refblock that holds 4096 width-1 refcounts (all zero, because we don't populate them until the final walk); and the old table now has 64 refcounts populated. Image is 32768 bytes long. Then we allocating a new reftable; so far, we only created one refblock for it to hold, so one cluster is sufficient. The allocation causes the old table to now have 66 refcounts populated (one for the new refblock, but also one for an additional refblock in the old table because the first refblock was full). Image is now 33792 bytes long. On the second pass, we call operation() 1 time with refblock_empty as false, so we don't need any allocation. Which means the test you wrote is correct, while my idea does NOT trigger the third walk, at least not for the initial file size of 32256. You've been vindicated, you did it correctly :) ===== Now, in response to your question about some other 3-pass inducing pattern, let's think back to v1, where you questioned what would happen if a hole in the reftable gets turned into data due to a later allocation. Let's see if I can come up with a scenario for that... Let's stick with a cluster size of 512, and use 32-bit and 64-bit widths as our two sizes. If we downsize from 64 to 32 bits, then every two refblock clusters in the old table results in one call to operation() for the new table; conversely, if we upsize, then every refblock cluster in the old table gives two calls to operation() in the new table. The trick at hand is to come up with some image where we punch a hole so that on the first pass, we call operation() with refblock_empty true for one iteration (necessarily a call later than the first, since the image header guarantees the first refblock is not empty), but where we have data after the hole, where it is the later data that triggers the allocation that will finally start to fill the hole. How about starting with an image that occupies between 1.5 and 2 refblocks worth of 32-width clusters (so an image anywhere between 193 and 256 clusters, or between 98816 and 131072 bytes). You should be able to figure out how many clusters this consumes for L1, L2, plus 1 for header, reftable, and 2 for refblocks, in order to figure out how many remaining clusters are dedicated to data; ideally, the data clusters are contiguous, and occupy a swath that covers at least clusters 126 through 192. Widening to 64-bit width will require 4 refblocks instead of 2, if all refblocks are needed. But the whole idea of punching a hole is that we don't need a refblock if it will be all-zero entries. So take this original image, and discard the data clusters from physical index 126 through 192, (this is NOT the data visible at guest offset 31744, but whatever actual offset of guest data that maps to physical offset 31744). The old reftable now looks like { refblock_o1 [0-125 occupied, 126 and 127 empty], refblock_o2 [128-191 empty, 192-whatever occupied, tail empty] }. With no allocations required, this would in turn would map to the following new refblocks: { refblock_n1 [0-64 occupied], refblock_n2 [65-125 occupied, 126-127 empty], NULL, refblock_n4 [192-whatever occupied] }. Note that we do not need to allocate refblock_n3 because of the hole in the old refblock; we DO end up allocating three refblocks, but in the sequence of things, refblock_n1 and refblock_n2 are allocated while we are visiting refblock_o1 and still fit in refblock_o1, while refblock_n4 is not allocated until after we have already passed over the first half of refblock_o2. Thus, the second walk over the image will see that we need to allocate refblock_n3 because it now contains entries (in particular, the entry for refblock_n4, but also the 1-cluster entry for the proposed reftable that is allocated between the walks). So, while your test used the allocation of the reftable as the spillover point, my scenario here uses the allocation of later refblocks as the spillover point that got missed during the first iteration. which means the reftable now looks like { refblock1, NULL, refblock3, NULL... }; and where refblock1 now has at least two free entries (possibly three, if the just-freed refblock2 happened to live before cluster 62). is we can also free refblock2 -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature