Hi, I added a few clarifications inline. The relevant calculation is: header_size + l2_tables_size + refcounts_tables_size + data_size
and it is also described inline. I will be happy if you can confirm this calculation is acceptable Thanks, Maor On Wed, Feb 22, 2017 at 6:15 PM, Maor Lipchuk <mlipc...@redhat.com> wrote: > Hi all, > > Thank you very much for your help, it was much helpful > We adopted John Snow advice and implemented our own calculation so we > can resolve the issue now, > We plan to drop this code once we can get this estimation from qemu-img. > > This is the link to the patch introducing the calculation: > https://gerrit.ovirt.org/#/c/65039/14/lib/vdsm/storage/qcow2.py > > And here are link to the tests that we added: > https://gerrit.ovirt.org/#/c/65039/14/tests/storage_qcow2_test.py > > Here is how the calculation goes: > > We first use qemuimg map to get the used clusters and count all the > clusters for each run returned from qemuimg.map(filename): > > def count_clusters(runs): Just a clarification: runs are the output of qemuimg.map(filename). Here is the code from github that implements it: https://github.com/oVirt/vdsm/blob/master/lib/vdsm/qemuimg.py > count = 0 > last = -1 > for r in runs: > # Find the cluster when start and end are located. > start = r["start"] // CLUSTER_SIZE > end = (r["start"] + r["length"]) // CLUSTER_SIZE > if r["data"]: > if start == end: > # This run is smaller then a cluster. If we have several runs > # in the same cluster, we want to count the cluster only > once. > if start != last: > count += 1 > else: > # This run span over multiple clusters - we want to count all > # the clusters this run touch. > count += end - start > last = end > return count > > > The following calculation is based on Kevin's comments on the original > bug, and qcow2 spec: > https://github.com/qemu/qemu/blob/master/docs/specs/qcow2.txt: > > header_size = 3 * CLUSTER_SIZE > > virtual_size = os.stat(filename).st_size > > # Each 512MiB has one l2 table (64K) > l2_tables = (virtual_size + (512 * 1024**2) - 1) // (512 * 1024**2) > l2_tables_size = l2_tables * CLUSTER_SIZE > > # Each cluster have a refcount entry (16 bits) in the refcount tables. It > # is not clear what is the size of the refcount table, lets assume it is > # the same size as the l2 tables. > refcounts_tables_size = l2_tables_size The calculation is missing two more lines: data_size = used_clusters * CLUSTER_SIZE return header_size + l2_tables_size + refcounts_tables_size + data_size > > > After we calculate the estimated size we do the following logic and > multiply it with 1.1: > > chunk_size = config.getint("irs", > "volume_utilization_chunk_mb") > chunk_size = chunk_size * sc.MEGAB > newsize = (estimate_size + chunk_size) / sc.BLOCK_SIZE > self.log.debug("Estimated allocation for qcow2 volume:" > "%d bytes", newsize) > newsize = newsize * 1.1 > This last calculation can be ignored, it is mainly a safety space we add > > Please let me know if that calculation is acceptable and makes since > for this use case > > Thanks, > Maor > > > >>> On Mon, Feb 20, 2017 at 1:07 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >>>> On Wed, Feb 15, 2017 at 05:49:58PM +0200, Nir Soffer wrote: >>>>> On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <stefa...@gmail.com> >>>>> wrote: >>>>> > On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote: >>>>> >> I was wondering if that is possible to provide a new API that >>>>> >> estimates the size of >>>>> >> qcow2 image converted from a raw image. We could use this new API to >>>>> >> allocate the >>>>> >> size more precisely before the convert operation. >>>>> >> >>>>> > [...] >>>>> >> We think that the best way to solve this issue is to return this info >>>>> >> from qemu-img, maybe as a flag to qemu-img convert that will >>>>> >> calculate the size of the converted image without doing any writes. >>>>> > >>>>> > Sounds reasonable. qcow2 actually already does some of this calculation >>>>> > internally for image preallocation in qcow2_create2(). >>>>> > >>>>> > Let's try this syntax: >>>>> > >>>>> > $ qemu-img query-max-size -f raw -O qcow2 input.raw >>>>> > 1234678000 >>>>> >>>>> This is little bit verbose compared to other commands >>>>> (e.g. info, check, convert) >>>>> >>>>> Since this is needed only during convert, maybe this can be >>>>> a convert flag? >>>>> >>>>> qemu-img convert -f xxx -O yyy src dst --estimate-size --output json >>>>> { >>>>> "estimated size": 1234678000 >>>>> } >>>> >>>> What is dst? It's a dummy argument. >>>> >>>> Let's not try to shoehorn this new sub-command into qemu-img convert. >>>> >>>>> > As John explained, it is only an estimate. But it will be a >>>>> > conservative maximum. >>>>> > >>>>> > Internally BlockDriver needs a new interface: >>>>> > >>>>> > struct BlockDriver { >>>>> > /* >>>>> > * Return a conservative estimate of the maximum host file size >>>>> > * required by a new image given an existing BlockDriverState (not >>>>> > * necessarily opened with this BlockDriver). >>>>> > */ >>>>> > uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs, >>>>> > Error **errp); >>>>> > }; >>>>> > >>>>> > This interface allows individual block drivers to probe other_bs in >>>>> > whatever way necessary (e.g. querying block allocation status). >>>>> > >>>>> > Since this is a conservative max estimate there's no need to read all >>>>> > data to check for zero regions. We should give the best estimate that >>>>> > can be generated quickly. >>>>> >>>>> I think we need to check allocation (e.g. with SEEK_DATA), I hope this >>>>> is what you mean by not read all data. >>>> >>>> Yes, allocation data must be checked. But it will not read data >>>> clusters from disk to check if they contains only zeroes. >>>> >>>> Stefan