Am 07.05.2014 um 03:45 hat Fam Zheng geschrieben: > On Tue, 05/06 10:32, Fam Zheng wrote: > > On mounted NFS filesystem, ftruncate is much much slower than doing a > > zero write. Changing this significantly speeds up cluster allocation. > > > > Comparing by converting a cirros image (296M) to VMDK on an NFS mount > > point, over 1Gbe LAN: > > > > $ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk > > > > Before: > > real 0m26.464s > > user 0m0.133s > > sys 0m0.527s > > > > After: > > real 0m2.120s > > user 0m0.080s > > sys 0m0.197s > > > > Signed-off-by: Fam Zheng <f...@redhat.com> > > > > --- > > V2: Fix cluster_offset check. (Kevin) > > > > Signed-off-by: Fam Zheng <f...@redhat.com> > > --- > > block/vmdk.c | 19 ++++++++++++++----- > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/block/vmdk.c b/block/vmdk.c > > index 06a1f9f..98d2d56 100644 > > --- a/block/vmdk.c > > +++ b/block/vmdk.c > > @@ -1037,6 +1037,7 @@ static int get_cluster_offset(BlockDriverState *bs, > > int min_index, i, j; > > uint32_t min_count, *l2_table; > > bool zeroed = false; > > + int64_t ret; > > > > if (m_data) { > > m_data->valid = 0; > > @@ -1110,12 +1111,20 @@ static int get_cluster_offset(BlockDriverState *bs, > > } > > > > /* Avoid the L2 tables update for the images that have snapshots. > > */ > > - *cluster_offset = bdrv_getlength(extent->file); > > + ret = bdrv_getlength(extent->file); > > + if (ret < 0 || > > + ret & ((extent->cluster_sectors << BDRV_SECTOR_BITS) - 1)) { > > + return VMDK_ERROR; > > + } > > + *cluster_offset = ret; > > if (!extent->compressed) { > > - bdrv_truncate( > > - extent->file, > > - *cluster_offset + (extent->cluster_sectors << 9) > > - ); > > + ret = bdrv_write_zeroes(extent->file, > > + *cluster_offset >> BDRV_SECTOR_BITS, > > + extent->cluster_sectors, > > + 0); > > Hi Stefan, > > By considering a bdrv_write_zeroes as a pre-write, it in general doubles the > write for the whole image, so it's not a good solution. > > A better way would be removing the bdrv_truncate and require the caller to do > full cluster write (with a bounce buffer if necessary).
Doesn't get_whole_cluster() already ensure that you already write a full cluster to the image file? However, it might be better to not use bdrv_getlength() each time you need a new cluster, but instead use a field in VmdkExtent to keep the next free cluster offset (which is rounded up in vmdk_open). This will ensure that we don't overlap the next cluster allocation in case get_whole_cluster() fails halfway through. (In fact, the L2 table should only be updated after get_whole_cluster() has succeeded, but we can do both to be on the safe side...) Kevin