On 2014-10-23 at 09:03, Kevin Wolf wrote:
Am 23.10.2014 um 00:18 hat Eric Blake geschrieben:
On 10/21/2014 03:24 AM, Max Reitz wrote:
On 2014-10-21 at 10:04, Zhang Haoyu wrote:
Use local variable to bdrv_pwrite_sync L1 table,
needless to make conversion of cached L1 table between
big-endian and host style.
Signed-off-by: Zhang Haoyu <zhan...@sangfor.com>
---
block/qcow2-refcount.c | 22 +++++++---------------
1 file changed, 7 insertions(+), 15 deletions(-)
I know we're up to v5 and that Max already took it into his branch, but...
l1_size2 = l1_size * sizeof(uint64_t);
+ l1_table = g_try_malloc0(align_offset(l1_size2, 512));
I wanted to propose using qemu_try_blockalign(), but since it'd require
a memset() afterwards, it gets rather ugly.
Not after this recent patch:
https://lists.gnu.org/archive/html/qemu-devel/2014-10/msg02499.html
We can switch to qemu_try_blockalign0() in a follow-up patch.
But this is far from being a fast path, so there is very little to gain
anyway.
We don't need the 0 variant anyway. In v5, the one I applied, it's just
qemu_try_blockalign(). The size does not have to be aligned to
BDRV_SECTOR_SIZE, since any access index is below l1_size and only
bdrv_pread() and bdrv_pwrite() are used, so we can first omit the
align_offset(). Second, all the rest (0 .. l1_size - 1) is overwritten
either by memcpy() or bdrv_pread(). Therefore, no 0-ing required and
qemu_try_blockalign() was fine.
There are things we can do in follow-up patches to optimize v5 (getting
read of the memcpy()s), but as Kevin said (and myself before that in
reply to v5), this function is not performance-critical, so there's no
real point in doing so (other than "Clearly unoptimized code makes my
fingers twitch").
Max