Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2013-01-23 Thread Kevin Wolf
Am 22.01.2013 15:14, schrieb Kevin Wolf:
 The series of requests to get into this state is somewhat tricky as it
 requires a specific physical ordering of clusters in the image file:
 
 Host cluster h: Guest cluster g is mapped to here
 Host cluster h + 1: Free, can be allocated for not yet mapped g + 1
 Host cluster h + 2: Guest cluster g + 2 is mapped to here
 
 I can get this with this command on a fresh qcow2 image (64k cluster size):
 
 ./qemu-io
   -c 'write -P 0x66 0 64k'
   -c 'write 64k 64k'
   -c 'write -P 0x88 128k 64k'
   -c 'discard 64k 64k'
   -c 'write -P 0x77 0 160k'
   /tmp/test.qcow2
 
 Now I get an additional COW for the area from 96k to 128k. However, this
 alone doesn't corrupt an image - a copy on write by itself is harmless
 because it only rewrites the data that is already there.
 
 The two things I'm going to check next are:
 
 a) What happens with concurrent requests now, are requests for the
same area still correctly serialised?
 
 b) Can I modify the parameters so that copy_sectors() is working
with values it wasn't designed for so that the data is copied to
a wrong place or something like that. m-nb_available is used as
an argument to it, so it certainly seems realistic.

I can reliably reproduce a bug with a) at least. It requires some
backports to qemu-io in order to get more control of the timing of AIO
requests, but then it works like this:

write -P 0x66 0 320k
write 320k 128k
write -P 0x88 448k 128k
discard 320k 128k
aio_flush

break cow_write A
aio_write -P 0x77 0 480k
wait_break A
write -P 0x99 480k 32k
resume A
aio_flush

read -P 0x77 0 480k
read -P 0x99 480k 32k
read -P 0x88 512k 64k

What happens is that the write of 0x99 to 480k doesn't detect an overlap
with the running AIO request (because obviously it doesn't overlap), but
the COW is actually taking place in the wrong cluster and therefore
unprotected. Stop the AIO request between the COW read and the COW write
to write 0x99 to that area, and the wrong COW will overwrite it, i.e.
corrupt the image.

Basing a real test case on this is not trivial, though, because I have
to stop on the blkdebug event cow_write - which obviously isn't even
emitted in the fixed version.

I still have to look into option b)

Kevin



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2013-01-23 Thread Kevin Wolf
Am 23.01.2013 12:43, schrieb Kevin Wolf:
 Am 22.01.2013 15:14, schrieb Kevin Wolf:
 The series of requests to get into this state is somewhat tricky as it
 requires a specific physical ordering of clusters in the image file:

 Host cluster h: Guest cluster g is mapped to here
 Host cluster h + 1: Free, can be allocated for not yet mapped g + 1
 Host cluster h + 2: Guest cluster g + 2 is mapped to here

 I can get this with this command on a fresh qcow2 image (64k cluster size):

 ./qemu-io
   -c 'write -P 0x66 0 64k'
   -c 'write 64k 64k'
   -c 'write -P 0x88 128k 64k'
   -c 'discard 64k 64k'
   -c 'write -P 0x77 0 160k'
   /tmp/test.qcow2

 Now I get an additional COW for the area from 96k to 128k. However, this
 alone doesn't corrupt an image - a copy on write by itself is harmless
 because it only rewrites the data that is already there.

 The two things I'm going to check next are:

 a) What happens with concurrent requests now, are requests for the
same area still correctly serialised?

 b) Can I modify the parameters so that copy_sectors() is working
with values it wasn't designed for so that the data is copied to
a wrong place or something like that. m-nb_available is used as
an argument to it, so it certainly seems realistic.
 
 I can reliably reproduce a bug with a) at least. It requires some
 backports to qemu-io in order to get more control of the timing of AIO
 requests, but then it works like this:
 
 write -P 0x66 0 320k
 write 320k 128k
 write -P 0x88 448k 128k
 discard 320k 128k
 aio_flush
 
 break cow_write A
 aio_write -P 0x77 0 480k
 wait_break A
 write -P 0x99 480k 32k
 resume A
 aio_flush
 
 read -P 0x77 0 480k
 read -P 0x99 480k 32k
 read -P 0x88 512k 64k
 
 What happens is that the write of 0x99 to 480k doesn't detect an overlap
 with the running AIO request (because obviously it doesn't overlap), but
 the COW is actually taking place in the wrong cluster and therefore
 unprotected. Stop the AIO request between the COW read and the COW write
 to write 0x99 to that area, and the wrong COW will overwrite it, i.e.
 corrupt the image.
 
 Basing a real test case on this is not trivial, though, because I have
 to stop on the blkdebug event cow_write - which obviously isn't even
 emitted in the fixed version.
 
 I still have to look into option b)

This is the reproducer using b):

write -P 0x66 0 320k
write 320k 128k
write -P 0x55 1M 128k
write -P 0x88 448k 128k
discard 320k 128k
aio_flush

write -P 0x77 0 480k
aio_flush

read -P 0x77 0 480k
read -P 0x88 480k 96k
read -P 0x55 1M 128k

Not sure if a stable branch for 1.1 is still maintained, but I'm copying
qemu-stable just in case. Commit b7ab0fea actually fixes a qcow2 data
corruption issue (even though under rare circumstances) and should
definitely be cherry-picked for any new 1.1.x release.

qemu 1.0 and older don't have the bug.

Kevin



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2013-01-23 Thread Michael Tokarev
23.01.2013 17:54, Kevin Wolf wrote:
[]
 Not sure if a stable branch for 1.1 is still maintained, but I'm copying
 qemu-stable just in case. Commit b7ab0fea actually fixes a qcow2 data
 corruption issue (even though under rare circumstances) and should
 definitely be cherry-picked for any new 1.1.x release.

It's been picked up by me once Philipp found it after bisection.
I understand that at that time, it wasn't known what actually
happens, but since Philipp tested it in his tree and it is a
bugfix and it were applied to master (actually to a released
version), I thought it is okay.  It is also included in current
Debian packages.

Speaking of 1.1, I really want to make 1.1.3 release, but I'm
just unsure how to do that... ;)

Thank you Kevin!

/mjt



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2013-01-22 Thread Kevin Wolf
Am 18.12.2012 10:46, schrieb Philipp Hahn:
 Hello Kevin, hello Michael,
 
 On Wednesday 12 December 2012 17:54:58 Kevin Wolf wrote:
 Am 12.12.2012 15:09, schrieb Philipp Hahn:
 Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
 As you can see in the commit message of that patch I was convinced that
 no bug did exist in practice and this was only dangerous with respect to
 future changes. Therefore my first question is if you're using an
 unmodified upstream qemu or if some backported patches are applied to
 it? If it's indeed unmodified, we should probably review the code once
 again to understand why it makes a difference.

 This were all unmodified versions directly from git between
 qemu-kvm-1.1.0 and qemu-kvm-1.2.0

 git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c works,
 git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1 is broken.
 git checkout qemu-kvm-1.1.2  is broken,
 git checkout qemu-kvm-1.1.2 ; git cherry-pick
 b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c  works

 Ok, thanks for clarifying. Then I must have missed some interesting case
 while doing the patch.
 
 I think I found your missing link:
 After filling in QCowL2Meta *m, that request ist queued:
   QLIST_INSERT_HEAD(s-cluster_allocs, m, next_in_flight);
 do prevent double allocating the same cluster for overlapping requests, which 
 is checked in do_alloc_cluster_offset().
 I guess that since the sector count was wrong, the overlap detection didn't 
 work and the two concurrent write requests to the same cluster overwrote each 
 other.

I'm still not seeing it. The overlap detection code doesn't use
m-nb_available at all, so why would it make a difference?

I guess I need some people who can decently review code - Laszlo, Paolo,
any idea why upstream commit b7ab0fea does change the behaviour,
contrary to what I claimed in the commit message?

 Ideally we would find a sequence of qemu-io commands to reliably
 reproduce this.
 
 You're the block guru, so I leave that to you (or anybody else who knows more 
 about the working of qemu-io.) ;-)

Yeah, as soon as I know what's happening, calling qemu-io with the right
options shouldn't be a problem.

Kevin



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2013-01-22 Thread Laszlo Ersek
On 01/22/13 11:25, Kevin Wolf wrote:
 Am 18.12.2012 10:46, schrieb Philipp Hahn:
 Hello Kevin, hello Michael,

 On Wednesday 12 December 2012 17:54:58 Kevin Wolf wrote:
 Am 12.12.2012 15:09, schrieb Philipp Hahn:
 Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
 As you can see in the commit message of that patch I was convinced that
 no bug did exist in practice and this was only dangerous with respect to
 future changes. Therefore my first question is if you're using an
 unmodified upstream qemu or if some backported patches are applied to
 it? If it's indeed unmodified, we should probably review the code once
 again to understand why it makes a difference.

 This were all unmodified versions directly from git between
 qemu-kvm-1.1.0 and qemu-kvm-1.2.0

 git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c works,
 git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1 is broken.
 git checkout qemu-kvm-1.1.2  is broken,
 git checkout qemu-kvm-1.1.2 ; git cherry-pick
 b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c  works

 Ok, thanks for clarifying. Then I must have missed some interesting case
 while doing the patch.

 I think I found your missing link:
 After filling in QCowL2Meta *m, that request ist queued:
   QLIST_INSERT_HEAD(s-cluster_allocs, m, next_in_flight);
 do prevent double allocating the same cluster for overlapping requests, 
 which 
 is checked in do_alloc_cluster_offset().
 I guess that since the sector count was wrong, the overlap detection didn't 
 work and the two concurrent write requests to the same cluster overwrote 
 each 
 other.
 
 I'm still not seeing it. The overlap detection code doesn't use
 m-nb_available at all, so why would it make a difference?
 
 I guess I need some people who can decently review code - Laszlo, Paolo,
 any idea why upstream commit b7ab0fea does change the behaviour,
 contrary to what I claimed in the commit message?

As I understand it, the question is not whether b7ab0fea works or not
(it does), the question is whether *without* b7ab0fea there is a real
problem (user visible bug).

Commit b7ab0fea decreased avail_sectors by

keep_clusters  (s-cluster_bits - BDRV_SECTOR_BITS)

which should make a difference for m-nb_available in two cases:

(a)

avail_sectors_patched   
requested_sectors   =
avail_sectors_unpatched

(ie. the decrease in avail_sectors flipped MIN to prefer it over
requested_sectors, lowering m-nb_available)

(b)

avail_sectors_patched   
avail_sectors_unpatched =
requested_sectors

(ie. MIN had already preferred avail_sectors, and now it went lower).


It seems that

  1  (s-cluster_bits - BDRV_SECTOR_BITS) == s-cluster_sectors[1]

Therefore both avail_sectors_patched and avail_sectors_unpatched are
an integral multiple of s-cluster_sectors. Whenever MIN() selects
either of them, the condition in qcow2_alloc_cluster_link_l2() will
evaluate to false.


In case (b) there seems to be no difference between patched  unpatched
in this regard: MIN() always selects an integral multiple, and so
copy_sectors() is never invoked.

In case (a), the patched version skips copy_sectors(), while the
unpatched version invokes it.

I'm not sure if case (a) is possible at all -- let's expand it:

avail_sectors_patched   
requested_sectors   =
avail_sectors_unpatched

Substitute the assigned values:

  nb_cluster   (s-cluster_bits - BDRV_SECTOR_BITS) 
n_end - keep_clusters * s-cluster_sectors=
(keep_clusters + nb_clusters)  (s-cluster_bits - BDRV_SECTOR_BITS)

Using [1], replace the shifts with multiplication:

nb_cluster  * s-cluster_sectors 
  n_end - keep_clusters * s-cluster_sectors =
  (keep_clusters + nb_clusters) * s-cluster_sectors

Distribute the addition:

nb_cluster  * s-cluster_sectors   
  n_end - keep_clusters * s-cluster_sectors   =
  keep_clusters * s-cluster_sectors +
nb_clusters * s-cluster_sectors

N :=   nb_cluster  * s-cluster_sectors [bytes]
K := keep_clusters * s-cluster_sectors [bytes]

  Nn_end - K  =K + N

Add K

  K + Nn_end  =  2*K + N   [2]

I have no clue about qcow2, but *each one* of K and N seem to be a
function of *both* n_end and the current qcow2 disk state (ie. sectors
being allocated vs. just referenced). IOW, I suspect that case (a) is
possible for some (disk state, n_end) pairs. And case (a) -- if indeed
possible -- seems to contradict the commit message:

  allocation. A COW occurs only if the request wasn't cluster aligned,
   ^^^^^^
  which in turn would imply that requested_sectors was less than

  avail_sectors (both in the original and in the fixed version). In this
  ^   

In case 

Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2013-01-22 Thread Kevin Wolf
Am 22.01.2013 14:17, schrieb Laszlo Ersek:
 As I understand it, the question is not whether b7ab0fea works or not
 (it does), the question is whether *without* b7ab0fea there is a real
 problem (user visible bug).

Correct. Or actually not whether, but why.

 Commit b7ab0fea decreased avail_sectors by
 
 keep_clusters  (s-cluster_bits - BDRV_SECTOR_BITS)
 
 which should make a difference for m-nb_available in two cases:
 
 (a)
 
 avail_sectors_patched   
 requested_sectors   =
 avail_sectors_unpatched
 
 (ie. the decrease in avail_sectors flipped MIN to prefer it over
 requested_sectors, lowering m-nb_available)
 
 (b)
 
 avail_sectors_patched   
 avail_sectors_unpatched =
 requested_sectors
 
 (ie. MIN had already preferred avail_sectors, and now it went lower).
 
 
 It seems that
 
   1  (s-cluster_bits - BDRV_SECTOR_BITS) == s-cluster_sectors[1]
 
 Therefore both avail_sectors_patched and avail_sectors_unpatched are
 an integral multiple of s-cluster_sectors. Whenever MIN() selects
 either of them, the condition in qcow2_alloc_cluster_link_l2() will
 evaluate to false.
 
 
 In case (b) there seems to be no difference between patched  unpatched
 in this regard: MIN() always selects an integral multiple, and so
 copy_sectors() is never invoked.
 
 In case (a), the patched version skips copy_sectors(), while the
 unpatched version invokes it.
 
 I'm not sure if case (a) is possible at all -- let's expand it:
 
 avail_sectors_patched   
 requested_sectors   =
 avail_sectors_unpatched
 
 Substitute the assigned values:
 
   nb_cluster   (s-cluster_bits - BDRV_SECTOR_BITS) 
 n_end - keep_clusters * s-cluster_sectors=
 (keep_clusters + nb_clusters)  (s-cluster_bits - BDRV_SECTOR_BITS)
 
 Using [1], replace the shifts with multiplication:
 
 nb_cluster  * s-cluster_sectors 
   n_end - keep_clusters * s-cluster_sectors =
   (keep_clusters + nb_clusters) * s-cluster_sectors
 
 Distribute the addition:
 
 nb_cluster  * s-cluster_sectors   
   n_end - keep_clusters * s-cluster_sectors   =
   keep_clusters * s-cluster_sectors +
 nb_clusters * s-cluster_sectors
 
 N :=   nb_cluster  * s-cluster_sectors [bytes]
 K := keep_clusters * s-cluster_sectors [bytes]
 
   Nn_end - K  =K + N
 
 Add K
 
   K + Nn_end  =  2*K + N   [2]

Thank you so much, Laszlo, you are great!

I more or less completely failed to see case (a). With your excellent
analysis I can indeed trigger a copy on write when it shouldn't happen.
The series of requests to get into this state is somewhat tricky as it
requires a specific physical ordering of clusters in the image file:

Host cluster h: Guest cluster g is mapped to here
Host cluster h + 1: Free, can be allocated for not yet mapped g + 1
Host cluster h + 2: Guest cluster g + 2 is mapped to here

I can get this with this command on a fresh qcow2 image (64k cluster size):

./qemu-io
  -c 'write -P 0x66 0 64k'
  -c 'write 64k 64k'
  -c 'write -P 0x88 128k 64k'
  -c 'discard 64k 64k'
  -c 'write -P 0x77 0 160k'
  /tmp/test.qcow2

Now I get an additional COW for the area from 96k to 128k. However, this
alone doesn't corrupt an image - a copy on write by itself is harmless
because it only rewrites the data that is already there.

The two things I'm going to check next are:

a) What happens with concurrent requests now, are requests for the
   same area still correctly serialised?

b) Can I modify the parameters so that copy_sectors() is working
   with values it wasn't designed for so that the data is copied to
   a wrong place or something like that. m-nb_available is used as
   an argument to it, so it certainly seems realistic.

 I have no clue about qcow2, but *each one* of K and N seem to be a
 function of *both* n_end and the current qcow2 disk state (ie. sectors
 being allocated vs. just referenced). IOW, I suspect that case (a) is
 possible for some (disk state, n_end) pairs. And case (a) -- if indeed
 possible -- seems to contradict the commit message:
 
   allocation. A COW occurs only if the request wasn't cluster aligned,
^^^^^^
   which in turn would imply that requested_sectors was less than
 
   avail_sectors (both in the original and in the fixed version). In this
   ^   
 
 In case (a) -- expanded as [2] --, requested_sectors is greater than
 avail_sectors_patched, and less than or equal to
 avail_sectors_unpatched.

Yes, I agree. The commit message is definitely wrong and I understand
now why, so that was a great step forward. There's still a missing piece
about how it creates a user-visible bug, but hopefully it won't be that
hard to find.

Kevin



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-18 Thread Philipp Hahn
Hello Kevin, hello Michael,

On Wednesday 12 December 2012 17:54:58 Kevin Wolf wrote:
 Am 12.12.2012 15:09, schrieb Philipp Hahn:
  Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
  As you can see in the commit message of that patch I was convinced that
  no bug did exist in practice and this was only dangerous with respect to
  future changes. Therefore my first question is if you're using an
  unmodified upstream qemu or if some backported patches are applied to
  it? If it's indeed unmodified, we should probably review the code once
  again to understand why it makes a difference.
 
  This were all unmodified versions directly from git between
  qemu-kvm-1.1.0 and qemu-kvm-1.2.0
 
  git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c works,
  git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1 is broken.
  git checkout qemu-kvm-1.1.2  is broken,
  git checkout qemu-kvm-1.1.2 ; git cherry-pick
  b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c  works

 Ok, thanks for clarifying. Then I must have missed some interesting case
 while doing the patch.

I think I found your missing link:
After filling in QCowL2Meta *m, that request ist queued:
  QLIST_INSERT_HEAD(s-cluster_allocs, m, next_in_flight);
do prevent double allocating the same cluster for overlapping requests, which 
is checked in do_alloc_cluster_offset().

I guess that since the sector count was wrong, the overlap detection didn't 
work and the two concurrent write requests to the same cluster overwrote each 
other.

 Ideally we would find a sequence of qemu-io commands to reliably
 reproduce this.

You're the block guru, so I leave that to you (or anybody else who knows more 
about the working of qemu-io.) ;-)

Sincerely
Philipp
-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHbe open.   fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/


signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-18 Thread Michael Tokarev
On 18.12.2012 13:46, Philipp Hahn wrote:

 I think I found your missing link:
 After filling in QCowL2Meta *m, that request ist queued:
   QLIST_INSERT_HEAD(s-cluster_allocs, m, next_in_flight);
 do prevent double allocating the same cluster for overlapping requests, which 
 is checked in do_alloc_cluster_offset().
 
 I guess that since the sector count was wrong, the overlap detection didn't 
 work and the two concurrent write requests to the same cluster overwrote each 
 other.

Meh.  And I already closed the debian bugreport... :)

But thank you Philipp for your excellent work on the matter!

/mjt



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-14 Thread Philipp Hahn
Hello Kevin,

On Wednesday 12 December 2012 18:29:48 Philipp Hahn wrote:
 I just re-run my git bisect run ~/bisect.sh  case, but it again arrived
 at that patch. I just queued another run for tonight so make sure the test
 is reliable:

The run from last night again arrived at the refecenced patch.

  Ideally we would find a sequence of qemu-io commands to reliably
  reproduce this. First thing worth trying would be running the current
  qemu-iotests suite on the old versions. If we don't find it this way, I
  guess we need to catch it with code review. I'm not sure if I can get to
  it this week, and starting next week I'll be on vacation, so any help
  with finding a reproducer would be appreciated.

I took a closer look at what gets corrupted; I've attached my notes.
Please notice that the partitions are not alignd properly.

If you would like to look at the full qcow2_alloc_clusters_offset trace, I can 
provide you with a link to the trace file.

BYtE
Philipp
-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHbe open.   fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/
# FILE=/var/cache/apt/archives/liblqr-1-0_0.4.1-1.3.201104131631_amd64.deb
# debugfs /dev/vg_ucs/rootfs -R stat $FILE
debugfs 1.41.12 (17-May-2010)
Inode: 1385717   Type: regularMode:  0644   Flags: 0x0
Generation: 2854262870Version: 0x
User: 0   Group: 0   Size: 38884
File ACL: 0Directory ACL: 0
Links: 1   Blockcount: 80
Fragment:  Address: 0Number: 0Size: 0
ctime: 0x50979ae9 -- Mon Nov  5 11:54:33 2012
atime: 0x50979b0d -- Mon Nov  5 11:55:09 2012
mtime: 0x4de1dbce -- Sun May 29 07:38:22 2011
Size of extra inode fields: 4
BLOCKS:
(0-9):5728401-5728410
TOTAL: 10

# BSIZE=4096
# BOFFSET=0
# dd bs=$BSIZE count=1 if=$FILE skip=$BOFFSET 2/dev/null | md5sum
065b19ba6e9153dcc88003ea06076f9f  -

# BLOCK=5728401
# dd bs=$BSIZE count=1 if=/dev/vg_ucs/rootfs skip=$BLOCK 2/dev/null | md5sum
065b19ba6e9153dcc88003ea06076f9f  -

# dmsetup table
vg_ucs-rootfs: 0 97910784 linear 254:3 384
# LV_SOFFSET=384
# dd bs=512c count=8 if=/dev/vda3 skip=$((BLOCK*8+LV_SOFFSET)) 2/dev/null | 
md5sum
065b19ba6e9153dcc88003ea06076f9f  -

# fdisk -l -u /dev/vda
/dev/vda3 4739175   10485625450058540   8e  Linux LVM
# PART_SOFFSET=4739175
# dd bs=512c count=8 if=/dev/vda skip=$((BLOCK*8+LV_SOFFSET+PART_SOFFSET)) 
2/dev/null | md5sum
065b19ba6e9153dcc88003ea06076f9f  -

# debugfs /dev/vg_ucs/rootfs -R icheck $(seq 5728387 5728403 | tr '\n' ' ')
debugfs 1.41.12 (17-May-2010)
Block   Inode number
5728387 1385715  
5728388 1385715  
5728389 1385715  
5728390 1385715  
5728391 1385715  
5728392 1385715  
5728393 1385715  
5728394 1385715  
5728395 1385715  
5728396 1385715  
5728397 1385716  
5728398 1385716  
5728399 1385716  
5728400 1385716  
5728401 1385717  
5728402 1385717  
5728403 1385717  

# debugfs /dev/vg_ucs/rootfs -R ncheck 1385715 1385716 1385717
debugfs 1.41.12 (17-May-2010)
Inode   Pathname 
1385715 
/var/cache/apt/archives/libhtml-template-perl_2.9-2.7.201104290220_all.deb
1385717 /var/cache/apt/archives/liblqr-1-0_0.4.1-1.3.201104131631_amd64.deb
1385716 
/var/cache/apt/archives/libio-socket-inet6-perl_2.65-1.1.3.201104291113_all.deb

# md5sum 
/var/cache/apt/archives/libhtml-template-perl_2.9-2.7.201104290220_all.deb 
/var/cache/apt/archives/liblqr-1-0_0.4.1-1.3.201104131631_amd64.deb 
/var/cache/apt/archives/libio-socket-inet6-perl_2.65-1.1.3.201104291113_all.deb
123f4c338bd875825d5d762e8bb48b2a  
/var/cache/apt/archives/libhtml-template-perl_2.9-2.7.201104290220_all.deb
eadbf53d7313df2560f4741fbe982008  
/var/cache/apt/archives/liblqr-1-0_0.4.1-1.3.201104131631_amd64.deb
d0c7bf2d62c125409e00e459ac94277a  
/var/cache/apt/archives/libio-socket-inet6-perl_2.65-1.1.3.201104291113_all.deb

# apt-cache show libhtml-template-perl liblqr-1-0 libio-socket-inet6-perl | 
egrep 'MD5sum|Package'
Package: libhtml-template-perl
MD5sum: 123f4c338bd875825d5d762e8bb48b2a
Package: liblqr-1-0
MD5sum: 94ccb97b38bedef97072fdcc7bce1872
Package: libio-socket-inet6-perl
MD5sum: 8f04f2da2a7d2eefb9e77bf87a0b8972



# IMAGE=/var/lib/libvirt/images/stefan_UCS-3.0-2-13.3-Kolab-Slave.qcow2
# BLOCK=5728401 BSIZE=4096 BOFFSET=0 LV_SOFFSET=384 PART_SOFFSET=4739175
# OFFSET=$((BLOCK*BSIZE + LV_SOFFSET*512 + PART_SOFFSET*512))
# echo $((OFFSET  16  13)) $(((OFFSET  16)  ((1  13) - 1))) $(((OFFSET 
 0)  ((1  16) - 1)))
48 1836 56832
# qemu-io -c read -v $OFFSET $BSIZE $IMAGE | sed -ne 's/^\([0-9a-f]\+:\) 
/\1/p' | xxd -r -seek -$OFFSET | md5sum
065b19ba6e9153dcc88003ea06076f9f  -
# qcow2.py -r -s $IMAGE --read $OFFSET $BSIZE | xxd -r | md5sum
l1=0x30 l2=0x72c c=0xde00
065b19ba6e9153dcc88003ea06076f9f  -

# qcow2.py -r -s $IMAGE | egrep 
L1\[$((0x30))\]|L2\[$((0x72c))\]|cluster_bits| size|l1_size
: + cluster_bits=16 (64 KiB)

Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-12 Thread Kevin Wolf
Hi Philipp,

Am 12.12.2012 14:25, schrieb Philipp Hahn:
 Hello Kevin, hello Michael, hello *,
 
 we noticed a data corruption bug in qemu-1.1.2, which will be shipped by 
 Debian and our own Debian based distibution.
 The corruption mostly manifests while installing large Debian package files 
 and seems to be reladed to memory preasure: As long as the file is still in 
 the page cache, everything looks fine, but when the file is re-read from the 
 virtual hard disk using a qcow2 file backed by another qcow2 file, the file 
 is corrupted: dpkg complains that the .tar.gz file inside the Debian archive 
 file is corrupted and the md5sum no longer matches.
 
 I tracked this down using git bisect to your patch attached below, which 
 fixed this bug, so everything is fine with qemu-kvm-1.2.0.
 From my reading this seems to explain our problems, since during my own 
 testing during development I never used backing chains and the problem only 
 showed up when my collegues started using qemu-kvm-1.1.2 with their VMs using 
 backing chains.
 
 @Kevin: Do you thinks that's a valid explanation and your patch should fix 
 that problem?
 I'd like to get your expertise before filing a bug with Debian and asking 
 Michael to include that patch with his next stable update for 1.1.

As you can see in the commit message of that patch I was convinced that
no bug did exist in practice and this was only dangerous with respect to
future changes. Therefore my first question is if you're using an
unmodified upstream qemu or if some backported patches are applied to
it? If it's indeed unmodified, we should probably review the code once
again to understand why it makes a difference.

In any case, this is the cluster allocation code. It's probably not
related to rereading things from disk, but rather to the writeout of the
page cache.

Kevin



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-12 Thread Philipp Hahn
Hello Kevin, 

Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
 As you can see in the commit message of that patch I was convinced that
 no bug did exist in practice and this was only dangerous with respect to
 future changes. Therefore my first question is if you're using an
 unmodified upstream qemu or if some backported patches are applied to
 it? If it's indeed unmodified, we should probably review the code once
 again to understand why it makes a difference.

This were all unmodified versions directly from git between qemu-kvm-1.1.0 
and qemu-kvm-1.2.0

git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c works,
git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1 is broken.
git checkout qemu-kvm-1.1.2  is broken,
git checkout qemu-kvm-1.1.2 ; git cherry-pick 
b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c  works

 In any case, this is the cluster allocation code. It's probably not
 related to rereading things from disk, but rather to the writeout of the
 page cache.

Yes, the problem is probably write related. But as the write doens't explode 
with some spectacular error, I only notice the error on the following read 
by comparing md5 sums.
I just re-checked it: After a reboot the md5sums are still invalid, so I guess 
the data is corrupted on writeout.

Sincerely
Philipp
-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHbe open.   fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/


signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-12 Thread Kevin Wolf
Am 12.12.2012 15:09, schrieb Philipp Hahn:
 Hello Kevin, 
 
 Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
 As you can see in the commit message of that patch I was convinced that
 no bug did exist in practice and this was only dangerous with respect to
 future changes. Therefore my first question is if you're using an
 unmodified upstream qemu or if some backported patches are applied to
 it? If it's indeed unmodified, we should probably review the code once
 again to understand why it makes a difference.
 
 This were all unmodified versions directly from git between qemu-kvm-1.1.0 
 and qemu-kvm-1.2.0
 
 git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c works,
 git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1 is broken.
 git checkout qemu-kvm-1.1.2  is broken,
 git checkout qemu-kvm-1.1.2 ; git cherry-pick 
 b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c  works

Ok, thanks for clarifying. Then I must have missed some interesting case
while doing the patch.

Ideally we would find a sequence of qemu-io commands to reliably
reproduce this. First thing worth trying would be running the current
qemu-iotests suite on the old versions. If we don't find it this way, I
guess we need to catch it with code review. I'm not sure if I can get to
it this week, and starting next week I'll be on vacation, so any help
with finding a reproducer would be appreciated.

 In any case, this is the cluster allocation code. It's probably not
 related to rereading things from disk, but rather to the writeout of the
 page cache.
 
 Yes, the problem is probably write related. But as the write doens't explode 
 with some spectacular error, I only notice the error on the following read 
 by comparing md5 sums.
 I just re-checked it: After a reboot the md5sums are still invalid, so I 
 guess 
 the data is corrupted on writeout.

Yes, it's really the only thing that makes sense in connection with this
patch.

Kevin



Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-12 Thread Philipp Hahn
Hello Kevin,

Am Mittwoch 12 Dezember 2012 17:54:58 schrieb Kevin Wolf:
 Am 12.12.2012 15:09, schrieb Philipp Hahn:
  Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
  As you can see in the commit message of that patch I was convinced that
  no bug did exist in practice and this was only dangerous with respect to
  future changes. Therefore my first question is if you're using an
  unmodified upstream qemu or if some backported patches are applied to
  it? If it's indeed unmodified, we should probably review the code once
  again to understand why it makes a difference.
 
  This were all unmodified versions directly from git between
  qemu-kvm-1.1.0 and qemu-kvm-1.2.0
 
  git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c works,
  git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1 is broken.
  git checkout qemu-kvm-1.1.2  is broken,
  git checkout qemu-kvm-1.1.2 ; git cherry-pick
  b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c  works

 Ok, thanks for clarifying. Then I must have missed some interesting case
 while doing the patch.

I just re-run my git bisect run ~/bisect.sh  case, but it again arrived at 
that patch. I just queued another run for tonight so make sure the test is 
reliable:

# bad: [4c3e02beed9878a5f760eeceb6cd42c475cf0127] Merge tag 'v1.2.0'
# good: [bd11ac4feb54d32653e5d4eb7994bed18be0609c] fdc: fix implied seek while 
there is no media in drive
git bisect start 'qemu-kvm-1.2.0' 'qemu-kvm-1.1.0'
# good: [15ecf28f39e2b6fba359ed094770c8fa4ad8dc60] Merge tag 'v1.1.0' into 
next
git bisect good 15ecf28f39e2b6fba359ed094770c8fa4ad8dc60
# bad: [2fa5008ffd49e51540756adccf966a2fcde6e6c1] hd-geometry: Factor out 
guess_chs_for_size()
git bisect bad 2fa5008ffd49e51540756adccf966a2fcde6e6c1
# bad: [306a571a2d75e32cd2eae5486c2714b7b7792a63] hw/arm_gic: Add qdev 
property for GIC revision
git bisect bad 306a571a2d75e32cd2eae5486c2714b7b7792a63
# good: [5c6f4f178ba542358c012ca033985f73e61b8ae5] z2: Rename PXA2xxState 
variable
git bisect good 5c6f4f178ba542358c012ca033985f73e61b8ae5
# good: [833e40858cb9501c5e76b3aa345e4bb5be34385a] qcow2: remove a line of 
unnecessary code
git bisect good 833e40858cb9501c5e76b3aa345e4bb5be34385a
# bad: [0b0cb9d310edfe2b2d108f18be4f013a1e552cfd] Merge remote-tracking 
branch 'kwolf/for-anthony' into staging
git bisect bad 0b0cb9d310edfe2b2d108f18be4f013a1e552cfd
# bad: [0446919dcab51e7468f346c0a009a88632c5c5e0] qemu-iotests: COW with many 
AIO requests on the same cluster
git bisect bad 0446919dcab51e7468f346c0a009a88632c5c5e0
# good: [b75a02829dde98723dfe16fa098338cb267b28b9] Prevent disk data loss when 
closing qemu
git bisect good b75a02829dde98723dfe16fa098338cb267b28b9
# good: [c4a248a138028bee63a099410c79b428db0c4779] block: copy 
enable_write_cache in bdrv_append
git bisect good c4a248a138028bee63a099410c79b428db0c4779
# good: [6af4e9ead4ec9491259c9861b1b35f9abee24a66] qcow2: always operate 
caches in writeback mode
git bisect good 6af4e9ead4ec9491259c9861b1b35f9abee24a66
# bad: [b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c] qcow2: Fix avail_sectors in 
cluster allocation code
git bisect bad b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c
# good: [cdba7fee1daa8865bac2d69da288171fe7c21aae] qcow2: Simplify calculation 
for COW area at the end
git bisect good cdba7fee1daa8865bac2d69da288171fe7c21aae

 Ideally we would find a sequence of qemu-io commands to reliably
 reproduce this. First thing worth trying would be running the current
 qemu-iotests suite on the old versions. If we don't find it this way, I
 guess we need to catch it with code review. I'm not sure if I can get to
 it this week, and starting next week I'll be on vacation, so any help
 with finding a reproducer would be appreciated.

I'll have a look at it tommorrow.

Thank you for your fast replies and have a nice vacation in case we don't head 
from each other this week again.

BYtE
Philipp
-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHbe open.   fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/


signature.asc
Description: This is a digitally signed message part.