Re: [Qemu-devel] [PATCH 1/2] ahci: Fix ahci cdrom read corruptions for reads > 128k
On Fri, Jul 27, 2012 at 09:59:01AM +0200, Kevin Wolf wrote: > Am 26.07.2012 21:40, schrieb Jason Baron: > > While testing q35, which has its cdrom attached to the ahci controller, I > > found > > that the Fedora 17 install would panic on boot. The panic occurs while > > squashfs is trying to read from the cdrom. The errors are: > > > > [8.622711] SQUASHFS error: xz_dec_run error, data probably corrupt > > [8.625180] SQUASHFS error: squashfs_read_data failed to read block > > 0x20be48a > > > > I was also able to produce corrupt data reads using an installed piix based > > qemu machine, using 'dd'. I found that the corruptions were only occuring > > when > > the read size was greater than 128k. For example, the following command > > results in corrupted reads: > > > > dd if=/dev/sr0 of=/tmp/blah bs=256k iflag=direct > > > > The > 128k size reads exercise a different code path than 128k and below. In > > ide_atapi_cmd_read_dma_cb() s->io_buffer_size is capped at 128k. Thus, > > ide_atapi_cmd_read_dma_cb() is called a second time when the read is > 128k. > > However, ahci_dma_rw_buf() restart the read from offset 0, instead of at > > 128k. > > Thus, resulting in a corrupted read. > > > > To fix this, I've introduced 'io_buffer_offset' field in IDEState to keep > > track of the offset. I've also modified ahci_populate_sglist() to take a new > > 3rd offset argument, so that the sglist is property initialized. > > > > I've tested this patch using 'dd' testing, and Fedora 17 now correctly boots > > and installs on q35 with the cdrom ahci controller. > > > > Signed-off-by: Jason Baron > > --- > > hw/ide/ahci.c | 41 ++--- > > hw/ide/internal.h |1 + > > 2 files changed, 35 insertions(+), 7 deletions(-) > > > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > > index efea93f..9c95714 100644 > > --- a/hw/ide/ahci.c > > +++ b/hw/ide/ahci.c > > @@ -636,7 +636,7 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t > > *cmd_fis) > > } > > } > > > > -static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist) > > +static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int > > offset) > > { > > AHCICmdHdr *cmd = ad->cur_cmd; > > uint32_t opts = le32_to_cpu(cmd->opts); > > @@ -647,6 +647,10 @@ static int ahci_populate_sglist(AHCIDevice *ad, > > QEMUSGList *sglist) > > uint8_t *prdt; > > int i; > > int r = 0; > > +int sum = 0; > > +int off_idx = -1; > > +int off_pos = 0; > > +int tbl_entry_size; > > > > if (!sglist_alloc_hint) { > > DPRINTF(ad->port_no, "no sg list given by guest: 0x%08x\n", opts); > > @@ -669,10 +673,31 @@ static int ahci_populate_sglist(AHCIDevice *ad, > > QEMUSGList *sglist) > > /* Get entries in the PRDT, init a qemu sglist accordingly */ > > if (sglist_alloc_hint > 0) { > > AHCI_SG *tbl = (AHCI_SG *)prdt; > > - > > -qemu_sglist_init(sglist, sglist_alloc_hint, ad->hba->dma); > > +sum = 0; > > for (i = 0; i < sglist_alloc_hint; i++) { > > /* flags_size is zero-based */ > > +tbl_entry_size = (le32_to_cpu(tbl[i].flags_size) + 1); > > +if (offset <= (sum + tbl_entry_size)) { > > +off_idx = i; > > +off_pos = offset - sum; > > +break; > > +} > > +sum += tbl_entry_size; > > +} > > +if ((off_idx == -1) || (off_pos < 0) || (off_pos > > > tbl_entry_size)) { > > +fprintf(stderr, "%s: Incorrect offset! " > > +"off_idx: %d, off_pos: %d\n", > > +__func__, off_idx, off_pos); > > I would use DPRINTF here, it's not a good idea to allow the guest to > trigger fprintfs. > ok. I'm also wondering if we need to 0 s->io_buffer_offset in more places than 'ahci_start_dma()'. For example, do we need it in ahci_dma_reset(), or ahci_dma_prepare_buf()? hw/ide/pci.c seems to do that. I didn't include those places in my original patch, because I saw that start_dma was always called before rw_buf. But perhaps, there is another codepath that I missed. Thanks, -Jason
Re: [Qemu-devel] [PATCH 1/2] ahci: Fix ahci cdrom read corruptions for reads > 128k
Hi Jason, Am 26.07.2012 21:40, schrieb Jason Baron: > While testing q35, which has its cdrom attached to the ahci controller, I > found > that the Fedora 17 install would panic on boot. The panic occurs while > squashfs is trying to read from the cdrom. The errors are: > > [8.622711] SQUASHFS error: xz_dec_run error, data probably corrupt > [8.625180] SQUASHFS error: squashfs_read_data failed to read block > 0x20be48a > > I was also able to produce corrupt data reads using an installed piix based > qemu machine, using 'dd'. I found that the corruptions were only occuring when > the read size was greater than 128k. For example, the following command > results in corrupted reads: > > dd if=/dev/sr0 of=/tmp/blah bs=256k iflag=direct > > The > 128k size reads exercise a different code path than 128k and below. In > ide_atapi_cmd_read_dma_cb() s->io_buffer_size is capped at 128k. Thus, > ide_atapi_cmd_read_dma_cb() is called a second time when the read is > 128k. > However, ahci_dma_rw_buf() restart the read from offset 0, instead of at 128k. > Thus, resulting in a corrupted read. > > To fix this, I've introduced 'io_buffer_offset' field in IDEState to keep > track of the offset. I've also modified ahci_populate_sglist() to take a new > 3rd offset argument, so that the sglist is property initialized. > > I've tested this patch using 'dd' testing, and Fedora 17 now correctly boots > and installs on q35 with the cdrom ahci controller. > > Signed-off-by: Jason Baron Tested-by: Andreas Färber This also fixes my test case of running the "Check Installation Media" boot menu option on, e.g., openSUSE-12.1-GNOME-LiveCD-x86_64.iso with ich9-ahci (BNC#725008). Thanks a lot for finding the root cause! If you resubmit, please cc qemu-sta...@nongnu.org, I would very much like to backport this bugfix to stable-0.15 branch. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/2] ahci: Fix ahci cdrom read corruptions for reads > 128k
Am 26.07.2012 21:40, schrieb Jason Baron: > While testing q35, which has its cdrom attached to the ahci controller, I > found > that the Fedora 17 install would panic on boot. The panic occurs while > squashfs is trying to read from the cdrom. The errors are: > > [8.622711] SQUASHFS error: xz_dec_run error, data probably corrupt > [8.625180] SQUASHFS error: squashfs_read_data failed to read block > 0x20be48a > > I was also able to produce corrupt data reads using an installed piix based > qemu machine, using 'dd'. I found that the corruptions were only occuring when > the read size was greater than 128k. For example, the following command > results in corrupted reads: > > dd if=/dev/sr0 of=/tmp/blah bs=256k iflag=direct > > The > 128k size reads exercise a different code path than 128k and below. In > ide_atapi_cmd_read_dma_cb() s->io_buffer_size is capped at 128k. Thus, > ide_atapi_cmd_read_dma_cb() is called a second time when the read is > 128k. > However, ahci_dma_rw_buf() restart the read from offset 0, instead of at 128k. > Thus, resulting in a corrupted read. > > To fix this, I've introduced 'io_buffer_offset' field in IDEState to keep > track of the offset. I've also modified ahci_populate_sglist() to take a new > 3rd offset argument, so that the sglist is property initialized. > > I've tested this patch using 'dd' testing, and Fedora 17 now correctly boots > and installs on q35 with the cdrom ahci controller. > > Signed-off-by: Jason Baron > --- > hw/ide/ahci.c | 41 ++--- > hw/ide/internal.h |1 + > 2 files changed, 35 insertions(+), 7 deletions(-) > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > index efea93f..9c95714 100644 > --- a/hw/ide/ahci.c > +++ b/hw/ide/ahci.c > @@ -636,7 +636,7 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t > *cmd_fis) > } > } > > -static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist) > +static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int > offset) > { > AHCICmdHdr *cmd = ad->cur_cmd; > uint32_t opts = le32_to_cpu(cmd->opts); > @@ -647,6 +647,10 @@ static int ahci_populate_sglist(AHCIDevice *ad, > QEMUSGList *sglist) > uint8_t *prdt; > int i; > int r = 0; > +int sum = 0; > +int off_idx = -1; > +int off_pos = 0; > +int tbl_entry_size; > > if (!sglist_alloc_hint) { > DPRINTF(ad->port_no, "no sg list given by guest: 0x%08x\n", opts); > @@ -669,10 +673,31 @@ static int ahci_populate_sglist(AHCIDevice *ad, > QEMUSGList *sglist) > /* Get entries in the PRDT, init a qemu sglist accordingly */ > if (sglist_alloc_hint > 0) { > AHCI_SG *tbl = (AHCI_SG *)prdt; > - > -qemu_sglist_init(sglist, sglist_alloc_hint, ad->hba->dma); > +sum = 0; > for (i = 0; i < sglist_alloc_hint; i++) { > /* flags_size is zero-based */ > +tbl_entry_size = (le32_to_cpu(tbl[i].flags_size) + 1); > +if (offset <= (sum + tbl_entry_size)) { > +off_idx = i; > +off_pos = offset - sum; > +break; > +} > +sum += tbl_entry_size; > +} > +if ((off_idx == -1) || (off_pos < 0) || (off_pos > tbl_entry_size)) { > +fprintf(stderr, "%s: Incorrect offset! " > +"off_idx: %d, off_pos: %d\n", > +__func__, off_idx, off_pos); I would use DPRINTF here, it's not a good idea to allow the guest to trigger fprintfs. The rest looks good to me. Kevin
[Qemu-devel] [PATCH 1/2] ahci: Fix ahci cdrom read corruptions for reads > 128k
While testing q35, which has its cdrom attached to the ahci controller, I found that the Fedora 17 install would panic on boot. The panic occurs while squashfs is trying to read from the cdrom. The errors are: [8.622711] SQUASHFS error: xz_dec_run error, data probably corrupt [8.625180] SQUASHFS error: squashfs_read_data failed to read block 0x20be48a I was also able to produce corrupt data reads using an installed piix based qemu machine, using 'dd'. I found that the corruptions were only occuring when the read size was greater than 128k. For example, the following command results in corrupted reads: dd if=/dev/sr0 of=/tmp/blah bs=256k iflag=direct The > 128k size reads exercise a different code path than 128k and below. In ide_atapi_cmd_read_dma_cb() s->io_buffer_size is capped at 128k. Thus, ide_atapi_cmd_read_dma_cb() is called a second time when the read is > 128k. However, ahci_dma_rw_buf() restart the read from offset 0, instead of at 128k. Thus, resulting in a corrupted read. To fix this, I've introduced 'io_buffer_offset' field in IDEState to keep track of the offset. I've also modified ahci_populate_sglist() to take a new 3rd offset argument, so that the sglist is property initialized. I've tested this patch using 'dd' testing, and Fedora 17 now correctly boots and installs on q35 with the cdrom ahci controller. Signed-off-by: Jason Baron --- hw/ide/ahci.c | 41 ++--- hw/ide/internal.h |1 + 2 files changed, 35 insertions(+), 7 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index efea93f..9c95714 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -636,7 +636,7 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t *cmd_fis) } } -static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist) +static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset) { AHCICmdHdr *cmd = ad->cur_cmd; uint32_t opts = le32_to_cpu(cmd->opts); @@ -647,6 +647,10 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist) uint8_t *prdt; int i; int r = 0; +int sum = 0; +int off_idx = -1; +int off_pos = 0; +int tbl_entry_size; if (!sglist_alloc_hint) { DPRINTF(ad->port_no, "no sg list given by guest: 0x%08x\n", opts); @@ -669,10 +673,31 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist) /* Get entries in the PRDT, init a qemu sglist accordingly */ if (sglist_alloc_hint > 0) { AHCI_SG *tbl = (AHCI_SG *)prdt; - -qemu_sglist_init(sglist, sglist_alloc_hint, ad->hba->dma); +sum = 0; for (i = 0; i < sglist_alloc_hint; i++) { /* flags_size is zero-based */ +tbl_entry_size = (le32_to_cpu(tbl[i].flags_size) + 1); +if (offset <= (sum + tbl_entry_size)) { +off_idx = i; +off_pos = offset - sum; +break; +} +sum += tbl_entry_size; +} +if ((off_idx == -1) || (off_pos < 0) || (off_pos > tbl_entry_size)) { +fprintf(stderr, "%s: Incorrect offset! " +"off_idx: %d, off_pos: %d\n", +__func__, off_idx, off_pos); +r = -1; +goto out; +} + +qemu_sglist_init(sglist, (sglist_alloc_hint - off_idx), ad->hba->dma); +qemu_sglist_add(sglist, le64_to_cpu(tbl[off_idx].addr + off_pos), +le32_to_cpu(tbl[off_idx].flags_size) + 1 - off_pos); + +for (i = off_idx + 1; i < sglist_alloc_hint; i++) { +/* flags_size is zero-based */ qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr), le32_to_cpu(tbl[i].flags_size) + 1); } @@ -745,7 +770,7 @@ static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, ncq_tfs->lba, ncq_tfs->lba + ncq_tfs->sector_count - 2, s->dev[port].port.ifs[0].nb_sectors - 1); -ahci_populate_sglist(&s->dev[port], &ncq_tfs->sglist); +ahci_populate_sglist(&s->dev[port], &ncq_tfs->sglist, 0); ncq_tfs->tag = tag; switch(ncq_fis->command) { @@ -970,7 +995,7 @@ static int ahci_start_transfer(IDEDMA *dma) goto out; } -if (!ahci_populate_sglist(ad, &s->sg)) { +if (!ahci_populate_sglist(ad, &s->sg, 0)) { has_sglist = 1; } @@ -1015,6 +1040,7 @@ static void ahci_start_dma(IDEDMA *dma, IDEState *s, DPRINTF(ad->port_no, "\n"); ad->dma_cb = dma_cb; ad->dma_status |= BM_STATUS_DMAING; +s->io_buffer_offset = 0; dma_cb(s, 0); } @@ -1023,7 +1049,7 @@ static int ahci_dma_prepare_buf(IDEDMA *dma, int is_write) AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); IDEState *s = &ad->port.ifs[0]; -ahci_populate_sglist(ad, &s->sg); +ahci_populate_sglist(ad, &s->sg, 0); s->io_buffer_size = s->sg.size; DPRINTF(ad->port_no, "len=%#x\n", s->io_buffer