On 27/11/20 14:57, P J P wrote:
+-- On Wed, 18 Nov 2020, P J P wrote --+
| During data transfer via packet command in 'ide_atapi_cmd_reply_end'
| 's->io_buffer_index' could exceed the 's->io_buffer' length, leading
| to OOB access issue. Add check to avoid it.
| ...
| #9 ahci_pio_transfer ../hw/ide/ahci.c:1383
| #10 ide_transfer_start_norecurse ../hw/ide/core.c:553
| #11 ide_atapi_cmd_reply_end ../hw/ide/atapi.c:284
| #12 ide_atapi_cmd_read_pio ../hw/ide/atapi.c:329
| #13 ide_atapi_cmd_read ../hw/ide/atapi.c:442
| #14 cmd_read ../hw/ide/atapi.c:988
| #15 ide_atapi_cmd ../hw/ide/atapi.c:1352
| #16 ide_transfer_start ../hw/ide/core.c:561
| #17 cmd_packet ../hw/ide/core.c:1729
| #18 ide_exec_cmd ../hw/ide/core.c:2107
| #19 handle_reg_h2d_fis ../hw/ide/ahci.c:1267
| #20 handle_cmd ../hw/ide/ahci.c:1318
| #21 check_cmd ../hw/ide/ahci.c:592
| #22 ahci_port_write ../hw/ide/ahci.c:373
| #23 ahci_mem_write ../hw/ide/ahci.c:513
|
| Reported-by: Wenxiang Qian <leonwxq...@gmail.com>
| Signed-off-by: Prasad J Pandit <p...@fedoraproject.org>
Prasad, please try to put yourself in the reviewer's shoes.
1) Obviously this condition was not in the mind of whoever wrote the
code. For this reason the first thing to understand if how the bug came
to happen, which precondition was not respected. Your backtraces shows
that you came from ide_atapi_cmd_read_pio, so:
- ide_atapi_cmd_reply_end is first entered with s->io_buffer_index ==
s->cd_sector_size
- s->lba is assumed to be != -1. from there you go to cd_read_sector ->
cd_read_sector_cb and then reenter ide_atapi_cmd_reply_end with
s->io_buffer_index == 0. Or to cd_read_sector_sync and then continue
down ide_atapi_cmd_reply_end, again with s->io_buffer_index == 0
- if s->elementary_transfer_size > 0, the number of bytes that are read
is bounded to s->cd_sector_size - s->io_buffer_index
- if s->elementary_transfer_size == 0, the size is again bounded to
s->cd_sector_size - s->io_buffer_index in this code:
/* we cannot transmit more than one sector at a time */
if (s->lba != -1) {
if (size > (s->cd_sector_size - s->io_buffer_index))
size = (s->cd_sector_size - s->io_buffer_index);
}
So my understanding is that, for the bug to happen, you need to enter
ide_atapi_cmd_reply_end with s->lba == -1 despite being in the read CD
path. This might be possible by passing 0xFFFFFFFF as the LBA in
cmd_read. The correct fix would be to check lba against the media size
in cmd_read.
This is reasoning that you should understand even before starting to
write a patch. Did you do this step? If so, no problem---I still
believe the patch is incorrect, but please explain how my reasoning is
wrong and we'll take it from there. If not, how do you know your patch
is not papering over a bigger issue somewhere?
2) The maintainer is not the one that knows the code best: it's only
someone who is trusted to make judgment calls that are good enough. My
job is to poke holes in your reasoning, not to reverse engineer it. So
if you did do step 1, you need to explain it to me, because at this
point you know this part of the code better than I do. This is a step
that you did not do, because your commit message has no such explanation.
There's also another reason to do this: recording the details of the bug
in the commit message helps anyone who wants to understand the story of
the code.
3) We also need to ensure that the bug will not happen again. Did you
get a reproducer from the reporter? If not, how did you trust the
report to be correct? If so, did you try to include it in the QEMU
qtest testsuite?
If my understanding of the bug is correct, the patch is not the correct
fix. The correct fix is to add an assertion here *and* to fix the
incorrect assumption up in the call chain. For example:
- add an assertion in ide_atapi_cmd_read_{dma,pio} that s->lba <=
s->nb_sectors && s->lba != -1
- add a range check in cmd_read and cmd_read_cd similar to what is
already done in cmd_seek (but checking the full range from lba to
lba+nb_sectors-1.
Even if the patch were correct, however, bullets (2) and (3) are two
reasons why this patch is not acceptable for QEMU's standards---not even
for a security fix.
This is nothing new. Even though I have sometimes applied (or redone_
your fixes, I have told you a variation of the above every time I saw a
a patch of yours. The output of "git log --author pjp tests" tells me
you didn't heed the advice though; I'm calling you out this time because
it's especially clear that you didn't do these steps and because the
result is especially wrong.
Thanks,
Paolo