On 07/27/2018 12:36 AM, Benjamin Herrenschmidt wrote: > On Thu, 2018-07-26 at 11:03 +0200, Cédric Le Goater wrote: >> Ben, >> >> I have found out recently that the QEMU PowerNV could hang while accessing >> the disk. >> >> The issue seems to be the phb3_msi_try_send() routine when called from >> the resend() handler. The 'P' is ignored in that case but not the 'Q' >> bit which means that no interrupt will be resent if P|Q are set. > > I'd have to remember how PQ works on P8 ... my gut feeling is that we > should resend if P=1 but I'm no 100% certain.
This is not exactly what the code does. To force a resend, it ignores P but if Q=1, it bails out without doing anything, like a normal trigger would do. So I think that in the resend case we should ignore Q as well. Thanks, C. > >> See the log extract below : >> >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005000000fd eff pq=0 >> prio=5 server=8 ignore_p=0 >> PHB3(phb3_msi_set_p): MSI 0: setting P >> PHB3(phb3_msi_set_p): IVE readback: 0x2005010000fd >> PHB3(phb3_msi_reject): MSI 0 rejected >> PHB3(phb3_msi_resend): MSI resend... >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010000fd eff pq=0 >> prio=5 server=8 ignore_p=1 >> PHB3(phb3_msi_set_p): MSI 0: setting P >> PHB3(phb3_msi_set_p): IVE readback: 0x2005010000fd >> PHB3(phb3_msi_reject): MSI 0 rejected >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010000fd eff pq=2 >> prio=5 server=8 ignore_p=0 >> PHB3(phb3_msi_set_q): MSI 0: setting Q >> PHB3(phb3_msi_set_q): IVE readback: 0x2005010100fd >> PHB3(phb3_msi_resend): MSI resend... >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010100fd eff pq=1 >> prio=5 server=8 ignore_p=1 >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010100fd eff pq=3 >> prio=5 server=8 ignore_p=0 >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010100fd eff pq=3 >> prio=5 server=8 ignore_p=0 >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010100fd eff pq=3 >> prio=5 server=8 ignore_p=0 >> PHB3(phb3_msi_try_send): MSI 0: try_send, ive=0x2005010100fd eff pq=3 >> prio=5 server=8 ignore_p=0 >> ... goes on and on ... >> hangs >> >> I have added the relevant code at the bottom of the email. >> >> >> If the 'Q' bit is ignored also, the results are good with a SATA drive >> or a SCSI drive using the megasas model. Do you think this is correct ? >> I would say so but I am still discovering that part. >> >> I have no idea why it didn't show up before. May be because we mostly >> used virtio-blk. >> >> Thanks, >> >> C. >> >>> +static void phb3_msi_try_send(Phb3MsiState *msi, int srcno, bool ignore_p) >>> +{ >>> + ICSState *ics = ICS_BASE(msi); >>> + uint64_t ive; >>> + uint64_t server, prio, pq, gen; >>> + >>> + if (!phb3_msi_read_ive(msi->phb, srcno, &ive)) { >>> + return; >>> + } >>> + >>> + server = GETFIELD(IODA2_IVT_SERVER, ive); >>> + prio = GETFIELD(IODA2_IVT_PRIORITY, ive); >>> + pq = GETFIELD(IODA2_IVT_Q, ive); >>> + if (!ignore_p) { >>> + pq |= GETFIELD(IODA2_IVT_P, ive) << 1; >>> + } >>> + gen = GETFIELD(IODA2_IVT_GEN, ive); >>> + >>> + /* >>> + * The low order 2 bits are the link pointer (Type II interrupts). >>> + * Shift back to get a valid IRQ server. >>> + */ >>> + server >>= 2; >>> + >>> + switch (pq) { >>> + case 0: /* 00 */ >>> + if (prio == 0xff) { >>> + /* Masked, set Q */ >>> + phb3_msi_set_q(msi, srcno); >>> + } else { >>> + /* Enabled, set P and send */ >>> + phb3_msi_set_p(msi, srcno, gen); >>> + icp_irq(ics, server, srcno + ics->offset, prio); >>> + } >>> + break; >>> + case 2: /* 10 */ >>> + /* Already pending, set Q */ >>> + phb3_msi_set_q(msi, srcno); >>> + break; >>> + case 1: /* 01 */ >>> + case 3: /* 11 */ >>> + default: >>> + /* Just drop stuff if Q already set */ >>> + break; >>> + } >>> +} >>> + >>> +static void phb3_msi_set_irq(void *opaque, int srcno, int val) >>> +{ >>> + Phb3MsiState *msi = PHB3_MSI(opaque); >>> + >>> + if (val) { >>> + phb3_msi_try_send(msi, srcno, false); >>> + } >>> +} >> >> [ ... ] >> >>> +static void phb3_msi_resend(ICSState *ics) >>> +{ >>> + Phb3MsiState *msi = PHB3_MSI(ics); >>> + unsigned int i, j; >>> + >>> + if (msi->rba_sum == 0) { >>> + return; >>> + } >>> + >>> + for (i = 0; i < 32; i++) { >>> + if ((msi->rba_sum & (1u << i)) == 0) { >>> + continue; >>> + } >>> + msi->rba_sum &= ~(1u << i); >>> + for (j = 0; j < 64; j++) { >>> + if ((msi->rba[i] & (1ull << j)) == 0) { >>> + continue; >>> + } >>> + msi->rba[i] &= ~(1u << j); >>> + phb3_msi_try_send(msi, i * 64 + j, true); >>> + } >>> + } >>> +} >> >> >