On Mon, Aug 31, 2020 at 11:40:12AM -0700, Richard Henderson wrote: > Based-on: <20200831160601.833692-1-richard.hender...@linaro.org> > ("[PULL 00/76] target/microblaze improvements") > > Hello again, Edgar. > > I had dropped the tcg_gen_lookup_and_goto_ptr patch from the > previous omnibus patch set, as you had reported lockups. > > I have identified, by inspection, two cases in which we failed > to return to the main loop even though we should have: > > (1) Return-from-exception type instructions. > > I had missed these before because they hadn't set cpustate_changed. > This still worked fine because they are all indirect branches, and > had exited immediately. > > Fixed by distinguishing these cases from normal indirect branches > before we start using lookup_and_goto_ptr. > > (2) MTS in a branch delay slot. > > We did not check dc->cpustate_changed before setting > dc->base.is_jmp to DISAS_JUMP, which lost the fact that we > need to return to the main loop. > > This mostly works fine without lookup_and_goto_ptr, because > we either (a) finished an indirect branch and returned to the > main loop anyway or (b) we'd return to the main loop via some > subsequent indirect branch, which would happen "soon enough". > > We should have been able to see soft-lockup with the existing > code in the case of a cpustate_changed in the delay slot of > a loop of direct branches that all use goto_tb. E.g. > > brid 0 > msrset MSR_IE > > I.e. an immediate branch back to the same branch insn, > re-enabling interrupts in the delay slot. Probably not > something that shows up in the wild. > > ---- > > Follow-up question: The manual says that several classes of > instructions are invalid in a branch delay slot, but does > not say what action is taken, if any. > > Some of these invalid cases could leave qemu in an inconsistent > state. Would it be legal for us to diagnose these cases with > trap_illegal? If not, what *should* we be doing? We could also > LOG_GUEST_ERROR for these either way.
What I found out is that these result in undefined and undocumented behaviour but that the behaviour is deterministic, i.e the cores won't lock-up or exposed security issues or anything like that. RTL will not raise exceptions on these either. So I think LOG_GUEST_ERROR and treating as NOP is probably a good approach. We can fix that in follow-up patches. Cheers, Edgar