bibo mao <maob...@loongson.cn> writes: On 2025/3/20 下午2:16, Markus Armbruster wrote: >> Bibo Mao <maob...@loongson.cn> writes: >> >>> In function virt_cpu_plug(), it will send cpu plug message to interrupt >>> controller extioi and ipi irqchip. If there is problem in this function, >>> system should continue to run and keep state the same before cpu is >>> added. >>> >>> Object cpuslot::cpu is set at last only when there is no any error. >>> If there is, send cpu unplug message to extioi and ipi irqchip, and then >>> return immediately. >>> >>> Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface) >>> Signed-off-by: Bibo Mao <maob...@loongson.cn>
[...] >> Hmm. >> >> You're right about the problem: virt_cpu_plug() neglects to revert >> changes when it fails. >> >> You're probably right to move the assignment to cpu_slot->cpu to the >> end. Anything you can delay until success is assured you don't have to >> revert. I say "probably" because the code that now runs before the >> assignment might theoretically "see" the assignment, and I didn't >> examine it to exclude that. >> >> Where I have doubts is the code to revert changes. >> >> The hotplug_handler_plug() error checkign suggests it can fail. >> >> Can hotplug_handler_unplug() fail, too? The error checking in >> virt_cpu_unplug() right above suggests it can. > > Basically from existing code about ipi/extioi hotplug handler, it is > impossible to there is error, here is only for future use. Aha. More at the end of my reply. > If there is error in function virt_cpu_plug(), undo() such as > hotplug_handler_unplug() should be called. However if undo() reports > error, I do not know how to handle it, here just discard error in > function undo(). Steinbach's Guideline for Systems Programming: Never test for an error condition you don't know how to handle. This old quip is a funny way to say that errors we don't know how to handle are *bad*, and should be avoided. > Regards > Bibo Mao >> >> What happens if it fails in virt_cpu_plug()? You assure us this can't happen today. Because of that, broken error recovery is not an actual problem. However, if things change some day so it can happen, broken error recovery becomes an actual problem. so, broken error recovery just "for future use" is actually just for silent future breakage. But is it broken? This is what I'm trying to find out with my "what happens if" question. If it is broken, then passing &error_abort would likely be less bad: crash instead of silent breakage. Also makes it completely obvious in the code that these errors are not handled, whereas broken error handling looks like it is until you actually think about it.