Hi, guys,
Recently, I encountered an IO hang problem in occasion which I cannot
reproduce it now.
I analyzed this problem carefully, the critical stack is as following:
After reading the codes in linux-aio.c(see ioq_submit() function), I found two
situations could lead us here.
1) no AIOs are in flight(s->ioq.in_flight is 0) and another call to io_submit
returns -EAGAIN
2) no AIOs are in flight(s->ioq.in_flight is 0) and s->io_q.pending IOs reach
to MAX_EVENTS at once
In both the two situations above, the do{...}while loop breaks out and set
s->io_q.blocked true.
After that, AIO completion callback will never be called, ioq_submit() either,
all pended requests will hang.
Is there a proper way we can fix this while do not affect(stuck) the guest ?
Hope for a reply, thanks.
Sochin.