On 19.06.2018 [13:14:51 -0700], Nishanth Aravamudan wrote: > On 19.06.2018 [14:35:33 -0500], Eric Blake wrote: > > On 06/15/2018 12:47 PM, Nishanth Aravamudan via Qemu-devel wrote:
<snip> > > > } else if (s->use_linux_aio) { > > > + int rc; > > > + rc = aio_setup_linux_aio(bdrv_get_aio_context(bs)); > > > + if (rc != 0) { > > > + error_report("Unable to use native AIO, falling back to " > > > + "thread pool."); > > > > In general, error_report() should not output a trailing '.'. > > Will fix. > > > > + s->use_linux_aio = 0; > > > + return rc; > > > > Wait - the message claims we are falling back, but the non-zero return code > > sounds like we are returning an error instead of falling back. (My > > preference - if the user requested something and we can't do it, it's better > > to error than to fall back to something that does not match the user's > > request). > > I think that makes sense, I hadn't tested this specific case (in my > reading of the code, it wasn't clear to me if raw_co_prw() could be > called before raw_aio_plug() had been called, but I think returning the > error code up should be handled correctly. What about the cases where > there is no error handling (the other two changes in the patch)? While looking at doing these changes, I realized that I'm not quite sure what the right approach is here. My original rationale for returning non-zero was that AIO was requested but could not be completed. I haven't fully tracked back the calling paths, but I assumed it would get retried at the top level, and since we indicated to not use AIO on subsequent calls, it will succeed and use threads then (note, that I do now realize this means a mismatch between the qemu command-line and the in-use AIO model). In practice, with my v2 patch, where I do return a non-zero error-code from this function, qemu does not exit (nor is any logging other than that I added emitted on the monitor). If I do not fallback, I imagine we would just continuously see this error message and IO might not actually every occur? Reworking all of the callpath to fail on non-zero returns from raw_co_prw() seems like a fair bit of work, but if that is what is being requested, I can try that (it will just take a while). Alternatively, I can produce a v3 quickly that does not bubble the actual errno all the way up (since it does seem like it is ignored anyways?). <snip> > > > + s->use_linux_aio = 0; > > > > Should s->use_linux_aio be a bool instead of an int? > > It is: > > bool use_linux_aio:1; > > would you prefer I did a preparatory patch that converted users to > true/false? Sorry, I misunderstood this -- only my patch does an assignment, so I'll switch to 'false'. Thanks, Nish