On 11/27/2009 10:43 PM, Chris Webb wrote:
Oops, I didn't know qemu-io existed; added that to the patch! I've updated qemu-img.c too. I also noticed that your bdrv_close_all() function didn't do things like close backing images correctly, so I've changed it to call bdrv_close() (which does do the right thing) and reindented in standard qemu style. Hope that's okay.
Thanks, it looks okay!
We're currently not doing any locking in the read-only case, e.g. a backing image (except as a wrapper around bdrv_commit()). Is there a problem with one process accessing an image read-only while another accesses it read-write? If there is, we probably need to arrange to take an exclusive lock on read-write, and a shared lock on read-only so you can have multiple readers, but readers can't coexist with a writer.
Even read-only access is not allowed while another qemu is doing write access. However, there is no problem about your patch, I guess. It is because Sheepdog allow us to clone images only from snapshot images. It means a backing image is always read-only. If users specify writable Sheepdog VDI as a backing image, qemu-img returns error.
More generally, I'm a little bit concerned about stray locks. The claims persist until they are explicitly released, even if the connection from the qemu to the sheepdog cluster is terminated. This means that crashing qemu processes, dying hosts, etc. will always leave stale locks. I'm sure this will lead to a cluster maintenance nightmare, especially as qemu is still so sloppy about doing exit(1) all throughout the code whenever something happens that it doesn't like. I appreciate that the sheepdog design means there isn't a single persistent connection which can be used to bound the lifetime of the lock, as you might have with (say) an NBD server. Maybe some sort of heartbeat contact with the qemu process should be required to keep the lock alive?
Yes, we must monitor whether VMs are alive or not to release locks in any cases. It is included in our TODO items. Currently, I am considering the following approach: Sheepdog design make a qemu host machine be in the Sheepdog cluster, so unexpected dying hosts can be detectable. Therefore, If crashed qemu can be detected by a local cluster daemon, we can monitor VMs properly. Regards, MORITA Kazutaka -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
