On Tue, Nov 18, 2014 at 03:12:12AM -0500, Francesco Romani wrote: > > > +static int coroutine_fn before_write_notify(NotifierWithReturn *notifier, > > > + void *opaque) > > > +{ > > > + BdrvTrackedRequest *req = opaque; > > > + BlockDriverState *bs = req->bs; > > > + int64_t amount = 0; > > > + > > > + assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0); > > > + assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0); > > > > Does the code still make these assumptions or are the asserts left over > > from previous versions of the patch? > > It's a leftover. > I understood they don't hurt and add a bit of safety, but if they are > confusing > I'll remove them.
Yes, it made me wonder why. Probably best to remove them. > > > +void bdrv_set_usage_threshold(BlockDriverState *bs, int64_t > > > threshold_bytes) > > > +{ > > > + BlockDriverState *target_bs = bs; > > > + if (bs->file) { > > > + target_bs = bs->file; > > > + } > > > > Hmm...I think now I understand why you are trying to use bs->file. This > > is an attempt to make image formats work with the threshold. > > > > Unfortunately the BlockDriverState topology can be more complicated than > > just 1 level. > > I thought so but I can't reproduce yet more complex topologies. > Disclosure: I'm testing against the topology I know to be used on oVirt, > lacking > immediate availability of others: suggestions welcome. > > At risk of being redundant, in oVirt the usecase is QCOW2 over lvm block > device, > and we'd like to be notified about the allocation of the lvm block device, > which (IIUC) > is the last bs->file. > > This is a simple topology (unless I'm missing something), and that's > the reason why I go down just one level. > > Of course I want a general solution for this change, so... There is a block driver for error injection called "blkdebug" (see docs/blkdebug.txt). Here is an example of the following topology: raw_bsd (drive0) -> blkdebug -> raw-posix (test.img) qemu-system-x86_64 -drive if=virtio,format=raw,file.driver=blkdebug,file.image.filename=test.img The blkdebug driver is interposing between the raw_bsd (drive0) root and the raw-posix leaf node. > > If we hardcode a strategy to traverse bs->file then it will work in most > > cases: > > > > while (bs->file) { > > bs = bs->file; > > } > > > > But there are cases like VMDK extent files where a BlockDriverState > > actually has multiple children. > > > > One way to solve this is to require that the management tool tells QEMU > > which exact BlockDriverState node the threshold applies to. Then QEMU > > doesn't need any hardcoded policy. But I'm not sure how realistic that > > it at the moment (whether management tools are uses node names for each > > node yet), so it may be best to hardcode the bs->file traversal that > > I've suggested. > > oVirt relies on libvirt[1], and uses device node (e.g. 'vda'). > > BTW, I haven't found a way to inspect from the outside (e.g. monitor command) > the BlockDriverState > topology, there is a way I'm missing? You can get the BlockDriverState and ->backing_hd chain using the query-block QMP command. I'm not aware of a command that returns the full graph of BlockDriverState nodes. The management tool should not need to inspect the graph because the graph can change at runtime (e.g. imagine I/O throttling is implemented as a BlockDriverState node then it could appear/disapper when the feature is activated/deactivated). Instead the management tool should name the nodes it knows about and then use those node names. > Another issue I don't yet have a proper solution is related to this one. > > The management app will have deal with VM with more than one block device > disk, so we need > a way to map the notification with the corresponding device. > > If we descend the bs->file chain, AFAIU the easiest mapping, being the device > name, > is easily lost because only the outermost BlockDriverState has a device name > attached, so when the > notification trigger > bdrv_get_device_name() will return NULL In the worst case a name string can be passed in along with the threshold values. > I believe we don't necessarily need a device name in the notification, for > example something > like an opaque cookie set together with the threshold and returned back with > the notification > will suffice. Of course the device name is nicer :) Agreed.
pgphSm0Y8l855.pgp
Description: PGP signature