Re: [Qemu-block] [PATCH v3 05/10] block: add transactional callbacks feature
On 23.04.2015 02:04, John Snow wrote: The goal here is to add a new method to transactions that allows developers to specify a callback that will get invoked only once all jobs spawned by a transaction are completed, allowing developers the chance to perform actions conditionally pending complete success, partial failure, or complete failure. In order to register the new callback to be invoked, a user must request a callback pointer and closure by calling new_action_cb_wrapper, which creates a wrapper around an opaque pointer and callback that would have originally been passed to e.g. backup_start(). The function will return a function pointer and a new opaque pointer to be passed instead. The transaction system will effectively intercept the original callbacks and perform book-keeping on the transaction after it has delivered the original enveloped callback. This means that Transaction Action callback methods will be called after all callbacks triggered by all Actions in the Transactional group have been received. This feature has no knowledge of any jobs spawned by Actions that do not inform the system via new_action_cb_wrapper(). For an example of how to use the feature, please skip ahead to: 'block: drive_backup transaction callback support' which serves as an example for how to hook up a post-transaction callback to the Drive Backup action. Note 1: Defining a callback method alone is not sufficient to have the new method invoked. You must call new_action_cb_wrapper() AND ensure the callback it returns is the one used as the callback for the job launched by the action. Note 2: You can use this feature for any system that registers completions of an asynchronous task via a callback of the form (void *opaque, int ret), not just block job callbacks. Signed-off-by: John Snow js...@redhat.com --- blockdev.c | 183 +++-- 1 file changed, 179 insertions(+), 4 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-block] [PATCH v3 08/10] qmp: Add an implementation wrapper for qmp_drive_backup
On 23.04.2015 02:04, John Snow wrote: We'd like to be able to specify the callback given to backup_start manually in the case of transactions, so split apart qmp_drive_backup into an implementation and a wrapper. Switch drive_backup_prepare to use the new wrapper, but don't overload the callback and closure yet. Signed-off-by: John Snow js...@redhat.com --- blockdev.c | 78 +++--- 1 file changed, 59 insertions(+), 19 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-block] [PATCH v3 10/10] iotests: 124 - transactional failure test
On 23.04.2015 02:04, John Snow wrote: Use a transaction to request an incremental backup across two drives. Coerce one of the jobs to fail, and then re-run the transaction. Verify that no bitmap data was lost due to the partial transaction failure. Signed-off-by: John Snow js...@redhat.com --- tests/qemu-iotests/124 | 120 - tests/qemu-iotests/124.out | 4 +- 2 files changed, 121 insertions(+), 3 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-block] [PATCH v3 09/10] block: drive_backup transaction callback support
On 23.04.2015 02:04, John Snow wrote: This patch actually implements the transactional callback system for the drive_backup action. (1) We manually pick up a reference to the bitmap if present to allow its cleanup to be delayed until after all drive_backup jobs launched by the transaction have fully completed. (2) We create a functional closure that envelops the original drive_backup callback, to be able to intercept the completion status and return code for the job. (3) We add the drive_backup_cb method for the drive_backup action, which unpacks the completion information and invokes the final cleanup. (4) backup_transaction_complete will perform the final cleanup on the backup job. (5) In the case of transaction cancellation, drive_backup_cb is still responsible for cleaning up the mess we may have already made. Signed-off-by: John Snow js...@redhat.com --- block/backup.c| 9 blockdev.c| 53 --- include/block/block_int.h | 8 +++ 3 files changed, 67 insertions(+), 3 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-block] [PATCH v3 02/10] iotests: add transactional incremental backup test
On 23.04.2015 02:04, John Snow wrote: Test simple usage cases for using transactions to create and synchronize incremental backups. Signed-off-by: John Snow js...@redhat.com --- tests/qemu-iotests/124 | 54 ++ tests/qemu-iotests/124.out | 4 ++-- 2 files changed, 56 insertions(+), 2 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-block] [PATCH v6 00/21] block: transactionless incremental backup series
On 04/23/2015 09:19 AM, Stefan Hajnoczi wrote: On Fri, Apr 17, 2015 at 07:49:48PM -0400, John Snow wrote: === v6: === 01: s/underlaying/underlying/ Removed a reference to 'disabled' bitmaps. Touching up inconsistent list indentation. Added FreeBSD Documentation License, primarily to be difficult Please stick to the currently used set of licenses in the future, unless you have a strong reason. It's not a good use of anyone's time to fuss with licenses when we have enough of them in the codebase already. In my non-lawyer opinion the license you chose seems okay but I'd rather avoid the risk and hassle. Thanks, Stefan I know I said primarily to be difficult but I was just being facetious. I didn't find the GPL2+ to be suitable for documentation, strictly, so I went to read up on the documentation licenses that the fsf support/recommend. There's the GNU documentation license, but I found that unsuitable for a couple reasons -- one of them was that you are forbidden(!) from changing the text of the license, and there are some provisions in there I didn't like, such as requiring the full text of the license to be included with compiled copies of the document. That's not something I care about -- a reference in source, for instance, is sufficient, or a copy of the license being distributed *with* the compiled source is fine, but I have no need for the full license to be copied with the compiled version. The other documentation license the fsf recommends is the FreeBSD one, and that one looked appealing, short, and to the point, so it is the one I chose. It is essentially the FreeBSD license with words altered to clarify what code and source means with respect to a document. Sorry for /actually/ being difficult; but Eric Blake was urging me to select a license instead of relying on the implicit GPL, so I did go out of my way to choose one I found appropriate. I stand by my pick. --js
Re: [Qemu-block] [PATCH v6 20/21] iotests: add incremental backup failure recovery test
On Fri, Apr 17, 2015 at 07:50:08PM -0400, John Snow wrote: Test the failure case for incremental backups. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/124 | 57 ++ tests/qemu-iotests/124.out | 4 ++-- 2 files changed, 59 insertions(+), 2 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpEXpOJWS7ep.pgp Description: PGP signature
Re: [Qemu-block] [PATCH v6 21/21] iotests: add incremental backup granularity tests
On Fri, Apr 17, 2015 at 07:50:09PM -0400, John Snow wrote: Test what happens if you fiddle with the granularity. Reviewed-by: Max Reitz mre...@redhat.com Signed-off-by: John Snow js...@redhat.com --- tests/qemu-iotests/124 | 58 +- tests/qemu-iotests/124.out | 4 ++-- 2 files changed, 49 insertions(+), 13 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpvKb06jALns.pgp Description: PGP signature
[Qemu-block] [Question] the problem about driver-mirror
I test the function of driver-mirror when run io stress test in vm. The fio's model is 128 iodepth, 4K block-size, 100% write, 100% randon. This is a continuous test, and then I start drive-mirror to copy data to other image file. I find this job can't finish. I analysis the problem and find the dirty bitmap is always set, the job can't converge. Any ideas to solve this problem? Thanks
Re: [Qemu-block] [Qemu-devel] [PATCH v6 00/21] block: transactionless incremental backup series
On 04/23/2015 03:18 PM, Eric Blake wrote: On 04/23/2015 08:41 AM, John Snow wrote: I know I said primarily to be difficult but I was just being facetious. I didn't find the GPL2+ to be suitable for documentation, strictly, so I went to read up on the documentation licenses that the fsf support/recommend. There's the GNU documentation license, but I found that unsuitable for a couple reasons -- one of them was that you are forbidden(!) from changing the text of the license, Note that it is usually only the license text proper that is locked down; the rest of the documentation is not under the same restriction unless you declare specific invariant sections such as a cover page. But I know that the Debian project has typically frowned upon any use of FDL with invariant sections, and the FDL has therefore earned a somewhat questionable reputation outside of FSF projects. Understood; however the GNU FDL specifies within the license where and how the GNU FDL must be displayed. I didn't like these requirements, and might've used the FDL, but you are prohibited from altering the license, so I chose against this license. It's too restrictive for me. and there are some provisions in there I didn't like, such as requiring the full text of the license to be included with compiled copies of the document. That's not something I care about -- a reference in source, for instance, is sufficient, or a copy of the license being distributed *with* the compiled source is fine, but I have no need for the full license to be copied with the compiled version. Yes, I like those benefits of the FreeBSD Documentation License. The other documentation license the fsf recommends is the FreeBSD one, and that one looked appealing, short, and to the point, so it is the one I chose. It is essentially the FreeBSD license with words altered to clarify what code and source means with respect to a document. In particular, according to the FSF, the FreeBSD Documentation License _should be_ acceptable for use with a GPLv2 program: https://www.gnu.org/philosophy/license-list.html#FreeDocumentationLicenses although this is probably not the right list to get a definitive answer from a lawyer familiar with the various copyright licenses and laws. Sorry for /actually/ being difficult; but Eric Blake was urging me to select a license instead of relying on the implicit GPL, so I did go out of my way to choose one I found appropriate. I stand by my pick. I also agree with the pick; I think that GPLv2+ on documentation is a bit questionable - if someone else implements the same interface using just the documentation, is their code required to be under the GPL by virtue of using the documentation? Using a more permissive documentation license feels nicer to me, as it would allow non-GPL implementations if someone is so inclined. Sorry if encouraging the issue has made matters more difficult. It's too late! You've opened Pandora's Box!
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 04/23/2015 05:26 PM, Paolo Bonzini wrote: On 23/04/2015 11:00, Kevin Wolf wrote: Because it may be the right design. If you're really worried about the test matrix, put a check in the filter block driver that its bs-file is qcow2. Of course, such an artificial restriction looks a bit ugly, but using a bad design just in order to get the same restriction is even worse. Stefan originally wanted to put image streaming in the QED driver. I think we'll agree today that it was right to reject that. It's simply not functionality related to the format. Adding replication logic to qcow2 looks similar to me in that respect. Yes, I can't deny it is similar. Still, there is a very important difference: limiting colo's internal workings to qcow2 or NBD doesn't limit what the user can do (while streaming limited the user to image files in QED format). It may also depend on how the patches look like and how much the colo code relies on other internal state. For NBD the answer is almost nothing, and you don't even need a filter driver. You only need to separate sharply the configure and open phases. So it may indeed be possible to generalize the handling of the secondary to non-NBD. It may be the same for the primary; I admit I haven't even tried to read the qcow2 patch, as I couldn't do a meaningful review. For qcow2, we need to read/write from NBD target directly after failover, because the cache image(the format is qcow2) may be put in ramfs to get better performance. The other thing is not changed. For qcow2, if we use a filter driver, the bs-file-drv should support backing file, and make_empty. So it can be the other format. Thanks Wen Congyang Paolo .
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 23/04/2015 11:00, Kevin Wolf wrote: Because it may be the right design. If you're really worried about the test matrix, put a check in the filter block driver that its bs-file is qcow2. Of course, such an artificial restriction looks a bit ugly, but using a bad design just in order to get the same restriction is even worse. Stefan originally wanted to put image streaming in the QED driver. I think we'll agree today that it was right to reject that. It's simply not functionality related to the format. Adding replication logic to qcow2 looks similar to me in that respect. Yes, I can't deny it is similar. Still, there is a very important difference: limiting colo's internal workings to qcow2 or NBD doesn't limit what the user can do (while streaming limited the user to image files in QED format). It may also depend on how the patches look like and how much the colo code relies on other internal state. For NBD the answer is almost nothing, and you don't even need a filter driver. You only need to separate sharply the configure and open phases. So it may indeed be possible to generalize the handling of the secondary to non-NBD. It may be the same for the primary; I admit I haven't even tried to read the qcow2 patch, as I couldn't do a meaningful review. Paolo
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On Wed, Apr 22, 2015 at 05:28:01PM +0800, Wen Congyang wrote: On 04/22/2015 05:18 PM, Stefan Hajnoczi wrote: On Tue, Apr 21, 2015 at 05:28:01PM +0200, Paolo Bonzini wrote: On 21/04/2015 03:25, Wen Congyang wrote: Please do not introduce name+colo block drivers. This approach is invasive and makes block replication specific to only a few block drivers, e.g. NBD or qcow2. NBD is used to connect to secondary qemu, so it must be used. But the primary qemu uses quorum, so the primary disk can be any format. The secondary disk is nbd target, and it can also be any format. The cache disk(active disk/hidden disk) is an empty disk, and it is created before run COLO. The cache disk format is qcow2 now. In theory, it can be ant format which supports backing file. But the driver should be updated to support colo mode. A cleaner approach is a QMP command or -drive options that work for any BlockDriverState. OK, I will add a new drive option to avoid use name+colo. Actually I liked the foo+colo names. These are just internal details of the implementations and the primary/secondary disks actually can be any format. Stefan, what was your worry with the +colo block drivers? Why does NBD need to know about COLO? It should be possible to use iSCSI or other protocols too. Hmm, if you want to use iSCSI or other protocols, you should update the driver to implement block replication's control interface. Currently, we only support nbd now. I took a quick look at the NBD patches in this series, it looks like they are a hacky way to make quorum dynamically reconfigurable. In other words, what you really need is a way to enable/disable a quorum child or even add/remove children at run-time. NBD is not the right place to implement that. Add APIs to quorum so COLO code can use them. Or maybe I'm misinterpreting the patches, I only took a quick look... Stefan pgplmUuucC3yz.pgp Description: PGP signature
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 23/04/2015 12:17, Kevin Wolf wrote: Perhaps quorum is not a great match after all, and it's better to add a new colo driver similar to quorum but simpler and only using the read policy that you need for colo. The new driver would also know how to use BDRV_O_NO_CONNECT. In any case the amount of work needed would not be too big. I thought the same, but haven't looked at the details yet. But if I understand correctly, the plan is to take quorum and add options to turn off the functionality of using a quorum - that's a bit odd. Yes, indeed. Quorum was okay for experimenting, now it's better to cp quorum.c colo.c and clean up the code instead of adding options to quorum. There's not going to be more duplication between quorum.c and colo.c than, say, between colo.c and blkverify.c. What I think is really needed here is essentially an active mirror filter. Yes, an active synchronous mirror. It can be either a filter or a device. Has anyone ever come up with a design for filters? Colo doesn't need much more complexity than a toy blkverify filter. Paolo
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 23/04/2015 12:40, Kevin Wolf wrote: The question that is still open for me is whether it would be a colo.c or an active-mirror.c, i.e. if this would be tied specifically to COLO or if it could be kept generic enough that it could be used for other use cases as well. Understood (now). What I think is really needed here is essentially an active mirror filter. Yes, an active synchronous mirror. It can be either a filter or a device. Has anyone ever come up with a design for filters? Colo doesn't need much more complexity than a toy blkverify filter. I think what we're doing now for quorum/blkverify/blkdebug is okay. The tricky and yet unsolved part is how to add/remove filter BDSes at runtime (dynamic reconfiguration), but IIUC that isn't needed here. Yes, it is. The defer connection to NBD when replication is started is effectively add the COLO filter (with the NBD connection as a children) when replication is started. Similarly close the NBD device when replication is stopped is effectively remove the COLO filter (which brings the NBD connection down with it). Paolo
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 04/23/2015 05:55 PM, Stefan Hajnoczi wrote: On Wed, Apr 22, 2015 at 05:28:01PM +0800, Wen Congyang wrote: On 04/22/2015 05:18 PM, Stefan Hajnoczi wrote: On Tue, Apr 21, 2015 at 05:28:01PM +0200, Paolo Bonzini wrote: On 21/04/2015 03:25, Wen Congyang wrote: Please do not introduce name+colo block drivers. This approach is invasive and makes block replication specific to only a few block drivers, e.g. NBD or qcow2. NBD is used to connect to secondary qemu, so it must be used. But the primary qemu uses quorum, so the primary disk can be any format. The secondary disk is nbd target, and it can also be any format. The cache disk(active disk/hidden disk) is an empty disk, and it is created before run COLO. The cache disk format is qcow2 now. In theory, it can be ant format which supports backing file. But the driver should be updated to support colo mode. A cleaner approach is a QMP command or -drive options that work for any BlockDriverState. OK, I will add a new drive option to avoid use name+colo. Actually I liked the foo+colo names. These are just internal details of the implementations and the primary/secondary disks actually can be any format. Stefan, what was your worry with the +colo block drivers? Why does NBD need to know about COLO? It should be possible to use iSCSI or other protocols too. Hmm, if you want to use iSCSI or other protocols, you should update the driver to implement block replication's control interface. Currently, we only support nbd now. I took a quick look at the NBD patches in this series, it looks like they are a hacky way to make quorum dynamically reconfigurable. In other words, what you really need is a way to enable/disable a quorum child or even add/remove children at run-time. NBD is not the right place to implement that. Add APIs to quorum so COLO code can use them. Or maybe I'm misinterpreting the patches, I only took a quick look... Hmm, if we can enable/disable or add/remove a child at run-time, it is another choice. Thanks Wen Congyang Stefan
Re: [Qemu-block] [PATCH] qcow2: do lazy allocation of the L2 cache
On Wed, Apr 22, 2015 at 04:37:15PM +0200, Alberto Garcia wrote: On Wed 22 Apr 2015 12:26:02 PM CEST, Stefan Hajnoczi wrote: Large disk images need large L2 caches in order to maximize their I/O performance. However setting a correct size for the cache is not necessarily easy since apart from the image size, it also depends on other factors like its usage patterns or whether it's part of a backing chain. In order to be able to set a very large cache size to cover the worst-case scenarios and yet prevent an unnecessary waste of memory, this patch modifies the qcow2 cache algorithm so the memory for each entry is allocated only when it's actually needed. This also improves the scenarios with smaller images: the current default L2 cache size can map a whole 8GB disk, so those smaller than that are allocating cache memory that can never be used. What measurable improvement does this patch make? Buffers allocated upfront are not touched, so they don't actually consume physical memory. For a cache size of 128MB, the PSS is actually ~10MB larger without the patch, which seems to come from posix_memalign(). Do you mean RSS or are you using a tool that reports a PSS number that I don't know about? We should understand what is going on instead of moving the code around to hide/delay the problem. Stefan pgpihllvMsUoj.pgp Description: PGP signature
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 04/23/2015 05:00 PM, Kevin Wolf wrote: Am 22.04.2015 um 12:12 hat Paolo Bonzini geschrieben: On 22/04/2015 11:31, Kevin Wolf wrote: Actually I liked the foo+colo names. These are just internal details of the implementations and the primary/secondary disks actually can be any format. Stefan, what was your worry with the +colo block drivers? I haven't read the patches yet, so I may be misunderstanding, but wouldn't a separate filter driver be more appropriate than modifying qcow2 with logic that has nothing to do with the image format? Possibly; on the other hand, why multiply the size of the test matrix with options that no one will use and that will bitrot? Because it may be the right design. If you're really worried about the test matrix, put a check in the filter block driver that its bs-file is qcow2. Of course, such an artificial restriction looks a bit ugly, but using a bad design just in order to get the same restriction is even worse. The bs-file-driver should support backing file, and use backing reference already. What about the primary side? We should control when to connect to NBD server, not in nbd_open(). Thanks Wen Congyang Stefan originally wanted to put image streaming in the QED driver. I think we'll agree today that it was right to reject that. It's simply not functionality related to the format. Adding replication logic to qcow2 looks similar to me in that respect. Kevin .
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 23/04/2015 11:14, Wen Congyang wrote: The bs-file-driver should support backing file, and use backing reference already. What about the primary side? We should control when to connect to NBD server, not in nbd_open(). My naive suggestion could be to add a BDRV_O_NO_CONNECT option to bdrv_open and a separate bdrv_connect callback. Open would fail if BDRV_O_NO_CONNECT is specified and drv-bdrv_connect is NULL. You would then need a way to have quorum pass BDRV_O_NO_CONNECT. Perhaps quorum is not a great match after all, and it's better to add a new colo driver similar to quorum but simpler and only using the read policy that you need for colo. The new driver would also know how to use BDRV_O_NO_CONNECT. In any case the amount of work needed would not be too big. Paolo
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
* Paolo Bonzini (pbonz...@redhat.com) wrote: On 23/04/2015 14:05, Dr. David Alan Gilbert wrote: As presented at the moment, I don't see there's any dynamic reconfiguration on the primary side at the moment So that means the bdrv_start_replication and bdrv_stop_replication callbacks are more or less redundant, at least on the primary? In fact, who calls them? Certainly nothing in this patch set... :) In the main colo set (I'm looking at the February version) there are calls to them, the 'stop_replication' is called at failover time. Here is I think the later version: http://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg05391.html Dave Paolo - it starts up in the configuration with the quorum(disk, NBD), and that's the way it stays throughout the fault-tolerant setup; the primary doesn't start running until the secondary is connected. Similarly the secondary startups in the configuration and stays that way; the interesting question to me is what happens after a failure. If the secondary fails, then your primary is still quorum(disk, NBD) but the NBD side is dead - so I don't think you need to do anything there immediately. If the primary fails, and the secondary takes over, then a lot of the stuff on the secondary now becomes redundent; does that stay the same and just operate in some form of passthrough - or does it need to change configuration? The hard part to me is how to bring it back into fault-tolerance now; after a primary failure, the secondary now needs to morph into something like a primary, and somehow you need to bring up a new secondary and get that new secondary an image of the primaries current disk. -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 23/04/2015 14:19, Dr. David Alan Gilbert wrote: So that means the bdrv_start_replication and bdrv_stop_replication callbacks are more or less redundant, at least on the primary? In fact, who calls them? Certainly nothing in this patch set... :) In the main colo set (I'm looking at the February version) there are calls to them, the 'stop_replication' is called at failover time. Here is I think the later version: http://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg05391.html I think the primary shouldn't do any I/O after failover (and the secondary should close the NBD server) so it is probably okay to ignore the removal for now. Inserting the filter dynamically is probably needed though. Paolo
Re: [Qemu-block] [PATCH v6 00/21] block: transactionless incremental backup series
On Fri, Apr 17, 2015 at 07:49:48PM -0400, John Snow wrote: === v6: === 01: s/underlaying/underlying/ Removed a reference to 'disabled' bitmaps. Touching up inconsistent list indentation. Added FreeBSD Documentation License, primarily to be difficult Please stick to the currently used set of licenses in the future, unless you have a strong reason. It's not a good use of anyone's time to fuss with licenses when we have enough of them in the codebase already. In my non-lawyer opinion the license you chose seems okay but I'd rather avoid the risk and hassle. Thanks, Stefan pgpPyTMJK68Iw.pgp Description: PGP signature
Re: [Qemu-block] [PATCH v6 06/21] hbitmap: cache array lengths
On Fri, Apr 17, 2015 at 07:49:54PM -0400, John Snow wrote: As a convenience: between incremental backups, bitmap migrations and bitmap persistence we seem to need to recalculate these a lot. Because the lengths are a little bit-twiddly, let's just solidly cache them and be done with it. Reviewed-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com Signed-off-by: John Snow js...@redhat.com --- util/hbitmap.c | 4 1 file changed, 4 insertions(+) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpJFTFui6cyd.pgp Description: PGP signature
Re: [Qemu-block] [PATCH v6 15/21] block: Resize bitmaps on bdrv_truncate
On Fri, Apr 17, 2015 at 07:50:03PM -0400, John Snow wrote: Signed-off-by: John Snow js...@redhat.com --- block.c| 18 ++ include/qemu/hbitmap.h | 10 ++ util/hbitmap.c | 48 3 files changed, 76 insertions(+) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpqDBe_ud8NM.pgp Description: PGP signature
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 23/04/2015 13:36, Kevin Wolf wrote: Crap. Then we need to figure out dynamic reconfiguration for filters (CCed Markus and Jeff). And this is really part of the fundamental operation mode and not just a way to give users a way to change their mind at runtime? Because if it were, we could go forward without that for the start and add dynamic reconfiguration in a second step. I honestly don't know. Wen, David? Paolo Anyway, even if we move it to a second step, it looks like we need to design something rather soon now.
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
* Paolo Bonzini (pbonz...@redhat.com) wrote: On 23/04/2015 13:36, Kevin Wolf wrote: Crap. Then we need to figure out dynamic reconfiguration for filters (CCed Markus and Jeff). And this is really part of the fundamental operation mode and not just a way to give users a way to change their mind at runtime? Because if it were, we could go forward without that for the start and add dynamic reconfiguration in a second step. I honestly don't know. Wen, David? As presented at the moment, I don't see there's any dynamic reconfiguration on the primary side at the moment - it starts up in the configuration with the quorum(disk, NBD), and that's the way it stays throughout the fault-tolerant setup; the primary doesn't start running until the secondary is connected. Similarly the secondary startups in the configuration and stays that way; the interesting question to me is what happens after a failure. If the secondary fails, then your primary is still quorum(disk, NBD) but the NBD side is dead - so I don't think you need to do anything there immediately. If the primary fails, and the secondary takes over, then a lot of the stuff on the secondary now becomes redundent; does that stay the same and just operate in some form of passthrough - or does it need to change configuration? The hard part to me is how to bring it back into fault-tolerance now; after a primary failure, the secondary now needs to morph into something like a primary, and somehow you need to bring up a new secondary and get that new secondary an image of the primaries current disk. Dave Paolo Anyway, even if we move it to a second step, it looks like we need to design something rather soon now. -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 04/23/2015 06:44 PM, Paolo Bonzini wrote: On 23/04/2015 12:40, Kevin Wolf wrote: The question that is still open for me is whether it would be a colo.c or an active-mirror.c, i.e. if this would be tied specifically to COLO or if it could be kept generic enough that it could be used for other use cases as well. Understood (now). What I think is really needed here is essentially an active mirror filter. Yes, an active synchronous mirror. It can be either a filter or a device. Has anyone ever come up with a design for filters? Colo doesn't need much more complexity than a toy blkverify filter. I think what we're doing now for quorum/blkverify/blkdebug is okay. The tricky and yet unsolved part is how to add/remove filter BDSes at runtime (dynamic reconfiguration), but IIUC that isn't needed here. Yes, it is. The defer connection to NBD when replication is started is effectively add the COLO filter (with the NBD connection as a children) when replication is started. Similarly close the NBD device when replication is stopped is effectively remove the COLO filter (which brings the NBD connection down with it). Hmm, I don't understand it clearly. Do you mean: 1. COLO filter is quorum's child 2. We can add/remove quorum's child at run-time. If I misunderstand something, please correct me. Thanks Wen Congyang Paolo .
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
Am 23.04.2015 um 12:44 hat Paolo Bonzini geschrieben: On 23/04/2015 12:40, Kevin Wolf wrote: The question that is still open for me is whether it would be a colo.c or an active-mirror.c, i.e. if this would be tied specifically to COLO or if it could be kept generic enough that it could be used for other use cases as well. Understood (now). What I think is really needed here is essentially an active mirror filter. Yes, an active synchronous mirror. It can be either a filter or a device. Has anyone ever come up with a design for filters? Colo doesn't need much more complexity than a toy blkverify filter. I think what we're doing now for quorum/blkverify/blkdebug is okay. The tricky and yet unsolved part is how to add/remove filter BDSes at runtime (dynamic reconfiguration), but IIUC that isn't needed here. Yes, it is. The defer connection to NBD when replication is started is effectively add the COLO filter (with the NBD connection as a children) when replication is started. Similarly close the NBD device when replication is stopped is effectively remove the COLO filter (which brings the NBD connection down with it). Crap. Then we need to figure out dynamic reconfiguration for filters (CCed Markus and Jeff). And this is really part of the fundamental operation mode and not just a way to give users a way to change their mind at runtime? Because if it were, we could go forward without that for the start and add dynamic reconfiguration in a second step. Anyway, even if we move it to a second step, it looks like we need to design something rather soon now. Kevin
Re: [Qemu-block] [PATCH for-2.4 V2 0/9] various improvements for the iSCSI driver
On Thu, Apr 16, 2015 at 04:08:24PM +0200, Peter Lieven wrote: v1-v2: - removed the requirement for libiscsi 1.10.0 [Paolo] - reworked to force_next_flush logic [Paolo] Peter Lieven (9): block/iscsi: do not forget to logout from target block/iscsi: change all iscsilun properties from uint8_t to bool block/iscsi: rename iscsi_write_protected and let it return void block/iscsi: store DPOFUA bit from the modesense command block/iscsi: optimize WRITE10/16 if cache.writeback is not set block/iscsi: increase retry count block/iscsi: handle SCSI_STATUS_TASK_SET_FULL block/iscsi: bump year in copyright notice block/iscsi: use the allocationmap also if cache.direct=on block/iscsi.c | 64 --- 1 file changed, 44 insertions(+), 20 deletions(-) -- 1.9.1 Thanks, applied to my block-next tree: https://github.com/stefanha/qemu/commits/block-next Stefan pgpXUlR2Tvxqi.pgp Description: PGP signature
Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description
On 04/24/2015 10:01 AM, Fam Zheng wrote: On Thu, 04/23 14:23, Paolo Bonzini wrote: On 23/04/2015 14:19, Dr. David Alan Gilbert wrote: So that means the bdrv_start_replication and bdrv_stop_replication callbacks are more or less redundant, at least on the primary? In fact, who calls them? Certainly nothing in this patch set... :) In the main colo set (I'm looking at the February version) there are calls to them, the 'stop_replication' is called at failover time. Here is I think the later version: http://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg05391.html I think the primary shouldn't do any I/O after failover (and the secondary should close the NBD server) so it is probably okay to ignore the removal for now. Inserting the filter dynamically is probably needed though. Or maybe just enabling/disabling? Hmm, after failover, the secondary qemu should become primary qemu, but we don't know the nbd server's IP/port when we execute the secondary qemu. So we need to inserting nbd client dynamically after failover. Thanks Wen Congyang Fam .