* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: > On 2015/3/11 17:06, Dr. David Alan Gilbert wrote: > >* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: > >>Hi Dave, > >> > >>Sorry for the late reply :) > > > >No problem. > > > >>On 2015/3/7 2:30, Dr. David Alan Gilbert wrote: > >>>* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: > >>>>On 2015/3/5 21:31, Dr. David Alan Gilbert (git) wrote: > >>>>>From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > >>>> > >>>>Hi Dave, > >>>> > >>>>> > >>>>>Hi, > >>>>> I'm getting COLO running on a couple of our machines here > >>>>>and wanted to see what was actually going on, so I merged > >>>>>in my recent rolling-stats code: > >>>>> > >>>>>http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00648.html > >>>>> > >>>>>with the following patch, and now I get on the primary side, > >>>>>info migrate shows me: > >>>>> > >>>>>capabilities: xbzrle: off rdma-pin-all: off auto-converge: off > >>>>>zero-blocks: off colo: on > >>>>>Migration status: colo > >>>>>total time: 0 milliseconds > >>>>>colo checkpoint (ms): Min/Max: 0, 10000 Mean: -1.1415868e-13 (Weighted: > >>>>>4.3136025e-158) Count: 4020 Values: 0@1425561742237, 0@1425561742300, > >>>>>0@1425561742363, 0@1425561742426, 0@1425561742489, 0@1425561742555, > >>>>>0@1425561742618, 0@1425561742681, 0@1425561742743, 0@1425561742824 > >>>>>colo paused time (ms): Min/Max: 55, 2789 Mean: 63.9 (Weighted: > >>>>>76.243584) Count: 4019 Values: 62@1425561742237, 62@1425561742300, > >>>>>62@1425561742363, 62@1425561742426, 61@1425561742489, 65@1425561742555, > >>>>>62@1425561742618, 62@1425561742681, 61@1425561742743, 80@1425561742824 > >>>>>colo checkpoint size: Min/Max: 18351, 2.1731606e+08 Mean: 150096.4 > >>>>>(Weighted: 127195.56) Count: 4020 Values: 211246@1425561742238, > >>>>>186622@1425561742301, 227662@1425561742364, 219454@1425561742428, > >>>>>268702@1425561742490, 96334@1425561742556, 47086@1425561742619, > >>>>>42982@1425561742682, 55294@1425561742744, 145582@1425561742825 > >>>>> > >>>>>which suggests I've got a problem with the packet comparison; but that's > >>>>>a separate issue I'll look at. > >>>>> > >>>> > >>>>There is an obvious mistake we have made in proxy, the macro > >>>>'IPS_UNTRACKED_BIT' in colo-patch-for-kernel.patch should be 14, > >>>>so please fix it before do the follow test. Sorry for this low-grade > >>>>mistake, we should do full test before issue it. ;) > >>> > >>>No, that's OK; we all make them. > >>> > >>>However, that didn't cure my problem; but after a bit of experimentation I > >>>now have > >>>COLO working pretty well; thanks for the help! > >>> > >>> 1) I had to disable IPv6 in the guest; it doesn't look like the > >>> conntrack is coping with IPv6 ICMPV6, and on our test network > >>> we're getting a few 10s of those each second, so it's constant > >>> miscompares (they seem to be neighbour broadcasts and multicast > >>> stuff). > >>> > >> > >>Hmm, yes, the proxy code in github does not support ICMPV6 packet comparing. > >>We will add this in the future. > >> > >>> 2) It looks like virtio-net is sending ARPs - possibly every time > >>> that a snapshot is loaded; it's not the 'qemu' announce-self code, > >>> (I added some debug there and it's not being called); and ARPs > >>> cause a miscompare - so you get a continuous streem of miscompares > >>> because a miscompare triggers a new snapshot, that sends more ARPs. > >>> I solved this by switching to e1000. > >>> > >> > >>I didn't meet this problem, i used tcpdump to capture the net packets and > >>did not found any ARPs after VM load in slave. > > > >Interesting. > > > >>Maybe i missed something, Are there any servers/commands that net related > >>run in VM? > > > >I don't think so, and even if they were, I don't think they would go away > >by switching to an e1000; I see there is a 'VIRTIO_NET_S_ANNOUNCE' feature > >in virtio-net, and I suspect it's that which is doing it, but maybe it > > >depends on the guest/host kernels to have it enabled? > > > > Er, quite possible, My host kernel is 3.14.0, and guest is suse11sp3...
I'm running 3.18 on both host and guest (Fedora 20 guest, RHEL7 host but with custom kernel). Dave > > >>And what's your tcpdump command line? > > > >just tcpdump -i em4 -n -w outputfile > > > >>> 3) The other problem with virtio is it's occasionally triggering a > >>> 'virtio: error trying to map MMIO memory' from qemu; I'm not sure > >>> why, the state COLO sends over should always be consistent. > >>> > >>> 4) With the e1000 setup; connections are generally fairly responsive, > >>> but sshing into the guest takes *ages* (10s of seconds). I'm not sure > >>> why, because a curl to a web server seems OK (less than a second) > >>> and once the ssh is open it's pretty responsive. > >>> > >> > >>Er, have you tried to ssh into the guest without in COLO mode? Is it also > >>taking a long time? > > > >Not yet; I'm going to try and take some logging to it to find out why. > > > >>I have encounter a similar situation when the slave VM is faked dead which > >>'info status' is 'running', > >>but VM can not respond to keyboad from VNC. Maybe there is some thing wrong > >>with device status, i > >>will look into it. > >> > >>> 5) I've seen one instance of; > >>> 'qemu-system-x86_64: block/raw-posix.c:836: handle_aiocb_rw: > >>> Assertion `p - buf == aiocb->aio_nbytes' failed.' > >>> on the primary side. > >>> > >>>Stats for a mostly idle guest are now showing: > >>> > >>>colo checkpoint (ms): Min/Max: 0, 10004 Mean: 1592.1 (Weighted: 1806.214) > >>>Count: 227 Values: 1650@1425666160229, 1661@1425666161998, > >>>1662@1425666163736, 1687@1425666165524, 811@1425666166438, > >>>788@1425666167298, 1619@1425666168992, 1699@1425666170793, > >>>2711@1425666173602, 1633@1425666175315 > >>>colo paused time (ms): Min/Max: 58, 2975 Mean: 90.3 (Weighted: 94.109752) > >>>Count: 227 Values: 107@1425666160337, 75@1425666162074, 100@1425666163837, > >>>102@1425666165627, 71@1425666166510, 74@1425666167373, 101@1425666169094, > >>>97@1425666170891, 79@1425666173682, 97@1425666175413 > >>>colo checkpoint size: Min/Max: 212252, 1.9241972e+08 Mean: 5569622.6 > >>>(Weighted: 4826386.5) Count: 227 Values: 5998892@1425666160230, > >>>4660988@1425666161999, 6002996@1425666163737, 5945540@1425666165525, > >>>4833356@1425666166439, 5510606@1425666167299, 5793692@1425666168993, > >>>5584388@1425666170794, 7016684@1425666173603, 4349084@1425666175316 > >>> > >>>So, one checkpoint every ~1.5 seconds; that's just with an > >>>ssh connected and a script doing a 'curl' to it's http > >>>repeatedly. Running 'top' on the ssh with a fast refresh > >>>brings the checkpoints much faster; I guess that's because > >>>the output of top is quite random. > >>> > >> > >>Yes, it is a known problem, actually, not only 'top' command, every command > >>with > >>random output may result in continuous miscompare. > >>Besides, the data transferred through SSH will be encrypted, which makes > >>things more bad. > >> > >>One way to solve this problem maybe: > >>if we detect a continuous stream of miscompares, we fall back to > >>Microcheckpointing mode (periodic checkpoint). > > > >Yes, I was going to try and implement that fallback - I've got some ideas > >to try for it. > > > >>>>To be honest, the proxy part in github is not integrated, we have cut it > >>>>just for easy review and understand, so there may be some mistakes. > >>> > >>>Yes, that's OK; and I've had a few kernel crashes; normally > >>>when the qemu crashes, the kernel doesn't really like it; > >>>but that's OK, I'm sure it will get better. > >>> > >> > >>Hmm, thanks very much for your feedback, we are making our efforts to > >>better it... ;) > > > >Thanks, > > > >Dave > > > >> > >>>I added the following to make my debug easier; which is how > >>>I found the IPv6 problem. > >>> > >>>diff --git a/xt_PMYCOLO.c b/xt_PMYCOLO.c > >>>index 9e50b62..13c0b48 100644 > >>>--- a/xt_PMYCOLO.c > >>>+++ b/xt_PMYCOLO.c > >>>@@ -1072,7 +1072,7 @@ resolve_master_ct(struct sk_buff *skb, unsigned int > >>>dataoff, > >>> h = nf_conntrack_find_get(&init_net, NF_CT_DEFAULT_ZONE, &tuple); > >>> > >>> if (h == NULL) { > >>>- pr_dbg("can't find master's ct for slaver packet\n"); > >>>+ pr_dbg("can't find master's ct for slaver packet > >>>(pf/l3num=%d protonum=%d)\n", l3num, protonum); > >>> return NULL; > >>> } > >>> > >>>@@ -1092,7 +1092,7 @@ nf_conntrack_slaver_in(u_int8_t pf, unsigned int > >>>hooknum, > >>> /* rcu_read_lock()ed by nf_hook_slow */ > >>> l3proto = __nf_ct_l3proto_find(pf); > >>> if (l3proto->get_l4proto(skb, skb_network_offset(skb), &dataoff, > >>> &protonum) <= 0) { > >>>- pr_dbg("slaver: l3proto not prepared to track yet or error > >>>occurred\n"); > >>>+ pr_dbg("slaver: l3proto not prepared to track yet or error > >>>occurred (pf=%d)\n", pf); > >>> NF_CT_STAT_INC_ATOMIC(&init_net, error); > >>> NF_CT_STAT_INC_ATOMIC(&init_net, invalid); > >>> goto out; > >>> > >>>> > >>>>Thanks, > >>>>zhanghailiang > >>> > >>>Thanks, > >>> > >>>Dave > >>>> > >>>> > >>>>>Dave > >>>>> > >>>>>Dr. David Alan Gilbert (1): > >>>>> COLO: Add primary side rolling statistics > >>>>> > >>>>> hmp.c | 12 ++++++++++++ > >>>>> include/migration/migration.h | 3 +++ > >>>>> migration/colo.c | 15 +++++++++++++++ > >>>>> migration/migration.c | 30 ++++++++++++++++++++++++++++++ > >>>>> qapi-schema.json | 11 ++++++++++- > >>>>> 5 files changed, 70 insertions(+), 1 deletion(-) > >>>>> > >>>> > >>>> > >>>-- > >>>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > >>> > >>>. > >>> > >> > >> > >-- > >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > >. > > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK