[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On Mon, May 09, 2016 at 11:22:04AM -0700, Yuanhan Liu wrote: > On Mon, May 09, 2016 at 04:25:38PM +, Xie, Huawei wrote: > > On 5/7/2016 2:36 PM, Yuanhan Liu wrote: > > > However, Michael claims some concerns: he made a good point: a crash > > > is happening means some memory is corrupted, and it could be the virtio > > > memory being corrupted. In such case, nothing will work without the > > > reset. > > > > I don't get this point. What is the scenario? > > It's not a specific scenario, just a hypothetic one. > > > For the crash of virtio frontend driver, i remember we discussed before, > > we have no good recipe but some workaround. The user space frontend > > driver crashes, and its memory is reallocated to other instances, but > > vhost is still writing to that memory. However this has nothing to do > > with vhost reconnect. > > Hmm, yes, seems like another good point to me. This patch seems like > a fix but not a workaround then :) > > --yliu I think it's a good idea to make sure the ring is already setup by this time though. Some future QEMU version might send base idx before it will setup the ring, we would not want to crash. Hopefully it will have fixed the base value sent by then. -- MST
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On 5/7/2016 2:36 PM, Yuanhan Liu wrote: > When DPDK app crashes (or quits, or gets killed), and when QEMU supports > reconnecting (patches have been sent, not merged yet), a restart of DPDK > app would get stale vring base from QEMU. That would break the kernel > virtio net completely, making it non-work any more, unless a driver reset > is done. s/DPDK app/DPDK vhost/ DPDK app is confusing > > So, instead of getting the stale vring base from QEMU, Huawei suggested > we could get a proper one from used->idx. > > Cc: "Michael S. Tsirkin" > Suggested-by: "Xie, Huawei" > Signed-off-by: Yuanhan Liu Acked-by: Huawei Xie
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On 5/7/2016 2:36 PM, Yuanhan Liu wrote: > However, Michael claims some concerns: he made a good point: a crash > is happening means some memory is corrupted, and it could be the virtio > memory being corrupted. In such case, nothing will work without the > reset. I don't get this point. What is the scenario? For the crash of virtio frontend driver, i remember we discussed before, we have no good recipe but some workaround. The user space frontend driver crashes, and its memory is reallocated to other instances, but vhost is still writing to that memory. However this has nothing to do with vhost reconnect.
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On Fri, May 06, 2016 at 11:40:22PM -0700, Yuanhan Liu wrote: > When DPDK app crashes (or quits, or gets killed), and when QEMU supports > reconnecting (patches have been sent, not merged yet), a restart of DPDK > app would get stale vring base from QEMU. That would break the kernel > virtio net completely, making it non-work any more, unless a driver reset > is done. > > So, instead of getting the stale vring base from QEMU, Huawei suggested > we could get a proper one from used->idx. > > Cc: "Michael S. Tsirkin" > Suggested-by: "Xie, Huawei" > Signed-off-by: Yuanhan Liu > --- > > Note that a right way to handle reconnect from abnormal quit would be > let the guest driver to initiate a reset when reconnect is detected > from QEMU. As a reset from the virtio net driver would resets all virtio > related meta datas, and trigger the vhost-user re-initiation again, > therefore, it would make the reconnect work as expected. > > What's unforunate is that driver reset on reconnect, as the term "driver" > implies, need driver hackings, which could be a linux kernel virtio net > driver, DPDK pmd driver, or even, the windows driver. Apparently, it will > not work so far, or even for a long while, until we are adapted to the > new kernel. > > The fix (or more precisely, the workaround) from this patch would make > reconnect work in most case, including the following two hypothesis that > might corrupt virtio memory: > > - vring_used_elem might be in stale state when we are in the middle of > updating used vring. Say, we might have updated the "id" field, but > leaving "len" untouched. > > - vring desc buff might also be in stale state, when we are copying > data into it. > > The reason it still works is that we haven't updated used->idx yet, > which means guest driver will not start processing those buggy used > entries. Therefore, no issues. > > However, Michael claims some concerns: he made a good point: a crash > is happening means some memory is corrupted, and it could be the virtio > memory being corrupted. In such case, nothing will work without the > reset. > > But I would say, that an app in product use would rare crash, and even > if it crashes, the chance that virtio memory being corrupted would be > relatively smaller then. Besides that, it would work, say when the app > is killed by ctrl-c or kill command. So, it works in the most cases. > But still, I would say it's just a workaround so far. > > On the other hand, without this patch, it would be 100% not working > from abnormal quit. But with this patch, it works in most cases, just > don't work in a rare case that when virtio memory is corrupted. Therefore, > I'd say, it is still good to have, until we have a perfect solution. > --- > lib/librte_vhost/virtio-net.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c > index c88aaa3..df103aa 100644 > --- a/lib/librte_vhost/virtio-net.c > +++ b/lib/librte_vhost/virtio-net.c > @@ -560,6 +560,14 @@ vhost_set_vring_addr(int vid, struct vhost_vring_addr > *addr) > return -1; > } > > + if (vq->last_used_idx != vq->used->idx) { > + RTE_LOG(WARNING, VHOST_CONFIG, > + "last_used_idx (%u) and vq->used->idx (%u) mismatch\n", > + vq->last_used_idx, vq->used->idx); > + vq->last_used_idx = vq->used->idx; > + vq->last_used_idx_res = vq->used->idx; > + } > + > vq->log_guest_addr = addr->log_guest_addr; > > LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address desc: %p\n", Acked-by: Michael S. Tsirkin And I would go further and just ignore the index coming from QEMU. dpdk processes packets in order so it does not need it. > -- > 1.9.0
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On 5/9/2016 6:45 PM, Victor Kaplansky wrote: >> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c >> > index c88aaa3..df103aa 100644 >> > --- a/lib/librte_vhost/virtio-net.c >> > +++ b/lib/librte_vhost/virtio-net.c >> > @@ -560,6 +560,14 @@ vhost_set_vring_addr(int vid, struct vhost_vring_addr >> > *addr) >> >return -1; >> >} >> > >> > + if (vq->last_used_idx != vq->used->idx) { >> > + RTE_LOG(WARNING, VHOST_CONFIG, >> > + "last_used_idx (%u) and vq->used->idx (%u) mismatch\n", > I agree with this approach. I just would add to the log message that > last_user_idx > have overrode by used_idx and some packets may be dropped. For TX, the packets are resent. For RX, the packets are dropped. > >> > + vq->last_used_idx, vq->used->idx); >> > + vq->last_used_idx = vq->used->idx; >> > + vq->last_used_idx_res = vq->used->idx; >> > + } >> > + >> >vq->log_guest_addr = addr->log_guest_addr; >> > >> >LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address desc: %p\n", >> > -- >> > 1.9.0 >> >
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On Mon, May 09, 2016 at 01:39:25PM +, Xie, Huawei wrote: > On 5/9/2016 6:45 PM, Victor Kaplansky wrote: > >> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c > >> > index c88aaa3..df103aa 100644 > >> > --- a/lib/librte_vhost/virtio-net.c > >> > +++ b/lib/librte_vhost/virtio-net.c > >> > @@ -560,6 +560,14 @@ vhost_set_vring_addr(int vid, struct > >> > vhost_vring_addr > >> > *addr) > >> > return -1; > >> > } > >> > > >> > +if (vq->last_used_idx != vq->used->idx) { > >> > +RTE_LOG(WARNING, VHOST_CONFIG, > >> > +"last_used_idx (%u) and vq->used->idx (%u) > >> > mismatch\n", > > I agree with this approach. I just would add to the log message that > > last_user_idx > > have overrode by used_idx and some packets may be dropped. Good suggestion. Will do in v2. > For TX, the packets are resent. > For RX, the packets are dropped. Yes. Thanks. --yliu
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
On Mon, May 09, 2016 at 04:25:38PM +, Xie, Huawei wrote: > On 5/7/2016 2:36 PM, Yuanhan Liu wrote: > > However, Michael claims some concerns: he made a good point: a crash > > is happening means some memory is corrupted, and it could be the virtio > > memory being corrupted. In such case, nothing will work without the > > reset. > > I don't get this point. What is the scenario? It's not a specific scenario, just a hypothetic one. > For the crash of virtio frontend driver, i remember we discussed before, > we have no good recipe but some workaround. The user space frontend > driver crashes, and its memory is reallocated to other instances, but > vhost is still writing to that memory. However this has nothing to do > with vhost reconnect. Hmm, yes, seems like another good point to me. This patch seems like a fix but not a workaround then :) --yliu
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
- Original Message - > From: "Yuanhan Liu" > To: dev at dpdk.org > Cc: "huawei xie" , "Yuanhan Liu" linux.intel.com>, "Michael S. Tsirkin" > > Sent: Saturday, May 7, 2016 9:40:22 AM > Subject: [dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base > > When DPDK app crashes (or quits, or gets killed), and when QEMU supports > reconnecting (patches have been sent, not merged yet), a restart of DPDK > app would get stale vring base from QEMU. That would break the kernel > virtio net completely, making it non-work any more, unless a driver reset > is done. > > So, instead of getting the stale vring base from QEMU, Huawei suggested > we could get a proper one from used->idx. > > Cc: "Michael S. Tsirkin" > Suggested-by: "Xie, Huawei" > Signed-off-by: Yuanhan Liu > --- > > Note that a right way to handle reconnect from abnormal quit would be > let the guest driver to initiate a reset when reconnect is detected > from QEMU. As a reset from the virtio net driver would resets all virtio > related meta datas, and trigger the vhost-user re-initiation again, > therefore, it would make the reconnect work as expected. > > What's unforunate is that driver reset on reconnect, as the term "driver" > implies, need driver hackings, which could be a linux kernel virtio net > driver, DPDK pmd driver, or even, the windows driver. Apparently, it will > not work so far, or even for a long while, until we are adapted to the > new kernel. > > The fix (or more precisely, the workaround) from this patch would make > reconnect work in most case, including the following two hypothesis that > might corrupt virtio memory: > > - vring_used_elem might be in stale state when we are in the middle of > updating used vring. Say, we might have updated the "id" field, but > leaving "len" untouched. > > - vring desc buff might also be in stale state, when we are copying > data into it. > > The reason it still works is that we haven't updated used->idx yet, > which means guest driver will not start processing those buggy used > entries. Therefore, no issues. > > However, Michael claims some concerns: he made a good point: a crash > is happening means some memory is corrupted, and it could be the virtio > memory being corrupted. In such case, nothing will work without the > reset. > > But I would say, that an app in product use would rare crash, and even > if it crashes, the chance that virtio memory being corrupted would be > relatively smaller then. Besides that, it would work, say when the app > is killed by ctrl-c or kill command. So, it works in the most cases. > But still, I would say it's just a workaround so far. > > On the other hand, without this patch, it would be 100% not working > from abnormal quit. But with this patch, it works in most cases, just > don't work in a rare case that when virtio memory is corrupted. Therefore, > I'd say, it is still good to have, until we have a perfect solution. > --- > lib/librte_vhost/virtio-net.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c > index c88aaa3..df103aa 100644 > --- a/lib/librte_vhost/virtio-net.c > +++ b/lib/librte_vhost/virtio-net.c > @@ -560,6 +560,14 @@ vhost_set_vring_addr(int vid, struct vhost_vring_addr > *addr) > return -1; > } > > + if (vq->last_used_idx != vq->used->idx) { > + RTE_LOG(WARNING, VHOST_CONFIG, > + "last_used_idx (%u) and vq->used->idx (%u) mismatch\n", I agree with this approach. I just would add to the log message that last_user_idx have overrode by used_idx and some packets may be dropped. > + vq->last_used_idx, vq->used->idx); > + vq->last_used_idx = vq->used->idx; > + vq->last_used_idx_res = vq->used->idx; > + } > + > vq->log_guest_addr = addr->log_guest_addr; > > LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address desc: %p\n", > -- > 1.9.0 > >
[dpdk-dev] [PATCH 4/6] vhost: workaround stale vring base
When DPDK app crashes (or quits, or gets killed), and when QEMU supports reconnecting (patches have been sent, not merged yet), a restart of DPDK app would get stale vring base from QEMU. That would break the kernel virtio net completely, making it non-work any more, unless a driver reset is done. So, instead of getting the stale vring base from QEMU, Huawei suggested we could get a proper one from used->idx. Cc: "Michael S. Tsirkin" Suggested-by: "Xie, Huawei" Signed-off-by: Yuanhan Liu --- Note that a right way to handle reconnect from abnormal quit would be let the guest driver to initiate a reset when reconnect is detected from QEMU. As a reset from the virtio net driver would resets all virtio related meta datas, and trigger the vhost-user re-initiation again, therefore, it would make the reconnect work as expected. What's unforunate is that driver reset on reconnect, as the term "driver" implies, need driver hackings, which could be a linux kernel virtio net driver, DPDK pmd driver, or even, the windows driver. Apparently, it will not work so far, or even for a long while, until we are adapted to the new kernel. The fix (or more precisely, the workaround) from this patch would make reconnect work in most case, including the following two hypothesis that might corrupt virtio memory: - vring_used_elem might be in stale state when we are in the middle of updating used vring. Say, we might have updated the "id" field, but leaving "len" untouched. - vring desc buff might also be in stale state, when we are copying data into it. The reason it still works is that we haven't updated used->idx yet, which means guest driver will not start processing those buggy used entries. Therefore, no issues. However, Michael claims some concerns: he made a good point: a crash is happening means some memory is corrupted, and it could be the virtio memory being corrupted. In such case, nothing will work without the reset. But I would say, that an app in product use would rare crash, and even if it crashes, the chance that virtio memory being corrupted would be relatively smaller then. Besides that, it would work, say when the app is killed by ctrl-c or kill command. So, it works in the most cases. But still, I would say it's just a workaround so far. On the other hand, without this patch, it would be 100% not working from abnormal quit. But with this patch, it works in most cases, just don't work in a rare case that when virtio memory is corrupted. Therefore, I'd say, it is still good to have, until we have a perfect solution. --- lib/librte_vhost/virtio-net.c | 8 1 file changed, 8 insertions(+) diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index c88aaa3..df103aa 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -560,6 +560,14 @@ vhost_set_vring_addr(int vid, struct vhost_vring_addr *addr) return -1; } + if (vq->last_used_idx != vq->used->idx) { + RTE_LOG(WARNING, VHOST_CONFIG, + "last_used_idx (%u) and vq->used->idx (%u) mismatch\n", + vq->last_used_idx, vq->used->idx); + vq->last_used_idx = vq->used->idx; + vq->last_used_idx_res = vq->used->idx; + } + vq->log_guest_addr = addr->log_guest_addr; LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address desc: %p\n", -- 1.9.0