Re: [dpdk-dev] [PATCH] vhost: fix iotlb pool out-of-memory handling

Maxime Coquelin Fri, 26 Jan 2018 07:23:54 -0800

Hi Tiwei,

On 01/26/2018 09:03 AM, Tiwei Bie wrote:

On Thu, Jan 25, 2018 at 04:06:53PM +0100, Maxime Coquelin wrote:

In the unlikely case the IOTLB memory pool runs out of memory,
an issue may happen if all entries are used by the IOTLB cache,
and an IOTLB miss happen. If the iotlb pending list is empty,
then no memory is freed and allocation fails a second time.


This patch fixes this by doing an IOTLB cache random evict if
the IOTLB pending list is empty, ensuring the second allocation
try will succeed.

In the same spirit, the opposite is done when inserting an
IOTLB entry in the IOTLB cache fails due to out of memory. In
this case, the IOTLB pending is flushed if the IOTLB cache is
empty to ensure the new entry can be inserted.

Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
Fixes: f72c2ad63aeb ("vhost: add pending IOTLB miss request list and helpers")

[...]

@@ -95,9 +98,11 @@ vhost_user_iotlb_pending_insert(struct vhost_virtqueue *vq,

ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);

        if (ret) {
-               RTE_LOG(INFO, VHOST_CONFIG,
-                               "IOTLB pool empty, clear pending misses\n");
-               vhost_user_iotlb_pending_remove_all(vq);
+               RTE_LOG(INFO, VHOST_CONFIG, "IOTLB pool empty, clear 
entries\n");
+               if (!TAILQ_EMPTY(&vq->iotlb_pending_list))
+                       vhost_user_iotlb_pending_remove_all(vq);
+               else
+                       vhost_user_iotlb_cache_random_evict(vq);
                ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
                if (ret) {
                        RTE_LOG(ERR, VHOST_CONFIG, "IOTLB pool still empty, 
failure\n");
@@ -186,8 +191,11 @@ vhost_user_iotlb_cache_insert(struct vhost_virtqueue *vq, 
uint64_t iova,

ret = rte_mempool_get(vq->iotlb_pool, (void **)&new_node);

        if (ret) {
-               RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, evict one 
entry\n");
-               vhost_user_iotlb_cache_random_evict(vq);
+               RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, clear 
entries\n");


Maybe we should use the same log level for both cases?
Currently, the first one is INFO, and this one is DEBUG.


Right, I will set DEBUG log level for both.

+               if (!TAILQ_EMPTY(&vq->iotlb_list))
+                       vhost_user_iotlb_cache_random_evict(vq);
+               else
+                       vhost_user_iotlb_pending_remove_all(vq);


The IOTLB entry will be invalidated explicitly if it's
unmapped in the frontend. So all the IOTLB entries in
IOTLB cache are supposed to be valid. So I think maybe
it's better to always prefer to free memory from IOTLB
pending list. Something like:

        if (TAILQ_EMPTY(&vq->iotlb_list))
                vhost_user_iotlb_pending_remove_all(vq);
        else if (TAILQ_EMPTY(&vq->iotlb_pending_list))
                vhost_user_iotlb_cache_random_evict(vq);
        else
                vhost_user_iotlb_pending_random_evict(vq);


I agree that all the IOTLB entries in the cache are supposed to
be valid. But removing in pending list first might not be a good idea,
as if the pending entry is valid, it means an IOTLB update is on its way
to the backend. So it could cause another MISS request to be sent for
the same address if the PMD thread retry the translation afterwards.

Besides, in __vhost_iova_to_vva():

uint64_t
__vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
                    uint64_t iova, uint64_t size, uint8_t perm)
{
        uint64_t vva, tmp_size;

        if (unlikely(!size))
                return 0;

        tmp_size = size;

        vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm);
        if (tmp_size == size)
                return vva;

        if (!vhost_user_iotlb_pending_miss(vq, iova + tmp_size, perm)) {
                /*
                 * iotlb_lock is read-locked for a full burst,
                 * but it only protects the iotlb cache.
                 * In case of IOTLB miss, we might block on the socket,
                 * which could cause a deadlock with QEMU if an IOTLB update
                 * is being handled. We can safely unlock here to avoid it.
                 */
                vhost_user_iotlb_rd_unlock(vq);

                vhost_user_iotlb_pending_insert(vq, iova + tmp_size, perm);
                vhost_user_iotlb_miss(dev, iova + tmp_size, perm);

The vhost_user_iotlb_miss() may fail. If it fails,
the inserted pending entry should be removed from
the pending list.


Right, thanks for spotting this.
I will fix this in a separate patch.

Regards,
Maxime

Best regards,
Tiwei Bie

Re: [dpdk-dev] [PATCH] vhost: fix iotlb pool out-of-memory handling

Reply via email to