Re: [ovs-dev] [RFC PATCH v1] dpdk: Support both shared and per port mempools.

Lam, Tiago Sun, 27 May 2018 10:05:48 -0700

Hi Kevin,

On 25/05/2018 18:22, Kevin Traynor wrote:
> Hi Tiago,
>


[snip]

>>> +
>>> +static struct dpdk_mp *
>>> +dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool per_port_mp)
>>> +{
>>> +    struct dpdk_mp *dmp;
>>> +    bool reuse = false;
>>>  
>>>      ovs_mutex_lock(&dpdk_mp_mutex);
>>> -    if (dpdk_mp_full(mp)) {
>>> -        VLOG_DBG("Freeing mempool \"%s\"", mp->name);
>>> -        rte_mempool_free(mp);
>>> -    } else {
>>> -        struct dpdk_mp *dmp;
>>> +    /* Check if shared mempools are being used, if so check existing 
>>> mempools
>>> +     * to see if reuse is possible. */
>>> +    if (!per_port_mp) {
>>> +        LIST_FOR_EACH (dmp, list_node, &dpdk_mp_list) {
>>> +            if (dmp->socket_id == dev->requested_socket_id
>>> +                && dmp->mtu == mtu) {
>>> +                VLOG_DBG("Reusing mempool \"%s\"", dmp->mp->name);
>>> +                dmp->refcount++;
>>> +                reuse = true;
>>> +                break;
>>> +            }
>>> +        }
>>> +    }
>>> +    /* Sweep mempools after reuse or before create. */
>>> +    dpdk_mp_sweep();
>>
>> *Should dpdk_mp_sweep() be called when destroying an interface as well,
>> in common_destruct()? While testing, I've noticed that if I add one port
>> and delete the same port the mempool will still be allocated until you
>> add another port, since dpdk_mp_sweep() will only be called on the next
>> call to dpdp_mp_get(). This could potentially create problems in the
>> per-port model where one deletes a certain number of ports and can't add
>> any other ports because there's (hanging) mempools holding memory.
>>
> 
> With the shared model, it was deliberate to leave the dpdk_mp_sweep() as
> late as possible to give more time for buffers to be returned to it and
> to allow for the possibility that it might be reused. In order to not
> require any additional memory, it is done just before a new mempool is
> created.
> 

Thanks for giving some context, I also found that to be the case while
testing with both models.

> It's not a problem to call dpdk_mp_sweep() anytime but not sure I follow
> the issue you've raised - sweep would still be called before mempools
> for any new port are created. Is there a case missed?
> 

Maybe I haven't explained myself as clearly as I wanted to. I wouldn't
say it is a missed case as much as a consequence of what you just
mentioned - "leave the dpdk_mp_sweep() as late as possible", but I
thought it would be best to flag it just so we are aware.

Let's say you are using the per-port model and you add port X, which
gets mempool A allocated to it. And let's say you immediately delete
that port X. What I was seeing while testing is that mempool A is not
freed right away, only when a new port, let's say port Y, is added. This
also goes in-line with what you just mentioned.

Now, my point is that this could potentially become an issue in the
per-port model, since it doesn't reuse the mempools. If you add a bunch
of ports and then delete them in a batch (or maybe this is not a
use-case?), if you try to add more ports you may find yourself out of
memory, because the mempools haven't actually been freed when deleting
the ports.

>>> +
>>> +/* Depending on the memory model being used this function tries
>>> + * identify and reuse an existing mempool or tries to allocate new
>>> + * mempool on requested_socket_id with mbuf size corresponding to
>>> + * requested_mtu. On success new configuration will be applied.
>>>   * On error, device will be left unchanged. */
>>>  static int
>>>  netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
>>>      OVS_REQUIRES(dev->mutex)
>>>  {
>>>      uint32_t buf_size = dpdk_buf_size(dev->requested_mtu);
>>> -    struct rte_mempool *mp;
>>> +    struct dpdk_mp *mp;
>>>      int ret = 0;
>>> +    bool per_port_mp = dpdk_per_port_mempool_enabled();
>>>  
>>> -    dpdk_mp_sweep();
>>> +    /* With shared mempools we do not need to configure a mempool if the 
>>> MTU
>>> +     * and socket ID have not changed, the previous configuration is still
>>> +     * valid so return 0 */
>>> +    if (dev->mtu == dev->requested_mtu
>>> +        && dev->socket_id == dev->requested_socket_id
>>> +        && (!per_port_mp)) {
>>
>> I also find that moving the `(!per_port_mp)` condition to the beginning
>> improves readability here. It even goes in hand with your comment just
>> above - "With shared mempools we do not ...".
>>
>>> +        return ret;
>>> +    }
>>>  
>>> -    mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size));
>>> +    mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size), per_port_mp);
>>>      if (!mp) {
>>>          VLOG_ERR("Failed to create memory pool for netdev "
>>>                   "%s, with MTU %d on socket %d: %s\n",
>>>                   dev->up.name, dev->requested_mtu, 
>>> dev->requested_socket_id,
>>> -                 rte_strerror(rte_errno));
>>> -        ret = rte_errno;
>>> +        rte_strerror(rte_errno));
>>
>> Missing indentation here.
>>
>>> +        return rte_errno;
>>>      } else {
>>> -        /* If a new MTU was requested and its rounded value equals the one
>>> -         * that is currently used, then the existing mempool is returned. 
>>> */
>>> -        if (dev->mp != mp) {
>>> -            /* A new mempool was created, release the previous one. */
>>> -            dpdk_mp_release(dev->mp);
>>> -        } else {
>>> -            ret = EEXIST;
>>> +        /* Check for any pre-existing dpdk_mp for the device */
>>> +        if (dev->dpdk_mp != NULL) {
>>> +            /* If a new MTU was requested and its rounded value equals the
>>> +             * one that is currently used, then the existing mempool is
>>> +             * returned. */
>>> +            if (dev->dpdk_mp->mp != mp->mp) {
>>> +                /* A new mempool was created, release the previous one. */
>>> +                dpdk_mp_put(dev->dpdk_mp);
>>> +            } else {
>>> +                /* If the mempool already exists in the current dpdk_mp 
>>> then
>>> +                 * we need to ensure dpdk_mp that was just created is 
>>> freed in
>>> +                 * the next sweep as it will not be used. */
>> The code path around the `else` block here will only be called when
>> `!per_port_mp`. This is because `dpdk_mp_get()` will only return an
>> already existing mempool when using the shared model. Otherwise a new
>> one is always returned, and thus the `if (dev->dpdk_mp->mp != mp->mp)`
>> will be true. Am I reading this right? If so then refactoring this a bit
>> to differentiate on `per_port_mp` might help on readability - this goes
>> in-line with Kevin's comment about making this piece a bit more readable.
>>
>> On the same note, the comment above mislead me to think that the
>> allocated `mp` is being freed, which would result in error since the
>> same `mp` is then assigned below. Instead, what it is doing is
>> decrementing the refcount in struct dpdk_mp, which might end up being
>> freed on the next dpdk_mp_sweep() if `refcount=0`. But that won't happen
>> on the shared model unless no port is using the mempool.
>>
> 
> but this port is using it :-)
> 
> +            if (dev->dpdk_mp->mp != mp->mp) {
> 
> is false, which means they both have the same mempool. The refcount in
> dpdk_mp only counts user of the dpdk_mp, not the actual mp, so it could
> then be freed while still needed.
> 

I'm not sure I understand your point entirely. It is indeed true that
both `mp`s will be equal and thus `dpdk_mp_put()` will be called,
decrementing the `refcount`. But `refcount` shouldn't get to 0 (and thus
the mempool shouldn't be freed during sweep) as the previous call to
`dpdk_mp_get()` is also incrementing `refcount` by 1.

(In fact, the `else` block seems to be here only to decrement the
`refcount` when a port has been reconfigured and ends up reusing the
same mempool it was using before. In that case, the previous call to
`dpdk_mp_get()` is "wrongly" incrementing `refcount` by 1.)

I also tried to crash the patch by exercising this part of the code with
the following:
1. Add one port - `refcount` here was 1;
2. Reconfigure that port (but keep the MTU, so the same mempool gets
reused) - `refcount` here was 1;
3. Add a second port (this should trigger a call to `dpdk_mp_sweep()`) -
`refcount` here was 2;
4. Send some traffic.

And wasn't able to do so. I do agree, though, that this part can be
improved.

Regards,
Tiago.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC PATCH v1] dpdk: Support both shared and per port mempools.

Reply via email to