Thanks, Ian! This will give us time to come up with a proper solution for 2.10. Let's work on that now. /Jan
> -----Original Message----- > From: ovs-dev-boun...@openvswitch.org > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Stokes, Ian > Sent: Tuesday, 13 February, 2018 16:52 > To: Stokes, Ian <ian.sto...@intel.com>; Kevin Traynor <ktray...@redhat.com>; > d...@openvswitch.org > Cc: Ilya Maximets <i.maxim...@samsung.com>; Antonio Fischetti > <antonio.fische...@gmail.com> > Subject: Re: [ovs-dev] [PATCH v1 branch-2.9] netdev-dpdk: Reintroduce shared > mempools. > > > > > > > On 02/13/2018 10:59 AM, Ian Stokes wrote: > > > > This commit manually reverts the current per port mempool model to > > > > the previous shared mempool model for DPDK ports. > > > > > > > > OVS previously used a shared mempool model for ports with the same > > > > MTU configuration. This was replaced by a per port mempool model to > > > > address issues flagged by users such as: > > > > > > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/04 > > > > 25 > > > > 60.html > > > > > > > > However the per port model has a number of issues including: > > > > > > > > 1. Requires an increase in memory resource requirements to support > > > > the same number of ports as the shared port model. > > > > 2. Incorrect algorithm for mbuf provisioning for each mempool. > > > > > > > > These are considered blocking factors for current deployments of OVS > > > > when upgrading to OVS 2.9 as a user may have to redimension memory > > > > for the same deployment configuration. This may not be possible for > > > users. > > > > > > > > For clarity, the commits whose changes are removed include the > > > > following: > > > > > > > > netdev-dpdk: Create separate memory pool for each port: d555d9b > > > > netdev-dpdk: fix management of pre-existing mempools: b6b26021d Fix > > > > mempool names to reflect socket id: f06546a > > > > netdev-dpdk: skip init for existing mempools: 837c176 > > > > netdev-dpdk: manage failure in mempool name creation: 65056fd > > > > netdev-dpdk: Reword mp_size as n_mbufs: ad9b5b9 > > > > netdev-dpdk: Rename dpdk_mp_put as dpdk_mp_free: a08a115 > > > > netdev-dpdk: Fix mp_name leak on snprintf failure: ec6edc8 > > > > netdev-dpdk: Fix dpdk_mp leak in case of EEXIST: 173ef76 > > > > netdev-dpdk: Factor out struct dpdk_mp: 24e78f9 > > > > netdev-dpdk: Remove unused MAX_NB_MBUF: bc57ed9 > > > > netdev-dpdk: Fix mempool creation with large MTU: af5b0da > > > > > > > > Due to the number of commits and period of time they were introduced > > > > over, a simple revert was not possible. All code from the commits > > > > above is removed and the shared mempool code reintroduced as it was > > > > before its replacement. > > > > > > > > Code introduced by commit > > > > > > > > netdev-dpdk: Add debug appctl to get mempool information: be48173 > > > > > > > > has been modified to work with the shared mempool model. > > > > > > > > > > There is a couple of small changes for coding stds to this version > > > from the rfc, but one is not correct (see below). Apart from that... > > > > Thanks for flagging this. > > I've pushed this to dpdk_merge_2_9 and its part of the branch 2.9 pull > request. > > https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344413.html > > Note this revert only applies to branch 2.9, I have not applied it to master. > > Thanks > Ian > > > > > > > > > Acked-by: Kevin Traynor <ktray...@redhat.com> > > > Tested-by: Kevin Traynor <ktray...@redhat.com> > > > > > > > ... > > > > > > ovs_mutex_lock(&dpdk_mp_mutex); > > > > - VLOG_DBG("Releasing \"%s\" mempool", mp->name); > > > > - rte_mempool_free(mp); > > > > + ovs_assert(dmp->refcount); > > > > + > > > > + if (! --dmp->refcount) { > > > > > > The space should not be added between ! and -- > > > > > > http://docs.openvswitch.org/en/latest/internals/contributing/coding- > > > style/#expressions > > > > Good catch, I can fix this before committing if there are no other > > comments. > > > > Thanks > > Ian > > > > > > > + ovs_list_remove(&dmp->list_node); > > > > + rte_mempool_free(dmp->mp); > > > > + rte_free(dmp); > > > > + } > > > > ovs_mutex_unlock(&dpdk_mp_mutex); } > > > > > > > > -/* Tries to allocate a new mempool - or re-use an existing one > > > > where > > > > - * appropriate - on requested_socket_id with a size determined by > > > > - * requested_mtu and requested Rx/Tx queues. > > > > - * On success - or when re-using an existing mempool - the new > > > > configuration > > > > - * will be applied. > > > > +/* Tries to allocate new mempool on requested_socket_id with > > > > + * mbuf size corresponding to requested_mtu. > > > > + * On success new configuration will be applied. > > > > * On error, device will be left unchanged. */ static int > > > > netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) > > > > OVS_REQUIRES(dev->mutex) > > > > { > > > > uint32_t buf_size = dpdk_buf_size(dev->requested_mtu); > > > > - struct rte_mempool *mp; > > > > - int ret = 0; > > > > + struct dpdk_mp *mp; > > > > > > > > - mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size)); > > > > + mp = dpdk_mp_get(dev->requested_socket_id, > > > > + FRAME_LEN_TO_MTU(buf_size)); > > > > if (!mp) { > > > > VLOG_ERR("Failed to create memory pool for netdev " > > > > "%s, with MTU %d on socket %d: %s\n", > > > > dev->up.name, dev->requested_mtu, dev- > > > >requested_socket_id, > > > > - rte_strerror(rte_errno)); > > > > - ret = rte_errno; > > > > + rte_strerror(rte_errno)); > > > > + return rte_errno; > > > > } else { > > > > - /* If a new MTU was requested and its rounded value equals > > the > > > one > > > > - * that is currently used, then the existing mempool is > > > returned. */ > > > > - if (dev->mp != mp) { > > > > - /* A new mempool was created, release the previous one. > > */ > > > > - dpdk_mp_free(dev->mp); > > > > - } else { > > > > - ret = EEXIST; > > > > - } > > > > - dev->mp = mp; > > > > + dpdk_mp_put(dev->dpdk_mp); > > > > + dev->dpdk_mp = mp; > > > > dev->mtu = dev->requested_mtu; > > > > dev->socket_id = dev->requested_socket_id; > > > > dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu); > > > > } > > > > > > > > - return ret; > > > > + return 0; > > > > } > > > > > > > > static void > > > > @@ -742,7 +765,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > > > > *dev, int n_rxq, int n_txq) > > > > > > > > for (i = 0; i < n_rxq; i++) { > > > > diag = rte_eth_rx_queue_setup(dev->port_id, i, dev- > > > >rxq_size, > > > > - dev->socket_id, NULL, dev- > > > >mp); > > > > + dev->socket_id, NULL, > > > > + dev->dpdk_mp->mp); > > > > if (diag) { > > > > VLOG_INFO("Interface %s rxq(%d) setup error: %s", > > > > dev->up.name, i, rte_strerror(-diag)); @@ > > > > -826,7 +850,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) > > > > memcpy(dev->hwaddr.ea, eth_addr.addr_bytes, ETH_ADDR_LEN); > > > > rte_eth_link_get_nowait(dev->port_id, &dev->link); > > > > > > > > - mbp_priv = rte_mempool_get_priv(dev->mp); > > > > + mbp_priv = rte_mempool_get_priv(dev->dpdk_mp->mp); > > > > dev->buf_size = mbp_priv->mbuf_data_room_size - > > > > RTE_PKTMBUF_HEADROOM; > > > > > > > > /* Get the Flow control configuration for DPDK-ETH */ @@ > > > > -1079,7 > > > > +1103,7 @@ common_destruct(struct netdev_dpdk *dev) > > > > OVS_EXCLUDED(dev->mutex) > > > > { > > > > rte_free(dev->tx_q); > > > > - dpdk_mp_free(dev->mp); > > > > + dpdk_mp_put(dev->dpdk_mp); > > > > > > > > ovs_list_remove(&dev->list_node); > > > > free(ovsrcu_get_protected(struct ingress_policer *, @@ -1823,7 > > > > +1847,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > > > > } > > > > > > > > nb_rx = rte_vhost_dequeue_burst(vid, qid * VIRTIO_QNUM + > > > VIRTIO_TXQ, > > > > - dev->mp, > > > > + dev->dpdk_mp->mp, > > > > (struct rte_mbuf **) batch- > > > >packets, > > > > NETDEV_MAX_BURST); > > > > if (!nb_rx) { > > > > @@ -2043,7 +2067,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int > > > >qid, > > > struct dp_packet_batch *batch) > > > > continue; > > > > } > > > > > > > > - pkts[txcnt] = rte_pktmbuf_alloc(dev->mp); > > > > + pkts[txcnt] = rte_pktmbuf_alloc(dev->dpdk_mp->mp); > > > > if (OVS_UNLIKELY(!pkts[txcnt])) { > > > > dropped += cnt - i; > > > > break; > > > > @@ -2919,7 +2943,7 @@ netdev_dpdk_get_mempool_info(struct > > > > unixctl_conn > > > *conn, > > > > ovs_mutex_lock(&dev->mutex); > > > > ovs_mutex_lock(&dpdk_mp_mutex); > > > > > > > > - rte_mempool_dump(stream, dev->mp); > > > > + rte_mempool_dump(stream, dev->dpdk_mp->mp); > > > > > > > > ovs_mutex_unlock(&dpdk_mp_mutex); > > > > ovs_mutex_unlock(&dev->mutex); @@ -3556,9 +3580,12 @@ > > > > netdev_dpdk_reconfigure(struct netdev *netdev) > > > > > > > > rte_eth_dev_stop(dev->port_id); > > > > > > > > - err = netdev_dpdk_mempool_configure(dev); > > > > - if (err && err != EEXIST) { > > > > - goto out; > > > > + if (dev->mtu != dev->requested_mtu > > > > + || dev->socket_id != dev->requested_socket_id) { > > > > + err = netdev_dpdk_mempool_configure(dev); > > > > + if (err) { > > > > + goto out; > > > > + } > > > > } > > > > > > > > netdev->n_txq = dev->requested_n_txq; @@ -3596,13 +3623,16 @@ > > > > dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev) > > > > > > > > netdev_dpdk_remap_txqs(dev); > > > > > > > > - err = netdev_dpdk_mempool_configure(dev); > > > > - if (!err) { > > > > - /* A new mempool was created. */ > > > > - netdev_change_seq_changed(&dev->up); > > > > - } else if (err != EEXIST){ > > > > - return err; > > > > + if (dev->requested_socket_id != dev->socket_id > > > > + || dev->requested_mtu != dev->mtu) { > > > > + err = netdev_dpdk_mempool_configure(dev); > > > > + if (err) { > > > > + return err; > > > > + } else { > > > > + netdev_change_seq_changed(&dev->up); > > > > + } > > > > } > > > > + > > > > if (netdev_dpdk_get_vid(dev) >= 0) { > > > > if (dev->vhost_reconfigured == false) { > > > > dev->vhost_reconfigured = true; > > > > > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev