Hi guys,

Sorry for resurrecting this thread, but I think we should reconsider
this patch at this point.

Let me explain: I'm still perfectly happy with the fix we applied,
e4e74c3a2b9a("dpif-netdev: Purge all ukeys when reconfigure pmd."),
it is necessary to properly collect stats and it works perfectly
when reconfiguring pmd threads.

I'm revisiting this, because I'm not that comfortable anymore assuming
that packets with the exact same flow have to come in from the
same queue.  Here's a couple of scenarios:

- With vhost-user multi queue, I'm not sure that anything prevents a
  guest from sending the same flow to different queues.
- We've encountered the same bug in dpif-netdev when adding/removing
  ports.  We fixed it by never moving the queues from the original
  pmd thread, which makes sense, but that's just an example
- At some point it would be nice to reassign queues among pmd threads
  without deleting the megaflow cache.

What do you guys think?

Thanks,

Daniele

On 11/08/2015 03:43, "Alex Wang" <al...@nicira.com> wrote:

>
>
>Just post a proposed fix here,
>
>
>http://openvswitch.org/pipermail/dev/2015-August/058760.html
><https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_piper
>mail_dev_2015-2DAugust_058760.html&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeA
>w-YihVMNtXt-uEs&r=SmB5nZacmXNq0gKCC1s_Cw5yUNjxgD4v5kJqZ2uWLlE&m=-Mcx1d508F
>GQXv8Hh9otTIgGFizLexDYfA-NEZvIndI&s=xv8eBtnCeCOVqGmnv2kgCtmVqbPg2V9-o0pMQa
>AQXx4&e=>
>
>
>
>Any comments are welcomed,
>
>
>On Mon, Aug 10, 2015 at 11:12 AM, Alex Wang
><al...@nicira.com> wrote:
>
>Hey IIya,
>
>
>Thx for the reply, Daniele helped me reproduced the issue offline and we
>confirmed that this is not DPDK issue.
>
>
>However, this is also not a ufid issue either...
>
>
>Let me explain,
>
>
>First to clear up, with "packets from the same flow (or stream)", I mean
>packets with same src/dst MAC, IP, and TCP/UDP port...  Packets with such
>same info must be hashed to the same rx queue of the receiving port and be
>handled by the same pmd threads,...  Otherwise, we may have serious issue
>with reordering.
>
>
>Then, turns out this issue can be reproduced whenever we reconfigure the
>pmd threads (i.e. change number of pmd threads or change cpu pin) or the
>port receiving queues...  And when such reconfiguration happens, the
>following two things happen:
>
>
>   - all pmd threads will be recreated, so after recreation, there is no
>flow
>     flow exists in any pmd thread's classifier and exact match cache.
>
>
>   - the rx queues of each port will be re-mapped (highly likely to a
>different
>     pmd thread).
>
>
>So, after the queue re-mapping, the pmd thread will fire a new upcall
>handling request to the ofproto-dpif-upcall module.  However, the module
>has
>no knowledge of the reconfiguration in the dpif-netdev layer, and still
>think
>the same packet has already been installed to in dpif-netdev module
>(even though the old pmd thread has been reset).  And thusly, the ofproto-
>dpif-upcall module will fail the flow installation and increment the
>handler_duplicate_upcall coverage counter.
>
>
>
>Expectedly, this will only be a transient state, since the revalidator
>should
>cleans up the old ukey (with the ufid) since it can no longer be dumped
>from the dpif-netdev module.  But there is a bug that makes the
>revalidators
>skip doing so.
>
>
>I'd like to explain more if things are still unclear.  Meanwhile, we will
>discuss further on the fix and post the patch asap...
>
>
>
>Thanks,
>Alex Wang,
>
>
>
>
>On Mon, Aug 10, 2015 at 7:57 AM, Ilya Maximets
><i.maxim...@samsung.com> wrote:
>
>Hello, Alex.
>I think, that this problem doesn't depend on dpdk. It is generic problem
>with ukeys.
>I found it while testing ODP-OVS from (Linaro Networking Group) without
>dpdk.
>IMHO, there must be opportunity to install ukeys for same flow by
>different pmd threads
>anyway, because they are reading from different rx queues.
>
>Sorry for delay, was busy fixing coverage to show it to you.)
>( 
>http://openvswitch.org/pipermail/dev/2015-August/058743.html
><https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_piper
>mail_dev_2015-2DAugust_058743.html&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeA
>w-YihVMNtXt-uEs&r=SmB5nZacmXNq0gKCC1s_Cw5yUNjxgD4v5kJqZ2uWLlE&m=-Mcx1d508F
>GQXv8Hh9otTIgGFizLexDYfA-NEZvIndI&s=zm7lX9fPjR9whNWEVj4bFx5NjE8OyIa4woxdwn
>IhkfE&e=> )
>
>#  ./bin/ovs-appctl coverage/show
>Event coverage, avg rate over last: 5 seconds, last minute, last hour,
>hash=09437d7d:
>bridge_reconfigure         0.0/sec     0.000/sec        0.0003/sec
>total: 1
>ofproto_flush              0.0/sec     0.000/sec        0.0003/sec
>total: 1
>ofproto_recv_openflow      0.0/sec     0.000/sec        0.0017/sec
>total: 6
>ofproto_update_port        0.0/sec     0.000/sec        0.0011/sec
>total: 4
>rev_reconfigure            0.0/sec     0.000/sec        0.0003/sec
>total: 1
>rev_flow_table             0.0/sec     0.000/sec        0.0011/sec
>total: 4
>handler_duplicate_upcall 243016.0/sec 268180.317/sec     5728.5375/sec
>total: 20622735
>xlate_actions            243016.2/sec 259283.650/sec     5580.2606/sec
>total: 20088938
>cmap_expand                0.0/sec     0.000/sec        0.0019/sec
>total: 7
>cmap_shrink                0.0/sec     0.000/sec        0.0008/sec
>total: 3
>dpif_port_add              0.0/sec     0.000/sec        0.0008/sec
>total: 3
>dpif_flow_flush            0.0/sec     0.000/sec        0.0006/sec
>total: 2
>dpif_flow_get              0.0/sec     0.000/sec        0.0014/sec
>total: 5
>dpif_flow_put              0.0/sec     0.000/sec        0.0014/sec
>total: 5
>dpif_flow_del              0.0/sec     0.000/sec        0.0014/sec
>total: 5
>dpif_execute               0.0/sec     0.000/sec        0.0006/sec
>total: 2
>flow_extract               0.0/sec     0.000/sec        0.0003/sec
>total: 1
>hmap_pathological          0.0/sec     0.000/sec        0.0050/sec
>total: 18
>hmap_expand               13.4/sec    19.467/sec        0.8011/sec
>total: 2884
>netdev_received          71024.6/sec 77401.583/sec     1678.5072/sec
>total: 6042626
>netdev_sent              306446.4/sec 328582.933/sec     7084.3894/sec
>total: 25503802
>netdev_get_stats           0.6/sec     0.600/sec        0.0200/sec
>total: 72
>txn_unchanged              0.2/sec     0.200/sec        0.0067/sec
>total: 24
>txn_incomplete             0.0/sec     0.000/sec        0.0006/sec
>total: 2
>txn_success                0.0/sec     0.000/sec        0.0006/sec
>total: 2
>poll_create_node          63.8/sec    89.917/sec        3.0958/sec
>total: 11145
>poll_zero_timeout          0.0/sec     0.000/sec        0.0017/sec
>total: 6
>rconn_queued               0.0/sec     0.000/sec        0.0008/sec
>total: 3
>rconn_sent                 0.0/sec     0.000/sec        0.0008/sec
>total: 3
>pstream_open               0.0/sec     0.000/sec        0.0008/sec
>total: 3
>stream_open                0.0/sec     0.000/sec        0.0003/sec
>total: 1
>unixctl_received           0.0/sec     0.383/sec        0.0131/sec
>total: 47
>unixctl_replied            0.0/sec     0.383/sec        0.0131/sec
>total: 47
>util_xalloc              729151.8/sec 778037.800/sec    16750.5969/sec
>total: 60302149
>vconn_received             0.0/sec     0.000/sec        0.0025/sec
>total: 9
>vconn_sent                 0.0/sec     0.000/sec        0.0017/sec
>total: 6
>netdev_set_policing        0.0/sec     0.000/sec        0.0003/sec
>total: 1
>netdev_get_ifindex         0.0/sec     0.000/sec        0.0003/sec
>total: 1
>netdev_get_hwaddr          0.0/sec     0.000/sec        0.0003/sec
>total: 1
>netdev_set_hwaddr          0.0/sec     0.000/sec        0.0003/sec
>total: 1
>netdev_get_ethtool         0.0/sec     0.000/sec        0.0008/sec
>total: 3
>netlink_received           0.2/sec     0.200/sec        0.0128/sec
>total: 46
>netlink_recv_jumbo         0.2/sec     0.200/sec        0.0067/sec
>total: 24
>netlink_sent               0.2/sec     0.200/sec        0.0097/sec
>total: 35
>nln_changed                0.0/sec     0.000/sec        0.0014/sec
>total: 5
>54 events never hit
>
>But also, I think, that "packets from the same flow (or stream) should
>only be
>handled by one pmd thread" is a very bad behavior. I don't know exactly
>how dpdk works,
>but it shouldn't work so.
>
>Best regards, Ilya Maximets.
>
>On 08.08.2015 01:28, Alex Wang wrote:
>> Thx for reporting this, I worry this may not be an ovs issue, but to do
>>with
>> the dpdk driver...  Here is my analysis, please correct me if wrong,
>>
>> When pmd threads are spawned, each of them will take ownership of port
>> queues in a round robin fashion...  In other words, each port queue will
>> be read by only one pmd thread.
>>
>> Then based on my understanding the dpdk module will hash packets to one
>> particular queue on the receiving port.  Thusly packets from the same
>>flow
>> (or stream) should only be handled by one pmd thread.
>>
>> Next, the issue you reported will only happen when the packets from same
>> flow (or stream) are handled by multiple pmd threads.  [In that case,
>>only one
>> pmd thread will have the flow installed in its classifier (and exact
>>match
>> cache)...  and all other pmd threads fail to install the flow due to
>>duplicate
>> ufid]
>>
>> So, based on the analysis, this really should not happen, and if
>>happens,
>> indicate a packet queueing issue in dpdk.
>>
>> Ilya, could you reproduce this issue and provide the `ovs-appctl
>> coverage/show` output?  Want to confirm that the
>>'handler_duplicate_upcall'
>> counter is increasing when this issue happens.
>>
>> Daniele, could you also provide thoughts on this?  Since I'm not sure
>>if the
>> ovs dpdk code changes (since i last touched it long time ago) affect
>>the issue
>> here.
>>
>>
>> Thanks,
>> Alex Wang,
>>
>>
>>
>>
>> On Fri, Aug 7, 2015 at 2:35 AM, Ilya Maximets <i.maxim...@samsung.com
>><mailto:i.maxim...@samsung.com>> wrote:
>>
>>     In multiqueue mode several pmd threads may process one
>>     port, but different queues. Flow doesn't depend on queue.
>>
>>     So, while miss upcall processing, all threads (except first
>>     for that port) will receive error = ENOSPC due to
>>     ukey_install failure. Therefore they will not add the flow
>>     to flow_table and will not insert it to exact match cache.
>>
>>     As a result all threads (except first for that port) will
>>     always execute a miss.
>>
>>     Fix that by mixing pmd_id with the bits from the ufid
>>     for ukey->hash calculation.
>>
>>     Signed-off-by: Ilya Maximets <i.maxim...@samsung.com
>><mailto:i.maxim...@samsung.com>>
>>     ---
>>      ofproto/ofproto-dpif-upcall.c | 22 ++++++++++++----------
>>      1 file changed, 12 insertions(+), 10 deletions(-)
>>
>>     diff --git a/ofproto/ofproto-dpif-upcall.c
>>b/ofproto/ofproto-dpif-upcall.c
>>     index 0f2e186..57a7f34 100644
>>     --- a/ofproto/ofproto-dpif-upcall.c
>>     +++ b/ofproto/ofproto-dpif-upcall.c
>>     @@ -291,7 +291,8 @@ static bool ukey_install_start(struct udpif *,
>>struct udpif_key *ukey);
>>      static bool ukey_install_finish(struct udpif_key *ukey, int error);
>>      static bool ukey_install(struct udpif *udpif, struct udpif_key
>>*ukey);
>>      static struct udpif_key *ukey_lookup(struct udpif *udpif,
>>     -                                     const ovs_u128 *ufid);
>>     +                                     const ovs_u128 *ufid,
>>     +                                     const unsigned pmd_id);
>>      static int ukey_acquire(struct udpif *, const struct dpif_flow *,
>>                              struct udpif_key **result, int *error);
>>      static void ukey_delete__(struct udpif_key *);
>>     @@ -1143,7 +1144,8 @@ process_upcall(struct udpif *udpif, struct
>>upcall *upcall,
>>                  }
>>                  if (actions_len == 0) {
>>                      /* Lookup actions in userspace cache. */
>>     -                struct udpif_key *ukey = ukey_lookup(udpif,
>>upcall->ufid);
>>     +                struct udpif_key *ukey = ukey_lookup(udpif,
>>upcall->ufid,
>>     +           
>>upcall->pmd_id);
>>                      if (ukey) {
>>                          actions = ukey->actions->data;
>>                          actions_len = ukey->actions->size;
>>     @@ -1320,19 +1322,19 @@ handle_upcalls(struct udpif *udpif, struct
>>upcall *upcalls,
>>      }
>>
>>      static uint32_t
>>     -get_ufid_hash(const ovs_u128 *ufid)
>>     +get_ukey_hash(const ovs_u128 *ufid, const unsigned pmd_id)
>>      {
>>     -    return ufid->u32[0];
>>     +    return ufid->u32[0] + pmd_id;
>>      }
>>
>>      static struct udpif_key *
>>     -ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid)
>>     +ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid, const
>>unsigned pmd_id)
>>      {
>>          struct udpif_key *ukey;
>>     -    int idx = get_ufid_hash(ufid) % N_UMAPS;
>>     +    int idx = get_ukey_hash(ufid, pmd_id) % N_UMAPS;
>>          struct cmap *cmap = &udpif->ukeys[idx].cmap;
>>
>>     -    CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ufid_hash(ufid),
>>cmap) {
>>     +    CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ukey_hash(ufid,
>>pmd_id), cmap) {
>>              if (ovs_u128_equals(&ukey->ufid, ufid)) {
>>                  return ukey;
>>              }
>>     @@ -1362,7 +1364,7 @@ ukey_create__(const struct nlattr *key,
>>size_t key_len,
>>          ukey->ufid_present = ufid_present;
>>          ukey->ufid = *ufid;
>>          ukey->pmd_id = pmd_id;
>>     -    ukey->hash = get_ufid_hash(&ukey->ufid);
>>     +    ukey->hash = get_ukey_hash(&ukey->ufid, pmd_id);
>>          ukey->actions = ofpbuf_clone(actions);
>>
>>          ovs_mutex_init(&ukey->mutex);
>>     @@ -1490,7 +1492,7 @@ ukey_install_start(struct udpif *udpif,
>>struct udpif_key *new_ukey)
>>          idx = new_ukey->hash % N_UMAPS;
>>          umap = &udpif->ukeys[idx];
>>          ovs_mutex_lock(&umap->mutex);
>>     -    old_ukey = ukey_lookup(udpif, &new_ukey->ufid);
>>     +    old_ukey = ukey_lookup(udpif, &new_ukey->ufid,
>>new_ukey->pmd_id);
>>          if (old_ukey) {
>>              /* Uncommon case: A ukey is already installed with the
>>same UFID. */
>>              if (old_ukey->key_len == new_ukey->key_len
>>     @@ -1572,7 +1574,7 @@ ukey_acquire(struct udpif *udpif, const
>>struct dpif_flow *flow,
>>          struct udpif_key *ukey;
>>          int retval;
>>
>>     -    ukey = ukey_lookup(udpif, &flow->ufid);
>>     +    ukey = ukey_lookup(udpif, &flow->ufid, flow->pmd_id);
>>          if (ukey) {
>>              retval = ovs_mutex_trylock(&ukey->mutex);
>>          } else {
>>     --
>>     2.1.4
>>
>>     _______________________________________________
>>     dev mailing list
>
>
>>     dev@openvswitch.org <mailto:dev@openvswitch.org>
>>     http://openvswitch.org/mailman/listinfo/dev
>><https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mail
>>man_listinfo_dev&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r
>>=SmB5nZacmXNq0gKCC1s_Cw5yUNjxgD4v5kJqZ2uWLlE&m=-Mcx1d508FGQXv8Hh9otTIgGFi
>>zLexDYfA-NEZvIndI&s=FObbTiWbZXBt1M2k9F7FSJwdjy4ACjHUXugS0h6V3vo&e=>
>>
>>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to