Metering is implemented as a datapath action so I think you can share
the mechanism instead of inventing a new one.

On Thu, Feb 15, 2018 at 05:57:58PM +0000, Manohar Krishnappa Chidambaraswamy 
wrote:
> [Adding to my previous reply]
> 
> Ben,
> 
> “[manu] Do you mean using meters in dp-netdev instead of token-bucket. I 
> don’t know much about meters. Will check.”
> 
> I checked about meters. For limiting flows hitting slowpath, meters can’t be 
> used. Since meter is an action, new flows would still hit the slow-path 
> before meter-action can be executed.
> 
> Thanx
> Manu
> 
> On 25/01/18, 11:52 AM, "[email protected] on behalf of Manohar 
> Krishnappa Chidambaraswamy" <[email protected] on behalf of 
> [email protected]> wrote:
> 
>     Ben,
>     
>     On 23/01/18, 2:01 AM, "Ben Pfaff" <[email protected]> wrote:
>     
>         I guess I'm probably the wrong one to ultimately review this patch,
>         since it's really a DPDK patch.
>         
>         It's difficult to say for sure that this is really an improvement, 
> since
>         it also causes traffic to be dropped (that is, the traffic that needs 
> to
>         be slow-pathed).
>         
>     [manu] Thanx for your response. Its true that this also results in 
>     packet-drops, but these packets belong to new flows that can potentially
>     take lot more cycles of PMD. Already established flows will *not* see any 
> drops
>     due to this rate-limiting. If implemented in DPDK, we can’t differentiate 
> packets
>     belonging to already established vs new flows. To do this, we would need 
> something
>     in OVS only.
>     
>     Sorry I didn’t include the description in my v2 patch. Here is the main 
> intent
>     behind this patch (from v1). Do you think this is reasonable?
>     
>     “In OVS-DPDK, both fast-path and slow-path execute in the context of a 
> common
>     thread (i.e, PMD thread), without any partitioning of CPU cycles between 
> the
>     two. When there is a burst of new flows coming into the data-path, packets
>     are punted to slow-path in the order they are received and the PMD is busy
>     for the duration of the upcall. Slow-path processing of a packet consumes
>     100-200 times the cycles of fast-path handling. As a result, the 
> forwarding
>     performance of a PMD degrades significantly during an upcall burst. If the
>     PMD was highly loaded already, it becomes temporarily overloaded and its 
> rx
>     queues start filling up. If the upcall burst is long enough, packets will 
> be
>     dropped when rx queues are full. This happens even if the new flows are
>     unexpected and the slow-path decides to drop the packets.
>                 
>     It is likely that most of the packets dropped due to rx queue overflow 
> belong
>     to established flows that should have been processed by the fast-path. 
> Hence,
>     the current OVS-DPDK architecture favors the handling of new flows over 
> the
>     forwarding of established flows. This is generally a sub-optimal approach.
>                 
>     Without a limit to the rate of upcalls, OVS-DPDK is vulnerable for DoS 
> attacks.
>     But even sporadic bursts of e.g. unexpected multicast packets have shown 
> to
>     cause such packet drops.”
>                 
>         I think that this could be handled in a datapath independent way using
>         meters.
>     
>     [manu] Do you mean using meters in dp-netdev instead of token-bucket. I 
> don’t
>     know much about meters. Will check.
>     
>     Thanx
>     Manu  
>         
>         On Mon, Jan 15, 2018 at 08:33:31AM +0000, Manohar Krishnappa 
> Chidambaraswamy wrote:
>         > Rebased to master.
>         > 
>         > Could you please review this patch?
>         > 
>         > v1 of this patch has been sitting idle for quite sometime. I 
> believe this
>         > feature is important. Without this feature, an attacker sending 
> packets 
>         > belonging to different (unestablished) flows can result in PMDs 
> running only
>         > upcalls, reducing the throughput drastically. Huge traffic loss due 
> to this
>         > is more like a DOS attack.
>         > 
>         > Please let me know your thoughts/comments. This is my first patch 
> series in
>         > OVS. Sorry if naming of this patch as v2 is not appropriate.
>         > 
>         > Thanx
>         > Manu
>         > 
>         > v1 of this patch: https://patchwork.ozlabs.org/patch/836737/
>         > 
>         > Signed-off-by: Manohar K C 
> <[email protected]>
>         > cc: Jan Scheurich <[email protected]>
>         > cc: Ben Pfaff <[email protected]>
>         > 
>         > ---
>         >  Documentation/howto/dpdk.rst      | 21 ++++++++++
>         >  debian/libopenvswitch-dev.install |  1 -
>         >  debian/libopenvswitch.install     |  1 -
>         >  debian/rules                      |  4 +-
>         >  lib/dpif-netdev.c                 | 81 
> +++++++++++++++++++++++++++++++++++++--
>         >  vswitchd/vswitch.xml              | 47 +++++++++++++++++++++++
>         >  6 files changed, 148 insertions(+), 7 deletions(-)
>         > 
>         > diff --git a/Documentation/howto/dpdk.rst 
> b/Documentation/howto/dpdk.rst
>         > index 587aaed..1db1562 100644
>         > --- a/Documentation/howto/dpdk.rst
>         > +++ b/Documentation/howto/dpdk.rst
>         > @@ -716,3 +716,24 @@ devices to bridge ``br0``. Once complete, 
> follow the below steps:
>         >     Check traffic on multiple queues::
>         >  
>         >         $ cat /proc/interrupts | grep virtio
>         > +
>         > +Upcall rate limiting
>         > +--------------------
>         > +ovs-vsctl can be used to enable and configure upcall rate limit 
> parameters.
>         > +There are 2 configurable values ``upcall-rate`` and 
> ``upcall-burst`` which
>         > +take effect when global enable knob ``upcall-rl`` is set to true.
>         > +
>         > +Upcall rate should be set using ``upcall-rate`` in 
> packets-per-sec. For
>         > +example::
>         > +
>         > +    $ ovs-vsctl set Open_vSwitch . other_config:upcall-rate=2000
>         > +
>         > +Upcall burst should be set using ``upcall-burst`` in 
> packets-per-sec. For
>         > +example::
>         > +
>         > +    $ ovs-vsctl set Open_vSwitch . other_config:upcall-burst=2000
>         > +
>         > +Upcall ratelimit feature should be globally enabled using 
> ``upcall-rl``. For
>         > +example::
>         > +
>         > +    $ ovs-vsctl set Open_vSwitch . other_config:upcall-rl=true
>         > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>         > index c7d157a..e2fd92c 100644
>         > --- a/lib/dpif-netdev.c
>         > +++ b/lib/dpif-netdev.c
>         > @@ -101,6 +101,16 @@ static struct shash dp_netdevs 
> OVS_GUARDED_BY(dp_netdev_mutex)
>         >  
>         >  static struct vlog_rate_limit upcall_rl = 
> VLOG_RATE_LIMIT_INIT(600, 600);
>         >  
>         > +/* Upcall rate-limit parameters */
>         > +static bool upcall_ratelimit;
>         > +static unsigned int upcall_rate;
>         > +static unsigned int upcall_burst;
>         > +
>         > +/* TODO: Tune these default values based on upcall perf test */
>         > +#define UPCALL_RATELIMIT_DEFAULT false /* Disabled by default */
>         > +#define UPCALL_RATE_DEFAULT      1000  /* pps */
>         > +#define UPCALL_BURST_DEFAULT     1000  /* pps */
>         > +
>         >  #define DP_NETDEV_CS_SUPPORTED_MASK (CS_NEW | CS_ESTABLISHED | 
> CS_RELATED \
>         >                                       | CS_INVALID | CS_REPLY_DIR | 
> CS_TRACKED \
>         >                                       | CS_SRC_NAT | CS_DST_NAT)
>         > @@ -340,6 +350,7 @@ enum dp_stat_type {
>         >                                     hits */
>         >      DP_STAT_SENT_PKTS,          /* Packets that has been sent. */
>         >      DP_STAT_SENT_BATCHES,       /* Number of batches sent. */
>         > +    DP_STAT_RATELIMIT_DROP,     /* Packets dropped due to upcall 
> policer */
>         >      DP_N_STATS
>         >  };
>         >  
>         > @@ -645,6 +656,9 @@ struct dp_netdev_pmd_thread {
>         >      unsigned long long stats_zero[DP_N_STATS];
>         >      uint64_t cycles_zero[PMD_N_CYCLES];
>         >  
>         > +    /* Policer to rate limit slow-path */
>         > +    struct token_bucket upcall_tb;
>         > +
>         >      /* Set to true if the pmd thread needs to be reloaded. */
>         >      bool need_reload;
>         >  };
>         > @@ -894,10 +908,11 @@ pmd_info_show_stats(struct ds *reply,
>         >      ds_put_format(reply,
>         >                    "\temc hits:%llu\n\tmegaflow hits:%llu\n"
>         >                    "\tavg. subtable lookups per hit:%.2f\n"
>         > -                  "\tmiss:%llu\n\tlost:%llu\n"
>         > +                  "\tmiss:%llu\n\tlost:%llu\n\tratelimit 
> drops:%llu\n"
>         >                    "\tavg. packets per output batch: %.2f\n",
>         >                    stats[DP_STAT_EXACT_HIT], 
> stats[DP_STAT_MASKED_HIT],
>         >                    lookups_per_hit, stats[DP_STAT_MISS], 
> stats[DP_STAT_LOST],
>         > +                  stats[DP_STAT_RATELIMIT_DROP],
>         >                    packets_per_batch);
>         >  
>         >      if (total_cycles == 0) {
>         > @@ -3031,6 +3046,8 @@ dpif_netdev_set_config(struct dpif *dpif, 
> const struct smap *other_config)
>         >          smap_get_ullong(other_config, "emc-insert-inv-prob",
>         >                          DEFAULT_EM_FLOW_INSERT_INV_PROB);
>         >      uint32_t insert_min, cur_min;
>         > +    unsigned int rate, burst;
>         > +    bool ratelimit;
>         >  
>         >      if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) {
>         >          free(dp->pmd_cmask);
>         > @@ -3056,6 +3073,36 @@ dpif_netdev_set_config(struct dpif *dpif, 
> const struct smap *other_config)
>         >          }
>         >      }
>         >  
>         > +    /* Handle upcall policer params */
>         > +    ratelimit = smap_get_bool(other_config, "upcall-rl",
>         > +                              UPCALL_RATELIMIT_DEFAULT);
>         > +    rate = smap_get_int(other_config, "upcall-rate",
>         > +                        UPCALL_RATE_DEFAULT);
>         > +    burst = smap_get_int(other_config, "upcall-burst",
>         > +                         UPCALL_BURST_DEFAULT);
>         > +
>         > +    if ((rate != upcall_rate) || (burst != upcall_burst)) {
>         > +        VLOG_INFO("Upcall ratelimit params changed : Old - rate=%d 
> burst=%d "
>         > +                  ": New - rate=%d burst=%d\n", upcall_rate, 
> upcall_burst,
>         > +                  rate, burst);
>         > +
>         > +        upcall_rate = rate;
>         > +        upcall_burst = burst;
>         > +
>         > +        /*
>         > +         * TODO: See if there is a way to reconfig only the policer
>         > +         * in each PMD.
>         > +         */
>         > +        dp_netdev_request_reconfigure(dp);
>         > +    }
>         > +
>         > +    if (ratelimit != upcall_ratelimit) {
>         > +        upcall_ratelimit = ratelimit;
>         > +
>         > +        VLOG_INFO("Upcall ratelimit changed to %s\n",
>         > +                  (upcall_ratelimit ? "Enabled" : "Disabled"));
>         > +    }
>         > +
>         >      return 0;
>         >  }
>         >  
>         > @@ -3958,6 +4005,12 @@ dpif_netdev_run(struct dpif *dpif)
>         >      ovs_mutex_lock(&dp->port_mutex);
>         >      non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID);
>         >      if (non_pmd) {
>         > +        /* Reconfig the upcall policer if params have changed */
>         > +        if ((upcall_rate != non_pmd->upcall_tb.rate) ||
>         > +            (upcall_burst != non_pmd->upcall_tb.burst)) {
>         > +            token_bucket_init(&non_pmd->upcall_tb, upcall_rate, 
> upcall_burst);
>         > +        }
>         > +
>         >          ovs_mutex_lock(&dp->non_pmd_mutex);
>         >          cycles_count_start(non_pmd);
>         >          HMAP_FOR_EACH (port, node, &dp->ports) {
>         > @@ -4154,6 +4207,9 @@ reload:
>         >          lc = UINT_MAX;
>         >      }
>         >  
>         > +    /* Initialize upcall policer token bucket with configured 
> params */
>         > +    token_bucket_init(&pmd->upcall_tb, upcall_rate, upcall_burst);
>         > +
>         >      cycles_count_start(pmd);
>         >      for (;;) {
>         >          for (i = 0; i < poll_cnt; i++) {
>         > @@ -4638,6 +4694,10 @@ dp_netdev_configure_pmd(struct 
> dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
>         >          emc_cache_init(&pmd->flow_cache);
>         >          pmd_alloc_static_tx_qid(pmd);
>         >      }
>         > +
>         > +    /* Initialize upcall policer token bucket with configured 
> params */
>         > +    token_bucket_init(&pmd->upcall_tb, upcall_rate, upcall_burst);
>         > +
>         >      cmap_insert(&dp->poll_threads, CONST_CAST(struct cmap_node *, 
> &pmd->node),
>         >                  hash_int(core_id, 0));
>         >  }
>         > @@ -5076,7 +5136,7 @@ handle_packet_upcall(struct 
> dp_netdev_pmd_thread *pmd,
>         >                       struct dp_packet *packet,
>         >                       const struct netdev_flow_key *key,
>         >                       struct ofpbuf *actions, struct ofpbuf 
> *put_actions,
>         > -                     int *lost_cnt)
>         > +                     int *lost_cnt, int *rl_drop_cnt)
>         >  {
>         >      struct ofpbuf *add_actions;
>         >      struct dp_packet_batch b;
>         > @@ -5084,6 +5144,18 @@ handle_packet_upcall(struct 
> dp_netdev_pmd_thread *pmd,
>         >      ovs_u128 ufid;
>         >      int error;
>         >  
>         > +    /*
>         > +     * Grab a token from the upcall policer to enter slowpath. If 
> token
>         > +     * is not available, drop and account the packet. This is to
>         > +     * rate-limit packets getting into slowpath.
>         > +     */
>         > +    if (upcall_ratelimit && 
> !token_bucket_withdraw(&pmd->upcall_tb, 1)) {
>         > +        dp_packet_delete(packet);
>         > +        (*rl_drop_cnt)++;
>         > +
>         > +        return;
>         > +    }
>         > +
>         >      match.tun_md.valid = false;
>         >      miniflow_expand(&key->mf, &match.flow);
>         >  
>         > @@ -5158,7 +5230,7 @@ fast_path_processing(struct 
> dp_netdev_pmd_thread *pmd,
>         >      struct dpcls *cls;
>         >      struct dpcls_rule *rules[PKT_ARRAY_SIZE];
>         >      struct dp_netdev *dp = pmd->dp;
>         > -    int miss_cnt = 0, lost_cnt = 0;
>         > +    int miss_cnt = 0, lost_cnt = 0, rl_drop_cnt = 0;
>         >      int lookup_cnt = 0, add_lookup_cnt;
>         >      bool any_miss;
>         >      size_t i;
>         > @@ -5202,7 +5274,7 @@ fast_path_processing(struct 
> dp_netdev_pmd_thread *pmd,
>         >  
>         >              miss_cnt++;
>         >              handle_packet_upcall(pmd, packet, &keys[i], &actions,
>         > -                                 &put_actions, &lost_cnt);
>         > +                                 &put_actions, &lost_cnt, 
> &rl_drop_cnt);
>         >          }
>         >  
>         >          ofpbuf_uninit(&actions);
>         > @@ -5235,6 +5307,7 @@ fast_path_processing(struct 
> dp_netdev_pmd_thread *pmd,
>         >      dp_netdev_count_packet(pmd, DP_STAT_LOOKUP_HIT, lookup_cnt);
>         >      dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_cnt);
>         >      dp_netdev_count_packet(pmd, DP_STAT_LOST, lost_cnt);
>         > +    dp_netdev_count_packet(pmd, DP_STAT_RATELIMIT_DROP, 
> rl_drop_cnt);
>         >  }
>         >  
>         >  /* Packets enter the datapath from a port (or from recirculation) 
> here.
>         > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>         > index 58c0ebd..bc77c35 100644
>         > --- a/vswitchd/vswitch.xml
>         > +++ b/vswitchd/vswitch.xml
>         > @@ -412,6 +412,53 @@
>         >          </p>
>         >        </column>
>         >  
>         > +      <column name="other_config" key="upcall-rl"
>         > +              type='{"type": "boolean"}'>
>         > +        <p>
>         > +          Set this value to <code>true</code> to enable upcall 
> rate-limiting.
>         > +          The upcall parameters like rate and burst will be 
> ignored, if this is
>         > +          not set.
>         > +        </p>
>         > +        <p>
>         > +          The default value is <code>false</code> and upcall 
> rate-limiting will
>         > +          be disabled.
>         > +        </p>
>         > +      </column>
>         > +
>         > +      <column name="other_config" key="upcall-rate"
>         > +        type='{"type": "integer", "minInteger": 0, "maxInteger": 
> 4294967295}'>
>         > +        <p>
>         > +          Specifies the rate of upcalls in packets-per-second that 
> is to be
>         > +          allowed. For example, if the value is 10000, then those 
> many upcalls
>         > +          (for packets) are allowed per second in each of the 
> packet polling
>         > +          thread (PMD or non-PMD).
>         > +        </p>
>         > +        <p>
>         > +          A value of <code>0</code> means, no upcalls would be 
> allowed i.e,
>         > +          upcall will be disabled. This is mainly for debugging.
>         > +        </p>
>         > +        <p>
>         > +          The default value is 1000.
>         > +        </p>
>         > +      </column>
>         > +
>         > +      <column name="other_config" key="upcall-burst"
>         > +        type='{"type": "integer", "minInteger": 0, "maxInteger": 
> 4294967295}'>
>         > +        <p>
>         > +          Specifies the maximum burst of upcalls in 
> packets-per-second that is
>         > +          to be allowed. For example, if the value is 15000, then 
> a maximum
>         > +          burst of 15000 upcalls (for packets) are allowed per 
> second in each
>         > +          of the packet polling thread (PMD or non-PMD).
>         > +        </p>
>         > +        <p>
>         > +          A value of <code>0</code> means, no upcalls would be 
> allowed i.e,
>         > +          upcall will be disabled. This is mainly for debugging.
>         > +        </p>
>         > +        <p>
>         > +          The default value is 1000.
>         > +        </p>
>         > +      </column>
>         > +
>         >        <column name="other_config" key="vlan-limit"
>         >                type='{"type": "integer", "minInteger": 0}'>
>         >          <p>
>         > -- 
>         > 1.9.1
>         > 
>         > 
>         > 
>         
>     
>     _______________________________________________
>     dev mailing list
>     [email protected]
>     https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>     
> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to