Ben,

Or did you mean reusing just the meter mechanism and not as an action? i.e, 
“struct dp_meter” in dp_netdev and something like dp_netdev_run_meter()?

I looked at it and felt its complex/heavy for this usecase, compared to token 
bucket implementation in openvswitch/token-bucket.h. Also, this token-bucket is 
not a new one; its being used by vlog_rate_limit(). Let us know your thoughts.

Thanx
Manu

On 16/02/18, 4:12 AM, "Jan Scheurich" <[email protected]> wrote:

    Hi Ben,
    
    This patch is about rate-limiting miss upcalls. Could you elaborate how you 
suggest to use the metering datapath action to realize that? 
    
    I also think this is a specific issue for the DPDK datapath as only the 
miss upcalls are executed in-line in the PMD thread so that mass upcalls 
directly impact the datapath performance. The kernel datapath (not sure about 
Windows) doesn't have the same issue as upcalls are transferred to the 
user-space through netlink.
    
    Thanks, Jan
    
    > -----Original Message-----
    > From: [email protected] 
[mailto:[email protected]] On Behalf Of Ben Pfaff
    > Sent: Thursday, 15 February, 2018 19:13
    > To: Manohar Krishnappa Chidambaraswamy 
<[email protected]>
    > Cc: [email protected]
    > Subject: Re: [ovs-dev] [PATCH v2] Upcall/Slowpath rate limiter for OVS
    > 
    > Metering is implemented as a datapath action so I think you can share
    > the mechanism instead of inventing a new one.
    > 
    > On Thu, Feb 15, 2018 at 05:57:58PM +0000, Manohar Krishnappa 
Chidambaraswamy wrote:
    > > [Adding to my previous reply]
    > >
    > > Ben,
    > >
    > > “[manu] Do you mean using meters in dp-netdev instead of token-bucket. 
I don’t know much about meters. Will check.”
    > >
    > > I checked about meters. For limiting flows hitting slowpath, meters 
can’t be used. Since meter is an action, new flows would still hit the
    > slow-path before meter-action can be executed.
    > >
    > > Thanx
    > > Manu
    > >
    > > On 25/01/18, 11:52 AM, "[email protected] on behalf of 
Manohar Krishnappa Chidambaraswamy" <ovs-dev-
    > [email protected] on behalf of 
[email protected]> wrote:
    > >
    > >     Ben,
    > >
    > >     On 23/01/18, 2:01 AM, "Ben Pfaff" <[email protected]> wrote:
    > >
    > >         I guess I'm probably the wrong one to ultimately review this 
patch,
    > >         since it's really a DPDK patch.
    > >
    > >         It's difficult to say for sure that this is really an 
improvement, since
    > >         it also causes traffic to be dropped (that is, the traffic that 
needs to
    > >         be slow-pathed).
    > >
    > >     [manu] Thanx for your response. Its true that this also results in
    > >     packet-drops, but these packets belong to new flows that can 
potentially
    > >     take lot more cycles of PMD. Already established flows will *not* 
see any drops
    > >     due to this rate-limiting. If implemented in DPDK, we can’t 
differentiate packets
    > >     belonging to already established vs new flows. To do this, we would 
need something
    > >     in OVS only.
    > >
    > >     Sorry I didn’t include the description in my v2 patch. Here is the 
main intent
    > >     behind this patch (from v1). Do you think this is reasonable?
    > >
    > >     “In OVS-DPDK, both fast-path and slow-path execute in the context 
of a common
    > >     thread (i.e, PMD thread), without any partitioning of CPU cycles 
between the
    > >     two. When there is a burst of new flows coming into the data-path, 
packets
    > >     are punted to slow-path in the order they are received and the PMD 
is busy
    > >     for the duration of the upcall. Slow-path processing of a packet 
consumes
    > >     100-200 times the cycles of fast-path handling. As a result, the 
forwarding
    > >     performance of a PMD degrades significantly during an upcall burst. 
If the
    > >     PMD was highly loaded already, it becomes temporarily overloaded 
and its rx
    > >     queues start filling up. If the upcall burst is long enough, 
packets will be
    > >     dropped when rx queues are full. This happens even if the new flows 
are
    > >     unexpected and the slow-path decides to drop the packets.
    > >
    > >     It is likely that most of the packets dropped due to rx queue 
overflow belong
    > >     to established flows that should have been processed by the 
fast-path. Hence,
    > >     the current OVS-DPDK architecture favors the handling of new flows 
over the
    > >     forwarding of established flows. This is generally a sub-optimal 
approach.
    > >
    > >     Without a limit to the rate of upcalls, OVS-DPDK is vulnerable for 
DoS attacks.
    > >     But even sporadic bursts of e.g. unexpected multicast packets have 
shown to
    > >     cause such packet drops.”
    > >
    > >         I think that this could be handled in a datapath independent 
way using
    > >         meters.
    > >
    > >     [manu] Do you mean using meters in dp-netdev instead of 
token-bucket. I don’t
    > >     know much about meters. Will check.
    > >
    > >     Thanx
    > >     Manu
    > >
    > >         On Mon, Jan 15, 2018 at 08:33:31AM +0000, Manohar Krishnappa 
Chidambaraswamy wrote:
    > >         > Rebased to master.
    > >         >
    > >         > Could you please review this patch?
    > >         >
    > >         > v1 of this patch has been sitting idle for quite sometime. I 
believe this
    > >         > feature is important. Without this feature, an attacker 
sending packets
    > >         > belonging to different (unestablished) flows can result in 
PMDs running only
    > >         > upcalls, reducing the throughput drastically. Huge traffic 
loss due to this
    > >         > is more like a DOS attack.
    > >         >
    > >         > Please let me know your thoughts/comments. This is my first 
patch series in
    > >         > OVS. Sorry if naming of this patch as v2 is not appropriate.
    > >         >
    > >         > Thanx
    > >         > Manu
    > >         >
    > >         > v1 of this patch: https://patchwork.ozlabs.org/patch/836737/
    > >         >
    > >         > Signed-off-by: Manohar K C 
<[email protected]>
    > >         > cc: Jan Scheurich <[email protected]>
    > >         > cc: Ben Pfaff <[email protected]>
    > >         >
    > >         > ---
    > >         >  Documentation/howto/dpdk.rst      | 21 ++++++++++
    > >         >  debian/libopenvswitch-dev.install |  1 -
    > >         >  debian/libopenvswitch.install     |  1 -
    > >         >  debian/rules                      |  4 +-
    > >         >  lib/dpif-netdev.c                 | 81 
+++++++++++++++++++++++++++++++++++++--
    > >         >  vswitchd/vswitch.xml              | 47 
+++++++++++++++++++++++
    > >         >  6 files changed, 148 insertions(+), 7 deletions(-)
    > >         >
    > >         > diff --git a/Documentation/howto/dpdk.rst 
b/Documentation/howto/dpdk.rst
    > >         > index 587aaed..1db1562 100644
    > >         > --- a/Documentation/howto/dpdk.rst
    > >         > +++ b/Documentation/howto/dpdk.rst
    > >         > @@ -716,3 +716,24 @@ devices to bridge ``br0``. Once 
complete, follow the below steps:
    > >         >     Check traffic on multiple queues::
    > >         >
    > >         >         $ cat /proc/interrupts | grep virtio
    > >         > +
    > >         > +Upcall rate limiting
    > >         > +--------------------
    > >         > +ovs-vsctl can be used to enable and configure upcall rate 
limit parameters.
    > >         > +There are 2 configurable values ``upcall-rate`` and 
``upcall-burst`` which
    > >         > +take effect when global enable knob ``upcall-rl`` is set to 
true.
    > >         > +
    > >         > +Upcall rate should be set using ``upcall-rate`` in 
packets-per-sec. For
    > >         > +example::
    > >         > +
    > >         > +    $ ovs-vsctl set Open_vSwitch . 
other_config:upcall-rate=2000
    > >         > +
    > >         > +Upcall burst should be set using ``upcall-burst`` in 
packets-per-sec. For
    > >         > +example::
    > >         > +
    > >         > +    $ ovs-vsctl set Open_vSwitch . 
other_config:upcall-burst=2000
    > >         > +
    > >         > +Upcall ratelimit feature should be globally enabled using 
``upcall-rl``. For
    > >         > +example::
    > >         > +
    > >         > +    $ ovs-vsctl set Open_vSwitch . 
other_config:upcall-rl=true
    > >         > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
    > >         > index c7d157a..e2fd92c 100644
    > >         > --- a/lib/dpif-netdev.c
    > >         > +++ b/lib/dpif-netdev.c
    > >         > @@ -101,6 +101,16 @@ static struct shash dp_netdevs 
OVS_GUARDED_BY(dp_netdev_mutex)
    > >         >
    > >         >  static struct vlog_rate_limit upcall_rl = 
VLOG_RATE_LIMIT_INIT(600, 600);
    > >         >
    > >         > +/* Upcall rate-limit parameters */
    > >         > +static bool upcall_ratelimit;
    > >         > +static unsigned int upcall_rate;
    > >         > +static unsigned int upcall_burst;
    > >         > +
    > >         > +/* TODO: Tune these default values based on upcall perf test 
*/
    > >         > +#define UPCALL_RATELIMIT_DEFAULT false /* Disabled by 
default */
    > >         > +#define UPCALL_RATE_DEFAULT      1000  /* pps */
    > >         > +#define UPCALL_BURST_DEFAULT     1000  /* pps */
    > >         > +
    > >         >  #define DP_NETDEV_CS_SUPPORTED_MASK (CS_NEW | CS_ESTABLISHED 
| CS_RELATED \
    > >         >                                       | CS_INVALID | 
CS_REPLY_DIR | CS_TRACKED \
    > >         >                                       | CS_SRC_NAT | 
CS_DST_NAT)
    > >         > @@ -340,6 +350,7 @@ enum dp_stat_type {
    > >         >                                     hits */
    > >         >      DP_STAT_SENT_PKTS,          /* Packets that has been 
sent. */
    > >         >      DP_STAT_SENT_BATCHES,       /* Number of batches sent. */
    > >         > +    DP_STAT_RATELIMIT_DROP,     /* Packets dropped due to 
upcall policer */
    > >         >      DP_N_STATS
    > >         >  };
    > >         >
    > >         > @@ -645,6 +656,9 @@ struct dp_netdev_pmd_thread {
    > >         >      unsigned long long stats_zero[DP_N_STATS];
    > >         >      uint64_t cycles_zero[PMD_N_CYCLES];
    > >         >
    > >         > +    /* Policer to rate limit slow-path */
    > >         > +    struct token_bucket upcall_tb;
    > >         > +
    > >         >      /* Set to true if the pmd thread needs to be reloaded. */
    > >         >      bool need_reload;
    > >         >  };
    > >         > @@ -894,10 +908,11 @@ pmd_info_show_stats(struct ds *reply,
    > >         >      ds_put_format(reply,
    > >         >                    "\temc hits:%llu\n\tmegaflow hits:%llu\n"
    > >         >                    "\tavg. subtable lookups per hit:%.2f\n"
    > >         > -                  "\tmiss:%llu\n\tlost:%llu\n"
    > >         > +                  "\tmiss:%llu\n\tlost:%llu\n\tratelimit 
drops:%llu\n"
    > >         >                    "\tavg. packets per output batch: %.2f\n",
    > >         >                    stats[DP_STAT_EXACT_HIT], 
stats[DP_STAT_MASKED_HIT],
    > >         >                    lookups_per_hit, stats[DP_STAT_MISS], 
stats[DP_STAT_LOST],
    > >         > +                  stats[DP_STAT_RATELIMIT_DROP],
    > >         >                    packets_per_batch);
    > >         >
    > >         >      if (total_cycles == 0) {
    > >         > @@ -3031,6 +3046,8 @@ dpif_netdev_set_config(struct dpif 
*dpif, const struct smap *other_config)
    > >         >          smap_get_ullong(other_config, "emc-insert-inv-prob",
    > >         >                          DEFAULT_EM_FLOW_INSERT_INV_PROB);
    > >         >      uint32_t insert_min, cur_min;
    > >         > +    unsigned int rate, burst;
    > >         > +    bool ratelimit;
    > >         >
    > >         >      if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) {
    > >         >          free(dp->pmd_cmask);
    > >         > @@ -3056,6 +3073,36 @@ dpif_netdev_set_config(struct dpif 
*dpif, const struct smap *other_config)
    > >         >          }
    > >         >      }
    > >         >
    > >         > +    /* Handle upcall policer params */
    > >         > +    ratelimit = smap_get_bool(other_config, "upcall-rl",
    > >         > +                              UPCALL_RATELIMIT_DEFAULT);
    > >         > +    rate = smap_get_int(other_config, "upcall-rate",
    > >         > +                        UPCALL_RATE_DEFAULT);
    > >         > +    burst = smap_get_int(other_config, "upcall-burst",
    > >         > +                         UPCALL_BURST_DEFAULT);
    > >         > +
    > >         > +    if ((rate != upcall_rate) || (burst != upcall_burst)) {
    > >         > +        VLOG_INFO("Upcall ratelimit params changed : Old - 
rate=%d burst=%d "
    > >         > +                  ": New - rate=%d burst=%d\n", upcall_rate, 
upcall_burst,
    > >         > +                  rate, burst);
    > >         > +
    > >         > +        upcall_rate = rate;
    > >         > +        upcall_burst = burst;
    > >         > +
    > >         > +        /*
    > >         > +         * TODO: See if there is a way to reconfig only the 
policer
    > >         > +         * in each PMD.
    > >         > +         */
    > >         > +        dp_netdev_request_reconfigure(dp);
    > >         > +    }
    > >         > +
    > >         > +    if (ratelimit != upcall_ratelimit) {
    > >         > +        upcall_ratelimit = ratelimit;
    > >         > +
    > >         > +        VLOG_INFO("Upcall ratelimit changed to %s\n",
    > >         > +                  (upcall_ratelimit ? "Enabled" : 
"Disabled"));
    > >         > +    }
    > >         > +
    > >         >      return 0;
    > >         >  }
    > >         >
    > >         > @@ -3958,6 +4005,12 @@ dpif_netdev_run(struct dpif *dpif)
    > >         >      ovs_mutex_lock(&dp->port_mutex);
    > >         >      non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID);
    > >         >      if (non_pmd) {
    > >         > +        /* Reconfig the upcall policer if params have 
changed */
    > >         > +        if ((upcall_rate != non_pmd->upcall_tb.rate) ||
    > >         > +            (upcall_burst != non_pmd->upcall_tb.burst)) {
    > >         > +            token_bucket_init(&non_pmd->upcall_tb, 
upcall_rate, upcall_burst);
    > >         > +        }
    > >         > +
    > >         >          ovs_mutex_lock(&dp->non_pmd_mutex);
    > >         >          cycles_count_start(non_pmd);
    > >         >          HMAP_FOR_EACH (port, node, &dp->ports) {
    > >         > @@ -4154,6 +4207,9 @@ reload:
    > >         >          lc = UINT_MAX;
    > >         >      }
    > >         >
    > >         > +    /* Initialize upcall policer token bucket with 
configured params */
    > >         > +    token_bucket_init(&pmd->upcall_tb, upcall_rate, 
upcall_burst);
    > >         > +
    > >         >      cycles_count_start(pmd);
    > >         >      for (;;) {
    > >         >          for (i = 0; i < poll_cnt; i++) {
    > >         > @@ -4638,6 +4694,10 @@ dp_netdev_configure_pmd(struct 
dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
    > >         >          emc_cache_init(&pmd->flow_cache);
    > >         >          pmd_alloc_static_tx_qid(pmd);
    > >         >      }
    > >         > +
    > >         > +    /* Initialize upcall policer token bucket with 
configured params */
    > >         > +    token_bucket_init(&pmd->upcall_tb, upcall_rate, 
upcall_burst);
    > >         > +
    > >         >      cmap_insert(&dp->poll_threads, CONST_CAST(struct 
cmap_node *, &pmd->node),
    > >         >                  hash_int(core_id, 0));
    > >         >  }
    > >         > @@ -5076,7 +5136,7 @@ handle_packet_upcall(struct 
dp_netdev_pmd_thread *pmd,
    > >         >                       struct dp_packet *packet,
    > >         >                       const struct netdev_flow_key *key,
    > >         >                       struct ofpbuf *actions, struct ofpbuf 
*put_actions,
    > >         > -                     int *lost_cnt)
    > >         > +                     int *lost_cnt, int *rl_drop_cnt)
    > >         >  {
    > >         >      struct ofpbuf *add_actions;
    > >         >      struct dp_packet_batch b;
    > >         > @@ -5084,6 +5144,18 @@ handle_packet_upcall(struct 
dp_netdev_pmd_thread *pmd,
    > >         >      ovs_u128 ufid;
    > >         >      int error;
    > >         >
    > >         > +    /*
    > >         > +     * Grab a token from the upcall policer to enter 
slowpath. If token
    > >         > +     * is not available, drop and account the packet. This 
is to
    > >         > +     * rate-limit packets getting into slowpath.
    > >         > +     */
    > >         > +    if (upcall_ratelimit && 
!token_bucket_withdraw(&pmd->upcall_tb, 1)) {
    > >         > +        dp_packet_delete(packet);
    > >         > +        (*rl_drop_cnt)++;
    > >         > +
    > >         > +        return;
    > >         > +    }
    > >         > +
    > >         >      match.tun_md.valid = false;
    > >         >      miniflow_expand(&key->mf, &match.flow);
    > >         >
    > >         > @@ -5158,7 +5230,7 @@ fast_path_processing(struct 
dp_netdev_pmd_thread *pmd,
    > >         >      struct dpcls *cls;
    > >         >      struct dpcls_rule *rules[PKT_ARRAY_SIZE];
    > >         >      struct dp_netdev *dp = pmd->dp;
    > >         > -    int miss_cnt = 0, lost_cnt = 0;
    > >         > +    int miss_cnt = 0, lost_cnt = 0, rl_drop_cnt = 0;
    > >         >      int lookup_cnt = 0, add_lookup_cnt;
    > >         >      bool any_miss;
    > >         >      size_t i;
    > >         > @@ -5202,7 +5274,7 @@ fast_path_processing(struct 
dp_netdev_pmd_thread *pmd,
    > >         >
    > >         >              miss_cnt++;
    > >         >              handle_packet_upcall(pmd, packet, &keys[i], 
&actions,
    > >         > -                                 &put_actions, &lost_cnt);
    > >         > +                                 &put_actions, &lost_cnt, 
&rl_drop_cnt);
    > >         >          }
    > >         >
    > >         >          ofpbuf_uninit(&actions);
    > >         > @@ -5235,6 +5307,7 @@ fast_path_processing(struct 
dp_netdev_pmd_thread *pmd,
    > >         >      dp_netdev_count_packet(pmd, DP_STAT_LOOKUP_HIT, 
lookup_cnt);
    > >         >      dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_cnt);
    > >         >      dp_netdev_count_packet(pmd, DP_STAT_LOST, lost_cnt);
    > >         > +    dp_netdev_count_packet(pmd, DP_STAT_RATELIMIT_DROP, 
rl_drop_cnt);
    > >         >  }
    > >         >
    > >         >  /* Packets enter the datapath from a port (or from 
recirculation) here.
    > >         > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
    > >         > index 58c0ebd..bc77c35 100644
    > >         > --- a/vswitchd/vswitch.xml
    > >         > +++ b/vswitchd/vswitch.xml
    > >         > @@ -412,6 +412,53 @@
    > >         >          </p>
    > >         >        </column>
    > >         >
    > >         > +      <column name="other_config" key="upcall-rl"
    > >         > +              type='{"type": "boolean"}'>
    > >         > +        <p>
    > >         > +          Set this value to <code>true</code> to enable 
upcall rate-limiting.
    > >         > +          The upcall parameters like rate and burst will be 
ignored, if this is
    > >         > +          not set.
    > >         > +        </p>
    > >         > +        <p>
    > >         > +          The default value is <code>false</code> and upcall 
rate-limiting will
    > >         > +          be disabled.
    > >         > +        </p>
    > >         > +      </column>
    > >         > +
    > >         > +      <column name="other_config" key="upcall-rate"
    > >         > +        type='{"type": "integer", "minInteger": 0, 
"maxInteger": 4294967295}'>
    > >         > +        <p>
    > >         > +          Specifies the rate of upcalls in 
packets-per-second that is to be
    > >         > +          allowed. For example, if the value is 10000, then 
those many upcalls
    > >         > +          (for packets) are allowed per second in each of 
the packet polling
    > >         > +          thread (PMD or non-PMD).
    > >         > +        </p>
    > >         > +        <p>
    > >         > +          A value of <code>0</code> means, no upcalls would 
be allowed i.e,
    > >         > +          upcall will be disabled. This is mainly for 
debugging.
    > >         > +        </p>
    > >         > +        <p>
    > >         > +          The default value is 1000.
    > >         > +        </p>
    > >         > +      </column>
    > >         > +
    > >         > +      <column name="other_config" key="upcall-burst"
    > >         > +        type='{"type": "integer", "minInteger": 0, 
"maxInteger": 4294967295}'>
    > >         > +        <p>
    > >         > +          Specifies the maximum burst of upcalls in 
packets-per-second that is
    > >         > +          to be allowed. For example, if the value is 15000, 
then a maximum
    > >         > +          burst of 15000 upcalls (for packets) are allowed 
per second in each
    > >         > +          of the packet polling thread (PMD or non-PMD).
    > >         > +        </p>
    > >         > +        <p>
    > >         > +          A value of <code>0</code> means, no upcalls would 
be allowed i.e,
    > >         > +          upcall will be disabled. This is mainly for 
debugging.
    > >         > +        </p>
    > >         > +        <p>
    > >         > +          The default value is 1000.
    > >         > +        </p>
    > >         > +      </column>
    > >         > +
    > >         >        <column name="other_config" key="vlan-limit"
    > >         >                type='{"type": "integer", "minInteger": 0}'>
    > >         >          <p>
    > >         > --
    > >         > 1.9.1
    > >         >
    > >         >
    > >         >
    > >
    > >
    > >     _______________________________________________
    > >     dev mailing list
    > >     [email protected]
    > >     https://mail.openvswitch.org/mailman/listinfo/ovs-dev
    > >
    > >
    > _______________________________________________
    > dev mailing list
    > [email protected]
    > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
    

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to