Re: [PATCH net-next 05/14] nfp: abm: add simple RED offload

2018-05-29 Thread Nogah Frankel

On 29-May-18 2:05 AM, Jakub Kicinski wrote:

Hi Jakub,


Hi Nogah!

On Mon, 28 May 2018 18:49:51 +0300, Nogah Frankel wrote:

+static int
+nfp_abm_red_replace(struct net_device *netdev, struct nfp_abm_link *alink,
+   struct tc_red_qopt_offload *opt)
+{
+   struct nfp_port *port = nfp_port_from_netdev(netdev);
+   int err;
+
+   if (opt->set.min != opt->set.max || !opt->set.is_ecn) {


I am a bit worried about the min == max.
sch_red doesn't really support it. It will calculate incorrect delta
value. (And that only if tc_red_eval_P in iproute2 won't reject it).
You might maybe use max = min+1,  because in real life it will probably
act the same but without this problem.


I remember having a long think about this when I wrote the code.
My conclusion was that the two would operate almost the same, and
setting min == max may be most obvious to the user.


I agree.



If min + 1 == max sch_red would act probabilistically for qavg == min,
which is not what the card would do.

Userspace now does this:

tc_red_eval_P() {
int i = qmax - qmin;
  
	if (!i)

return 0;
if (i < 0)
return -1;
...
}

And you've fixed delta to be treated as 1 to avoid division by 0 in
commit 5c472203421a ("net_sched: red: Avoid devision by zero"):

red_set_parms() {
int delta = qth_max - qth_min;
u32 max_p_delta;

p->qth_min   = qth_min << Wlog;
p->qth_max   = qth_max << Wlog;
p->Wlog  = Wlog;
p->Plog  = Plog;
if (delta <= 0)
delta = 1;
p->qth_delta = delta;
...
}


I changes it to avoid division by 0, but I wasn't sure that the delta 
value of 1 will be good, just that it is better then 0.




So we should be safe.  Targets will match.  Probability adjustment for
adaptive should work correctly.  Which doesn't matter anyway, since we
will never use the probabilistic action...


That makes sense, and it is a better way to set this setup (DCTCP, I 
guess?) than before.


Thanks

Nogah





Nogah Frankel
(from a new mail address)


Noted :)



Re: [PATCH net-next 05/14] nfp: abm: add simple RED offload

2018-05-28 Thread Nogah Frankel

On 26-May-18 7:53 AM, Jakub Kicinski wrote:

Offload simple RED configurations.  For now support only DCTCP
like scenarios where min and max are the same.

Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>
---
  drivers/net/ethernet/netronome/nfp/abm/main.c | 82 +++
  drivers/net/ethernet/netronome/nfp/abm/main.h | 10 +++
  2 files changed, 92 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/abm/main.c 
b/drivers/net/ethernet/netronome/nfp/abm/main.c
index 28a18ac62040..22251d88c958 100644
--- a/drivers/net/ethernet/netronome/nfp/abm/main.c
+++ b/drivers/net/ethernet/netronome/nfp/abm/main.c
@@ -38,6 +38,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  #include "../nfpcore/nfp.h"

  #include "../nfpcore/nfp_cpp.h"
@@ -55,6 +57,84 @@ static u32 nfp_abm_portid(enum nfp_repr_type rtype, unsigned 
int id)
   FIELD_PREP(NFP_ABM_PORTID_ID, id);
  }
  
+static void

+nfp_abm_red_destroy(struct net_device *netdev, struct nfp_abm_link *alink,
+   u32 handle)
+{
+   struct nfp_port *port = nfp_port_from_netdev(netdev);
+
+   if (handle != alink->qdiscs[0].handle)
+   return;
+
+   alink->qdiscs[0].handle = TC_H_UNSPEC;
+   port->tc_offload_cnt = 0;
+   nfp_abm_ctrl_set_all_q_lvls(alink, ~0);
+}
+
+static int
+nfp_abm_red_replace(struct net_device *netdev, struct nfp_abm_link *alink,
+   struct tc_red_qopt_offload *opt)
+{
+   struct nfp_port *port = nfp_port_from_netdev(netdev);
+   int err;
+
+   if (opt->set.min != opt->set.max || !opt->set.is_ecn) {


I am a bit worried about the min == max.
sch_red doesn't really support it. It will calculate incorrect delta 
value. (And that only if tc_red_eval_P in iproute2 won't reject it).
You might maybe use max = min+1,  because in real life it will probably 
act the same but without this problem.


Nogah Frankel
(from a new mail address)


+   nfp_warn(alink->abm->app->cpp,
+"RED offload failed - unsupported parameters\n");
+   err = -EINVAL;
+   goto err_destroy;
+   }
+   err = nfp_abm_ctrl_set_all_q_lvls(alink, opt->set.min);
+   if (err)
+   goto err_destroy;
+
+   alink->qdiscs[0].handle = opt->handle;
+   port->tc_offload_cnt = 1;
+
+   return 0;
+err_destroy:
+   if (alink->qdiscs[0].handle != TC_H_UNSPEC)
+   nfp_abm_red_destroy(netdev, alink, alink->qdiscs[0].handle);
+   return err;
+}
+
+static int
+nfp_abm_setup_tc_red(struct net_device *netdev, struct nfp_abm_link *alink,
+struct tc_red_qopt_offload *opt)
+{
+   if (opt->parent != TC_H_ROOT)
+   return -EOPNOTSUPP;
+
+   switch (opt->command) {
+   case TC_RED_REPLACE:
+   return nfp_abm_red_replace(netdev, alink, opt);
+   case TC_RED_DESTROY:
+   nfp_abm_red_destroy(netdev, alink, opt->handle);
+   return 0;
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static int
+nfp_abm_setup_tc(struct nfp_app *app, struct net_device *netdev,
+enum tc_setup_type type, void *type_data)
+{
+   struct nfp_repr *repr = netdev_priv(netdev);
+   struct nfp_port *port;
+
+   port = nfp_port_from_netdev(netdev);
+   if (!port || port->type != NFP_PORT_PF_PORT)
+   return -EOPNOTSUPP;
+
+   switch (type) {
+   case TC_SETUP_QDISC_RED:
+   return nfp_abm_setup_tc_red(netdev, repr->app_priv, type_data);
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
  static struct net_device *nfp_abm_repr_get(struct nfp_app *app, u32 port_id)
  {
enum nfp_repr_type rtype;
@@ -403,6 +483,8 @@ const struct nfp_app_type app_abm = {
.vnic_alloc = nfp_abm_vnic_alloc,
.vnic_free  = nfp_abm_vnic_free,
  
+	.setup_tc	= nfp_abm_setup_tc,

+
.eswitch_mode_get   = nfp_abm_eswitch_mode_get,
.eswitch_mode_set   = nfp_abm_eswitch_mode_set,
  
diff --git a/drivers/net/ethernet/netronome/nfp/abm/main.h b/drivers/net/ethernet/netronome/nfp/abm/main.h

index 1ac651cdc140..979f98fb808b 100644
--- a/drivers/net/ethernet/netronome/nfp/abm/main.h
+++ b/drivers/net/ethernet/netronome/nfp/abm/main.h
@@ -58,18 +58,28 @@ struct nfp_abm {
const struct nfp_rtsym *q_lvls;
  };
  
+/**

+ * struct nfp_red_qdisc - representation of single RED Qdisc
+ * @handle:handle of currently offloaded RED Qdisc
+ */
+struct nfp_red_qdisc {
+   u32 handle;
+};
+
  /**
   * struct nfp_abm_link - port tuple of a ABM NIC
   * @abm:  back pointer to nfp_abm
   * @vnic: data vNIC
   * @id:   id of the data vNIC
   * @queue_base:   id of base to host queue within PCIe (not QC idx)
+ * @qdiscs:array of qdiscs
   */
  struct nfp_abm_link {

RE: [PATCH net-next] net: sch: prio: Set bands to default on delete instead of noop

2018-04-29 Thread Nogah Frankel
> > When a band is created, it is set to the default qdisc, which is
> > "invisible" pfifo.
> > However, if a band is set to a qdisc that is later being deleted, it will
> > be set to noop qdisc. This can cause a packet loss, while there is no clear
> > user indication for it. ("invisible" qdisc are not being shown by default).
> > This patch sets a band to the default qdisc, rather then the noop qdisc, on
> > delete operation.
> >
> > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> 
> Like Cong, I'm worried this will break something.  The code has
> behaved this way for 2 decades or longer.
> 
> If you want to put another qdisc there, and thus not drop any traffic,
> modify the qdisc to a new one instead of performing a delete operation.

I believe that this historic behavior is quite bug like, especially because
TCA_DUMP_INVISIBLE is not set by default.
Yet I accept your judgment that changing it now might break things.
Thank you.
 
Nogah



[PATCH net-next] net: sch: prio: Set bands to default on delete instead of noop

2018-04-26 Thread Nogah Frankel
When a band is created, it is set to the default qdisc, which is
"invisible" pfifo.
However, if a band is set to a qdisc that is later being deleted, it will
be set to noop qdisc. This can cause a packet loss, while there is no clear
user indication for it. ("invisible" qdisc are not being shown by default).
This patch sets a band to the default qdisc, rather then the noop qdisc, on
delete operation.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
---
 net/sched/sch_prio.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 222e53d..6862d23 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -314,8 +314,15 @@ static int prio_graft(struct Qdisc *sch, unsigned long 
arg, struct Qdisc *new,
bool any_qdisc_is_offloaded;
int err;
 
-   if (new == NULL)
-   new = _qdisc;
+   if (!new) {
+   new = qdisc_create_dflt(sch->dev_queue, _qdisc_ops,
+   TC_H_MAKE(sch->handle, band + 1),
+   extack);
+   if (!new)
+   new = _qdisc;
+   else
+   qdisc_hash_add(new, true);
+   }
 
*old = qdisc_replace(sch, new, >queues[band]);
 
@@ -332,7 +339,7 @@ static int prio_graft(struct Qdisc *sch, unsigned long arg, 
struct Qdisc *new,
_offload);
 
/* Don't report error if the graft is part of destroy operation. */
-   if (err && new != _qdisc) {
+   if (err && new->handle) {
/* Don't report error if the parent, the old child and the new
 * one are not offloaded.
 */
-- 
2.4.11



RE: [PATCH net-next v2] net: sched: red: don't reset the backlog on every stat dump

2018-01-16 Thread Nogah Frankel
> -Original Message-
> From: Jakub Kicinski [mailto:jakub.kicin...@netronome.com]
> Sent: Monday, January 15, 2018 6:01 AM
> To: da...@davemloft.net; j...@resnulli.us; Nogah Frankel
> <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; oss-driv...@netronome.com;
> xiyou.wangc...@gmail.com; eduma...@google.com; Yuval Mintz
> <yuv...@mellanox.com>; Jakub Kicinski <jakub.kicin...@netronome.com>
> Subject: [PATCH net-next v2] net: sched: red: don't reset the backlog on every
> stat dump
> 
> Commit 0dfb33a0d7e2 ("sch_red: report backlog information") copied
> child's backlog into RED's backlog.  Back then RED did not maintain
> its own backlog counts.  This has changed after commit 2f5fb43f
> ("net_sched: update hierarchical backlog too") and commit d7f4f332f082
> ("sch_red: update backlog as well").  Copying is no longer necessary.
> 
> Tested:
> 
> $ tc -s qdisc show dev veth0
> qdisc red 1: root refcnt 2 limit 40b min 3b max 3b ecn
>  Sent 20942 bytes 221 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 1260b 14p requeues 14
>   marked 0 early 0 pdrop 0 other 0
> qdisc tbf 2: parent 1: rate 1Kbit burst 15000b lat 3585.0s
>  Sent 20942 bytes 221 pkt (dropped 0, overlimits 138 requeues 0)
>  backlog 1260b 14p requeues 14
> 
> Recently RED offload was added.  We need to make sure drivers don't
> depend on resetting the stats.  This means backlog should be treated
> like any other statistic:
> 
>   total_stat = new_hw_stat - prev_hw_stat;
> 
> Adjust mlxsw.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>

Acked-by: Nogah Frankel <nog...@mellanox.com>

Thanks
Nogah

> ---
> v2:
>  - reuse the mlxsw infra added for prio;
>  - align the way qstats are passed with prio.
> 
>  .../net/ethernet/mellanox/mlxsw/spectrum_qdisc.c   | 26
> +++---
>  include/net/pkt_cls.h  |  1 +
>  net/sched/sch_red.c|  2 +-
>  3 files changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> index e11a0abfc663..8cac5202b913 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> @@ -247,6 +247,8 @@ mlxsw_sp_setup_tc_qdisc_red_clean_stats(struct
> mlxsw_sp_port *mlxsw_sp_port,
> 
>   stats_base->overlimits = red_base->prob_drop + red_base-
> >prob_mark;
>   stats_base->drops = red_base->prob_drop + red_base->pdrop;
> +
> + stats_base->backlog = 0;
>  }
> 
>  static int
> @@ -306,6 +308,19 @@ mlxsw_sp_qdisc_red_replace(struct mlxsw_sp_port
> *mlxsw_sp_port,
>max, prob, p->is_ecn);
>  }
> 
> +static void
> +mlxsw_sp_qdisc_red_unoffload(struct mlxsw_sp_port *mlxsw_sp_port,
> +  struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
> +  void *params)
> +{
> + struct tc_red_qopt_offload_params *p = params;
> + u64 backlog;
> +
> + backlog = mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp,
> +mlxsw_sp_qdisc->stats_base.backlog);
> + p->qstats->backlog -= backlog;
> +}
> +
>  static int
>  mlxsw_sp_qdisc_get_red_xstats(struct mlxsw_sp_port *mlxsw_sp_port,
> struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
> @@ -338,7 +353,7 @@ mlxsw_sp_qdisc_get_red_stats(struct mlxsw_sp_port
> *mlxsw_sp_port,
>struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
>struct tc_qopt_offload_stats *stats_ptr)
>  {
> - u64 tx_bytes, tx_packets, overlimits, drops;
> + u64 tx_bytes, tx_packets, overlimits, drops, backlog;
>   u8 tclass_num = mlxsw_sp_qdisc->tclass_num;
>   struct mlxsw_sp_qdisc_stats *stats_base;
>   struct mlxsw_sp_port_xstats *xstats;
> @@ -354,14 +369,18 @@ mlxsw_sp_qdisc_get_red_stats(struct
> mlxsw_sp_port *mlxsw_sp_port,
>stats_base->overlimits;
>   drops = xstats->wred_drop[tclass_num] + xstats-
> >tail_drop[tclass_num] -
>   stats_base->drops;
> + backlog = xstats->backlog[tclass_num];
> 
>   _bstats_update(stats_ptr->bstats, tx_bytes, tx_packets);
>   stats_ptr->qstats->overlimits += overlimits;
>   stats_ptr->qstats->drops += drops;
>   stats_ptr->qstats->backlog +=
> - mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp,
> -  xstats->backlog[tclass_num])

RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc

2018-01-12 Thread Nogah Frankel
> > > > > > Hm.  You you need this just because you didn't add the backlog
> > > > > > pointer to destroy?  AFAIK on destroy we are free to reset
> > > > > > stats as well, thus simplifying your driver...  Let me know
> > > > > > if I misunderstand.
> >
> > The problem of doing it in destroy is when one qdisc is replacing
> > another. I want to be able to destroy the old qdisc to "make room"
> > for the new one before I get the destroy command for the old qdisc
> > (that will come just after the replace command for the new qdisc).
> > If I am saying that the destroy changes the stats, I need to save
> > some data about the old qdisc till I get the destroy command for it.
> 
> Agreed, maintaining a coherent destroy behavior would be problematic
> when successful replace with a new qdisc (e.g. different handle) is
> involved :(
> 
> Besides the momentary stats seem to be reset before destroy so not
> touching them may be in fact more correct.  I need to look into the
> propagation done in qdisc_tree_reduce_backlog(), it worries me.  If
> we start stacking the qdiscs (e.g. red on top of prio) it could mess
> with the root's backlog...

I think it can be solved in the driver, or by using the OFFLOAD flag to
avoid this backlog reduce for offloaded qdiscs.

It might make sense in some point to separate the HW statistics from the
SW, at least for the momentary ones.



RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc

2018-01-12 Thread Nogah Frankel


> -Original Message-
> From: Yuval Mintz
> Sent: Friday, January 12, 2018 10:47 AM
> To: Jakub Kicinski <kubak...@wp.pl>
> Cc: Jiri Pirko <j...@resnulli.us>; netdev@vger.kernel.org; Nogah Frankel
> <nog...@mellanox.com>; da...@davemloft.net; Ido Schimmel
> <ido...@mellanox.com>; mlxsw <ml...@mellanox.com>; j...@mojatatu.com;
> xiyou.wangc...@gmail.com
> Subject: RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for 
> PRIO qdisc
> 
> > > > Hm.  You you need this just because you didn't add the backlog
> > > > pointer to destroy?  AFAIK on destroy we are free to reset stats as
> > > > well, thus simplifying your driver...  Let me know if I
> > > > misunderstand.

The problem of doing it in destroy is when one qdisc is replacing another.
I want to be able to destroy the old qdisc to "make room" for the new one
before I get the destroy command for the old qdisc (that will come just
after the replace command for the new qdisc).
If I am saying that the destroy changes the stats, I need to save some data
about the old qdisc till I get the destroy command for it.

> > >
> > > This is meant exactly for the scenario where qdisc didn't get
> > > destroyed yet is no longer offloaded; E.g., if number of bands
> > > increased beyond What we can offload. So we can't reset the
> > > statistics in this case. [Although I might be the one to
> > > misunderstand you, as the 'not destroyed' was explicitly mentioned
> > > twice above]
> >
> > I was trying to take some liberty with handling of destroy but your
> > approach may actually end up being simpler.  I will withdraw my series
> > for now and reuse your new callback once this series lands.
> >
> > Do you have any objections to changing RED to behave more like prio
> > (and other qdiscs) in principle?

On the contrary, I think it is great.

> 
> From statistics' perspective? None.


RE: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc

2018-01-12 Thread Nogah Frankel


> -Original Message-
> From: Yuval Mintz
> Sent: Friday, January 12, 2018 2:05 AM
> To: Jakub Kicinski <kubak...@wp.pl>
> Cc: Jiri Pirko <j...@resnulli.us>; netdev@vger.kernel.org; Nogah Frankel
> <nog...@mellanox.com>; da...@davemloft.net; Ido Schimmel
> <ido...@mellanox.com>; mlxsw <ml...@mellanox.com>; j...@mojatatu.com;
> xiyou.wangc...@gmail.com
> Subject: RE: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO 
> qdisc
> 
> > > > > +struct tc_prio_qopt_offload_params {
> > > > > + int bands;
> > > > > + u8 priomap[TC_PRIO_MAX + 1];
> > > > > + /* In case that a prio qdisc is offloaded and now is changed to 
> > > > > a
> > > > > +  * non-offloadedable config, it needs to update the backlog 
> > > > > value
> > > > > +  * to negate the HW backlog value.
> > > > > +  */
> > > > > + u32 *backlog;
> > > > > +};
> > > >
> > > > Could we please pass the full qstats on replace and destroy.  This
> > > > simplifies the driver code and allows handling the qlen as well as
> > > > backlog.  Please see the 2 patch series I sent earlier yesterday.

On replace - no problem.
On destroy, I think it is redundant because the qdisc is going to be
deleted anyway.
(To make sure that we are on the same page, if replace in the HW fails, we
"rollback" only the backlog and qlen. The rest of the counters like TX
count should reflect the number of the packets that passed in the HW while
this qdisc was offloaded)

> > >
> > > That might give the false impression that offloading driver is expected
> > > to correct all the qstats fields during destruction, whereas for most of
> > > them it doesn't seem appropriate.
> >
> > The driver is supposed to return the momentary stats to their
> > original/SW-only value.  And the driver knows exactly which stats
> > those are, just look at your patch 5, you handle backlog completely
> > differently than other stats already.
> 
> *we* surely understand that now. I'm just mentioning it might
> confuse future offloaders; No strong objection here.
> And I agree the alternative [passing pointers to each momentary stat]
> is quite ugly.


RE: [RFC -next 2/2] net: sched: red: don't reset the backlog on every stat dump

2018-01-08 Thread Nogah Frankel
> -Original Message-
> From: Jakub Kicinski [mailto:jakub.kicin...@netronome.com]
> Sent: Thursday, January 04, 2018 10:19 PM
> To: john.fastab...@gmail.com; j...@resnulli.us; xiyou.wangc...@gmail.com;
> Nogah Frankel <nog...@mellanox.com>; Yuval Mintz
> <yuv...@mellanox.com>
> Cc: netdev@vger.kernel.org; oss-driv...@netronome.com;
> eduma...@google.com; Jakub Kicinski <jakub.kicin...@netronome.com>
> Subject: [RFC -next 2/2] net: sched: red: don't reset the backlog on every 
> stat
> dump
> 
> Commit 0dfb33a0d7e2 ("sch_red: report backlog information") copied
> child's backlog into RED's backlog.  Back then RED did not maintain
> its own backlog counts.  This has changed after commit 2f5fb43f
> ("net_sched: update hierarchical backlog too") and commit d7f4f332f082
> ("sch_red: update backlog as well").  Copying is no longer necessary.
> 
> Tested:
> 
> $ tc -s qdisc show dev veth0
> qdisc red 1: root refcnt 2 limit 40b min 3b max 3b ecn
>  Sent 20942 bytes 221 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 1260b 14p requeues 14
>   marked 0 early 0 pdrop 0 other 0
> qdisc tbf 2: parent 1: rate 1Kbit burst 15000b lat 3585.0s
>  Sent 20942 bytes 221 pkt (dropped 0, overlimits 138 requeues 0)
>  backlog 1260b 14p requeues 14
> 
> Recently RED offload was added.  We need to make sure drivers don't
> depend on resetting the stats.  This means backlog should be treated
> like any other statistic:
> 
>   total_stat = new_hw_stat - prev_hw_stat;
> 
> Unlike for other statistics new_hw_stat < prev_hw_stat can be true.
> Adjust mlxsw.

There is one problem with this patch, and that is that we can fail in 
changing RED that is offloaded. In this case, we delete RED from the driver
but the backlog will still include the hardware backlog.
The solution is to send in the offload-replace command a pointer to the
backlog, so failure in updating the hardware can be follow by backlog update,
if needed.

Thanks

Nogah

> 
> Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>
> ---
>  drivers/net/ethernet/mellanox/mlxsw/spectrum.h   | 1 +
>  drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c | 9 +++--
>  net/sched/sch_red.c  | 1 -
>  3 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> index ff8d32bc852c..6755050e4ee0 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> @@ -237,6 +237,7 @@ struct mlxsw_sp_qdisc {
>   u64 tx_packets;
>   u64 drops;
>   u64 overlimits;
> + u64 backlog;
>  };
> 
>  /* No need an internal lock; At worse - miss a single periodic iteration */
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> index c33beac5def0..d5091740bd40 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> @@ -212,6 +212,7 @@ mlxsw_sp_qdisc_get_red_stats(struct mlxsw_sp_port
> *mlxsw_sp_port, u32 handle,
>   u64 tx_bytes, tx_packets, overlimits, drops;
>   struct mlxsw_sp_port_xstats *xstats;
>   struct rtnl_link_stats64 *stats;
> + s64 backlog;
> 
>   if (mlxsw_sp_qdisc->handle != handle ||
>   mlxsw_sp_qdisc->type != MLXSW_SP_QDISC_RED)
> @@ -226,17 +227,21 @@ mlxsw_sp_qdisc_get_red_stats(struct
> mlxsw_sp_port *mlxsw_sp_port, u32 handle,
>mlxsw_sp_qdisc->overlimits;
>   drops = xstats->wred_drop[tclass_num] + xstats-
> >tail_drop[tclass_num] -
>   mlxsw_sp_qdisc->drops;
> + backlog = mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp,
> +xstats->backlog[tclass_num]) -
> + mlxsw_sp_qdisc->backlog;
> 
>   _bstats_update(res->bstats, tx_bytes, tx_packets);
>   res->qstats->overlimits += overlimits;
>   res->qstats->drops += drops;
> - res->qstats->backlog += mlxsw_sp_cells_bytes(mlxsw_sp_port-
> >mlxsw_sp,
> - xstats->backlog[tclass_num]);
> + res->qstats->backlog += backlog;
> 
>   mlxsw_sp_qdisc->drops +=  drops;
>   mlxsw_sp_qdisc->overlimits += overlimits;
>   mlxsw_sp_qdisc->tx_bytes += tx_bytes;
>   mlxsw_sp_qdisc->tx_packets += tx_packets;
> + mlxsw_sp_qdisc->backlog += backlog;
> +
>   return 0;
>  }
> 
> diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
> index a392eaa4a0b4..caebb7e37551 100644
> --- a/net/sched/sch_red.c
> +++ b/net/sched/sch_red.c
> @@ -322,7 +322,6 @@ static int red_dump(struct Qdisc *sch, struct sk_buff
> *skb)
>   };
>   int err;
> 
> - sch->qstats.backlog = q->qdisc->qstats.backlog;
>   err = red_dump_offload_stats(sch, );
>   if (err)
>   goto nla_put_failure;
> --
> 2.15.1



[patch net-next 0/2] net: sched: Fix RED qdisc offload flag

2017-12-25 Thread Nogah Frankel
Replacing the RED offload flag (TC_RED_OFFLOADED) with the generic one
(TCQ_F_OFFLOADED) deleted some of the logic behind it. This patchset fixes
this problem.

Nogah Frankel (2):
  net_sch: red: Fix the new offload indication
  net: sched: Move offload check till after dump call

 net/sched/sch_api.c |  5 ++---
 net/sched/sch_red.c | 26 ++
 2 files changed, 16 insertions(+), 15 deletions(-)

-- 
2.4.3



[patch net-next 2/2] net: sched: Move offload check till after dump call

2017-12-25 Thread Nogah Frankel
Move the check of the offload state to after the qdisc dump action was
called, so the qdisc could update it if it was changed.

Fixes: 7a4fa29106d9 ("net: sched: Add TCA_HW_OFFLOAD")
Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Yuval Mintz <yuv...@mellanox.com>
---
 net/sched/sch_api.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 3a3a1da..758f132 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -807,11 +807,10 @@ static int tc_fill_qdisc(struct sk_buff *skb, struct 
Qdisc *q, u32 clid,
tcm->tcm_info = refcount_read(>refcnt);
if (nla_put_string(skb, TCA_KIND, q->ops->id))
goto nla_put_failure;
-   if (nla_put_u8(skb, TCA_HW_OFFLOAD, !!(q->flags & TCQ_F_OFFLOADED)))
-   goto nla_put_failure;
if (q->ops->dump && q->ops->dump(q, skb) < 0)
goto nla_put_failure;
-
+   if (nla_put_u8(skb, TCA_HW_OFFLOAD, !!(q->flags & TCQ_F_OFFLOADED)))
+   goto nla_put_failure;
qlen = qdisc_qlen_sum(q);
 
stab = rtnl_dereference(q->stab);
-- 
2.4.3



[patch net-next 1/2] net_sch: red: Fix the new offload indication

2017-12-25 Thread Nogah Frankel
Update the offload flag, TCQ_F_OFFLOADED, in each dump call (and ignore
the offloading function return value in relation to this flag).
This is done because a qdisc is being initialized, and therefore offloaded
before being grafted. Since the ability of the driver to offload the qdisc
depends on its location, a qdisc can be offloaded and un-offloaded by graft
calls, that doesn't effect the qdisc itself.

Fixes: 428a68af3a7c ("net: sched: Move to new offload indication in RED"
Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Yuval Mintz <yuv...@mellanox.com>
---
 net/sched/sch_red.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
index ec0bd36..a392eaa 100644
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -157,7 +157,6 @@ static int red_offload(struct Qdisc *sch, bool enable)
.handle = sch->handle,
.parent = sch->parent,
};
-   int err;
 
if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
return -EOPNOTSUPP;
@@ -172,14 +171,7 @@ static int red_offload(struct Qdisc *sch, bool enable)
opt.command = TC_RED_DESTROY;
}
 
-   err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED, );
-
-   if (!err && enable)
-   sch->flags |= TCQ_F_OFFLOADED;
-   else
-   sch->flags &= ~TCQ_F_OFFLOADED;
-
-   return err;
+   return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED, );
 }
 
 static void red_destroy(struct Qdisc *sch)
@@ -297,12 +289,22 @@ static int red_dump_offload_stats(struct Qdisc *sch, 
struct tc_red_qopt *opt)
.stats.qstats = >qstats,
},
};
+   int err;
+
+   sch->flags &= ~TCQ_F_OFFLOADED;
 
-   if (!(sch->flags & TCQ_F_OFFLOADED))
+   if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
+   return 0;
+
+   err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED,
+   _stats);
+   if (err == -EOPNOTSUPP)
return 0;
 
-   return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED,
-_stats);
+   if (!err)
+   sch->flags |= TCQ_F_OFFLOADED;
+
+   return err;
 }
 
 static int red_dump(struct Qdisc *sch, struct sk_buff *skb)
-- 
2.4.3



[patch net 0/2] RED qdisc fixes

2017-12-04 Thread Nogah Frankel
Add some input validation checks to RED qdisc.

Nogah Frankel (2):
  net_sched: red: Avoid devision by zero
  net_sched: red: Avoid illegal values

 include/net/red.h | 13 -
 net/sched/sch_choke.c |  3 +++
 net/sched/sch_gred.c  |  3 +++
 net/sched/sch_red.c   |  2 ++
 net/sched/sch_sfq.c   |  3 +++
 5 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.4.3



[patch net 2/2] net_sched: red: Avoid illegal values

2017-12-04 Thread Nogah Frankel
Check the qmin & qmax values doesn't overflow for the given Wlog value.
Check that qmin <= qmax.

Fixes: a783474591f2 ("[PKT_SCHED]: Generic RED layer")
Signed-off-by: Nogah Frankel <nog...@mellanox.com>
---
 include/net/red.h | 11 +++
 net/sched/sch_choke.c |  3 +++
 net/sched/sch_gred.c  |  3 +++
 net/sched/sch_red.c   |  2 ++
 net/sched/sch_sfq.c   |  3 +++
 5 files changed, 22 insertions(+)

diff --git a/include/net/red.h b/include/net/red.h
index 5918f78..9665582 100644
--- a/include/net/red.h
+++ b/include/net/red.h
@@ -168,6 +168,17 @@ static inline void red_set_vars(struct red_vars *v)
v->qcount   = -1;
 }
 
+static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog)
+{
+   if (fls(qth_min) + Wlog > 32)
+   return false;
+   if (fls(qth_max) + Wlog > 32)
+   return false;
+   if (qth_max < qth_min)
+   return false;
+   return true;
+}
+
 static inline void red_set_parms(struct red_parms *p,
 u32 qth_min, u32 qth_max, u8 Wlog, u8 Plog,
 u8 Scell_log, u8 *stab, u32 max_P)
diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index b30a2c7..531250f 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -369,6 +369,9 @@ static int choke_change(struct Qdisc *sch, struct nlattr 
*opt)
 
ctl = nla_data(tb[TCA_CHOKE_PARMS]);
 
+   if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog))
+   return -EINVAL;
+
if (ctl->limit > CHOKE_MAX_QUEUE)
return -EINVAL;
 
diff --git a/net/sched/sch_gred.c b/net/sched/sch_gred.c
index 17c7130..bc30f91 100644
--- a/net/sched/sch_gred.c
+++ b/net/sched/sch_gred.c
@@ -356,6 +356,9 @@ static inline int gred_change_vq(struct Qdisc *sch, int dp,
struct gred_sched *table = qdisc_priv(sch);
struct gred_sched_data *q = table->tab[dp];
 
+   if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog))
+   return -EINVAL;
+
if (!q) {
table->tab[dp] = q = *prealloc;
*prealloc = NULL;
diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
index 7f8ea9e..9d874e6 100644
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -212,6 +212,8 @@ static int red_change(struct Qdisc *sch, struct nlattr *opt)
max_P = tb[TCA_RED_MAX_P] ? nla_get_u32(tb[TCA_RED_MAX_P]) : 0;
 
ctl = nla_data(tb[TCA_RED_PARMS]);
+   if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog))
+   return -EINVAL;
 
if (ctl->limit > 0) {
child = fifo_create_dflt(sch, _qdisc_ops, ctl->limit);
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 09c1203..930e5bd 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -639,6 +639,9 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt)
if (ctl->divisor &&
(!is_power_of_2(ctl->divisor) || ctl->divisor > 65536))
return -EINVAL;
+   if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max,
+   ctl_v1->Wlog))
+   return -EINVAL;
if (ctl_v1 && ctl_v1->qth_min) {
p = kmalloc(sizeof(*p), GFP_KERNEL);
if (!p)
-- 
2.4.3



[patch net 1/2] net_sched: red: Avoid devision by zero

2017-12-04 Thread Nogah Frankel
Do not allow delta value to be zero since it is used as a divisor.

Fixes: 8af2a218de38 ("sch_red: Adaptative RED AQM")
Signed-off-by: Nogah Frankel <nog...@mellanox.com>
---
 include/net/red.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/red.h b/include/net/red.h
index 9a93477..5918f78 100644
--- a/include/net/red.h
+++ b/include/net/red.h
@@ -179,7 +179,7 @@ static inline void red_set_parms(struct red_parms *p,
p->qth_max  = qth_max << Wlog;
p->Wlog = Wlog;
p->Plog = Plog;
-   if (delta < 0)
+   if (delta <= 0)
delta = 1;
p->qth_delta= delta;
if (!max_P) {
-- 
2.4.3



RE: [patch net-next RFC 1/9] net_sch: red: Add offload ability to RED qdisc

2017-10-30 Thread Nogah Frankel


> -Original Message-
> From: Roman Mashak [mailto:m...@mojatatu.com]
> Sent: Monday, October 30, 2017 2:47 PM
> To: Jiri Pirko <j...@resnulli.us>
> Cc: netdev@vger.kernel.org; da...@davemloft.net; Nogah Frankel 
> <nog...@mellanox.com>;
> j...@mojatatu.com; xiyou.wangc...@gmail.com; mlxsw <ml...@mellanox.com>;
> and...@lunn.ch; vivien.dide...@savoirfairelinux.com; f.faine...@gmail.com;
> michael.c...@broadcom.com; ganes...@chelsio.com; Saeed Mahameed
> <sae...@mellanox.com>; Matan Barak <mat...@mellanox.com>; Leon Romanovsky
> <leo...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>;
> jakub.kicin...@netronome.com; simon.hor...@netronome.com;
> pieter.jansenvanvuu...@netronome.com; john.hur...@netronome.com;
> alexander.h.du...@intel.com; Or Gerlitz <ogerl...@mellanox.com>;
> john.fastab...@gmail.com
> Subject: Re: [patch net-next RFC 1/9] net_sch: red: Add offload ability to 
> RED qdisc
> 
> Jiri Pirko <j...@resnulli.us> writes:
> 
> 
> [...]
> 
> > +static void red_unoffload(struct Qdisc *sch)
> > +{
> > +   struct net_device *dev = qdisc_dev(sch);
> > +   struct tc_red_qopt_offload opt = {
> > +   .handle = sch->handle,
> > +   .command = TC_RED_DESTROY,
> > +   .parent = sch->parent,
> > +   };
> > +
> > +   if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
> > +   return;
> > +
> > +   dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED,  );
> > +}
> > +
> 
> > @@ -162,6 +179,28 @@ static const struct nla_policy red_policy[TCA_RED_MAX 
> > + 1] = {
> > [TCA_RED_MAX_P] = { .type = NLA_U32 },
> >  };
> >
> > +static int red_offload(struct Qdisc *sch)
> > +{
> > +   struct red_sched_data *q = qdisc_priv(sch);
> > +   struct net_device *dev = qdisc_dev(sch);
> > +   struct tc_red_qopt_offload opt = {
> > +   .handle = sch->handle,
> > +   .command = TC_RED_REPLACE,
> > +   .parent = sch->parent,
> > +   .set = {
> > +   .min = q->parms.qth_min >> q->parms.Wlog,
> > +   .max = q->parms.qth_max >> q->parms.Wlog,
> > +   .probability = q->parms.max_P,
> > +   .is_ecn = red_use_ecn(q),
> > +   },
> > +   };
> > +
> > +   if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
> > +   return -EOPNOTSUPP;
> > +
> > +   return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED, );
> > +}
> > +
> 
> [...]
> 
> Can't red_unoffload() and red_offload() be unified in a single API? For
> example, red_offload(struct Qdisc *sch, bool enable) ?

I will unify them.
Thanks.



RE: [patch net-next RFC 1/9] net_sch: red: Add offload ability to RED qdisc

2017-10-30 Thread Nogah Frankel


> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Monday, October 30, 2017 2:20 PM
> To: j...@resnulli.us
> Cc: netdev@vger.kernel.org; Nogah Frankel <nog...@mellanox.com>; 
> j...@mojatatu.com;
> xiyou.wangc...@gmail.com; mlxsw <ml...@mellanox.com>; and...@lunn.ch;
> vivien.dide...@savoirfairelinux.com; f.faine...@gmail.com; 
> michael.c...@broadcom.com;
> ganes...@chelsio.com; Saeed Mahameed <sae...@mellanox.com>; Matan Barak
> <mat...@mellanox.com>; Leon Romanovsky <leo...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>; jakub.kicin...@netronome.com; 
> simon.hor...@netronome.com;
> pieter.jansenvanvuu...@netronome.com; john.hur...@netronome.com;
> alexander.h.du...@intel.com; Or Gerlitz <ogerl...@mellanox.com>;
> john.fastab...@gmail.com
> Subject: Re: [patch net-next RFC 1/9] net_sch: red: Add offload ability to 
> RED qdisc
> 
> From: Jiri Pirko <j...@resnulli.us>
> Date: Mon, 30 Oct 2017 09:56:05 +0100
> 
> > diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> > index 0e88cc2..743c42a 100644
> > --- a/include/uapi/linux/pkt_sched.h
> > +++ b/include/uapi/linux/pkt_sched.h
> > @@ -255,6 +255,7 @@ struct tc_red_qopt {
> >  #define TC_RED_ECN 1
> >  #define TC_RED_HARDDROP2
> >  #define TC_RED_ADAPTATIVE  4
> > +#define TC_RED_OFFLOADED   8
> >  };
> >
> >  struct tc_red_xstats {
> 
> What keeps a user from setting this flag in the tc_red_qopt it
> passes into the a change operation?

Nothing keeps the user from doing it, but it has no effect.
The decision to offload is the driver's only.
It is basically a read-only flag.



RE: [PATCH iproute2 v5 2/4] ifstat: Add extended statistics to ifstat

2017-02-07 Thread Nogah Frankel

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Friday, February 03, 2017 8:07 PM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; ro...@cumulusnetworks.com; roszenr...@gmail.com;
> jb...@redhat.com; sergei.shtyl...@cogentembedded.com; Jiri Pirko
> <j...@mellanox.com>; Elad Raz <el...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v5 2/4] ifstat: Add extended statistics to ifstat
> 
> On Thu, 26 Jan 2017 14:44:39 +0200
> Nogah Frankel <nog...@mellanox.com> wrote:
> 
> > Extended stats are part of the RTM_GETSTATS method. This patch adds them
> > to ifstat.
> > While extended stats can come in many forms, we support only the
> > rtnl_link_stats64 struct for them (which is the 64 bits version of struct
> > rtnl_link_stats).
> > We support stats in the main nesting level, or one lower.
> > The extension can be called by its name or any shorten of it. If there is
> > more than one matched, the first one will be picked.
> >
> > To get the extended stats the flag -x  is used.
> >
> > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > Reviewed-by: Jiri Pirko <j...@mellanox.com>
> 
> Sorry I was confused because RTM_GETSTATS contains multiple statistics.
> Your patch is about getting LINK_XSTATS and after looking in more detail, you 
> are
> correct this should be an option. Although it would make sense to show this 
> as addition
> to the basic statistics. And when I tested it no output happens which seems 
> confusing.
> 
> $ ./misc/ifstat -p -x cpu_hits
> #kernel
> InterfaceRX Pkts/RateTX Pkts/RateRX Data/RateTX Data/Rate
>  RX Errs/DropTX Errs/DropRX Over/RateTX Coll/Rate
> 

Not all devices support this xstat.
Do you prefer another print in this case?

About printing both the xstat and the default stats together, it may be 
problematic since 
ifstat is about diffs. I think it is better that one ifstat call for a specific 
stats, won't change
the other stats data. (And since we are talking about diffs, reading data 
meaning changing
it).

> What I was intending in earlier discussion was using IFLA_STATS_LINK_64 which 
> would
> allow supporting 64 bit statistics on 32 bit platforms.
> 


RE: [PATCH iproute2 v5 2/4] ifstat: Add extended statistics to ifstat

2017-02-02 Thread Nogah Frankel


> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, January 30, 2017 6:34 AM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; ro...@cumulusnetworks.com; roszenr...@gmail.com;
> jb...@redhat.com; sergei.shtyl...@cogentembedded.com; Jiri Pirko
> <j...@mellanox.com>; Elad Raz <el...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v5 2/4] ifstat: Add extended statistics to ifstat
> 
> On Thu, 26 Jan 2017 14:44:39 +0200
> Nogah Frankel <nog...@mellanox.com> wrote:
> 
> > Extended stats are part of the RTM_GETSTATS method. This patch adds them
> > to ifstat.
> > While extended stats can come in many forms, we support only the
> > rtnl_link_stats64 struct for them (which is the 64 bits version of struct
> > rtnl_link_stats).
> > We support stats in the main nesting level, or one lower.
> > The extension can be called by its name or any shorten of it. If there is
> > more than one matched, the first one will be picked.
> >
> > To get the extended stats the flag -x  is used.
> 
> It would be better if this command used 64 bit statistics transparently
> like other tools in iproute2. Transparency is always better user experience.

From user point of view, all the statistics in ifstat are 64 bits and are saved 
that way.
We currently get the stats in 32 bits structs and translate them to 64 bits. It 
keeps track
of  int overflows (in the 32 bits).
The problem with it is  that if between one ifstat call to another passes too 
much time,
one might lose one of the overflows. 
But to change it is not a trivial matter, and I believe it is not related to 
the xstat feature I
want to add.
I don't think I understood what is the change you want.


[PATCH iproute2 v5 3/4] ifstat: Add "sw only" extended statistics to ifstat

2017-01-26 Thread Nogah Frankel
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'cpu_hits'
(or any shorten of it as 'cpu' or simply 'c')

For example:
ifstat -x c

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 9467119..a853ee6 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -729,7 +729,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"   cpu_hits   Counts only packets that went via the CPU.\n");
 }
 
 struct extended_stats_options_t {
@@ -743,6 +744,7 @@ struct extended_stats_options_t {
  * Name length must be under 64 chars.
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
+   {"cpu_hits",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static const char *get_filter_type(const char *name)
-- 
2.4.3



[PATCH iproute2 v5 2/4] ifstat: Add extended statistics to ifstat

2017-01-26 Thread Nogah Frankel
Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 160 --
 1 file changed, 145 insertions(+), 15 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 5bcbcc8..9467119 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -34,6 +34,7 @@
 #include "libnetlink.h"
 #include "json_writer.h"
 #include "SNAPSHOT.h"
+#include "utils.h"
 
 int dump_zeros;
 int reset_history;
@@ -48,17 +49,21 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+bool is_extended;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
__u32   ival[MAXS];
 };
@@ -106,6 +111,48 @@ static int match(const char *id)
return 0;
 }
 
+static int get_nlmsg_extended(const struct sockaddr_nl *who,
+ struct nlmsghdr *m, void *arg)
+{
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
+   int len = m->nlmsg_len;
+   struct ifstat_ent *n;
+
+   if (m->nlmsg_type != RTM_NEWSTATS)
+   return 0;
+
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
+   if (len < 0)
+   return -1;
+
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[filter_type] == NULL)
+   return 0;
+
+   n = malloc(sizeof(*n));
+   if (!n)
+   abort();
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(>val, RTA_DATA(tb[filter_type]), sizeof(n->val));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(>val, RTA_DATA(attr), sizeof(n->val));
+   }
+   memset(>rate, 0, sizeof(n->rate));
+   n->next = kern_db;
+   kern_db = n;
+   return 0;
+}
+
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
@@ -147,18 +194,34 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filter_mask;
 
if (rtnl_open(, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
-   perror("Cannot send dump request");
-   exit(1);
-   }
+   if (is_extended) {
+   ll_init_map();
+   filter_mask = IFLA_STATS_FILTER_BIT(filter_type);
+   if (rtnl_wilddump_stats_req_filter(, AF_UNSPEC, 
RTM_GETSTATS,
+  filter_mask) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
 
-   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
-   fprintf(stderr, "Dump terminated\n");
-   exit(1);
+   if (rtnl_dump_filter(, get_nlmsg_extended, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
+   } else {
+   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
+
+   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
}
 
rtnl_close();
@@ -553,10 +616,17 @@ static void update_db(int interval)
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+

[PATCH iproute2 v5 4/4] ifstat: Add xstat to ifstat man page

2017-01-26 Thread Nogah Frankel
Add documentation about the extended statistics to the ifstat man page.
Add ifstat man age to the man8 Makefile

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 man/man8/Makefile |  3 ++-
 man/man8/ifstat.8 | 12 +++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/man/man8/Makefile b/man/man8/Makefile
index 77d347c..bc2fc81 100644
--- a/man/man8/Makefile
+++ b/man/man8/Makefile
@@ -18,7 +18,8 @@ MAN8PAGES = $(TARGETS) ip.8 arpd.8 lnstat.8 routel.8 rtacct.8 
rtmon.8 rtpr.8 ss.
tc-connmark.8 tc-csum.8 tc-mirred.8 tc-nat.8 tc-pedit.8 tc-police.8 \
tc-simple.8 tc-skbedit.8 tc-vlan.8 tc-xt.8  tc-ife.8 \
tc-tunnel_key.8 \
-   devlink.8 devlink-dev.8 devlink-monitor.8 devlink-port.8 devlink-sb.8
+   devlink.8 devlink-dev.8 devlink-monitor.8 devlink-port.8 devlink-sb.8 \
+   ifstat.8
 
 all: $(TARGETS)
 
diff --git a/man/man8/ifstat.8 b/man/man8/ifstat.8
index e49d868..3ba0088 100644
--- a/man/man8/ifstat.8
+++ b/man/man8/ifstat.8
@@ -14,7 +14,8 @@ ifstat \- handy utility to read network interface statistics
 The utility keeps records of the previous data displayed in history files and
 by default only shows difference between the last and the current call.
 Location of the history files defaults to /tmp/.ifstat.u$UID but may be
-overridden with the IFSTAT_HISTORY environment variable.
+overridden with the IFSTAT_HISTORY environment variable. Similarly, the default
+location for xstat (extended stats) is /tmp/._ifstat.u$UID.
 .SH OPTIONS
 .TP
 .B \-h, \-\-help
@@ -46,6 +47,15 @@ Report average over the last SECS seconds.
 .TP
 .B \-z, \-\-zeros
 Show entries with zero activity.
+.TP
+.B \-x, \-\-extended=TYPE
+Show extended stats of TYPE. Supported types are:
+
+.in +8
+.B cpu_hits
+- Counts only packets that went via the CPU.
+.in -8
+
 .SH ENVIRONMENT
 .TP
 .B IFSTAT_HISTORY
-- 
2.4.3



[PATCH iproute2 v5 1/4] ifstat: Includes reorder

2017-01-26 Thread Nogah Frankel
Reorder the includes in misc/ifstat.c to match convention.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 92d67b0..5bcbcc8 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -28,12 +28,12 @@
 #include 
 #include 
 
-#include 
-#include 
 #include 
 #include 
 
-#include 
+#include "libnetlink.h"
+#include "json_writer.h"
+#include "SNAPSHOT.h"
 
 int dump_zeros;
 int reset_history;
-- 
2.4.3



[PATCH iproute2 v5 0/4] update ifstat for new stats

2017-01-26 Thread Nogah Frankel
Previously stats were gotten by RTM_GETLINK which returns 32 bits based
statistics. It supports only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset adds ifstat the ability to get extended stats (xstats) by
this method. Its adds xstat of cpu_hits - for packets that hit cpu.

---
v4->v5:
 - remove patch 3/4 that added 64 based "normal" stats
 - add a new patch 4/4 that updates ifstat man page

v3->v4:
- patch 2/4:
 - change xstat name read to avoid redundant copy.
 - delete extra line
- patch 4/4:
 - change xstat name.

v2->v3:
- patch 1/4:
 - add a new patch to reorder includes in misc/ifstat.c
- patch 2/4: (previously 1/3)
 - fix typos.
 - change error print to use fprintf.

v1->v2:
 - change from using RTM_GETSTATS always to using it only for extended
   stats.
 - Add 64bits extended stats type.

Nogah Frankel (4):
  ifstat: Includes reorder
  ifstat: Add extended statistics to ifstat
  ifstat: Add "sw only" extended statistics to ifstat
  ifstat: Add xstat to ifstat man page

 man/man8/Makefile |   3 +-
 man/man8/ifstat.8 |  12 +++-
 misc/ifstat.c | 168 --
 3 files changed, 163 insertions(+), 20 deletions(-)

-- 
2.4.3



RE: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to extended statistics

2017-01-22 Thread Nogah Frankel

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Thursday, January 19, 2017 9:11 PM
> To: Roopa Prabhu <ro...@cumulusnetworks.com>
> Cc: Nogah Frankel <nog...@mellanox.com>; netdev@vger.kernel.org;
> roszenr...@gmail.com; Jiri Pirko <j...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>; Elad Raz <el...@mellanox.com>; Yotam Gigi
> <yot...@mellanox.com>; Or Gerlitz <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to 
> extended statistics
> 
> On Thu, 19 Jan 2017 08:06:21 -0800
> Roopa Prabhu <ro...@cumulusnetworks.com> wrote:
> 
> > On 1/19/17, 7:21 AM, Nogah Frankel wrote:
> > >> -Original Message-
> > >> From: Nogah Frankel
> > >> Sent: Sunday, January 15, 2017 3:55 PM
> > >> To: 'Stephen Hemminger' <step...@networkplumber.org>
> > >> Cc: netdev@vger.kernel.org; roszenr...@gmail.com;
> ro...@cumulusnetworks.com; Jiri
> > >> Pirko <j...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad Raz
> > >> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> > >> <ogerl...@mellanox.com>
> > >> Subject: RE: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to 
> > >> extended
> statistics
> > >>
> > >>
> > >>
> > >>> -Original Message-
> > >>> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> > >>> Sent: Friday, January 13, 2017 3:44 AM
> > >>> To: Nogah Frankel <nog...@mellanox.com>
> > >>> Cc: netdev@vger.kernel.org; roszenr...@gmail.com;
> ro...@cumulusnetworks.com;
> > >> Jiri
> > >>> Pirko <j...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad Raz
> > >>> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> > >>> <ogerl...@mellanox.com>
> > >>> Subject: Re: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to 
> > >>> extended
> statistics
> > >>>
> > >>> On Thu, 12 Jan 2017 15:49:50 +0200
> > >>> Nogah Frankel <nog...@mellanox.com> wrote:
> > >>>
> > >>>> The default stats for ifstat are 32 bits based.
> > >>>> The kernel supports 64 bits based stats. (They are returned in struct
> > >>>> rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
> > >>>> which the "normal" stats are returned, but with fields of u64 instead 
> > >>>> of
> > >>>> u32). This patch adds them as an extended stats.
> > >>>>
> > >>>> It is read with filter type IFLA_STATS_LINK_64 and no sub type.
> > >>>>
> > >>>> It is under the name 64bits
> > >>>> (or any shorten of it as "64")
> > >>>>
> > >>>> For example:
> > >>>> ifstat -x 64bit
> > >>>>
> > >>>> Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > >>>> Reviewed-by: Jiri Pirko <j...@mellanox.com>
> > >>> Other commands (like ip link) always use the 64 bit statistics if 
> > >>> available
> > >>> from the device. I see no reason that ifstat needs to be different.
> > >>>
> > >> Do you mean to change the default ifstat results to be 64 bits based?
> > >> I tried it in the first version, but Roopa commented that it was not a 
> > >> good idea.
> > >> She said they tried it in the past and it caused backward 
> > >> compatibilities problems.
> > >> (Or maybe I didn't understand correctly)
> > > So, can I leave the default ifstat results to be 32 bits based, for the 
> > > time being?
> > >
> > From past discussions: Moving the default to 64bit has compat issues with 
> > the old
> history file.
> > There is a way to make it work by using a new file header (to indicate that 
> > it is 64 bit) in
> > a freshly created history file and also check this header before dumping 
> > stats into the
> history file.
> > ie maintain backward compat without introducing a new option. It is doable.
> >
> > One approach is, you can drop the 64bit option from this series and
> > try updating the default to 64 bit (with compat handling code) in a later 
> > series.

I think I will take your suggestion to drop the 64 bits from this series.
Hopefully, I'll return to it in some later series in the future.
Thanks

> 
> The ifstat code could do conversion based on file size.
> 
>   if (history_file_is_32bit()) {
>   printf("converting to 64 bit format\n");
> ...
>   }
> 
> 



RE: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to extended statistics

2017-01-19 Thread Nogah Frankel
> -Original Message-
> From: Nogah Frankel
> Sent: Sunday, January 15, 2017 3:55 PM
> To: 'Stephen Hemminger' <step...@networkplumber.org>
> Cc: netdev@vger.kernel.org; roszenr...@gmail.com; ro...@cumulusnetworks.com; 
> Jiri
> Pirko <j...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> <ogerl...@mellanox.com>
> Subject: RE: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to 
> extended statistics
> 
> 
> 
> > -Original Message-
> > From: Stephen Hemminger [mailto:step...@networkplumber.org]
> > Sent: Friday, January 13, 2017 3:44 AM
> > To: Nogah Frankel <nog...@mellanox.com>
> > Cc: netdev@vger.kernel.org; roszenr...@gmail.com; ro...@cumulusnetworks.com;
> Jiri
> > Pirko <j...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad Raz
> > <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> > <ogerl...@mellanox.com>
> > Subject: Re: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to 
> > extended statistics
> >
> > On Thu, 12 Jan 2017 15:49:50 +0200
> > Nogah Frankel <nog...@mellanox.com> wrote:
> >
> > > The default stats for ifstat are 32 bits based.
> > > The kernel supports 64 bits based stats. (They are returned in struct
> > > rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
> > > which the "normal" stats are returned, but with fields of u64 instead of
> > > u32). This patch adds them as an extended stats.
> > >
> > > It is read with filter type IFLA_STATS_LINK_64 and no sub type.
> > >
> > > It is under the name 64bits
> > > (or any shorten of it as "64")
> > >
> > > For example:
> > > ifstat -x 64bit
> > >
> > > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > > Reviewed-by: Jiri Pirko <j...@mellanox.com>
> >
> > Other commands (like ip link) always use the 64 bit statistics if available
> > from the device. I see no reason that ifstat needs to be different.
> >
> 
> Do you mean to change the default ifstat results to be 64 bits based?
> I tried it in the first version, but Roopa commented that it was not a good 
> idea.
> She said they tried it in the past and it caused backward compatibilities 
> problems.
> (Or maybe I didn't understand correctly)

So, can I leave the default ifstat results to be 32 bits based, for the time 
being?



RE: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to extended statistics

2017-01-15 Thread Nogah Frankel


> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Friday, January 13, 2017 3:44 AM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; roszenr...@gmail.com; ro...@cumulusnetworks.com; 
> Jiri
> Pirko <j...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to 
> extended statistics
> 
> On Thu, 12 Jan 2017 15:49:50 +0200
> Nogah Frankel <nog...@mellanox.com> wrote:
> 
> > The default stats for ifstat are 32 bits based.
> > The kernel supports 64 bits based stats. (They are returned in struct
> > rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
> > which the "normal" stats are returned, but with fields of u64 instead of
> > u32). This patch adds them as an extended stats.
> >
> > It is read with filter type IFLA_STATS_LINK_64 and no sub type.
> >
> > It is under the name 64bits
> > (or any shorten of it as "64")
> >
> > For example:
> > ifstat -x 64bit
> >
> > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > Reviewed-by: Jiri Pirko <j...@mellanox.com>
> 
> Other commands (like ip link) always use the 64 bit statistics if available
> from the device. I see no reason that ifstat needs to be different.
> 

Do you mean to change the default ifstat results to be 64 bits based?
I tried it in the first version, but Roopa commented that it was not a good 
idea.
She said they tried it in the past and it caused backward compatibilities 
problems.
(Or maybe I didn't understand correctly)



RE: [PATCH iproute2 v4 1/4] ifstat: Includes reorder

2017-01-15 Thread Nogah Frankel


> -Original Message-
> From: Sergei Shtylyov [mailto:sergei.shtyl...@cogentembedded.com]
> Sent: Friday, January 13, 2017 12:11 PM
> To: Nogah Frankel <nog...@mellanox.com>; netdev@vger.kernel.org
> Cc: step...@networkplumber.org; roszenr...@gmail.com;
> ro...@cumulusnetworks.com; Jiri Pirko <j...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>; Elad Raz <el...@mellanox.com>; Yotam Gigi
> <yot...@mellanox.com>; Or Gerlitz <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v4 1/4] ifstat: Includes reorder
> 
> On 1/12/2017 4:49 PM, Nogah Frankel wrote:
> 
> > Reorder the includes order in misc/ifstat.c to match convention.
> 
> "Reorder the ... order" sounds a bit tautological. :-)
> 
> MBR, Sergei

I'll find a better description.

Thanks :)


[PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to extended statistics

2017-01-12 Thread Nogah Frankel
The default stats for ifstat are 32 bits based.
The kernel supports 64 bits based stats. (They are returned in struct
rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
which the "normal" stats are returned, but with fields of u64 instead of
u32). This patch adds them as an extended stats.

It is read with filter type IFLA_STATS_LINK_64 and no sub type.

It is under the name 64bits
(or any shorten of it as "64")

For example:
ifstat -x 64bit

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 9467119..3478f0a 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -729,7 +729,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"   64bits default stats, with 64 bits support\n");
 }
 
 struct extended_stats_options_t {
@@ -743,6 +744,7 @@ struct extended_stats_options_t {
  * Name length must be under 64 chars.
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
+   {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
 };
 
 static const char *get_filter_type(const char *name)
-- 
2.4.3



[PATCH iproute2 v4 2/4] ifstat: Add extended statistics to ifstat

2017-01-12 Thread Nogah Frankel
Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 160 --
 1 file changed, 145 insertions(+), 15 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 5bcbcc8..9467119 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -34,6 +34,7 @@
 #include "libnetlink.h"
 #include "json_writer.h"
 #include "SNAPSHOT.h"
+#include "utils.h"
 
 int dump_zeros;
 int reset_history;
@@ -48,17 +49,21 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+bool is_extended;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
__u32   ival[MAXS];
 };
@@ -106,6 +111,48 @@ static int match(const char *id)
return 0;
 }
 
+static int get_nlmsg_extended(const struct sockaddr_nl *who,
+ struct nlmsghdr *m, void *arg)
+{
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
+   int len = m->nlmsg_len;
+   struct ifstat_ent *n;
+
+   if (m->nlmsg_type != RTM_NEWSTATS)
+   return 0;
+
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
+   if (len < 0)
+   return -1;
+
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[filter_type] == NULL)
+   return 0;
+
+   n = malloc(sizeof(*n));
+   if (!n)
+   abort();
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(>val, RTA_DATA(tb[filter_type]), sizeof(n->val));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(>val, RTA_DATA(attr), sizeof(n->val));
+   }
+   memset(>rate, 0, sizeof(n->rate));
+   n->next = kern_db;
+   kern_db = n;
+   return 0;
+}
+
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
@@ -147,18 +194,34 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filter_mask;
 
if (rtnl_open(, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
-   perror("Cannot send dump request");
-   exit(1);
-   }
+   if (is_extended) {
+   ll_init_map();
+   filter_mask = IFLA_STATS_FILTER_BIT(filter_type);
+   if (rtnl_wilddump_stats_req_filter(, AF_UNSPEC, 
RTM_GETSTATS,
+  filter_mask) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
 
-   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
-   fprintf(stderr, "Dump terminated\n");
-   exit(1);
+   if (rtnl_dump_filter(, get_nlmsg_extended, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
+   } else {
+   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
+
+   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
}
 
rtnl_close();
@@ -553,10 +616,17 @@ static void update_db(int interval)
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+

[PATCH iproute2 v4 4/4] ifstat: Add "sw only" extended statistics to ifstat

2017-01-12 Thread Nogah Frankel
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'cpu_hits'
(or any shorten of it as 'cpu' or simply 'c')

For example:
ifstat -x c

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 3478f0a..5b6a36b 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -730,7 +730,8 @@ static void xstat_usage(void)
 {
fprintf(stderr,
 "Usage: ifstat supported xstats:\n"
-"   64bits default stats, with 64 bits support\n");
+"   64bits default stats, with 64 bits support\n"
+"   cpu_hits   Counts only packets that went via the CPU.\n");
 }
 
 struct extended_stats_options_t {
@@ -745,6 +746,7 @@ struct extended_stats_options_t {
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
{"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+   {"cpu_hits",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static const char *get_filter_type(const char *name)
-- 
2.4.3



[PATCH iproute2 v4 0/4] update ifstat for new stats

2017-01-12 Thread Nogah Frankel
Previously stats were gotten by RTM_GETLINK which returns 32 bits based
statistics. It supports only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset adds ifstat the ability to get extended stats by this
method. Its adds two types of extended stats:
64bits - the same as the "normal" stats but get the stats from the cpu
in 64 bits based struct.
cpu_hits - for packets that hit cpu.

---
v3->v4:
- patch 2/4:
 - change xstat name read to avoid redundant copy.
 - delete extra line
- patch 4/4:
 - change xstat name.

v2->v3:
- patch 1/4:
 - add a new patch to reorder includes in misc/ifstat.c
- patch 2/4: (previously 1/3)
 - fix typos.
 - change error print to use fprintf.

v1->v2:
 - change from using RTM_GETSTATS always to using it only for extended
   stats.
 - Add 64bits extended stats type.

Nogah Frankel (4):
  ifstat: Includes reorder
  ifstat: Add extended statistics to ifstat
  ifstat: Add 64 bits based stats to extended statistics
  ifstat: Add "sw only" extended statistics to ifstat

 misc/ifstat.c | 170 +++---
 1 file changed, 152 insertions(+), 18 deletions(-)

-- 
2.4.3



[PATCH iproute2 v4 1/4] ifstat: Includes reorder

2017-01-12 Thread Nogah Frankel
Reorder the includes order in misc/ifstat.c to match convention.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 92d67b0..5bcbcc8 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -28,12 +28,12 @@
 #include 
 #include 
 
-#include 
-#include 
 #include 
 #include 
 
-#include 
+#include "libnetlink.h"
+#include "json_writer.h"
+#include "SNAPSHOT.h"
 
 int dump_zeros;
 int reset_history;
-- 
2.4.3



RE: [PATCH iproute2 v3 4/4] ifstat: Add "sw only" extended statistics to ifstat

2016-12-25 Thread Nogah Frankel


> -Original Message-
> From: Roopa Prabhu [mailto:ro...@cumulusnetworks.com]
> Sent: Thursday, December 22, 2016 11:10 PM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; step...@networkplumber.org; roszenr...@gmail.com; 
> Or
> Gerlitz <ogerl...@mellanox.com>; Jiri Pirko <j...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>
> Subject: Re: [PATCH iproute2 v3 4/4] ifstat: Add "sw only" extended 
> statistics to ifstat
> 
> On 12/22/16, 8:23 AM, Nogah Frankel wrote:
> > Add support for extended statistics of SW only type, for counting only the
> > packets that went via the cpu. (useful for systems with forward
> > offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
> > and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.
> >
> > It is under the name 'software'
> > (or any shorten of it as 'soft' or simply 's')
> >
> > For example:
> > ifstat -x s
> >
> >
> Nogah, can we keep the option names closer to the attribute names ?
> That would avoid some confusion and help with the follow-up stats.
> 
> ifstat -x offload cpu
> or
> ifstat -x cpu
> 
> for others it would be:
> 
> ifstat -x link [vlan|igmp]
> ifstat -x vlan
> ifstat -x igmp
> ifstat -x lacp
> 
> and so on...
> 
> thanks!

Sure, I will change it.


RE: [PATCH iproute2 v3 2/4] ifstat: Add extended statistics to ifstat

2016-12-25 Thread Nogah Frankel
> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Thursday, December 22, 2016 8:59 PM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; ro...@cumulusnetworks.com; roszenr...@gmail.com; 
> Or
> Gerlitz <ogerl...@mellanox.com>; Jiri Pirko <j...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>
> Subject: Re: [PATCH iproute2 v3 2/4] ifstat: Add extended statistics to ifstat
> 
> On Thu, 22 Dec 2016 18:23:13 +0200
> Nogah Frankel <nog...@mellanox.com> wrote:
> On Thu, 22 Dec 2016 18:23:13 +0200
> Nogah Frankel <nog...@mellanox.com> wrote:
> 
> >  }
> > @@ -691,18 +804,22 @@ static const struct option longopts[] = {
> > { "interval", 1, 0, 't' },
> > { "version", 0, 0, 'V' },
> > { "zeros", 0, 0, 'z' },
> > +   { "extended", 1, 0, 'x'},
> > { 0 }
> >  };
> >
> > +
> >  int main(int argc, char *argv[])
> 
> You let extra whitespace changes creep in.
> 
> 
> > +   case 'x':
> > +   is_extended = true;
> > +   memset(stats_type, 0, 64);
> > +   strncpy(stats_type, optarg, 63);
> > +   break;
> 
> This seems like doing this either the paranoid or hard way.
> Why not:
>   const char *stats_type = NULL;
> ...
> 
>   case 'x':
>   stats_type = optarg;
>   break;
> ...
>   if (stats_type)
>   snprintf(hist_name, sizeof(hist_name),
>"%s/.%s_ifstat.u%d", P_tmpdir, stats_type,
>getuid());
>   else
>   snprintf(hist_name, sizeof(hist_name),
>"%s/.ifstat.u%d", P_tmpdir, getuid());
> 
> 
> Since:
>   1) optarg points to area in argv that is persistent (avoid copy)
>   2) don't need is_extended flag value then
> 
> Please cleanup and resubmit.
> 
> 

I will.
Thank you.




[PATCH iproute2 v3 2/4] ifstat: Add extended statistics to ifstat

2016-12-22 Thread Nogah Frankel
Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 161 --
 1 file changed, 146 insertions(+), 15 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 5bcbcc8..ce666b3 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -34,6 +34,7 @@
 #include "libnetlink.h"
 #include "json_writer.h"
 #include "SNAPSHOT.h"
+#include "utils.h"
 
 int dump_zeros;
 int reset_history;
@@ -48,17 +49,21 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+bool is_extended;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
__u32   ival[MAXS];
 };
@@ -106,6 +111,48 @@ static int match(const char *id)
return 0;
 }
 
+static int get_nlmsg_extended(const struct sockaddr_nl *who,
+ struct nlmsghdr *m, void *arg)
+{
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
+   int len = m->nlmsg_len;
+   struct ifstat_ent *n;
+
+   if (m->nlmsg_type != RTM_NEWSTATS)
+   return 0;
+
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
+   if (len < 0)
+   return -1;
+
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[filter_type] == NULL)
+   return 0;
+
+   n = malloc(sizeof(*n));
+   if (!n)
+   abort();
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(>val, RTA_DATA(tb[filter_type]), sizeof(n->val));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(>val, RTA_DATA(attr), sizeof(n->val));
+   }
+   memset(>rate, 0, sizeof(n->rate));
+   n->next = kern_db;
+   kern_db = n;
+   return 0;
+}
+
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
@@ -147,18 +194,34 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filter_mask;
 
if (rtnl_open(, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
-   perror("Cannot send dump request");
-   exit(1);
-   }
+   if (is_extended) {
+   ll_init_map();
+   filter_mask = IFLA_STATS_FILTER_BIT(filter_type);
+   if (rtnl_wilddump_stats_req_filter(, AF_UNSPEC, 
RTM_GETSTATS,
+  filter_mask) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
 
-   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
-   fprintf(stderr, "Dump terminated\n");
-   exit(1);
+   if (rtnl_dump_filter(, get_nlmsg_extended, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
+   } else {
+   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
+
+   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
}
 
rtnl_close();
@@ -553,10 +616,17 @@ static void update_db(int interval)
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+

[PATCH iproute2 v3 1/4] ifstat: Includes reorder

2016-12-22 Thread Nogah Frankel
Reorder the includes order in misc/ifstat.c to match convention.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 92d67b0..5bcbcc8 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -28,12 +28,12 @@
 #include 
 #include 
 
-#include 
-#include 
 #include 
 #include 
 
-#include 
+#include "libnetlink.h"
+#include "json_writer.h"
+#include "SNAPSHOT.h"
 
 int dump_zeros;
 int reset_history;
-- 
2.4.3



[PATCH iproute2 v3 3/4] ifstat: Add 64 bits based stats to extended statistics

2016-12-22 Thread Nogah Frankel
The default stats for ifstat are 32 bits based.
The kernel supports 64 bits based stats. (They are returned in struct
rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
which the "normal" stats are returned, but with fields of u64 instead of
u32). This patch adds them as an extended stats.

It is read with filter type IFLA_STATS_LINK_64 and no sub type.

It is under the name 64bits
(or any shorten of it as "64")

For example:
ifstat -x 64bit

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index ce666b3..8325ac7 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -729,7 +729,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"   64bits default stats, with 64 bits support\n");
 }
 
 struct extended_stats_options_t {
@@ -743,6 +744,7 @@ struct extended_stats_options_t {
  * Name length must be under 64 chars.
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
+   {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



[PATCH iproute2 v3 4/4] ifstat: Add "sw only" extended statistics to ifstat

2016-12-22 Thread Nogah Frankel
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'software'
(or any shorten of it as 'soft' or simply 's')

For example:
ifstat -x s

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 8325ac7..79c0dba 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -730,7 +730,8 @@ static void xstat_usage(void)
 {
fprintf(stderr,
 "Usage: ifstat supported xstats:\n"
-"   64bits default stats, with 64 bits support\n");
+"   64bits default stats, with 64 bits support\n"
+"   softwareSW stats. Counts only packets that went via the 
CPU\n");
 }
 
 struct extended_stats_options_t {
@@ -745,6 +746,7 @@ struct extended_stats_options_t {
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
{"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+   {"software",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



[PATCH iproute2 v3 0/4] update ifstat for new stats

2016-12-22 Thread Nogah Frankel
Previously stats were gotten by RTM_GETLINK which returns 32 bits based
statistics. It supports only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset adds ifstat the ability to get extended stats by this
method. Its adds two types of extended stats:
64bits - the same as the "normal" stats but get the stats from the cpu
in 64 bits based struct.
SW - for packets that hit cpu.

---
v2->v3:
- patch 1/4:
 - add a new patch to reorder includes in misc/ifstat.c
- patch 2/4: (previously 1/3)
 - fix typos.
 - change error print to use fprintf.

v1->v2:
 - change from using RTM_GETSTATS always to using it only for extended
   stats.
 - Add 64bits extended stats type.

Nogah Frankel (4):
  ifstat: Includes reorder
  ifstat: Add extended statistics to ifstat
  ifstat: Add 64 bits based stats to extended statistics
  ifstat: Add "sw only" extended statistics to ifstat

 misc/ifstat.c | 171 +++---
 1 file changed, 153 insertions(+), 18 deletions(-)

-- 
2.4.3



RE: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat

2016-12-18 Thread Nogah Frankel
> -Original Message-
> From: Rami Rosen [mailto:roszenr...@gmail.com]
> Sent: Friday, December 16, 2016 5:21 PM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: Stephen Hemminger <step...@networkplumber.org>; netdev@vger.kernel.org;
> ro...@cumulusnetworks.com; Jiri Pirko <j...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Ido Schimmel
> <ido...@mellanox.com>; Or Gerlitz <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat
> 
> Hi,
> 
> >Thanks, I'll fix it.
> Another minor nit, on this occasion:
> 
> bool is_extanded should be: bool is_extended
> 
> Regards,
> Rami Rosen

Thank you, I will fix it.

Nogah


RE: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat

2016-12-16 Thread Nogah Frankel

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Thursday, December 15, 2016 7:33 PM
> To: Nogah Frankel <nog...@mellanox.com>
> Cc: netdev@vger.kernel.org; ro...@cumulusnetworks.com; Jiri Pirko
> <j...@mellanox.com>; Elad Raz <el...@mellanox.com>; Yotam Gigi
> <yot...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Or Gerlitz
> <ogerl...@mellanox.com>
> Subject: Re: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat
> 
> On Thu, 15 Dec 2016 15:00:43 +0200
> Nogah Frankel <nog...@mellanox.com> wrote:
> 
> > Extended stats are part of the RTM_GETSTATS method. This patch adds them
> > to ifstat.
> > While extended stats can come in many forms, we support only the
> > rtnl_link_stats64 struct for them (which is the 64 bits version of struct
> > rtnl_link_stats).
> > We support stats in the main nesting level, or one lower.
> > The extension can be called by its name or any shorten of it. If there is
> > more than one matched, the first one will be picked.
> >
> > To get the extended stats the flag -x  is used.
> >
> > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > Reviewed-by: Jiri Pirko <j...@mellanox.com>
> > ---
> >  misc/ifstat.c | 161
> --
> >  1 file changed, 146 insertions(+), 15 deletions(-)
> >
> > diff --git a/misc/ifstat.c b/misc/ifstat.c
> > index 92d67b0..d17ae21 100644
> > --- a/misc/ifstat.c
> > +++ b/misc/ifstat.c
> > @@ -35,6 +35,7 @@
> >
> >  #include 
> >
> > +#include "utils.h"
> >  int dump_zeros;
> >  int reset_history;
> >  int ignore_history;
> 
> Minor nit, please cleanup include order here (original code was wrong).
> 
> Standard practice is:
>  #include system headers (like stdio.h etc)
>  #include "xxx.h" local headers.
> 
> Should be:
> #include 
> 
> #include 
> #include 
> 
> #include "json_writer.h"
> #include "libnetlink.h"
> #include "utils.h"
> #include "SNAPSHOT.h"

Thanks, I'll fix it.


[PATCH iproute2 v2 0/3] update ifstat for new stats

2016-12-15 Thread Nogah Frankel
Previously stats were gotten by RTM_GETLINK which returns 32 bits based
statistics. It supports only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset adds ifstat the ability to get extended stats by this
method. Its adds two types of extended stats:
64bits - the same as the "normal" stats but get the stats from the cpu
in 64 bits based struct.
SW - for packets that hit cpu.

---
v1->v2:
 - change from using RTM_GETSTATS always to using it only for extended
   stats.
 - Add 64bits extended stats type.

Nogah Frankel (3):
  ifstat: Add extended statistics to ifstat
  ifstat: Add 64 bits based stats to extended statistics
  ifstat: Add "sw only" extended statistics to ifstat

 misc/ifstat.c | 165 --
 1 file changed, 150 insertions(+), 15 deletions(-)

-- 
2.4.3



[PATCH iproute2 v2 2/3] ifstat: Add 64 bits based stats to extended statistics

2016-12-15 Thread Nogah Frankel
The default stats for ifstat are 32 bits based.
The kernel supports 64 bits based stats. (They are returned in struct
rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
which the "normal" stats are returned, but with fields of u64 instead of
u32). This patch adds them as an extended stats.

It is read with filter type IFLA_STATS_LINK_64 and no sub type.

It is under the name 64bits
(or any shorten of it as "64")

For example:
ifstat -x 64bit

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index d17ae21..ac99d04 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -729,7 +729,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"   64bits default stats, with 64 bits support\n");
 }
 
 struct extended_stats_options_t {
@@ -743,6 +744,7 @@ struct extended_stats_options_t {
  * Name length must be under 64 chars.
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
+   {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



[PATCH iproute2 v2 3/3] ifstat: Add "sw only" extended statistics to ifstat

2016-12-15 Thread Nogah Frankel
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'software'
(or any shorten of it as 'soft' or simply 's')

For example:
ifstat -x s

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index ac99d04..62f1f2b 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -730,7 +730,8 @@ static void xstat_usage(void)
 {
fprintf(stderr,
 "Usage: ifstat supported xstats:\n"
-"   64bits default stats, with 64 bits support\n");
+"   64bits default stats, with 64 bits support\n"
+"   softwareSW stats. Counts only packets that went via the 
CPU\n");
 }
 
 struct extended_stats_options_t {
@@ -745,6 +746,7 @@ struct extended_stats_options_t {
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
{"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+   {"software",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



[PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat

2016-12-15 Thread Nogah Frankel
Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 161 --
 1 file changed, 146 insertions(+), 15 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 92d67b0..d17ae21 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -35,6 +35,7 @@
 
 #include 
 
+#include "utils.h"
 int dump_zeros;
 int reset_history;
 int ignore_history;
@@ -48,17 +49,21 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+bool is_extanded;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
__u32   ival[MAXS];
 };
@@ -106,6 +111,48 @@ static int match(const char *id)
return 0;
 }
 
+static int get_nlmsg_extanded(const struct sockaddr_nl *who,
+ struct nlmsghdr *m, void *arg)
+{
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
+   int len = m->nlmsg_len;
+   struct ifstat_ent *n;
+
+   if (m->nlmsg_type != RTM_NEWSTATS)
+   return 0;
+
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
+   if (len < 0)
+   return -1;
+
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[filter_type] == NULL)
+   return 0;
+
+   n = malloc(sizeof(*n));
+   if (!n)
+   abort();
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(>val, RTA_DATA(tb[filter_type]), sizeof(n->val));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(>val, RTA_DATA(attr), sizeof(n->val));
+   }
+   memset(>rate, 0, sizeof(n->rate));
+   n->next = kern_db;
+   kern_db = n;
+   return 0;
+}
+
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
@@ -147,18 +194,34 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filter_mask;
 
if (rtnl_open(, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
-   perror("Cannot send dump request");
-   exit(1);
-   }
+   if (is_extanded) {
+   ll_init_map();
+   filter_mask = IFLA_STATS_FILTER_BIT(filter_type);
+   if (rtnl_wilddump_stats_req_filter(, AF_UNSPEC, 
RTM_GETSTATS,
+  filter_mask) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
 
-   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
-   fprintf(stderr, "Dump terminated\n");
-   exit(1);
+   if (rtnl_dump_filter(, get_nlmsg_extanded, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
+   } else {
+   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
+
+   if (rtnl_dump_filter(, get_nlmsg, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
}
 
rtnl_close();
@@ -553,10 +616,17 @@ static void update_db(int interval)
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+   __u64 incr;
+
+   if (is_extand

[PATCH iproute2 1/3] ifstat: Change interface to get stats

2016-11-24 Thread Nogah Frankel
ifstat used to get it data from the kernel with RTM_GETLINK.
Change the interface to get this data to RTM_GETSTATS that supports more
stats type beside the default one. It also change the default stats to be
64 bits based.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 41 ++---
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index d551973..25a8fc1 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -35,6 +35,7 @@
 
 #include 
 
+#include "utils.h"
 int dump_zeros;
 int reset_history;
 int ignore_history;
@@ -52,15 +53,15 @@ int npatterns;
 char info_source[128];
 int source_mismatch;
 
-#define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define MAXS (sizeof(struct rtnl_link_stats64)/sizeof(__u64))
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
-   __u32   ival[MAXS];
+   __u64   ival[MAXS];
 };
 
 static const char *stats[MAXS] = {
@@ -109,32 +110,30 @@ static int match(const char *id)
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
-   struct ifinfomsg *ifi = NLMSG_DATA(m);
-   struct rtattr *tb[IFLA_MAX+1];
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
int len = m->nlmsg_len;
struct ifstat_ent *n;
int i;
 
-   if (m->nlmsg_type != RTM_NEWLINK)
+   if (m->nlmsg_type != RTM_NEWSTATS)
return 0;
 
-   len -= NLMSG_LENGTH(sizeof(*ifi));
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
if (len < 0)
return -1;
 
-   if (!(ifi->ifi_flags_UP))
-   return 0;
-
-   parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len);
-   if (tb[IFLA_IFNAME] == NULL || tb[IFLA_STATS] == NULL)
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[IFLA_STATS_LINK_64] == NULL)
return 0;
 
n = malloc(sizeof(*n));
if (!n)
abort();
-   n->ifindex = ifi->ifi_index;
-   n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
-   memcpy(>ival, RTA_DATA(tb[IFLA_STATS]), sizeof(n->ival));
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+   memcpy(>ival, RTA_DATA(tb[IFLA_STATS_LINK_64]), sizeof(n->ival));
memset(>rate, 0, sizeof(n->rate));
for (i = 0; i < MAXS; i++)
n->val[i] = n->ival[i];
@@ -147,11 +146,15 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filt_mask;
 
if (rtnl_open(, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(, AF_INET, RTM_GETLINK) < 0) {
+   ll_init_map();
+   filt_mask = IFLA_STATS_FILTER_BIT(IFLA_STATS_LINK_64);
+   if (rtnl_wilddump_stats_req_filter(, AF_UNSPEC, RTM_GETSTATS,
+  filt_mask) < 0) {
perror("Cannot send dump request");
exit(1);
}
@@ -216,7 +219,7 @@ static void load_raw_table(FILE *fp)
*next++ = 0;
if (sscanf(p, "%llu", n->val+i) != 1)
abort();
-   n->ival[i] = (__u32)n->val[i];
+   n->ival[i] = (__u64)n->val[i];
p = next;
if (!(next = strchr(p, ' ')))
abort();
@@ -546,14 +549,14 @@ static void update_db(int interval)
int i;
 
for (i = 0; i < MAXS; i++) {
-   if ((long)(h1->ival[i] - n->ival[i]) < 
0) {
+   if ((long long)(h1->ival[i] - 
n->ival[i]) < 0) {
memset(n->ival, 0, 
sizeof(n->ival));
break;
}
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+   unsigned long long incr = h1->ival[i] - 
n->ival[i];
 
n->val[i] += incr;
n->ival[i] = h1->ival[i];
-- 
2.4.3



[PATCH iproute2 2/3] ifstat: Add extended statistics to ifstat

2016-11-24 Thread Nogah Frankel
Add extended stats option for ifstat. It supports stats that are in the
nesting level as the "normal" stats or one lower, as long as they are in
the same struct type as the "normal" stats.
Every extension is unaware of data from other extension and is being
presented by itself.
The extension can be called by its name or any shorten of it. If there is
more then one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 88 ++-
 1 file changed, 81 insertions(+), 7 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 25a8fc1..90aeeaa 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -49,11 +49,14 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats64)/sizeof(__u64))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
@@ -124,7 +127,7 @@ static int get_nlmsg(const struct sockaddr_nl *who,
return -1;
 
parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
-   if (tb[IFLA_STATS_LINK_64] == NULL)
+   if (tb[filter_type] == NULL)
return 0;
 
n = malloc(sizeof(*n));
@@ -133,7 +136,17 @@ static int get_nlmsg(const struct sockaddr_nl *who,
 
n->ifindex = ifsm->ifindex;
n->name = strdup(ll_index_to_name(ifsm->ifindex));
-   memcpy(>ival, RTA_DATA(tb[IFLA_STATS_LINK_64]), sizeof(n->ival));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(>ival, RTA_DATA(tb[filter_type]), sizeof(n->ival));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(>ival, RTA_DATA(attr), sizeof(n->ival));
+   }
memset(>rate, 0, sizeof(n->rate));
for (i = 0; i < MAXS; i++)
n->val[i] = n->ival[i];
@@ -152,7 +165,7 @@ static void load_info(void)
exit(1);
 
ll_init_map();
-   filt_mask = IFLA_STATS_FILTER_BIT(IFLA_STATS_LINK_64);
+   filt_mask = IFLA_STATS_FILTER_BIT(filter_type);
if (rtnl_wilddump_stats_req_filter(, AF_UNSPEC, RTM_GETSTATS,
   filt_mask) < 0) {
perror("Cannot send dump request");
@@ -659,6 +672,50 @@ static int verify_forging(int fd)
return -1;
 }
 
+static void xstat_usage(void)
+{
+   fprintf(stderr,
+"Usage: ifstat supported xstats:\n");
+
+}
+
+struct extended_stats_options_t {
+   char *name;
+   int id;
+   int sub_type;
+};
+
+/* Note: if one xstat name in subset of another, it should be before it in this
+ * list. Therefore the default "" option must always be first.
+ * Name length must be under 64 chars.
+ */
+static const struct extended_stats_options_t extended_stats_options[] = {
+   {"", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+};
+
+static bool get_filter_type(char *name)
+{
+   int name_len;
+   int i;
+
+   name_len = strlen(name);
+   for (i = 0; i < ARRAY_SIZE(extended_stats_options); i++) {
+   const struct extended_stats_options_t *xstat;
+
+   xstat = _stats_options[i];
+   if (strncmp(name, xstat->name, name_len) == 0) {
+   filter_type = xstat->id;
+   sub_type = xstat->sub_type;
+   strcpy(name, xstat->name);
+   return true;
+   }
+   }
+
+   printf("invalid ifstat extension %s\n", name);
+   xstat_usage();
+   return false;
+}
+
 static void usage(void) __attribute__((noreturn));
 
 static void usage(void)
@@ -676,7 +733,8 @@ static void usage(void)
 "   -s, --noupdate don\'t update history\n"
 "   -t, --interval=SECSreport average over the last SECS\n"
 "   -V, --version  output version information\n"
-"   -z, --zerosshow entries with zero activity\n");
+"   -z, --zerosshow entries with zero activity\n"
+"   -x, --extended=TYPEshow extended stats of TYPE\n");
 
exit(-1);
 }
@@ -694,18 +752,22 @@ static const struct option longopts[] = {
{ "interval", 1, 0, 't' },
{ "version", 0, 0, 'V' },
{ "zeros", 0, 0, 'z' },
+   { "extended", 1, 0, 'x'},
{ 0 }
 };
 
+
 int main(int argc, char *argv[])
 {
char hist_name[128];
struct sockaddr_un sun;
FILE *hist_fp = NULL;
+   char stats_type[64];
  

[PATCH iproute2 0/3] update ifstat for new stats

2016-11-24 Thread Nogah Frankel
Previously stats were gotten by RTM_GETLINK which return 32 bits based
statistics. It support only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset change ifstat to the new method, add it the ability to
choose an extended type of statistic, and add the extended type of SW
stats for packets that hit cpu.

Nogah Frankel (3):
  ifstat: Change interface to get stats
  ifstat: Add extended statistics to ifstat
  ifstat: Add "sw only" extended statistics to ifstat

 misc/ifstat.c | 125 +++---
 1 file changed, 102 insertions(+), 23 deletions(-)

-- 
2.4.3



[PATCH iproute2 3/3] ifstat: Add "sw only" extended statistics to ifstat

2016-11-24 Thread Nogah Frankel
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'software'
(or any shorten of it as 'soft' or simply 's').

For example:
ifstat -x s

Signed-off-by: Nogah Frankel <nog...@mellanox.com>
Reviewed-by: Jiri Pirko <j...@mellanox.com>
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 90aeeaa..7825a3a 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -675,7 +675,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"  softwareSW stats. Counts only packets that went via the CPU\n");
 
 }
 
@@ -691,6 +692,7 @@ struct extended_stats_options_t {
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
{"", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+   {"software",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



RE: [patch net-next v8 2/3] net: core: Add offload stats to if_stats_msg

2016-09-12 Thread Nogah Frankel
> -Original Message-
> From: Nikolay Aleksandrov [mailto:niko...@cumulusnetworks.com]
> Sent: Friday, September 09, 2016 4:39 PM
> To: Nikolay Aleksandrov <niko...@cumulusnetworks.com>
> Cc: Jiri Pirko <j...@resnulli.us>; Linux Kernel Network Developers 
> <netdev@vger.kernel.org>; David S. Miller
> <da...@davemloft.net>; Nogah Frankel <nog...@mellanox.com>; Ido Schimmel 
> <ido...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz 
> <ogerl...@mellanox.com>; ro...@cumulusnetworks.com;
> linvi...@tuxdriver.com; tg...@suug.ch; go...@cumulusnetworks.com; 
> sfel...@gmail.com; s...@queasysnail.net; Eran Ben Elisha
> <era...@mellanox.com>; a...@plumgrid.com; eduma...@google.com; 
> han...@stressinduktion.org; f.faine...@gmail.com;
> d...@cumulusnetworks.com
> Subject: Re: [patch net-next v8 2/3] net: core: Add offload stats to 
> if_stats_msg
> 
> 
> > On Sep 9, 2016, at 4:36 PM, Nikolay Aleksandrov 
> > <niko...@cumulusnetworks.com> wrote:
> >
> >>
> >> On Sep 8, 2016, at 9:19 AM, Jiri Pirko <j...@resnulli.us> wrote:
> >>
> >> From: Nogah Frankel <nog...@mellanox.com>
> >>
> >> Add a nested attribute of offload stats to if_stats_msg
> >> named IFLA_STATS_LINK_OFFLOAD_XSTATS.
> >> Under it, add SW stats, meaning stats only per packets that went via
> >> slowpath to the cpu, named IFLA_OFFLOAD_XSTATS_CPU_HIT.
> >>
> >> Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> >> Signed-off-by: Jiri Pirko <j...@mellanox.com>
> >> ---
> >> include/uapi/linux/if_link.h |  9 
> >> net/core/rtnetlink.c | 97 
> >> ++--
> >> 2 files changed, 102 insertions(+), 4 deletions(-)
> >>
> > [snip]
> >> @@ -3655,6 +3725,22 @@ static int rtnl_fill_statsinfo(struct sk_buff *skb, 
> >> struct net_device *dev,
> >>}
> >>}
> >>
> >> +  if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_OFFLOAD_XSTATS,
> >> +   *idxattr)) {
> >> +  attr = nla_nest_start(skb, IFLA_STATS_LINK_OFFLOAD_XSTATS);
> >> +  if (!attr)
> >> +  goto nla_put_failure;
> >> +
> >> +  err = rtnl_get_offload_stats(skb, dev);
> >> +  if (err == -ENODATA)
> >> +  nla_nest_cancel(skb, attr);
> >> +  else
> >> +  nla_nest_end(skb, attr);
> >> +
> >> +  if ((err) && (err != -ENODATA))
> >> +  goto nla_put_failure;
> >> +  }
> >> +
> >
> > Hmm, actually on a second read I think there’s a potential problem here. 
> > Since you don’t set *idxattr and if the space
> > isn’t enough the dump will get restarted and this will lead to an infinite 
> > loop.
> 
> Err, poor choice of words. I meant the loop will be for userspace since the 
> dump will err out at the same spot all the time so it
> won’t be able to ever finish. :-)
> 
> > I.e. if the previous attributes filled the skb and there’s not enough room 
> > for this one, it will return an error
> > but *idxattr will be == 0 from the previous attribute thus the whole dump 
> > will be restarted (this is in case someone
> > requests this attribute with some of the others of course).
> >
> > Cheers,
> > Nik
> >
> >

I'll fix it. Thanks.

> >>nlmsg_end(skb, nlh);
> >>
> >>return 0;
> >> @@ -3712,6 +3798,9 @@ static size_t if_nlmsg_stats_size(const struct 
> >> net_device *dev,
> >>}
> >>}
> >>
> >> +  if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_OFFLOAD_XSTATS, 0))
> >> +  size += rtnl_get_offload_stats_size(dev);
> >> +
> >>return size;
> >> }
> >>
> >> --
> >> 2.5.5



RE: [patch net-next v7 2/3] net: core: Add offload stats to if_stats_msg

2016-09-07 Thread Nogah Frankel
> From: Roopa Prabhu [mailto:ro...@cumulusnetworks.com]
> Sent: Tuesday, September 06, 2016 6:20 PM
> To: Jiri Pirko <j...@resnulli.us>
> Cc: netdev@vger.kernel.org; da...@davemloft.net; Nogah Frankel
> <nog...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad Raz
> <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or Gerlitz
> <ogerl...@mellanox.com>; niko...@cumulusnetworks.com;
> linvi...@tuxdriver.com; tg...@suug.ch; go...@cumulusnetworks.com;
> sfel...@gmail.com; s...@queasysnail.net; Eran Ben Elisha
> <era...@mellanox.com>; a...@plumgrid.com; eduma...@google.com;
> han...@stressinduktion.org; f.faine...@gmail.com;
> d...@cumulusnetworks.com
> Subject: Re: [patch net-next v7 2/3] net: core: Add offload stats to 
> if_stats_msg
> 
> On 9/5/16, 10:18 AM, Jiri Pirko wrote:
> > From: Nogah Frankel <nog...@mellanox.com>
> >
> > Add a nested attribute of offload stats to if_stats_msg
> > named IFLA_STATS_LINK_OFFLOAD_XSTATS.
> > Under it, add SW stats, meaning stats only per packets that went via
> > slowpath to the cpu, named IFLA_OFFLOAD_XSTATS_CPU_HIT.
> >
> > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > Signed-off-by: Jiri Pirko <j...@mellanox.com>
> > ---
> >  include/uapi/linux/if_link.h | 10 +
> >  net/core/rtnetlink.c | 88
> ++--
> >  2 files changed, 94 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> > index 9bf3aec..4aaa2a1 100644
> > --- a/include/uapi/linux/if_link.h
> > +++ b/include/uapi/linux/if_link.h
> > @@ -826,6 +826,7 @@ enum {
> > IFLA_STATS_LINK_64,
> > IFLA_STATS_LINK_XSTATS,
> > IFLA_STATS_LINK_XSTATS_SLAVE,
> > +   IFLA_STATS_LINK_OFFLOAD_XSTATS,
> > __IFLA_STATS_MAX,
> >  };
> >
> > @@ -845,6 +846,15 @@ enum {
> >  };
> >  #define LINK_XSTATS_TYPE_MAX (__LINK_XSTATS_TYPE_MAX - 1)
> >
> > +/* These are stats embedded into IFLA_STATS_LINK_OFFLOAD_XSTATS */
> > +enum {
> > +   IFLA_OFFLOAD_XSTATS_UNSPEC,
> > +   IFLA_OFFLOAD_XSTATS_CPU_HIT, /* struct rtnl_link_stats64 */
> > +   __IFLA_OFFLOAD_XSTATS_MAX
> > +};
> > +#define IFLA_OFFLOAD_XSTATS_MAX (__IFLA_OFFLOAD_XSTATS_MAX - 1)
> > +#define IFLA_OFFLOAD_XSTATS_FIRST (IFLA_OFFLOAD_XSTATS_UNSPEC +
> 1)
> 
> not sure why we need a first in the uapi.

I'll move it to rtnetlink.

> > +
> >  /* XDP section */
> >
> >  enum {
> > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> > index 1dfca1c..95eb131 100644
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -3577,6 +3577,74 @@ static bool stats_attr_valid(unsigned int mask,
> int attrid, int idxattr)
> >(!idxattr || idxattr == attrid);
> >  }
> >
> > +static int rtnl_get_offload_stats_attr_size(int attr_id)
> > +{
> > +   switch (attr_id) {
> > +   case IFLA_OFFLOAD_XSTATS_CPU_HIT:
> > +   return sizeof(struct rtnl_link_stats64);
> > +   }
> > +
> > +   return 0;
> > +}
> > +
> > +static int rtnl_get_offload_stats(struct sk_buff *skb, struct net_device
> *dev)
> > +{
> > +   struct nlattr *attr;
> > +   int attr_id, size;
> > +   void *attr_data;
> > +   int err;
> > +
> > +   if (!(dev->netdev_ops && dev->netdev_ops->ndo_has_offload_stats
> &&
> > + dev->netdev_ops->ndo_get_offload_stats))
> > +   return 0;
> > +
> > +   for (attr_id = IFLA_OFFLOAD_XSTATS_FIRST;
> > +attr_id <= IFLA_OFFLOAD_XSTATS_MAX; attr_id++) {
> > +   size = rtnl_get_offload_stats_attr_size(attr_id);
> > +   if (!size)
> > +   continue;
> > +
> > +   if (!dev->netdev_ops->ndo_has_offload_stats(attr_id))
> > +   continue;
> > +
> > +   attr = nla_reserve_64bit(skb, attr_id, size,
> > +IFLA_OFFLOAD_XSTATS_UNSPEC);
> > +   if (!attr)
> > +   return -EMSGSIZE;
> 
> you need this 64bit alignment padding only for the specific nested attribute
> your driver is adding ?. The code seems to add it unconditionally for every
> attribute.
> But, I am guessing future attribute authors will need to adjust this somehow.
> 

I think it's best to wait till we have more attributes to see how to implement 
it.

> > +
> > +   attr_data = nla_data(attr);
> > +  

RE: [patch net-next v6 2/3] net: core: add SW stats to if_stats_msg

2016-08-16 Thread Nogah Frankel
> -Original Message-
> From: Roopa Prabhu [mailto:ro...@cumulusnetworks.com]
> Sent: Wednesday, August 10, 2016 9:09 AM
> To: Jiri Pirko <j...@resnulli.us>
> Cc: netdev@vger.kernel.org; da...@davemloft.net; Nogah Frankel
> <nog...@mellanox.com>; Ido Schimmel <ido...@mellanox.com>; Elad
> Raz <el...@mellanox.com>; Yotam Gigi <yot...@mellanox.com>; Or
> Gerlitz <ogerl...@mellanox.com>; niko...@cumulusnetworks.com;
> linvi...@tuxdriver.com; tg...@suug.ch; go...@cumulusnetworks.com;
> sfel...@gmail.com; s...@queasysnail.net; Eran Ben Elisha
> <era...@mellanox.com>; a...@plumgrid.com; eduma...@google.com;
> han...@stressinduktion.org; f.faine...@gmail.com;
> d...@cumulusnetworks.com
> Subject: Re: [patch net-next v6 2/3] net: core: add SW stats to if_stats_msg
> 
> On 8/9/16, 2:25 PM, Jiri Pirko wrote:
> > From: Nogah Frankel <nog...@mellanox.com>
> >
> > Add a nested attribute of SW stats to if_stats_msg
> > under IFLA_STATS_LINK_SW_64.
> >
> > Signed-off-by: Nogah Frankel <nog...@mellanox.com>
> > Reviewed-by: Ido Schimmel <ido...@mellanox.com>
> > Signed-off-by: Jiri Pirko <j...@mellanox.com>
> > ---
> >  include/uapi/linux/if_link.h |  1 +
> >  net/core/rtnetlink.c | 21 +
> >  2 files changed, 22 insertions(+)
> >
> > diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> > index a1b5202..1c9b808 100644
> > --- a/include/uapi/linux/if_link.h
> > +++ b/include/uapi/linux/if_link.h
> > @@ -825,6 +825,7 @@ enum {
> > IFLA_STATS_LINK_64,
> > IFLA_STATS_LINK_XSTATS,
> > IFLA_STATS_LINK_XSTATS_SLAVE,
> > +   IFLA_STATS_LINK_SW_64,
> 
> sorry, don't mean to drag this discussion forever...but...
> 
> IIRC on the switchdev call, I thought we had agreed to just put these kind of
> breakdown
> stats in one more layer of nesting.. which can be extended for everybody.
> 
> IFLA_STATS_LINK_DRIVER_XSTATS   (or some other name) [
> IFLA_STATS_LINK_SW_64  (struct rtnl_link_stats64)< 
> you get
> the well defined struct here as you wanted.
> IFLA_STATS_LINK_HW_ACL_DROPS  /* u64 */<- we can
> keep extending this for other stats people want.
> 
> /* just keeps it extensible */
> ]
> 
> because in addition to all other reasons I have mentioned before,
> technically IFLA_STATS_LINK_64 (top-level aggregate stats)
> are also software only stats for say most logical devices etc.
> 
> 
> and instead of adding one ndo for your software only stats now and then
> again for other ethtool
> like hw stats from all drivers, it could also be like the below:
> 
> new ndo_get_hw_stats
> 
> IFLA_STATS_LINK_HW_XSTATS   (or some other name) [
> IFLA_STATS_LINK_CPU_64  (struct rtnl_link_stats64)< 
> you
> get the well defined struct here as you wanted.
> IFLA_STATS_LINK_HW_ACL_DROPS  /* u64 */  <- 
> we can
> keep extending this for other stats people want.
> /* just keeps it extensible */
> ]
> 
> It gets you what you want and keeps the api extensible IMO. your call.
> 

I don't think extra nesting is a good idea because when we are called for 
IFLA_STATS_LINK_DRIVER_XSTATS  we can't distinguish between subtypes of it.
So for each driver stats we are going to need different XSTATS types anyway.

We shouldn't return all the stats a driver can provide when we are asked for 
one.
HW_ACL_DROPS shouldn't be queried when user want SW stats nor the  other
way around.
 (And I think that stats exposed here should be well defined so they could 
be exposed to ifstat. The place for not well define stats is in ethtool)

> > __IFLA_STATS_MAX,
> >  };
> >
> > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> > index 189cc78..910f802 100644
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -3583,6 +3583,21 @@ static int rtnl_fill_statsinfo(struct sk_buff
> *skb, struct net_device *dev,
> > dev_get_stats(dev, sp);
> > }
> >
> > +   if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_SW_64,
> *idxattr)) {
> > +   if (dev_have_sw_stats(dev)) {
> > +   struct rtnl_link_stats64 *sp;
> > +
> > +   attr = nla_reserve_64bit(skb,
> IFLA_STATS_LINK_SW_64,
> > +sizeof(struct
> rtnl_link_stats64),
> > +IFLA_STATS_UNSPEC);
> > +   if (!attr)
> > +   go