RE: [EXT] Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-07 Thread Stefan Chulski


> -Original Message-
> From: Miquel RAYNAL [mailto:miquel.ray...@free-electrons.com]
> Sent: Tuesday, November 07, 2017 12:45 AM
> To: Stefan Chulski <stef...@marvell.com>
> Cc: Thomas Petazzoni <thomas.petazz...@free-electrons.com>; Antoine Tenart
> <antoine.ten...@free-electrons.com>; David S. Miller
> <da...@davemloft.net>; netdev@vger.kernel.org
> Subject: [EXT] Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics
> 
> External Email
> 
> --
> Hi Stefan,
> 
> +David Miller/Net ML
> 
> > > @@ -6844,6 +7023,10 @@ static int mvpp2_open(struct net_device
> > > *dev)
> > >
> > >   mvpp2_start_dev(port);
> > >
> > > + /* Start hardware statistics gathering */
> > > + queue_delayed_work(priv->stats_queue, >stats_work,
> > > +MVPP2_MIB_COUNTERS_STATS_DELAY);
> > > +
> > >   return 0;
> > >
> > >  err_free_link_irq:
> > > @@ -6888,6 +7071,9 @@ static int mvpp2_stop(struct net_device *dev)
> > >   mvpp2_cleanup_rxqs(port);
> > >   mvpp2_cleanup_txqs(port);
> > >
> > > + cancel_delayed_work_sync(>stats_work);
> > > + flush_workqueue(priv->stats_queue);
> > > +
> >
> > Hi Miquel,
> >
> > I think there are bug here.
> > priv is common for all ports on same CPN and they have same
> > priv->stats_work.
> >
> > For example on A7K board with 3 Ports. queue_delayed_work and
> > cancel_delayed_work_sync called for each port stop and start
> > procedure. For example:
> > If Port0 and Port1 were started, then if only Port0 stopped, delayed
> > work would be canceled for both ports.
> 
> 
> Thanks for spotting it, you are right this is a bug since I moved 
> starting/stopping
> the queues in the opening and close procedure of the ports (to avoid using CPU
> time while no interface is actually up).
> 
> Maybe I should have a work per port, it would be easier to handle.
> 
> Thank you,
> Miquèl

I agree with you.
If delayed_work enable/disabled upon port start/stop procedure, it should be
per port and gather MIB statistics for single port.

Stefan.





Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-06 Thread Miquel RAYNAL
Hi Stefan,

+David Miller/Net ML

> > @@ -6844,6 +7023,10 @@ static int mvpp2_open(struct net_device
> > *dev)
> > 
> > mvpp2_start_dev(port);
> > 
> > +   /* Start hardware statistics gathering */
> > +   queue_delayed_work(priv->stats_queue, >stats_work,
> > +  MVPP2_MIB_COUNTERS_STATS_DELAY);
> > +
> > return 0;
> > 
> >  err_free_link_irq:
> > @@ -6888,6 +7071,9 @@ static int mvpp2_stop(struct net_device *dev)
> > mvpp2_cleanup_rxqs(port);
> > mvpp2_cleanup_txqs(port);
> > 
> > +   cancel_delayed_work_sync(>stats_work);
> > +   flush_workqueue(priv->stats_queue);
> > +  
> 
> Hi Miquel,
> 
> I think there are bug here.
> priv is common for all ports on same CPN and they have same
> priv->stats_work.
> 
> For example on A7K board with 3 Ports. queue_delayed_work and
> cancel_delayed_work_sync called for each port stop and start
> procedure. For example:
> If Port0 and Port1 were started, then if only Port0 stopped, delayed
> work would be canceled for both ports.


Thanks for spotting it, you are right this is a bug since I moved
starting/stopping the queues in the opening and close procedure of the
ports (to avoid using CPU time while no interface is actually up).

Maybe I should have a work per port, it would be easier to
handle.

Thank you,
Miquèl


Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-06 Thread Miquel RAYNAL
Hi Andrew,

> > Here I do have a problem: I choose the IDA solution because it was
> > quite straightforward but I agree it would be better to use an
> > unique name. Unfortunately, on the Armada-8040-DB that instantiate
> > this driver twice, the node name is not unique. There are two CP
> > (master and slave) and both names are "ethernet@0". Otherwise, if I
> > use the full path, I get something like
> > "/cp110-master/config-space@f200/ethernet@0" and
> > "/cp110-slave/config-space@f400/ethernet@0" but the problem is
> > that workqueue names are truncated to 24 characters and only 15
> > appears in ps, so it would not solve the issue and choosing the
> > "parent parent node name" would work here but does not scale very
> > well. Do you have any idea to get this right?  
> 
> Hi Miquel
> 
> You could move the starting of the thread into mvpp2_port_probe(). If
> you do it after register_netdev(dev), you can use netdev_name(dev).
> 
> Andrew

There is one workqueue per instance of the driver, not per port.
Hence, I choose to use the netdev_name() (short) with a '+' after it
if there are other ports involved, ie. "stats-wq-eth0+" and
"stats-wq-eth2+".


Thanks for your help,
Miquèl


Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-06 Thread Andrew Lunn
> Here I do have a problem: I choose the IDA solution because it was
> quite straightforward but I agree it would be better to use an unique
> name. Unfortunately, on the Armada-8040-DB that instantiate this driver
> twice, the node name is not unique. There are two CP (master and
> slave) and both names are "ethernet@0". Otherwise, if I use the full
> path, I get something like
> "/cp110-master/config-space@f200/ethernet@0" and
> "/cp110-slave/config-space@f400/ethernet@0" but the problem is that
> workqueue names are truncated to 24 characters and only 15 appears in
> ps, so it would not solve the issue and choosing the "parent
> parent node name" would work here but does not scale very well. Do you
> have any idea to get this right?

Hi Miquel

You could move the starting of the thread into mvpp2_port_probe(). If
you do it after register_netdev(dev), you can use netdev_name(dev).

Andrew


Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-06 Thread Miquel RAYNAL
Hi Andrew,

On Fri, 3 Nov 2017 16:19:06 +0100
Andrew Lunn  wrote:

> > @@ -817,6 +856,12 @@ struct mvpp2 {
> >  
> > /* Maximum number of RXQs per port */
> > unsigned int max_port_rxqs;
> > +
> > +   /* Workqueue to gather hardware statistics with its lock */
> > +   struct mutex gather_stats_lock;
> > +   struct delayed_work stats_work;
> > +   char queue_name[20];
> > +   struct workqueue_struct *stats_queue;
> >  };  
>   
> > +static u64 mvpp2_read_count(struct mvpp2_port *port,
> > +   const struct mvpp2_ethtool_counter
> > *counter) +{
> > +   void __iomem *base;
> > +   u64 val;
> > +
> > +   if (port->priv->hw_version == MVPP21)
> > +   base = port->priv->lms_base +
> > MVPP21_MIB_COUNTERS_OFFSET +
> > +  port->gop_id * MVPP21_MIB_COUNTERS_PORT_SZ;
> > +   else
> > +   base = port->priv->iface_base +
> > MVPP22_MIB_COUNTERS_OFFSET +
> > +  port->gop_id *
> > MVPP22_MIB_COUNTERS_PORT_SZ;  
> 
> This seems like something which could be calculated once and then
> stored away, e.g. next to stats_queue.

Sure, it is even more important now that there is a workqueue calling
quite often this function. I actually had to move the new field into
the "mvpp2_port" structure otherwise the offset still should have been
calculated each time.

> 
> > +static void mvpp2_ethtool_get_stats(struct net_device *dev,
> > +   struct ethtool_stats *stats,
> > u64 *data) +{
> > +   struct mvpp2_port *port = netdev_priv(dev);
> > +
> > +   /* Update statistics for all ports, copy only those
> > actually needed */
> > +   mvpp2_gather_hw_statistics(>priv->stats_work.work);
> > +
> > +   memcpy(data, port->ethtool_stats,
> > +  sizeof(u64) * ARRAY_SIZE(mvpp2_ethtool_regs));  
> 
> Thanks for adding the mutex in mvpp2_gather_hw_statistics(). However,
> should we be holding the mutex while performing this copy? There is no
> snapshot support, so the statistics are not guaranteed to be
> consistent. However, since a statistics is a u64, it is possible the
> copy will get the new lower 32 bits and the old 32 bits if the copy
> happens while mvpp2_gather_hw_statistics() is running.

I thought the concurrency problem was only when reading from the
hardware registers so I did not thought about the situation you
describe here, I updated (see the coming v3) by surrounding the memcpy
call by acquiring/releasing the lock.

> 
> > +   port->ethtool_stats = devm_kcalloc(>dev,
> > +
> > ARRAY_SIZE(mvpp2_ethtool_regs),
> > +  sizeof(u64),
> > GFP_KERNEL);
> > +   if (!port->ethtool_stats) {
> > +   err = -ENOMEM;
> > +   goto err_free_stats;
> > +   }
> > +
> > mvpp2_port_copy_mac_addr(dev, priv, port_node, _from);
> >  
> > port->tx_ring_size = MVPP2_MAX_TXD;
> > @@ -7707,6 +7904,7 @@ static void mvpp2_port_remove(struct
> > mvpp2_port *port) of_node_put(port->phy_node);
> > free_percpu(port->pcpu);
> > free_percpu(port->stats);
> > +   kfree(port->ethtool_stats);  
> 
> You allocate the memory using devm_. You should not use plain kfree()
> on it. You might want to spend some time reading about devm_

Rha, right, was too hurry to resend, I obviously forgot to update the
*_remove() function, sorry about that. 

> 
> > +   mutex_init(>gather_stats_lock);
> > +   index = ida_simple_get(_index_ida, 0, 0,
> > GFP_KERNEL);
> > +   if (index < 0)
> > +   goto err_mg_clk;
> > +
> > +   snprintf(priv->queue_name, sizeof(priv->queue_name),
> > +"mvpp2_stats_%d", index);  
> 
> I know Florian asked for unique names, which IDA will give you. But
> could you derive the name from device tree? It then becomes a name you
> can actually map back to the hardware, rather than being semi-random.

Here I do have a problem: I choose the IDA solution because it was
quite straightforward but I agree it would be better to use an unique
name. Unfortunately, on the Armada-8040-DB that instantiate this driver
twice, the node name is not unique. There are two CP (master and
slave) and both names are "ethernet@0". Otherwise, if I use the full
path, I get something like
"/cp110-master/config-space@f200/ethernet@0" and
"/cp110-slave/config-space@f400/ethernet@0" but the problem is that
workqueue names are truncated to 24 characters and only 15 appears in
ps, so it would not solve the issue and choosing the "parent
parent node name" would work here but does not scale very well. Do you
have any idea to get this right?

Thanks for the review,
Miquèl


Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-03 Thread Andrew Lunn
> @@ -817,6 +856,12 @@ struct mvpp2 {
>  
>   /* Maximum number of RXQs per port */
>   unsigned int max_port_rxqs;
> +
> + /* Workqueue to gather hardware statistics with its lock */
> + struct mutex gather_stats_lock;
> + struct delayed_work stats_work;
> + char queue_name[20];
> + struct workqueue_struct *stats_queue;
>  };
  
> +static u64 mvpp2_read_count(struct mvpp2_port *port,
> + const struct mvpp2_ethtool_counter *counter)
> +{
> + void __iomem *base;
> + u64 val;
> +
> + if (port->priv->hw_version == MVPP21)
> + base = port->priv->lms_base + MVPP21_MIB_COUNTERS_OFFSET +
> +port->gop_id * MVPP21_MIB_COUNTERS_PORT_SZ;
> + else
> + base = port->priv->iface_base + MVPP22_MIB_COUNTERS_OFFSET +
> +port->gop_id * MVPP22_MIB_COUNTERS_PORT_SZ;

This seems like something which could be calculated once and then
stored away, e.g. next to stats_queue.

> +static void mvpp2_ethtool_get_stats(struct net_device *dev,
> + struct ethtool_stats *stats, u64 *data)
> +{
> + struct mvpp2_port *port = netdev_priv(dev);
> +
> + /* Update statistics for all ports, copy only those actually needed */
> + mvpp2_gather_hw_statistics(>priv->stats_work.work);
> +
> + memcpy(data, port->ethtool_stats,
> +sizeof(u64) * ARRAY_SIZE(mvpp2_ethtool_regs));

Thanks for adding the mutex in mvpp2_gather_hw_statistics(). However,
should we be holding the mutex while performing this copy? There is no
snapshot support, so the statistics are not guaranteed to be
consistent. However, since a statistics is a u64, it is possible the
copy will get the new lower 32 bits and the old 32 bits if the copy
happens while mvpp2_gather_hw_statistics() is running.

> + port->ethtool_stats = devm_kcalloc(>dev,
> +ARRAY_SIZE(mvpp2_ethtool_regs),
> +sizeof(u64), GFP_KERNEL);
> + if (!port->ethtool_stats) {
> + err = -ENOMEM;
> + goto err_free_stats;
> + }
> +
>   mvpp2_port_copy_mac_addr(dev, priv, port_node, _from);
>  
>   port->tx_ring_size = MVPP2_MAX_TXD;
> @@ -7707,6 +7904,7 @@ static void mvpp2_port_remove(struct mvpp2_port *port)
>   of_node_put(port->phy_node);
>   free_percpu(port->pcpu);
>   free_percpu(port->stats);
> + kfree(port->ethtool_stats);

You allocate the memory using devm_. You should not use plain kfree()
on it. You might want to spend some time reading about devm_

> + mutex_init(>gather_stats_lock);
> + index = ida_simple_get(_index_ida, 0, 0, GFP_KERNEL);
> + if (index < 0)
> + goto err_mg_clk;
> +
> + snprintf(priv->queue_name, sizeof(priv->queue_name),
> +  "mvpp2_stats_%d", index);

I know Florian asked for unique names, which IDA will give you. But
could you derive the name from device tree? It then becomes a name you
can actually map back to the hardware, rather than being semi-random.

Thanks
Andrew


Re: [PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-03 Thread Miquel RAYNAL
On Fri,  3 Nov 2017 12:04:25 +0100
Miquel Raynal  wrote:

> Add ethtool statistics support by reading the GOP statistics from the
> hardware counters. Also implement a workqueue to gather the statistics
> every second or some 32-bit counters could overflow.
> 
> Suggested-by: Stefan Chulski 
> Signed-off-by: Miquel Raynal 
> ---

Changes since v1:
- Constified the ethtool array (with strings and offset) and added a
  third parameter to the repeated structure: if the register is 64-bit
  wide.
- Added locking inside the mvpp2_gather_stats() to avoid starting
  reading the registers while a read is already pending.
- Used devm_* allocators.
- Switched to cancel_delayed_work_sync().
- Fixed the size allocated to the statistics array.
- Added an index to the workqueue name, derived with IDA because the
  name shown in 'ps' is only 15 chars wide (so must be short) and there
  was no existing way to retrieve a unique number per engine instance.
- Start gathering the statistics only in ndo_start(), stop in
  ndo_stop().

Best regards,
Miquèl


[PATCH net-next v2] net: mvpp2: add ethtool GOP statistics

2017-11-03 Thread Miquel Raynal
Add ethtool statistics support by reading the GOP statistics from the
hardware counters. Also implement a workqueue to gather the statistics
every second or some 32-bit counters could overflow.

Suggested-by: Stefan Chulski 
Signed-off-by: Miquel Raynal 
---
 drivers/net/ethernet/marvell/mvpp2.c | 237 ++-
 1 file changed, 231 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c 
b/drivers/net/ethernet/marvell/mvpp2.c
index 97efe4733661..ca46eca27981 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -38,6 +38,8 @@
 #include 
 #include 
 
+static DEFINE_IDA(engine_index_ida);
+
 /* RX Fifo Registers */
 #define MVPP2_RX_DATA_FIFO_SIZE_REG(port)  (0x00 + 4 * (port))
 #define MVPP2_RX_ATTR_FIFO_SIZE_REG(port)  (0x20 + 4 * (port))
@@ -769,6 +771,42 @@ enum mvpp2_bm_type {
MVPP2_BM_SWF_SHORT
 };
 
+/* GMAC MIB Counters register definitions */
+#define MVPP21_MIB_COUNTERS_OFFSET 0x1000
+#define MVPP21_MIB_COUNTERS_PORT_SZ0x400
+#define MVPP22_MIB_COUNTERS_OFFSET 0x0
+#define MVPP22_MIB_COUNTERS_PORT_SZ0x100
+
+#define MVPP2_MIB_GOOD_OCTETS_RCVD 0x0
+#define MVPP2_MIB_BAD_OCTETS_RCVD  0x8
+#define MVPP2_MIB_CRC_ERRORS_SENT  0xc
+#define MVPP2_MIB_UNICAST_FRAMES_RCVD  0x10
+#define MVPP2_MIB_BROADCAST_FRAMES_RCVD0x18
+#define MVPP2_MIB_MULTICAST_FRAMES_RCVD0x1c
+#define MVPP2_MIB_FRAMES_64_OCTETS 0x20
+#define MVPP2_MIB_FRAMES_65_TO_127_OCTETS  0x24
+#define MVPP2_MIB_FRAMES_128_TO_255_OCTETS 0x28
+#define MVPP2_MIB_FRAMES_256_TO_511_OCTETS 0x2c
+#define MVPP2_MIB_FRAMES_512_TO_1023_OCTETS0x30
+#define MVPP2_MIB_FRAMES_1024_TO_MAX_OCTETS0x34
+#define MVPP2_MIB_GOOD_OCTETS_SENT 0x38
+#define MVPP2_MIB_UNICAST_FRAMES_SENT  0x40
+#define MVPP2_MIB_MULTICAST_FRAMES_SENT0x48
+#define MVPP2_MIB_BROADCAST_FRAMES_SENT0x4c
+#define MVPP2_MIB_FC_SENT  0x54
+#define MVPP2_MIB_FC_RCVD  0x58
+#define MVPP2_MIB_RX_FIFO_OVERRUN  0x5c
+#define MVPP2_MIB_UNDERSIZE_RCVD   0x60
+#define MVPP2_MIB_FRAGMENTS_RCVD   0x64
+#define MVPP2_MIB_OVERSIZE_RCVD0x68
+#define MVPP2_MIB_JABBER_RCVD  0x6c
+#define MVPP2_MIB_MAC_RCV_ERROR0x70
+#define MVPP2_MIB_BAD_CRC_EVENT0x74
+#define MVPP2_MIB_COLLISION0x78
+#define MVPP2_MIB_LATE_COLLISION   0x7c
+
+#define MVPP2_MIB_COUNTERS_STATS_DELAY (1 * HZ)
+
 /* Definitions */
 
 /* Shared Packet Processor resources */
@@ -796,6 +834,7 @@ struct mvpp2 {
struct clk *axi_clk;
 
/* List of pointers to port structures */
+   int port_count;
struct mvpp2_port **port_list;
 
/* Aggregated TXQs */
@@ -817,6 +856,12 @@ struct mvpp2 {
 
/* Maximum number of RXQs per port */
unsigned int max_port_rxqs;
+
+   /* Workqueue to gather hardware statistics with its lock */
+   struct mutex gather_stats_lock;
+   struct delayed_work stats_work;
+   char queue_name[20];
+   struct workqueue_struct *stats_queue;
 };
 
 struct mvpp2_pcpu_stats {
@@ -879,6 +924,7 @@ struct mvpp2_port {
u16 tx_ring_size;
u16 rx_ring_size;
struct mvpp2_pcpu_stats __percpu *stats;
+   u64 *ethtool_stats;
 
phy_interface_t phy_interface;
struct device_node *phy_node;
@@ -4743,9 +4789,142 @@ static void mvpp2_port_loopback_set(struct mvpp2_port 
*port)
writel(val, port->base + MVPP2_GMAC_CTRL_1_REG);
 }
 
+struct mvpp2_ethtool_counter {
+   unsigned int offset;
+   const char string[ETH_GSTRING_LEN];
+   bool reg_is_64b;
+};
+
+static u64 mvpp2_read_count(struct mvpp2_port *port,
+   const struct mvpp2_ethtool_counter *counter)
+{
+   void __iomem *base;
+   u64 val;
+
+   if (port->priv->hw_version == MVPP21)
+   base = port->priv->lms_base + MVPP21_MIB_COUNTERS_OFFSET +
+  port->gop_id * MVPP21_MIB_COUNTERS_PORT_SZ;
+   else
+   base = port->priv->iface_base + MVPP22_MIB_COUNTERS_OFFSET +
+  port->gop_id * MVPP22_MIB_COUNTERS_PORT_SZ;
+
+   val = readl(base + counter->offset);
+   if (counter->reg_is_64b)
+   val += (u64)readl(base + counter->offset + 4) << 32;
+
+   return val;
+}
+
+/* Due to the fact that software statistics and hardware statistics are, by
+ * design, incremented at different moments in the chain of packet processing,
+ * it is very likely that incoming packets could have been dropped after being
+ * counted by hardware but before reaching software statistics (most probably
+ * multicast