Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-28 Thread Ben Hutchings
On Fri, 2015-04-24 at 21:53 +0300, Sergei Shtylyov wrote:
 On 04/23/2015 02:22 AM, Florian Fainelli wrote:
 
 [...]
 
  +if (ecmd-duplex == DUPLEX_FULL)
  +priv-duplex = 1;
  +else
  +priv-duplex = 0;
 
  Why not use what priv-phydev-duplex has cached for you?
 
  Because we compare 'priv-duplex' with 'priv-phydev-duplex' in
  ravb_adjust_link(). Or what did you mean?
 
  Oh I see how you are using this now, but it does not look like it is
  necessary, since you use phy_ethtool_sset() using phydev-duplex
 
It only writes to it, doesn't use it AFAICS...
 
  directly ought to be enough anywhere in your driver?
 
 'priv-phydev' is NULL when the device is closed, so I just can't call 
 phy_ethtool_sset().
 
  Unless there is a
  mode where you are running PHY-less, and not using a fixed PHY to
  emulate a PHY...
 
 No such mode.
 
  [...]
 
  +static int ravb_nway_reset(struct net_device *ndev)
  +{
  +struct ravb_private *priv = netdev_priv(ndev);
  +int error = -ENODEV;
  +unsigned long flags;
  +
  +if (priv-phydev) {
 
  Is checking against priv-phydev really necessary, it does not look like
  the driver will work or accept an invalid PHY device at all anyway?
 
 This check was copied from sh_eth that was fixed by Ben ot to crash due to
 'ethtool' being called on closed device, see:
 
 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/renesas/sh_eth.c?id=4f9dce230b32eec45cec8c28cae61efdfa2f7d57
 
 That commit refers to a dangling pointer, not sure what does this mean...
 The PHy device doesn't seem to be freed by phy_disconnect(). Ben?
[...]

In practice the phy_device is unlikely to be freed immediately.  Bt it
is certainly not valid for a net driver to pass a phy_device pointer to
phylib functions after calling phy_disconnect() on it.

Ben.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3] Renesas Ethernet AVB driver

2015-04-27 Thread David Laight
From: Sergei Shtylyov 
 Sent: 24 April 2015 19:27
...
  If you have ethernet hardware that requires tx or rx buffers to be on
  4n boundaries you should send it back as 'not fit for purpose'.
 
 The RX buffers can be adjusted with skb_resrerve(), it's only the TX
 buffers that need to be copied...

If the processor can't perform misaligned reads (don't know what is on
your SoC, but I suspect it can't - crossing page boundaries is hard)
then the rx buffer will have to be re-aligned in software.
Even the 'userdata' part will typically end up with an expensive
misaligned buffer copy.

Even on x86 the misaligned transfers are probably measurable.

 I'm afraid we can't. :-)
 However, my colleague has suggested a scheme minimizing the copying:
 only up to 3 first bytes need to be copied to the driver's internal buffers,
 the rest can be sent from an skb itself. That would require substantial
 changes to the driver though...

There might be a restriction on the length of buffer fragments.

You might be able to alternate 14 and 1500+ byte receive buffers.
The frame following a slightly overlong one would be 'wrong'.

David
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3] Renesas Ethernet AVB driver

2015-04-24 Thread David Laight
From: Sergei Shtylyov
 Sent: 22 April 2015 22:39
 On 04/22/2015 11:42 PM, David Miller wrote:
 
  Hmm, I've been digging in the net core, and was unable to see where TX
  skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
  Probably need to print out skb's fields...
 
  NET_IP_ALIGN is for receive, not transmit.
 
 But when I print 'skb-data' from the ndo_start_xmit() method (in the
 'sh_eth' driver), all addresses end with 2, so it looks like NET_IP_ALIGN gets
 added somewhere...

For a locally generated message:
The TCP userdata is likely to be 4 byte aligned.
The TCP and IP headers are multiples of 4 bytes.
The MAC header is 14 bytes.
So you end up with a buffer that starts on a 4n+2 boundary or an initial
short fragment that is 4n+2 bytes long.

If a message is being forwarded the alignment probably depends on where
it came from.

If you have ethernet hardware that requires tx or rx buffers to be on
4n boundaries you should send it back as 'not fit for purpose'.

David

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-24 Thread Sergei Shtylyov

On 04/24/2015 12:03 PM, David Laight wrote:


Sent: 22 April 2015 22:39
On 04/22/2015 11:42 PM, David Miller wrote:



 Hmm, I've been digging in the net core, and was unable to see where TX
 skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
 Probably need to print out skb's fields...



NET_IP_ALIGN is for receive, not transmit.



 But when I print 'skb-data' from the ndo_start_xmit() method (in the
'sh_eth' driver), all addresses end with 2, so it looks like NET_IP_ALIGN gets
added somewhere...



For a locally generated message:
The TCP userdata is likely to be 4 byte aligned.
The TCP and IP headers are multiples of 4 bytes.
The MAC header is 14 bytes.
So you end up with a buffer that starts on a 4n+2 boundary or an initial
short fragment that is 4n+2 bytes long.



If a message is being forwarded the alignment probably depends on where
it came from.


   Thanks for the detailed reply, though it came a bit late. :-)


If you have ethernet hardware that requires tx or rx buffers to be on


   The RX buffers can be adjusted with skb_resrerve(), it's only the TX 
buffers that need to be copied...



4n boundaries you should send it back as 'not fit for purpose'.


   I'm afraid we can't. :-)
   However, my colleague has suggested a scheme minimizing the copying:
only up to 3 first bytes need to be copied to the driver's internal buffers, 
the rest can be sent from an skb itself. That would require substantial 
changes to the driver though...



David


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-24 Thread Sergei Shtylyov

On 04/23/2015 02:22 AM, Florian Fainelli wrote:

[...]


+if (ecmd-duplex == DUPLEX_FULL)
+priv-duplex = 1;
+else
+priv-duplex = 0;



Why not use what priv-phydev-duplex has cached for you?



Because we compare 'priv-duplex' with 'priv-phydev-duplex' in
ravb_adjust_link(). Or what did you mean?



Oh I see how you are using this now, but it does not look like it is
necessary, since you use phy_ethtool_sset() using phydev-duplex


  It only writes to it, doesn't use it AFAICS...


directly ought to be enough anywhere in your driver?


   'priv-phydev' is NULL when the device is closed, so I just can't call 
phy_ethtool_sset().



Unless there is a
mode where you are running PHY-less, and not using a fixed PHY to
emulate a PHY...


   No such mode.


[...]



+static int ravb_nway_reset(struct net_device *ndev)
+{
+struct ravb_private *priv = netdev_priv(ndev);
+int error = -ENODEV;
+unsigned long flags;
+
+if (priv-phydev) {



Is checking against priv-phydev really necessary, it does not look like
the driver will work or accept an invalid PHY device at all anyway?


   This check was copied from sh_eth that was fixed by Ben ot to crash due to
'ethtool' being called on closed device, see:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/renesas/sh_eth.c?id=4f9dce230b32eec45cec8c28cae61efdfa2f7d57

   That commit refers to a dangling pointer, not sure what does this mean...
The PHy device doesn't seem to be freed by phy_disconnect(). Ben?


You still can run 'ethtool' on a closed network device.



Sure, but that does not mean that priv-phydev becomes NULL, even if you


   It does with 'sh_eth' and hence with 'ravb' too.


have called phy_disconnect() in your ndo_close() function, you should
still have a correct priv-phydev reference to the PHY device, no?


   PHY device is returned by of_phy_connect() each time the device is opened, 
see ravb_phy_init().
   We could indeed remove NULLifying 'priv-phydev' from ravb_close() though, 
needs testing...


[...]


+static int ravb_start_xmit(struct sk_buff *skb, struct net_device
*ndev)
+{
+struct ravb_private *priv = netdev_priv(ndev);
+struct ravb_tstamp_skb *ts_skb = NULL;
+struct ravb_tx_desc *desc;
+unsigned long flags;
+void *buffer;
+u32 entry;
+u32 tccr;
+int q;
+
+/* If skb needs TX timestamp, it is handled in network control
queue */
+q = (skb_shinfo(skb)-tx_flags  SKBTX_HW_TSTAMP) ? RAVB_NC :
RAVB_BE;
+
+spin_lock_irqsave(priv-lock, flags);
+if (priv-cur_tx[q] - priv-dirty_tx[q] = priv-num_tx_ring[q]
- 4) {



What's so special about 4 here, you don't seem to be using 4 descriptors



Not sure, this was clearly copied from sh_eth.c. Perhaps it's just a
threshold for calling ravb_tx_free()...



Then 1 inclusive or 0 exclusive is probably what you should be comparing
to, otherwise you may just stop the tx queue earlier than needed.


   Will look into this...

[...]


+desc-ds = skb-len;
+desc-dptr = dma_map_single(ndev-dev, buffer, skb-len,
+DMA_TO_DEVICE);
+if (dma_mapping_error(ndev-dev, desc-dptr)) {
+dev_kfree_skb_any(skb);
+priv-tx_skb[q][entry] = NULL;



Don't you need to make sure this NULL is properly seen by ravb_tx_free()?



You mean doing this before releasing the spinlock? Or what?



Yes, the locking your transmit function seems to open windows during
which it is possible for the interrupt handler running on another CPU to
mess up with the data you are using here.


   Will look into that too...


--
Florian


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread David Miller
From: MITSUHIRO KIMURA mitsuhiro.kimura...@renesas.com
Date: Wed, 22 Apr 2015 05:04:13 +

 Hello Sergei.
 
  (2015/04/15 6:37:28), Sergei Shtylyov wrote:
  + if (!ravb_tx_free(ndev, q)) {
  + netif_warn(priv, tx_queued, ndev, TX FD exhausted.\n);
  + netif_stop_queue(ndev);
  + spin_unlock_irqrestore(priv-lock, flags);
  + return NETDEV_TX_BUSY;
  + }
  + }
  + entry = priv-cur_tx[q] % priv-num_tx_ring[q];
  + priv-cur_tx[q]++;
  + spin_unlock_irqrestore(priv-lock, flags);
  +
  + if (skb_put_padto(skb, ETH_ZLEN))
  + return NETDEV_TX_OK;
  +
  + priv-tx_skb[q][entry] = skb;
  + buffer = PTR_ALIGN(priv-tx_buffers[q][entry], RAVB_ALIGN);
  + memcpy(buffer, skb-data, skb-len);
 
  ~1500 bytes memcpy(), not good...
 
 I'm looking in the manual and not finding the hard requirement to have 
 the
 buffer address aligned to 128 bytes (RAVB_ALIGN), sigh... Kimura-san?
 
 There are the hardware requirement that the frame data must be aligned with
 a 32-bit boundary in the URAM, see section 45A.3.3.1 Data Representation
 in the manual.
 I think that the original skb-data is almost aligned with 2 bytes boundary
 by NET_IP_ALING, so we copied original skb-data to the local aligned buffer.
 
 In addition, see section 45A.3.3.12 Tips for Optimizing Performance in 
 Handling
 Descriptors, it mentioned that frame data is accessed in blocks up to 128 
 bytes
 and the number of 128 byte borders (addresses H'xxx00 and H'xxx80) and frame 
 data
 inside should be minimized.
 So we set RAVB_ALIGN to 128 bytes.

There is no way that copying is going to be faster than finding an adequate way
to transmit directly out of the SKB memory.

In this day and age there is simply no excuse for something like this, you will
have to find a way.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread David Miller
From: Sergei Shtylyov sergei.shtyl...@cogentembedded.com
Date: Thu, 23 Apr 2015 00:38:56 +0300

 On 04/22/2015 11:42 PM, David Miller wrote:
 
 Hmm, I've been digging in the net core, and was unable to see where TX
 skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
 Probably need to print out skb's fields...
 
 NET_IP_ALIGN is for receive, not transmit.
 
But when I print 'skb-data' from the ndo_start_xmit() method (in the
'sh_eth' driver), all addresses end with 2, so it looks like
NET_IP_ALIGN gets added somewhere...

It's the IPV4 header which is 4 byte aligned, then the ethernet header
is pushed which is 14 bytes.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread David Miller
From: Sergei Shtylyov sergei.shtyl...@cogentembedded.com
Date: Thu, 23 Apr 2015 01:34:32 +0300

Sigh... I'm seeing no way out of that then, only copying. :-(

What exactly is the device's restriction?

Any reasonable modern chip allows one of two things.

Either it allows arbitrary alignment of the start of the TX
frame when DMA'ing.

_or_

It allows a variable number of pad bytes to be inserted by the
driver before giving it to the card, which do not go onto the
wire, in order to meet the device's DMA restrictions.

For example, if the packet is only 2 byte aligned, you set the ignore
offset to 2 and push two zero bytes in front of the ethernet frame
before giving it to the card.

If a chip made in this day and era cannot do one of those two things,
this is beyond disappointing and is a massive engineering failure.
Whoever designed this chip made no investigation into how their
hardware is going to be actually used.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread Sergei Shtylyov

Hello.

On 04/22/2015 11:42 PM, David Miller wrote:


Hmm, I've been digging in the net core, and was unable to see where TX
skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
Probably need to print out skb's fields...



NET_IP_ALIGN is for receive, not transmit.


   Hm, then 'skb-data' should be aligned to 4 byte boundary on TX, right?
If so, there should be no problem with removing the copy.

WBR, Segei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread David Miller
From: Sergei Shtylyov sergei.shtyl...@cogentembedded.com
Date: Wed, 22 Apr 2015 23:46:52 +0300

 Hello.
 
 On 04/22/2015 11:42 PM, David Miller wrote:
 
 Hmm, I've been digging in the net core, and was unable to see where TX
 skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
 Probably need to print out skb's fields...
 
 NET_IP_ALIGN is for receive, not transmit.
 
Hm, then 'skb-data' should be aligned to 4 byte boundary on TX,
right?

Yes, TCP even take great pains to ensure this.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread Sergei Shtylyov

On 04/23/2015 01:41 AM, David Miller wrote:



Sigh... I'm seeing no way out of that then, only copying. :-(



What exactly is the device's restriction?


   The frame data must be aligned on 32-bit boundary.


Any reasonable modern chip allows one of two things.



Either it allows arbitrary alignment of the start of the TX
frame when DMA'ing.



_or_



It allows a variable number of pad bytes to be inserted by the
driver before giving it to the card, which do not go onto the
wire, in order to meet the device's DMA restrictions.



For example, if the packet is only 2 byte aligned, you set the ignore
offset to 2 and push two zero bytes in front of the ethernet frame
before giving it to the card.


   I'm not seeing any padding logic on the TX path, only on the RX path (but 
it counts in 4-byte words, so seems quite useless).



If a chip made in this day and era cannot do one of those two things,
this is beyond disappointing and is a massive engineering failure.
Whoever designed this chip made no investigation into how their
hardware is going to be actually used.


   Too nad the Renesas SoC designers are not reading that. :-)

WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread David Miller
From: Sergei Shtylyov sergei.shtyl...@cogentembedded.com
Date: Wed, 22 Apr 2015 23:30:02 +0300

Hmm, I've been digging in the net core, and was unable to see where TX
skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
Probably need to print out skb's fields...

NET_IP_ALIGN is for receive, not transmit.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread Sergei Shtylyov

On 04/22/2015 11:42 PM, David Miller wrote:


Hmm, I've been digging in the net core, and was unable to see where TX
skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
Probably need to print out skb's fields...



NET_IP_ALIGN is for receive, not transmit.


   But when I print 'skb-data' from the ndo_start_xmit() method (in the 
'sh_eth' driver), all addresses end with 2, so it looks like NET_IP_ALIGN gets 
added somewhere...


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread Sergei Shtylyov

On 04/23/2015 01:18 AM, David Miller wrote:


 Hmm, I've been digging in the net core, and was unable to see where TX
 skb's get their NET_IP_ALIGN bytes reserved. Have I missed something?
 Probably need to print out skb's fields...



NET_IP_ALIGN is for receive, not transmit.



But when I print 'skb-data' from the ndo_start_xmit() method (in the
'sh_eth' driver), all addresses end with 2, so it looks like
NET_IP_ALIGN gets added somewhere...



It's the IPV4 header which is 4 byte aligned, then the ethernet header
is pushed which is 14 bytes.


   Sigh... I'm seeing no way out of that then, only copying. :-(

WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread Florian Fainelli
On 14/04/15 14:37, Sergei Shtylyov wrote:
 
 +/* Wait for stopping the hardware TX process */
 +ravb_wait(ndev, TCCR, TCCR_TSRQ0 | TCCR_TSRQ1 | TCCR_TSRQ2 |
 TCCR_TSRQ3,
 +  0);
 +
 +ravb_wait(ndev, CSR, CSR_TPO0 | CSR_TPO1 | CSR_TPO2 | CSR_TPO3, 0);
 +
 +/* Stop the E-MAC's RX processes. */
 +ravb_write(ndev, ravb_read(ndev, ECMR)  ~ECMR_RE, ECMR);
 
 [snip]
 
 +/* Transmited network control queue */
 +if (tis  TIS_FTF1) {
 +ravb_tx_free(ndev, RAVB_NC);
 +netif_wake_queue(ndev);
 
 This would be better moved to the NAPI handler.
 
Maybe, not sure...
 
 +result = IRQ_HANDLED;
 +}
 
 [snip]
 
 +if (ecmd-duplex == DUPLEX_FULL)
 +priv-duplex = 1;
 +else
 +priv-duplex = 0;
 
 Why not use what priv-phydev-duplex has cached for you?
 
Because we compare 'priv-duplex' with 'priv-phydev-duplex' in
 ravb_adjust_link(). Or what did you mean?

Oh I see how you are using this now, but it does not look like it is
necessary, since you use phy_ethtool_sset() using phydev-duplex
directly ought to be enough anywhere in your driver? Unless there is a
mode where you are running PHY-less, and not using a fixed PHY to
emulate a PHY...

 
 [...]
 
 +static int ravb_nway_reset(struct net_device *ndev)
 +{
 +struct ravb_private *priv = netdev_priv(ndev);
 +int error = -ENODEV;
 +unsigned long flags;
 +
 +if (priv-phydev) {
 
 Is checking against priv-phydev really necessary, it does not look like
 the driver will work or accept an invalid PHY device at all anyway?
 
You still can run 'ethtool' on a closed network device.

Sure, but that does not mean that priv-phydev becomes NULL, even if you
have called phy_disconnect() in your ndo_close() function, you should
still have a correct priv-phydev reference to the PHY device, no?

 
 [...]
 
 +/* Network device open function for Ethernet AVB */
 +static int ravb_open(struct net_device *ndev)
 +{
 +struct ravb_private *priv = netdev_priv(ndev);
 +int error;
 +
 +napi_enable(priv-napi);
 +
 +error = request_irq(ndev-irq, ravb_interrupt, IRQF_SHARED,
 ndev-name,
 +ndev);
 +if (error) {
 +netdev_err(ndev, cannot request IRQ\n);
 +goto out_napi_off;
 +}
 +
 +/* Descriptor set */
 +/* +26 gets the maximum ethernet encapsulation, +7  ~7 because the
 + * card needs room to do 8 byte alignment, +2 so we can reserve
 + * the first 2 bytes, and +16 gets room for the status word from
 the
 + * card.
 + */
 +priv-rx_buffer_size = (ndev-mtu = 1492 ? PKT_BUF_SZ :
 +(((ndev-mtu + 26 + 7)  ~7) + 2 + 16));
 
 Is not that something that should be moved to a local ndo_change_mtu()
 
That was copied from sh_eth.c verbatim, I even doubt that the formula
 is correct for EtherAVB...
 
 function? What happens if I change the MTU of an interface running, does
 not that completely break this RX buffer estimation?
 
Well, not completely, I think. eth_change_mtu() doesn't allow MTU 
 1500 bytes, so it looks like we just need to change 1492 to 1500 here.
 
 [...]
 
 +static int ravb_start_xmit(struct sk_buff *skb, struct net_device
 *ndev)
 +{
 +struct ravb_private *priv = netdev_priv(ndev);
 +struct ravb_tstamp_skb *ts_skb = NULL;
 +struct ravb_tx_desc *desc;
 +unsigned long flags;
 +void *buffer;
 +u32 entry;
 +u32 tccr;
 +int q;
 +
 +/* If skb needs TX timestamp, it is handled in network control
 queue */
 +q = (skb_shinfo(skb)-tx_flags  SKBTX_HW_TSTAMP) ? RAVB_NC :
 RAVB_BE;
 +
 +spin_lock_irqsave(priv-lock, flags);
 +if (priv-cur_tx[q] - priv-dirty_tx[q] = priv-num_tx_ring[q]
 - 4) {
 
 What's so special about 4 here, you don't seem to be using 4 descriptors
 
Not sure, this was clearly copied from sh_eth.c. Perhaps it's just a
 threshold for calling ravb_tx_free()...

Then 1 inclusive or 0 exclusive is probably what you should be comparing
to, otherwise you may just stop the tx queue earlier than needed.

 
 +if (!ravb_tx_free(ndev, q)) {
 +netif_warn(priv, tx_queued, ndev, TX FD exhausted.\n);
 +netif_stop_queue(ndev);
 +spin_unlock_irqrestore(priv-lock, flags);
 +return NETDEV_TX_BUSY;
 +}
 +}
 +entry = priv-cur_tx[q] % priv-num_tx_ring[q];
 +priv-cur_tx[q]++;
 +spin_unlock_irqrestore(priv-lock, flags);
 +
 +if (skb_put_padto(skb, ETH_ZLEN))
 +return NETDEV_TX_OK;
 +
 +priv-tx_skb[q][entry] = skb;
 +buffer = PTR_ALIGN(priv-tx_buffers[q][entry], RAVB_ALIGN);
 +memcpy(buffer, skb-data, skb-len);
 
 ~1500 bytes memcpy(), not good...
 
I'm looking in the manual and not finding the hard requirement to
 have the buffer address aligned to 128 bytes (RAVB_ALIGN), sigh...
 Kimura-san?
 
 +desc = priv-tx_ring[q][entry];
 
 Since we have released the spinlock few lines above, is there something
 protecting 

Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-22 Thread Sergei Shtylyov

Hello.

On 04/22/2015 06:36 PM, David Miller wrote:


+   if (!ravb_tx_free(ndev, q)) {
+   netif_warn(priv, tx_queued, ndev, TX FD exhausted.\n);
+   netif_stop_queue(ndev);
+   spin_unlock_irqrestore(priv-lock, flags);
+   return NETDEV_TX_BUSY;
+   }
+   }
+   entry = priv-cur_tx[q] % priv-num_tx_ring[q];
+   priv-cur_tx[q]++;
+   spin_unlock_irqrestore(priv-lock, flags);
+
+   if (skb_put_padto(skb, ETH_ZLEN))
+   return NETDEV_TX_OK;
+
+   priv-tx_skb[q][entry] = skb;
+   buffer = PTR_ALIGN(priv-tx_buffers[q][entry], RAVB_ALIGN);
+   memcpy(buffer, skb-data, skb-len);



~1500 bytes memcpy(), not good...



 I'm looking in the manual and not finding the hard requirement to have the
buffer address aligned to 128 bytes (RAVB_ALIGN), sigh... Kimura-san?



There are the hardware requirement that the frame data must be aligned with
a 32-bit boundary in the URAM, see section 45A.3.3.1 Data Representation
in the manual.
I think that the original skb-data is almost aligned with 2 bytes boundary
by NET_IP_ALING, so we copied original skb-data to the local aligned buffer.



In addition, see section 45A.3.3.12 Tips for Optimizing Performance in Handling
Descriptors, it mentioned that frame data is accessed in blocks up to 128 bytes
and the number of 128 byte borders (addresses H'xxx00 and H'xxx80) and frame 
data
inside should be minimized.
So we set RAVB_ALIGN to 128 bytes.



There is no way that copying is going to be faster than finding an adequate way
to transmit directly out of the SKB memory.



In this day and age there is simply no excuse for something like this, you will
have to find a way.


   Hmm, I've been digging in the net core, and was unable to see where TX 
skb's get their NET_IP_ALIGN bytes reserved. Have I missed something? Probably 
need to print out skb's fields...


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-21 Thread MITSUHIRO KIMURA
Hello Sergei.

 (2015/04/15 6:37:28), Sergei Shtylyov wrote:
  +  if (!ravb_tx_free(ndev, q)) {
  +  netif_warn(priv, tx_queued, ndev, TX FD exhausted.\n);
  +  netif_stop_queue(ndev);
  +  spin_unlock_irqrestore(priv-lock, flags);
  +  return NETDEV_TX_BUSY;
  +  }
  +  }
  +  entry = priv-cur_tx[q] % priv-num_tx_ring[q];
  +  priv-cur_tx[q]++;
  +  spin_unlock_irqrestore(priv-lock, flags);
  +
  +  if (skb_put_padto(skb, ETH_ZLEN))
  +  return NETDEV_TX_OK;
  +
  +  priv-tx_skb[q][entry] = skb;
  +  buffer = PTR_ALIGN(priv-tx_buffers[q][entry], RAVB_ALIGN);
  +  memcpy(buffer, skb-data, skb-len);
 
  ~1500 bytes memcpy(), not good...
 
 I'm looking in the manual and not finding the hard requirement to have the
 buffer address aligned to 128 bytes (RAVB_ALIGN), sigh... Kimura-san?

There are the hardware requirement that the frame data must be aligned with
a 32-bit boundary in the URAM, see section 45A.3.3.1 Data Representation
in the manual.
I think that the original skb-data is almost aligned with 2 bytes boundary
by NET_IP_ALING, so we copied original skb-data to the local aligned buffer.

In addition, see section 45A.3.3.12 Tips for Optimizing Performance in Handling
Descriptors, it mentioned that frame data is accessed in blocks up to 128 bytes
and the number of 128 byte borders (addresses H'xxx00 and H'xxx80) and frame 
data
inside should be minimized.
So we set RAVB_ALIGN to 128 bytes.

Best Regards,
Mitsuhiro Kimura



Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-19 Thread Richard Cochran
On Tue, Apr 14, 2015 at 01:07:38AM +0300, Sergei Shtylyov wrote:

 +static int ravb_wait(struct net_device *ndev, u16 reg, u32 mask, u32 value)
 +{
 + int i;
 +
 + for (i = 0; i  1; i++) {
 + if ((ravb_read(ndev, reg)  mask) == value)
 + return 0;
 + udelay(10);
 + }
 + return -ETIMEDOUT;
 +}

This function performs a busy wait of up to 100 milliseconds.

It also has a return value.

 +/* function for waiting dma process finished */
 +static void ravb_wait_stop_dma(struct net_device *ndev)
 +{
 + /* Wait for stopping the hardware TX process */
 + ravb_wait(ndev, TCCR, TCCR_TSRQ0 | TCCR_TSRQ1 | TCCR_TSRQ2 | TCCR_TSRQ3,
 +   0);
 +
 + ravb_wait(ndev, CSR, CSR_TPO0 | CSR_TPO1 | CSR_TPO2 | CSR_TPO3, 0);

Ignores return value.

 + /* Stop the E-MAC's RX processes. */
 + ravb_write(ndev, ravb_read(ndev, ECMR)  ~ECMR_RE, ECMR);
 +
 + /* Wait for stopping the RX DMA process */
 + ravb_wait(ndev, CSR, CSR_RPO, 0);
 +}
 +
 +/* Caller must hold the lock */
 +static void ravb_ptp_update_compare(struct ravb_private *priv, u32 ns)
 +{
 + struct net_device *ndev = priv-ndev;
 + /* When the comparison value (GPTC.PTCV) is in range of
 +  * [x-1 to x+1] (x is the configured increment value in
 +  * GTI.TIV), it may happen that a comparison match is
 +  * not detected when the timer wraps around.
 +  */
 + u32 gti_ns_plus_1 = (priv-ptp.current_addend  20) + 1;
 +
 + if (ns  gti_ns_plus_1)
 + ns = gti_ns_plus_1;
 + else if (ns  0 - gti_ns_plus_1)
 + ns = 0 - gti_ns_plus_1;
 +
 + ravb_write(ndev, ns, GPTC);
 + ravb_write(ndev, ravb_read(ndev, GCCR) | GCCR_LPTC, GCCR);
 + if (ravb_read(ndev, CSR)  CSR_OPS_OPERATION)
 + ravb_wait(ndev, GCCR, GCCR_LPTC, 0);

Ignores return value.

 +}

 +static void ravb_ptp_tcr_request(struct ravb_private *priv, int request)
 +{
 + struct net_device *ndev = priv-ndev;
 +
 + if (ravb_read(ndev, CSR)  CSR_OPS_OPERATION) {
 + ravb_wait(ndev, GCCR, GCCR_TCR, GCCR_TCR_NOREQ);
 + ravb_write(ndev, ravb_read(ndev, GCCR) | request, GCCR);
 + ravb_wait(ndev, GCCR, GCCR_TCR, GCCR_TCR_NOREQ);

Ignores return value.

 + }
 +}

 +/* Caller must hold lock */
 +static void ravb_ptp_time_write(struct ravb_private *priv,
 + const struct timespec64 *ts)
 +{
 + struct net_device *ndev = priv-ndev;
 +
 + ravb_ptp_tcr_request(priv, GCCR_TCR_RESET);
 +
 + ravb_write(ndev, ts-tv_nsec, GTO0);
 + ravb_write(ndev, ts-tv_sec,  GTO1);
 + ravb_write(ndev, (ts-tv_sec  32)  0x, GTO2);
 + ravb_write(ndev, ravb_read(ndev, GCCR) | GCCR_LTO, GCCR);
 + if (ravb_read(ndev, CSR)  CSR_OPS_OPERATION)
 + ravb_wait(ndev, GCCR, GCCR_LTO, 0);

Ignores return value.

 +}
 +
 +/* Caller must hold lock */
 +static u64 ravb_ptp_cnt_read(struct ravb_private *priv)
 +{
 + struct timespec64 ts;
 + ktime_t kt;
 +
 + ravb_ptp_time_read(priv, ts);
 + kt = timespec64_to_ktime(ts);
 +
 + return ktime_to_ns(kt);
 +}
 +
 +/* Caller must hold lock */
 +static void ravb_ptp_cnt_write(struct ravb_private *priv, u64 ns)
 +{
 + struct timespec64 ts = ns_to_timespec64(ns);
 +
 + ravb_ptp_time_write(priv, ts);
 +}
 +

 +/* Caller must hold lock */
 +static void ravb_ptp_select_counter(struct ravb_private *priv, u16 sel)
 +{
 + struct net_device *ndev = priv-ndev;
 + u32 val;
 +
 + ravb_wait(ndev, GCCR, GCCR_TCR, GCCR_TCR_NOREQ);

Ignores return value.

 + val = ravb_read(ndev, GCCR)  ~GCCR_TCSS;
 + ravb_write(ndev, val | (sel  8), GCCR);
 +}
 +
 +/* Caller must hold lock */
 +static void ravb_ptp_update_addend(struct ravb_private *priv, u32 addend)
 +{
 + struct net_device *ndev = priv-ndev;
 +
 + priv-ptp.current_addend = addend;
 +
 + ravb_write(ndev, addend  GTI_TIV, GTI);
 + ravb_write(ndev, ravb_read(ndev, GCCR) | GCCR_LTI, GCCR);
 + if (ravb_read(ndev, CSR)  CSR_OPS_OPERATION)
 + ravb_wait(ndev, GCCR, GCCR_LTI, 0);

Ignores return value.

 +}
 +
 +/* PTP clock operations */
 +static int ravb_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
 +{
 + struct ravb_private *priv = container_of(ptp, struct ravb_private,
 +  ptp.info);
 + unsigned long flags;
 + u32 diff, addend;
 + int neg_adj = 0;
 + u64 adj;
 +
 + if (ppb  0) {
 + neg_adj = 1;
 + ppb = -ppb;
 + }
 + addend = priv-ptp.default_addend;
 + adj = addend;
 + adj *= ppb;
 + diff = div_u64(adj, NSEC_PER_SEC);
 +
 + addend = neg_adj ? addend - diff : addend + diff;
 +
 + spin_lock_irqsave(priv-lock, flags);
 + ravb_ptp_update_addend(priv, addend);

This is one example of many where you make a call to ravb_wait() while:

- holding a spinlock with interrupts disabled (for up to 100 milliseconds)
- 

Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-19 Thread Sergei Shtylyov

Hello.

On 04/14/2015 03:49 AM, Lino Sanfilippo wrote:


+struct ravb_desc {
+#ifdef __LITTLE_ENDIAN
+   u32 ds: 12; /* Descriptor size */
+   u32 cc: 12; /* Content control */
+   u32 die: 4; /* Descriptor interrupt enable */
+   /* 0: disable, other: enable */
+   u32 dt: 4;  /* Descriptor type */
+#else
+   u32 dt: 4;  /* Descriptor type */
+   u32 die: 4; /* Descriptor interrupt enable */
+   /* 0: disable, other: enable */
+   u32 cc: 12; /* Content control */
+   u32 ds: 12; /* Descriptor size */
+#endif
+   u32 dptr;   /* Descriptor pointer */
+};
+
+struct ravb_rx_desc {
+#ifdef __LITTLE_ENDIAN
+   u32 ds: 12; /* Descriptor size */
+   u32 ei: 1;  /* Error indication */
+   u32 ps: 2;  /* Padding selection */
+   u32 tr: 1;  /* Truncation indication */
+   u32 msc: 8; /* MAC status code */
+   u32 die: 4; /* Descriptor interrupt enable */
+   /* 0: disable, other: enable */
+   u32 dt: 4;  /* Descriptor type */
+#else
+   u32 dt: 4;  /* Descriptor type */
+   u32 die: 4; /* Descriptor interrupt enable */
+   /* 0: disable, other: enable */
+   u32 msc: 8; /* MAC status code */
+   u32 ps: 2;  /* Padding selection */
+   u32 ei: 1;  /* Error indication */
+   u32 tr: 1;  /* Truncation indication */
+   u32 ds: 12; /* Descriptor size */
+#endif
+   u32 dptr;   /* Descpriptor pointer */
+};
+
+struct ravb_ex_rx_desc {
+#ifdef __LITTLE_ENDIAN
+   u32 ds: 12; /* Descriptor size */
+   u32 ei: 1;  /* Error indication */
+   u32 ps: 2;  /* Padding selection */
+   u32 tr: 1;  /* Truncation indication */
+   u32 msc: 8; /* MAC status code */
+   u32 die: 4; /* Descriptor interrupt enable */
+   /* 0: disable, other: enable */
+   u32 dt: 4;  /* Descriptor type */
+#else
+   u32 dt: 4;  /* Descriptor type */
+   u32 die: 4; /* Descriptor interrupt enable */
+   /* 0: disable, other: enable */
+   u32 msc: 8; /* MAC status code */
+   u32 ps: 2;  /* Padding selection */
+   u32 ei: 1;  /* Error indication */
+   u32 tr: 1;  /* Truncation indication */
+   u32 ds: 12; /* Descriptor size */
+#endif
+   u32 dptr;   /* Descpriptor pointer */
+   u32 ts_n;   /* Timestampe nsec */
+   u32 ts_sl;  /* Timestamp low */
+#ifdef __LITTLE_ENDIAN
+   u32 res: 16;/* Reserved bits */
+   u32 ts_sh: 16;  /* Timestamp high */
+#else
+   u32 ts_sh: 16;  /* Timestamp high */
+   u32 res: 16;/* Reserved bits */
+#endif
+};



I recall a thread in which the use of bitfields for structs that are
shared with the hardware was considered a bad idea (because the compiler
is free to reorder the fields). Shift operations are probably a better
choice here.


   Well, it looks as the compiler is not free to reorder bit fields, and the 
order is determined by the ABI. Will look into getting rid of them anyway...


[...]

+/* Network device open function for Ethernet AVB */
+static int ravb_open(struct net_device *ndev)
+{
+   struct ravb_private *priv = netdev_priv(ndev);
+   int error;
+
+   napi_enable(priv-napi);
+
+   error = request_irq(ndev-irq, ravb_interrupt, IRQF_SHARED, ndev-name,
+   ndev);
+   if (error) {
+   netdev_err(ndev, cannot request IRQ\n);
+   goto out_napi_off;
+   }
+
+   /* Descriptor set */
+   /* +26 gets the maximum ethernet encapsulation, +7  ~7 because the
+* card needs room to do 8 byte alignment, +2 so we can reserve
+* the first 2 bytes, and +16 gets room for the status word from the
+* card.
+*/
+   priv-rx_buffer_size = (ndev-mtu = 1492 ? PKT_BUF_SZ :
+   (((ndev-mtu + 26 + 7)  ~7) + 2 + 16));
+
+   error = ravb_ring_init(ndev, RAVB_BE);
+   if (error)
+   goto out_free_irq;
+   error = ravb_ring_init(ndev, RAVB_NC);
+   if (error)
+   goto out_free_irq;
+
+   /* Device init */
+   error = ravb_dmac_init(ndev);
+   if (error)
+   goto out_free_irq;
+   ravb_emac_init(ndev);
+
+   netif_start_queue(ndev);
+
+   /* PHY control start */
+   error = ravb_phy_start(ndev);
+   if (error)
+   goto out_free_irq;
+
+   return 0;
+
+out_free_irq:
+   free_irq(ndev-irq, ndev);



freeing all the memory allocated in the former avb_ring_init calls is
missing.


   OK, fixed. The same bug as in sh_eth.c which also needs fixing

[...]

+/* Timeout function for Ethernet AVB */
+static void ravb_tx_timeout(struct net_device *ndev)
+{
+   struct ravb_private *priv = netdev_priv(ndev);
+   int i, q;
+
+ 

Re: [PATCH v3] Renesas Ethernet AVB driver

2015-04-19 Thread Lino Sanfilippo
Hi,

On 20.04.2015 00:10, Sergei Shtylyov wrote:
 
 I recall a thread in which the use of bitfields for structs that are
 shared with the hardware was considered a bad idea (because the compiler
 is free to reorder the fields). Shift operations are probably a better
 choice here.
 
 Well, it looks as the compiler is not free to reorder bit fields, and the 
 order is determined by the ABI. Will look into getting rid of them anyway...

I think that thread I was referring to was this one:
http://thread.gmane.org/gmane.linux.kernel/182862/focus=182986
(See the first comment from Benjamin Herrenschmidt).

 +/* Packet transmit function for Ethernet AVB */
 +static int ravb_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 +{
 +   struct ravb_private *priv = netdev_priv(ndev);
 +   struct ravb_tstamp_skb *ts_skb = NULL;
 +   struct ravb_tx_desc *desc;
 +   unsigned long flags;
 +   void *buffer;
 +   u32 entry;
 +   u32 tccr;
 +   int q;
 +
 +   /* If skb needs TX timestamp, it is handled in network control queue */
 +   q = (skb_shinfo(skb)-tx_flags  SKBTX_HW_TSTAMP) ? RAVB_NC : RAVB_BE;
 +
 +   spin_lock_irqsave(priv-lock, flags);
 +   if (priv-cur_tx[q] - priv-dirty_tx[q] = priv-num_tx_ring[q] - 4) {
 +   if (!ravb_tx_free(ndev, q)) {
 +   netif_warn(priv, tx_queued, ndev, TX FD exhausted.\n);
 +   netif_stop_queue(ndev);
 +   spin_unlock_irqrestore(priv-lock, flags);
 +   return NETDEV_TX_BUSY;
 +   }
 +   }
 +   entry = priv-cur_tx[q] % priv-num_tx_ring[q];
 +   priv-cur_tx[q]++;
 +   spin_unlock_irqrestore(priv-lock, flags);
 +
 +   if (skb_put_padto(skb, ETH_ZLEN))
 +   return NETDEV_TX_OK;
 +
 +   priv-tx_skb[q][entry] = skb;
 +   buffer = PTR_ALIGN(priv-tx_buffers[q][entry], RAVB_ALIGN);
 +   memcpy(buffer, skb-data, skb-len);
 +   desc = priv-tx_ring[q][entry];
 +   desc-ds = skb-len;
 +   desc-dptr = dma_map_single(ndev-dev, buffer, skb-len,
 +   DMA_TO_DEVICE);
 +   if (dma_mapping_error(ndev-dev, desc-dptr)) {
 +   dev_kfree_skb_any(skb);
 +   priv-tx_skb[q][entry] = NULL;
 +   return NETDEV_TX_OK;
 +   }
 +
 +   /* TX timestamp required */
 +   if (q == RAVB_NC) {
 +   ts_skb = kmalloc(sizeof(*ts_skb), GFP_ATOMIC);
 +   if (!ts_skb)
 +   return -ENOMEM;
 
 Dma mapping has to be undone.
 
 OK, fixed. Not sure what we should return in this case: error code or
 NETDEV_TX_OK...

NETDEV_TX_OK is the correct return value even in error case. The only
exception is NETDEV_TX_BUSY when the tx queue has been stopped. However
returning NETDEV_TX_OK also means that the skb has to be consumed (so
beside unmapping dma also the skb has to be freed in case that kmalloc
fails in ravb_start_xmit).

 example all ptp related code could be put into its own file.
 
 OK, will try to split the driver back... Perhaps I should also split the 
 patch accordingly?

Yes, sounds like a good idea.

Regards,
Lino

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html