On 02/27/2013 02:04 PM, Tijs Van Buggenhout wrote: > On Tuesday 26 February 2013 21:45:00 Tijs Van Buggenhout wrote: >> This is a proposal to add QoS support to bgmac driver, allowing to use >> (possibly) all of the tx queues supported by the hardware. >> >> The bgmac device is allocated as an ethernet multiqueue network device, >> reporting the (maximum) tx queues supported by the hardware to the kernel. >> >> The default queue selection mechenism is replaced with the >> bgmac_select_queue function, mapping the priorities of the skbs to a >> specific tx queue number.
I would not replace the default algorithm. >> Later this number is used in the bgmax_start_xmit function to select the >> right tx queue, on which the new skb is added. >> The hardware will generate an interrupt when the skb is handled, in which it >> reports on what hardware queue (number/id) the event occured, allowing to >> call bgmac_dma_tx_free on the right tx queue/ring. >> >> Mapping is according to 802.1d priority to traffic class mapping, as found >> in et_linux.c : > > Sorry, it resides in et/sys/etc.c, my mistake... > >> TX_Q1, /* 0 BE TC_BE Best Effort */ >> TX_Q0, /* 1 BK TC_BK Background */ >> TX_Q0, /* 2 -- TC_BK Background */ >> TX_Q1, /* 3 EE TC_BE Best Effort */ >> TX_Q2, /* 4 CL TC_CL Controlled Load */ >> TX_Q2, /* 5 VI TC_CL Controlled Load */ >> TX_Q3, /* 6 VO TC_VO Voice */ >> TX_Q3 /* 7 NC TC_VO Voice */ >> Additionally any other value is mapped on TX_Q0.. >> >> Impact on the source code is rather limited, some extra local variables for >> netdev_queue, queue_id (u16), and all netif_.*_queue calls converted to >> appropriate netdif_tx_.*_queue calls. >> >> I did not test extensively yet, but proof of concept works and is easily >> tested by adding some debug code to the driver (in bgmac_dma_tx_add, >> bgmac_poll and bgmac_select_queue functions) and using the ping command >> on command line and the dropbear ssh (which uses a non-default TOS value >> on packets: Minimize Delay). Console output shows the following: >> >> root@OpenWrt:/# ping 192.168.1.113 >> PING 192.168.1.113 (192.168.1.113): 56 data bytes >> [ 58.304000] bgmac bcma0:1: bgmac_select_queue based on priority 0x0 >> [ 58.316000] bgmac bcma0:1: Select tx queue 1 for packet >> [ 58.320000] bgmac bcma0:1: Received interrupt for tx queue 1 >> [ 58.328000] bgmac bcma0:1: bgmac_select_queue based on priority 0x0 >> [ 58.332000] bgmac bcma0:1: Select tx queue 1 for packet >> [ 58.340000] bgmac bcma0:1: Received interrupt for tx queue 1 >> 64 bytes from 192.168.1.113: seq=0 ttl=64 time=40.513 ms >> [ 59.344000] bgmac bcma0:1: bgmac_select_queue based on priority 0x0 >> [ 59.352000] bgmac bcma0:1: Select tx queue 1 for packet >> [ 59.356000] bgmac bcma0:1: Received interrupt for tx queue 1 >> 64 bytes from 192.168.1.113: seq=1 ttl=64 time=18.595 ms >> [ 60.364000] bgmac bcma0:1: bgmac_select_queue based on priority 0x0 >> [ 60.372000] bgmac bcma0:1: Select tx queue 1 for packet >> [ 60.376000] bgmac bcma0:1: Received interrupt for tx queue 1 >> 64 bytes from 192.168.1.113: seq=2 ttl=64 time=18.550 ms >> ^C >> --- 192.168.1.113 ping statistics --- >> 3 packets transmitted, 3 packets received, 0% packet loss >> root@OpenWrt:/# ssh [email protected] >> [ 105.992000] bgmac bcma0:1: bgmac_select_queue based on priority 0x0 >> [ 106.000000] bgmac bcma0:1: Select tx queue 1 for packet >> [ 106.004000] bgmac bcma0:1: Received interrupt for tx queue 1 >> [ 106.012000] bgmac bcma0:1: bgmac_select_queue based on priority 0x0 >> [ 106.016000] bgmac bcma0:1: Select tx queue 1 for packet >> [ 106.024000] bgmac bcma0:1: Received interrupt for tx queue 1 >> [ 106.032000] bgmac bcma0:1: bgmac_select_queue based on priority 0x6 >> [ 106.036000] bgmac bcma0:1: Select tx queue 3 for packet >> [ 106.044000] bgmac bcma0:1: Received interrupt for tx queue 3 >> [ 106.048000] bgmac bcma0:1: bgmac_select_queue based on priority 0x6 >> [ 106.056000] bgmac bcma0:1: Select tx queue 3 for packet >> [ 106.060000] bgmac bcma0:1: Received interrupt for tx queue 3 >> [ 106.072000] bgmac bcma0:1: bgmac_select_queue based on priority 0x6 >> ... >> Host '192.168.1.113' is not in the trusted hosts file. >> (fingerprint md5 b7:12:75:ca:65:cb:c7:60:59:35:6a:5d:28:6d:27:83) >> Do you want to continue connecting? (y/n) >> [ 106.400000] bgmac bcma0:1: bgmac_select_queue based on priority 0x6 >> [ 106.404000] bgmac bcma0:1: Select tx queue 3 for packet >> [ 106.408000] bgmac bcma0:1: Received interrupt for tx queue 3 >> y >> [ 111.776000] bgmac bcma0:1: bgmac_select_queue based on priority 0x6 >> ... >> >> This patch is generated from git trunk.. Any comments/suggestions/ideas very >> welcome... This patch is whitespace damaged by your mail client. All the tabs are replaced by spaces and so on this prevents this patch from applying. Please read http://kerneltrap.org/Linux/Email_Clients_and_Patches or use "git send-email". Could you also send this for linux mainline inclusion to the netdev mailing list. >> Signed-off-by: Tijs Van Buggenhout <[email protected]> >> --- >> >> diff --git a/drivers/net/ethernet/broadcom/bgmac.c >> b/drivers/net/ethernet/broadcom/bgmac.c index 3fd3288..58b1d09 100644 >> --- a/drivers/net/ethernet/broadcom/bgmac.c >> +++ b/drivers/net/ethernet/broadcom/bgmac.c >> @@ -15,6 +15,8 @@ >> #include <linux/mii.h> >> #include <linux/interrupt.h> >> #include <linux/dma-mapping.h> >> +#include <linux/pkt_sched.h> /* TC defines */ >> +#include <linux/skbuff.h> /* skb_get_queue_mapping */ ^^^^^^^^^^^^^^^^^^^^^^^^^^^ These comments are not needed. >> #include <asm/mach-bcm47xx/nvram.h> >> >> static const struct bcma_device_id bgmac_bcma_tbl[] = { >> @@ -102,15 +104,22 @@ static void bgmac_dma_tx_enable(struct bgmac *bgmac, >> >> static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac, >> struct bgmac_dma_ring *ring, >> - struct sk_buff *skb) >> + struct sk_buff *skb, >> + u16 queue_id) > > We could determine queue_id from (ring - &bgmac->tx_ring[0]) ofcourse, > avoiding the extra function argument... I like the current way more. >> { >> struct device *dma_dev = bgmac->core->dma_dev; >> struct net_device *net_dev = bgmac->net_dev; >> + struct netdev_queue *tx_queue; >> struct bgmac_dma_desc *dma_desc; >> struct bgmac_slot_info *slot; >> u32 ctl0, ctl1; >> int free_slots; >> >> + if ((tx_queue = netdev_get_tx_queue(net_dev, queue_id)) == NULL) { >> + bgmac_err(bgmac, "Error mapping skb to TX queue (id %u)\n", >> queue_id); + goto err_drop; >> + } >> + >> if (skb->len > BGMAC_DESC_CTL1_LEN) { >> bgmac_err(bgmac, "Too long skb (%d)\n", skb->len); >> goto err_stop_drop; >> @@ -122,7 +131,7 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac, >> free_slots = ring->start - ring->end; >> if (free_slots == 1) { >> bgmac_err(bgmac, "TX ring is full, queue should be >> stopped!\n"); - netif_stop_queue(net_dev); >> + netif_tx_stop_queue(tx_queue); This looks like an other fix. >> return NETDEV_TX_BUSY; >> } >> >> @@ -160,20 +169,22 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac >> *bgmac, >> >> /* Always keep one slot free to allow detecting bugged calls. */ >> if (--free_slots == 1) >> - netif_stop_queue(net_dev); >> + netif_tx_stop_queue(tx_queue); >> >> return NETDEV_TX_OK; >> >> err_stop_drop: >> - netif_stop_queue(net_dev); >> + netif_tx_stop_queue(tx_queue); >> +err_drop: >> dev_kfree_skb(skb); >> return NETDEV_TX_OK; >> } >> >> /* Free transmitted packets */ >> -static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring >> *ring) +static void bgmac_dma_tx_free(struct bgmac *bgmac, struct >> bgmac_dma_ring *ring, u16 queue_id) { > > Also here the extra queue_id argument could be avoided.. > >> struct device *dma_dev = bgmac->core->dma_dev; >> + struct netdev_queue *tx_queue = netdev_get_tx_queue(bgmac->net_dev, >> queue_id); int empty_slot; >> bool freed = false; >> >> @@ -204,8 +215,8 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, >> struct bgmac_dma_ring *ring) freed = true; >> } >> >> - if (freed && netif_queue_stopped(bgmac->net_dev)) >> - netif_wake_queue(bgmac->net_dev); >> + if (freed && netif_tx_queue_stopped(tx_queue)) >> + netif_tx_wake_queue(tx_queue); >> } >> >> static void bgmac_dma_rx_reset(struct bgmac *bgmac, struct bgmac_dma_ring >> *ring) @@ -1096,14 +1107,23 @@ static irqreturn_t bgmac_interrupt(int irq, >> void *dev_id) >> >> static int bgmac_poll(struct napi_struct *napi, int weight) >> { >> + static const u32 tx_ring[BGMAC_MAX_TX_RINGS] = { >> + BGMAC_IS_TX0, BGMAC_IS_TX1, >> + BGMAC_IS_TX2, BGMAC_IS_TX3, >> + }; >> struct bgmac *bgmac = container_of(napi, struct bgmac, napi); >> struct bgmac_dma_ring *ring; >> - int handled = 0; >> - >> - if (bgmac->int_status & BGMAC_IS_TX0) { >> - ring = &bgmac->tx_ring[0]; >> - bgmac_dma_tx_free(bgmac, ring); >> - bgmac->int_status &= ~BGMAC_IS_TX0; >> + int handled = 0, i; >> + >> + if (bgmac->int_status & BGMAC_IS_TX_MASK) { >> + for (i = 0 ; (bgmac->int_status & BGMAC_IS_TX_MASK) && >> + i < BGMAC_MAX_TX_RINGS ; i++) { >> + if (bgmac->int_status & tx_ring[i]) { >> + ring = &bgmac->tx_ring[i]; >> + bgmac_dma_tx_free(bgmac, ring, (u16)i); >> + bgmac->int_status &= ~tx_ring[i]; >> + } >> + } >> } > > The if clause is redundant, as int_status is also checked in the condition of > the for-loop, .. could be left out. I would remove the (bgmac->int_status & BGMAC_IS_TX_MASK) from the for loop. This is a hot path and if there is no tx ack we do not have to do the for loop at all. > >> if (bgmac->int_status & BGMAC_IS_RX) { >> @@ -1172,10 +1192,11 @@ static netdev_tx_t bgmac_start_xmit(struct sk_buff >> *skb, { >> struct bgmac *bgmac = netdev_priv(net_dev); >> struct bgmac_dma_ring *ring; >> + u16 queue_id = 0; >> >> - /* No QOS support yet */ >> - ring = &bgmac->tx_ring[0]; >> - return bgmac_dma_tx_add(bgmac, ring, skb); >> + queue_id = skb_get_queue_mapping(skb); >> + ring = &bgmac->tx_ring[queue_id]; >> + return bgmac_dma_tx_add(bgmac, ring, skb, queue_id); >> } >> >> static int bgmac_set_mac_address(struct net_device *net_dev, void *addr) >> @@ -1217,6 +1238,32 @@ static int bgmac_ioctl(struct net_device *net_dev, >> struct ifreq *ifr, int cmd) } >> } >> >> +static u16 bgmac_select_queue(struct net_device *net_dev, struct sk_buff >> *skb) +{ >> + u32 priority = skb->priority; >> + >> + /* 802.1d priority to traffic class mapping for 4 queues */ >> + switch (priority) { >> + case TC_PRIO_BESTEFFORT: /* Best effort */ >> + case 3: /* Excellent effort */ >> + /* Best effort queue TX_Q1 */ >> + return 1; >> + case TC_PRIO_INTERACTIVE_BULK: /* Controlled load */ >> + case 5: /* Video */ >> + /* Controlled Load queue TX_Q2 */ >> + return 2; >> + case TC_PRIO_INTERACTIVE: /* Voice */ >> + case TC_PRIO_CONTROL: /* Network control */ >> + /* Interactive queue TX_Q3 */ >> + return 3; >> + case TC_PRIO_FILLER: >> + case TC_PRIO_BULK: >> + default: >> + /* Background queue TX_Q0 */ >> + return 0; >> + } >> +} You should not use priority, no other ethernet driver does so, use skb_get_queue_mapping() and see select_ring_by_priority() in drivers/net/wireless/b43/dma.c >> + >> static const struct net_device_ops bgmac_netdev_ops = { >> .ndo_open = bgmac_open, >> .ndo_stop = bgmac_stop, >> @@ -1225,6 +1272,7 @@ static const struct net_device_ops bgmac_netdev_ops = >> { .ndo_set_mac_address = bgmac_set_mac_address, >> .ndo_validate_addr = eth_validate_addr, >> .ndo_do_ioctl = bgmac_ioctl, >> + .ndo_select_queue = bgmac_select_queue, >> }; >> >> /************************************************** >> @@ -1330,7 +1378,7 @@ static int bgmac_probe(struct bcma_device *core) >> } >> >> /* Allocation and references */ >> - net_dev = alloc_etherdev(sizeof(*bgmac)); >> + net_dev = alloc_etherdev_mq(sizeof(*bgmac), BGMAC_MAX_TX_RINGS); >> if (!net_dev) >> return -ENOMEM; >> net_dev->netdev_ops = &bgmac_netdev_ops; >> _______________________________________________ >> openwrt-devel mailing list >> [email protected] >> https://lists.openwrt.org/mailman/listinfo/openwrt-devel > > We could keep some counters on the individual rings, to print statistics later > e.g. with ethtool. This driver currently does not generate many statics, but this should be improved in general. > We could have a qos boolean to en/disable qos, which can be checked in > bgmac_select_queue to e.g always return TX_Q0 when disabled, or one of four > tx queues, as shown above. when you use skb_get_queue_mapping() it should be configurable from user space. Hauke _______________________________________________ openwrt-devel mailing list [email protected] https://lists.openwrt.org/mailman/listinfo/openwrt-devel
