Re: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC
On 12/27/2016 04:40 PM, maowenan wrote: -Original Message- From: tndave [mailto:tushar.n.d...@oracle.com] Sent: Wednesday, December 28, 2016 6:28 AM To: maowenan; jeffrey.t.kirs...@intel.com; intel-wired-...@lists.osuosl.org Cc: netdev@vger.kernel.org; weiyongjun (A); Dingtianhong Subject: Re: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC On 12/26/2016 03:39 AM, maowenan wrote: -Original Message- From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On Behalf Of Tushar Dave Sent: Tuesday, December 06, 2016 1:07 AM To: jeffrey.t.kirs...@intel.com; intel-wired-...@lists.osuosl.org Cc: netdev@vger.kernel.org Subject: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC Unlike previous generation NIC (e.g. ixgbe) i40e doesn't seem to have standard CSR where PCIe relaxed ordering can be set. Without PCIe relax ordering enabled, i40e performance is significantly low on SPARC. [Mao Wenan]Hi Tushar, you have referred to i40e doesn't seem to have standard CSR to set PCIe relaxed ordering, this CSR like TX DCA Control Register in 82599, right? Yes. i40e datasheet mentions some CSR that can be used to enable/disable PCIe relaxed ordering in device; however I don't see the exact definition of those register in datasheet. (https://www.mail-archive.com/netdev@vger.kernel.org/msg117219.html). Is DMA_ATTR_WEAK_ORDERING the same as TX control register in 82599? No. DMA_ATTR_WEAK_ORDERING applies to the PCIe root complex of the system. -Tushar I understand that the PCIe Root Complex is the Host Bridge in the CPU that connects the CPU and memory to the PCIe architecture. So this attribute DMA_ATTR_WEAK_ORDERING is only applied on CPU side(the SPARC in you system), it can't apply on i40e, is it right? Yes. And it is not the same as 82599 DCA control register's relax ordering bits. It is not same as 82599 DCA control register's relax ordering bits. -Tushar -Mao Wenan And to enable relax ordering mode in 82599 for SPARC using below codes: s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw) { u32 i; /* Clear the rate limiters */ for (i = 0; i < hw->mac.max_tx_queues; i++) { IXGBE_WRITE_REG(hw, IXGBE_RTTDQSEL, i); IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRC, 0); } IXGBE_WRITE_FLUSH(hw); #ifndef CONFIG_SPARC /* Disable relaxed ordering */ for (i = 0; i < hw->mac.max_tx_queues; i++) { u32 regval; regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i)); regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN; IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval); } for (i = 0; i < hw->mac.max_rx_queues; i++) { u32 regval; regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i)); regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN | IXGBE_DCA_RXCTRL_HEAD_WRO_EN); IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval); } #endif return 0; } This patch sets PCIe relax ordering for SPARC arch by setting dma attr DMA_ATTR_WEAK_ORDERING for every tx and rx DMA map/unmap. This has shown 10x increase in performance numbers. e.g. iperf TCP test with 10 threads on SPARC S7 Test 1: Without this patch [root@brm-snt1-03 net]# iperf -s Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) [ 4] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40926 [ 5] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40934 [ 6] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40930 [ 7] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40928 [ 8] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40922 [ 9] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40932 [ 10] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40920 [ 11] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40924 [ 14] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40982 [ 12] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40980 [ ID] Interval Transfer Bandwidth [ 4] 0.0-20.0 sec 566 MBytes 237 Mbits/sec [ 5] 0.0-20.0 sec 532 MBytes 223 Mbits/sec [ 6] 0.0-20.0 sec 537 MBytes 225 Mbits/sec [ 8] 0.0-20.0 sec 546 MBytes 229 Mbits/sec [ 11] 0.0-20.0 sec 592 MBytes 248 Mbits/sec [ 7] 0.0-20.0 sec 539 MBytes 226 Mbits/sec [ 9] 0.0-20.0 sec 572 MBytes 240 Mbits/sec [ 10] 0.0-20.0 sec 604 MBytes 253 Mbits/sec [ 14] 0.0-20.0 sec 567 MBytes 238 Mbits/sec [ 12] 0.0-20.0 sec 511 MBytes 214 Mbits/sec [SUM] 0.0-20.0 sec 5.44 GBytes 2.33 Gbits/sec Test 2: with this patch: [root@brm-snt1-03 net]# iperf -s Server listening on TCP port 5001 TCP window
RE: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC
> -Original Message- > From: tndave [mailto:tushar.n.d...@oracle.com] > Sent: Wednesday, December 28, 2016 6:28 AM > To: maowenan; jeffrey.t.kirs...@intel.com; intel-wired-...@lists.osuosl.org > Cc: netdev@vger.kernel.org; weiyongjun (A); Dingtianhong > Subject: Re: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC > > > > On 12/26/2016 03:39 AM, maowenan wrote: > > > > > >> -Original Message- > >> From: netdev-ow...@vger.kernel.org > >> [mailto:netdev-ow...@vger.kernel.org] > >> On Behalf Of Tushar Dave > >> Sent: Tuesday, December 06, 2016 1:07 AM > >> To: jeffrey.t.kirs...@intel.com; intel-wired-...@lists.osuosl.org > >> Cc: netdev@vger.kernel.org > >> Subject: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC > >> > >> Unlike previous generation NIC (e.g. ixgbe) i40e doesn't seem to have > >> standard CSR where PCIe relaxed ordering can be set. Without PCIe > >> relax ordering enabled, i40e performance is significantly low on SPARC. > >> > > [Mao Wenan]Hi Tushar, you have referred to i40e doesn't seem to have > > standard CSR to set PCIe relaxed ordering, this CSR like TX DCA Control > Register in 82599, right? > Yes. > i40e datasheet mentions some CSR that can be used to enable/disable PCIe > relaxed ordering in device; however I don't see the exact definition of those > register in datasheet. > (https://www.mail-archive.com/netdev@vger.kernel.org/msg117219.html). > > > Is DMA_ATTR_WEAK_ORDERING the same as TX control register in > 82599? > No. > DMA_ATTR_WEAK_ORDERING applies to the PCIe root complex of the system. > > -Tushar I understand that the PCIe Root Complex is the Host Bridge in the CPU that connects the CPU and memory to the PCIe architecture. So this attribute DMA_ATTR_WEAK_ORDERING is only applied on CPU side(the SPARC in you system), it can't apply on i40e, is it right? And it is not the same as 82599 DCA control register's relax ordering bits. -Mao Wenan > > > > And to enable relax ordering mode in 82599 for SPARC using below codes: > > s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw) { > > u32 i; > > > > /* Clear the rate limiters */ > > for (i = 0; i < hw->mac.max_tx_queues; i++) { > > IXGBE_WRITE_REG(hw, IXGBE_RTTDQSEL, i); > > IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRC, 0); > > } > > IXGBE_WRITE_FLUSH(hw); > > > > #ifndef CONFIG_SPARC > > /* Disable relaxed ordering */ > > for (i = 0; i < hw->mac.max_tx_queues; i++) { > > u32 regval; > > > > regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i)); > > regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN; > > IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval); > > } > > > > for (i = 0; i < hw->mac.max_rx_queues; i++) { > > u32 regval; > > > > regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i)); > > regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN | > > IXGBE_DCA_RXCTRL_HEAD_WRO_EN); > > IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval); > > } > > #endif > > return 0; > > } > > > > > > > >> This patch sets PCIe relax ordering for SPARC arch by setting dma > >> attr DMA_ATTR_WEAK_ORDERING for every tx and rx DMA map/unmap. > >> This has shown 10x increase in performance numbers. > >> > >> e.g. > >> iperf TCP test with 10 threads on SPARC S7 > >> > >> Test 1: Without this patch > >> > >> [root@brm-snt1-03 net]# iperf -s > >> > >> Server listening on TCP port 5001 > >> TCP window size: 85.3 KByte (default) > >> > >> [ 4] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40926 [ > >> 5] local > >> 16.0.0.7 port 5001 connected with 16.0.0.1 port 40934 [ 6] local > >> 16.0.0.7 port > >> 5001 connected with 16.0.0.1 port 40930 [ 7] local 16.0.0.7 port > >> 5001 connected with 16.0.0.1 port 40928 [ 8] local 16.0.0.7 port > >> 5001 connected with 16.0.0.1 port 40922 [ 9] local 16.0.0.7 port > >> 5001 connected with 16.0.0.1 port 40932 [ 10] local 16.0.0.7 port > >> 5001 connected with 16.0.0.1 port 40920 [ 11] local 16.0.0.7 port > >> 5001 connected with 16.0.0.1 port 40924 [ 14] local > >> 16.0.0.7 port 5001 connected with 16.0.0.1 port 40982 [ 12] local >
Re: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC
On 12/26/2016 03:39 AM, maowenan wrote: -Original Message- From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On Behalf Of Tushar Dave Sent: Tuesday, December 06, 2016 1:07 AM To: jeffrey.t.kirs...@intel.com; intel-wired-...@lists.osuosl.org Cc: netdev@vger.kernel.org Subject: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC Unlike previous generation NIC (e.g. ixgbe) i40e doesn't seem to have standard CSR where PCIe relaxed ordering can be set. Without PCIe relax ordering enabled, i40e performance is significantly low on SPARC. [Mao Wenan]Hi Tushar, you have referred to i40e doesn't seem to have standard CSR to set PCIe relaxed ordering, this CSR like TX DCA Control Register in 82599, right? Yes. i40e datasheet mentions some CSR that can be used to enable/disable PCIe relaxed ordering in device; however I don't see the exact definition of those register in datasheet. (https://www.mail-archive.com/netdev@vger.kernel.org/msg117219.html). Is DMA_ATTR_WEAK_ORDERING the same as TX control register in 82599? No. DMA_ATTR_WEAK_ORDERING applies to the PCIe root complex of the system. -Tushar And to enable relax ordering mode in 82599 for SPARC using below codes: s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw) { u32 i; /* Clear the rate limiters */ for (i = 0; i < hw->mac.max_tx_queues; i++) { IXGBE_WRITE_REG(hw, IXGBE_RTTDQSEL, i); IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRC, 0); } IXGBE_WRITE_FLUSH(hw); #ifndef CONFIG_SPARC /* Disable relaxed ordering */ for (i = 0; i < hw->mac.max_tx_queues; i++) { u32 regval; regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i)); regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN; IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval); } for (i = 0; i < hw->mac.max_rx_queues; i++) { u32 regval; regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i)); regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN | IXGBE_DCA_RXCTRL_HEAD_WRO_EN); IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval); } #endif return 0; } This patch sets PCIe relax ordering for SPARC arch by setting dma attr DMA_ATTR_WEAK_ORDERING for every tx and rx DMA map/unmap. This has shown 10x increase in performance numbers. e.g. iperf TCP test with 10 threads on SPARC S7 Test 1: Without this patch [root@brm-snt1-03 net]# iperf -s Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) [ 4] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40926 [ 5] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40934 [ 6] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40930 [ 7] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40928 [ 8] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40922 [ 9] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40932 [ 10] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40920 [ 11] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40924 [ 14] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40982 [ 12] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40980 [ ID] Interval Transfer Bandwidth [ 4] 0.0-20.0 sec 566 MBytes 237 Mbits/sec [ 5] 0.0-20.0 sec 532 MBytes 223 Mbits/sec [ 6] 0.0-20.0 sec 537 MBytes 225 Mbits/sec [ 8] 0.0-20.0 sec 546 MBytes 229 Mbits/sec [ 11] 0.0-20.0 sec 592 MBytes 248 Mbits/sec [ 7] 0.0-20.0 sec 539 MBytes 226 Mbits/sec [ 9] 0.0-20.0 sec 572 MBytes 240 Mbits/sec [ 10] 0.0-20.0 sec 604 MBytes 253 Mbits/sec [ 14] 0.0-20.0 sec 567 MBytes 238 Mbits/sec [ 12] 0.0-20.0 sec 511 MBytes 214 Mbits/sec [SUM] 0.0-20.0 sec 5.44 GBytes 2.33 Gbits/sec Test 2: with this patch: [root@brm-snt1-03 net]# iperf -s Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) TCP: request_sock_TCP: Possible SYN flooding on port 5001. Sending cookies. Check SNMP counters. [ 4] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46876 [ 5] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46874 [ 6] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46872 [ 7] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46880 [ 8] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46878 [ 9] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46884 [ 10] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46886 [ 11] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46890 [ 12] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46888 [ 13] local 16.0.0.7 port 5001 connected with 16.0.0.1
RE: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC
> -Original Message- > From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] > On Behalf Of Tushar Dave > Sent: Tuesday, December 06, 2016 1:07 AM > To: jeffrey.t.kirs...@intel.com; intel-wired-...@lists.osuosl.org > Cc: netdev@vger.kernel.org > Subject: [RFC PATCH] i40e: enable PCIe relax ordering for SPARC > > Unlike previous generation NIC (e.g. ixgbe) i40e doesn't seem to have standard > CSR where PCIe relaxed ordering can be set. Without PCIe relax ordering > enabled, i40e performance is significantly low on SPARC. > [Mao Wenan]Hi Tushar, you have referred to i40e doesn't seem to have standard CSR to set PCIe relaxed ordering, this CSR like TX DCA Control Register in 82599, right? Is DMA_ATTR_WEAK_ORDERING the same as TX control register in 82599? And to enable relax ordering mode in 82599 for SPARC using below codes: s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw) { u32 i; /* Clear the rate limiters */ for (i = 0; i < hw->mac.max_tx_queues; i++) { IXGBE_WRITE_REG(hw, IXGBE_RTTDQSEL, i); IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRC, 0); } IXGBE_WRITE_FLUSH(hw); #ifndef CONFIG_SPARC /* Disable relaxed ordering */ for (i = 0; i < hw->mac.max_tx_queues; i++) { u32 regval; regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i)); regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN; IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval); } for (i = 0; i < hw->mac.max_rx_queues; i++) { u32 regval; regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i)); regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN | IXGBE_DCA_RXCTRL_HEAD_WRO_EN); IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval); } #endif return 0; } > This patch sets PCIe relax ordering for SPARC arch by setting dma attr > DMA_ATTR_WEAK_ORDERING for every tx and rx DMA map/unmap. > This has shown 10x increase in performance numbers. > > e.g. > iperf TCP test with 10 threads on SPARC S7 > > Test 1: Without this patch > > [root@brm-snt1-03 net]# iperf -s > > Server listening on TCP port 5001 > TCP window size: 85.3 KByte (default) > > [ 4] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40926 [ 5] local > 16.0.0.7 port 5001 connected with 16.0.0.1 port 40934 [ 6] local 16.0.0.7 > port > 5001 connected with 16.0.0.1 port 40930 [ 7] local 16.0.0.7 port 5001 > connected with 16.0.0.1 port 40928 [ 8] local 16.0.0.7 port 5001 connected > with 16.0.0.1 port 40922 [ 9] local 16.0.0.7 port 5001 connected with > 16.0.0.1 > port 40932 [ 10] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40920 > [ 11] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 40924 [ 14] local > 16.0.0.7 port 5001 connected with 16.0.0.1 port 40982 [ 12] local 16.0.0.7 > port > 5001 connected with 16.0.0.1 port 40980 > [ ID] Interval Transfer Bandwidth > [ 4] 0.0-20.0 sec 566 MBytes 237 Mbits/sec > [ 5] 0.0-20.0 sec 532 MBytes 223 Mbits/sec > [ 6] 0.0-20.0 sec 537 MBytes 225 Mbits/sec > [ 8] 0.0-20.0 sec 546 MBytes 229 Mbits/sec > [ 11] 0.0-20.0 sec 592 MBytes 248 Mbits/sec > [ 7] 0.0-20.0 sec 539 MBytes 226 Mbits/sec > [ 9] 0.0-20.0 sec 572 MBytes 240 Mbits/sec > [ 10] 0.0-20.0 sec 604 MBytes 253 Mbits/sec > [ 14] 0.0-20.0 sec 567 MBytes 238 Mbits/sec > [ 12] 0.0-20.0 sec 511 MBytes 214 Mbits/sec > [SUM] 0.0-20.0 sec 5.44 GBytes 2.33 Gbits/sec > > Test 2: with this patch: > > [root@brm-snt1-03 net]# iperf -s > > Server listening on TCP port 5001 > TCP window size: 85.3 KByte (default) > > TCP: request_sock_TCP: Possible SYN flooding on port 5001. Sending cookies. > Check SNMP counters. > [ 4] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46876 [ 5] local > 16.0.0.7 port 5001 connected with 16.0.0.1 port 46874 [ 6] local 16.0.0.7 > port > 5001 connected with 16.0.0.1 port 46872 [ 7] local 16.0.0.7 port 5001 > connected with 16.0.0.1 port 46880 [ 8] local 16.0.0.7 port 5001 connected > with 16.0.0.1 port 46878 [ 9] local 16.0.0.7 port 5001 connected with > 16.0.0.1 > port 46884 [ 10] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46886 > [ 11] local 16.0.0.7 port 5001 connected with 16.0.0.1 port 46890 [ 12] local > 16.0.0.7 port 5001 connected with 16.0.0.1 port 46888 [ 13] local 16.0.0.7 > port > 5001 connected with 16.0.0.1 port 46882 > [ ID] Interval Transfer Bandwidth > [ 4] 0.0-20.0 sec 7.45 GBytes 3.19 Gbits/sec [ 5] 0.0-20.0 sec 7.48 > GBytes 3.21 Gbits/sec [ 7] 0.0-20.0 sec 7.34 GBytes 3.15 Gbits/sec > [ 8] 0.0-20.0 sec 7.42