Re: RFC on writel and writel_relaxed
On 2018-03-28 02:14, Linus Torvalds wrote: On Tue, Mar 27, 2018 at 5:24 PM, Sinan Kayawrote: Basically changing it to dma_buffer->foo = 1;/* WB */ wmb() writel_relaxed(KICK, DMA_KICK_REGISTER);/* UC */ mmiowb() Why? Why not just remove the wmb(), and keep the barrier in the writel()? Yes, we want to get there indeed. It is because of some arch not implementing writel properly. Maintainers want to play safe. That is why I asked if IA64 and other well known archs follow the strongly ordered rule at this moment like PPC and ARM. Or should we go and inform every arch about this before yanking wmb()? Maintainers are afraid of introducing a regression. The above code makes no sense, and just looks stupid to me. It also generates pointlessly bad code on x86, so it's bad there too. Linus
Re: [PATCH v4 17/17] net: ena: Eliminate duplicate barriers on weakly-ordered archs
On 2018-03-25 08:06, Belgazal, Netanel wrote: I think you should either add a parameter to ena_com_write_sq_doorbell() or add ena_com_write_sq_doorbell_rel(). Right now, you have unused function. That is true. I got rid of ena_com_write_sq_doorbell_rel. On 3/20/18, 4:43 AM, "Sinan Kaya"wrote: Code includes barrier() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Create a new wrapper function with relaxed write operator. Use the new wrapper when a write is following a barrier(). Since code already has an explicit barrier call, changing writel() to writel_relaxed(). Signed-off-by: Sinan Kaya --- drivers/net/ethernet/amazon/ena/ena_com.c | 6 -- drivers/net/ethernet/amazon/ena/ena_eth_com.h | 22 -- drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++-- 3 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c index bf2de52..b6e628f 100644 --- a/drivers/net/ethernet/amazon/ena/ena_com.c +++ b/drivers/net/ethernet/amazon/ena/ena_com.c @@ -631,7 +631,8 @@ static u32 ena_com_reg_bar_read32(struct ena_com_dev *ena_dev, u16 offset) */ wmb(); - writel(mmio_read_reg, ena_dev->reg_bar + ENA_REGS_MMIO_REG_READ_OFF); + writel_relaxed(mmio_read_reg, + ena_dev->reg_bar + ENA_REGS_MMIO_REG_READ_OFF); for (i = 0; i < timeout; i++) { if (read_resp->req_id == mmio_read->seq_num) @@ -1826,7 +1827,8 @@ void ena_com_aenq_intr_handler(struct ena_com_dev *dev, void *data) /* write the aenq doorbell after all AENQ descriptors were read */ mb(); - writel((u32)aenq->head, dev->reg_bar + ENA_REGS_AENQ_HEAD_DB_OFF); + writel_relaxed((u32)aenq->head, + dev->reg_bar + ENA_REGS_AENQ_HEAD_DB_OFF); } int ena_com_dev_reset(struct ena_com_dev *ena_dev, diff --git a/drivers/net/ethernet/amazon/ena/ena_eth_com.h b/drivers/net/ethernet/amazon/ena/ena_eth_com.h index 2f76572..09ef7cd 100644 --- a/drivers/net/ethernet/amazon/ena/ena_eth_com.h +++ b/drivers/net/ethernet/amazon/ena/ena_eth_com.h @@ -107,7 +107,8 @@ static inline int ena_com_sq_empty_space(struct ena_com_io_sq *io_sq) return io_sq->q_depth - 1 - cnt; } -static inline int ena_com_write_sq_doorbell(struct ena_com_io_sq *io_sq) +static inline int ena_com_write_sq_doorbell(struct ena_com_io_sq *io_sq, + bool relaxed) { u16 tail; @@ -116,7 +117,24 @@ static inline int ena_com_write_sq_doorbell(struct ena_com_io_sq *io_sq) pr_debug("write submission queue doorbell for queue: %d tail: %d\n", io_sq->qid, tail); - writel(tail, io_sq->db_addr); + if (relaxed) + writel_relaxed(tail, io_sq->db_addr); + else + writel(tail, io_sq->db_addr); + + return 0; +} + +static inline int ena_com_write_sq_doorbell_rel(struct ena_com_io_sq *io_sq) +{ + u16 tail; + + tail = io_sq->tail; + + pr_debug("write submission queue doorbell for queue: %d tail: %d\n", +io_sq->qid, tail); + + writel_relaxed(tail, io_sq->db_addr); return 0; } diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 6975150..0530201 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -556,7 +556,7 @@ static int ena_refill_rx_bufs(struct ena_ring *rx_ring, u32 num) * issue a doorbell */ wmb(); - ena_com_write_sq_doorbell(rx_ring->ena_com_io_sq); + ena_com_write_sq_doorbell(rx_ring->ena_com_io_sq, true); } rx_ring->next_to_use = next_to_use; @@ -2151,7 +2151,7 @@ static netdev_tx_t ena_start_xmit(struct sk_buff *skb, struct net_device *dev) if (netif_xmit_stopped(txq) || !skb->xmit_more) { /* trigger the dma engine */ - ena_com_write_sq_doorbell(tx_ring->ena_com_io_sq); + ena_com_write_sq_doorbell(tx_ring->ena_com_io_sq, false); u64_stats_update_begin(_ring->syncp); tx_ring->tx_stats.doorbells++; u64_stats_update_end(_ring->syncp); -- 2.7.4
Re: [PATCH v7 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
On 2018-03-23 19:58, Jeff Kirsher wrote: On Fri, 2018-03-23 at 14:53 -0700, Alexander Duyck wrote: On Fri, Mar 23, 2018 at 11:52 AM, Sinan Kayawrote: > Code includes wmb() followed by writel() in multiple places. writel() > already has a barrier on some architectures like arm64. > > This ends up CPU observing two barriers back to back before executing > the > register write. > > Since code already has an explicit barrier call, changing writel() to > writel_relaxed(). > > I did a regex search for wmb() followed by writel() in each drivers > directory. > I scrubbed the ones I care about in this series. > > I considered "ease of change", "popular usage" and "performance > critical > path" as the determining criteria for my filtering. > > We used relaxed API heavily on ARM for a long time but > it did not exist on other architectures. For this reason, relaxed > architectures have been paying double penalty in order to use the > common > drivers. > > Now that relaxed API is present on all architectures, we can go and > scrub > all drivers to see what needs to change and what can remain. > > We start with mostly used ones and hope to increase the coverage over > time. > It will take a while to cover all drivers. > > Feel free to apply patches individually. I looked over the set and they seem good. Reviewed-by: Alexander Duyck Grrr, patch 1 does not apply cleanly to my next-queue tree (dev-queue branch). I will deal with this series in a day or two, after I have dealt with my driver pull requests. Sorry, you will have to replace the ones you took from me. > > Changes since v6: > clean up between 2..6 and then make your Alex's changes on 1 and 7 > The mmiowb shouldn't be needed for Rx. Only one CPU will be running > NAPI for the queue and we will synchronize this with a full writel > anyway when we re-enable the interrupts. > > Sinan Kaya (7): > i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs > ixgbe: eliminate duplicate barriers on weakly-ordered archs > igbvf: eliminate duplicate barriers on weakly-ordered archs > igb: eliminate duplicate barriers on weakly-ordered archs > fm10k: Eliminate duplicate barriers on weakly-ordered archs > ixgbevf: keep writel() closer to wmb() > ixgbevf: eliminate duplicate barriers on weakly-ordered archs > > drivers/net/ethernet/intel/fm10k/fm10k_main.c | 4 ++-- > drivers/net/ethernet/intel/i40e/i40e_txrx.c | 14 ++ > drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 4 ++-- > drivers/net/ethernet/intel/igb/igb_main.c | 4 ++-- > drivers/net/ethernet/intel/igbvf/netdev.c | 4 ++-- > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 > drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 - > drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 11 --- > 8 files changed, 30 insertions(+), 24 deletions(-) > > -- > 2.7.4 >
Re: [PATCH v4 12/17] net: cxgb4/cxgb4vf: Eliminate duplicate barriers on weakly-ordered archs
On 2018-03-21 19:03, Casey Leedom wrote: [[ Appologies for the DUPLICATE email. I forgot to tell my Mail Agent to use Plain Text. -- Casey ]] I feel very uncomfortable with these proposed changes. Our team is right in the middle of trying to tease our way through the various platform implementations of writel(), writel_relaxed(), __raw_writel(), etc. in order to support x86, PowerPC, ARM, etc. with a single code base. This is complicated by the somewhat ... "fuzzily defined" semantics and varying platform implementations of all of these APIs. (And note that I'm just picking writel() as an example.) Additionally, many of the changes aren't even in fast paths and are thus unneeded for performance. Please don't make these changes. We're trying to get this all sussed out. I was also given the feedback to look at performance critical path only. I am in the process of revisiting the patches. If you can point me to the ones that are important, I can try to limit the changes to those only. If your team wants to do it, I can drop this patch as well. I think the semantics of write API is clear. What was actually implemented is another story. I can share a few of my findings. A portable driver needs to do this. descriptor update in mem wmb () writel_relaxed () mmiowb () Using __raw_write() is wrong as it can get reordered. Using wmb()+writel() is also wrong for performance reasons. If something is unclear, please ask.
Re: [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
On 2018-03-21 17:54, David Miller wrote: From: Jeff KirsherDate: Wed, 21 Mar 2018 14:48:08 -0700 On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote: Remove ixgbevf_write_tail() in favor of moving writel() close to wmb(). Signed-off-by: Sinan Kaya Reviewed-by: Alexander Duyck --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 - drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++-- 2 files changed, 2 insertions(+), 7 deletions(-) This patch fails to compile because there is a call to ixgbevf_write_tail() which you missed cleaning up. For a change with delicate side effects, it doesn't create much confidence if the code does not even compile. Sinan, please put more care into the changes you are making. I think the issue is the tree that code is getting tested has undelivered code as Alex mentioned. I was using linux-next 4.16 rc4 for testing. I will rebase to Jeff's tree. Thank you.
Re: [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
On 2018-03-21 17:48, Jeff Kirsher wrote: On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote: Remove ixgbevf_write_tail() in favor of moving writel() close to wmb(). Signed-off-by: Sinan KayaReviewed-by: Alexander Duyck --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 - drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++-- 2 files changed, 2 insertions(+), 7 deletions(-) This patch fails to compile because there is a call to ixgbevf_write_tail() which you missed cleaning up. Hah, I did a compile test but maybe I missed something. I will get v6 of this patch only and leave the rest of the series as it is.
Re: [PATCH 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
On 2018-03-14 01:08, Timur Tabi wrote: On 3/13/18 10:20 PM, Sinan Kaya wrote: +/* Assumes caller has executed a write barrier to order memory and device + * requests. + */ static inline void ixgbevf_write_tail(struct ixgbevf_ring *ring, u32 value) { - writel(value, ring->tail); + writel_relaxed(value, ring->tail); } Why not put the wmb() in this function, or just get rid of the wmb() in the rest of the file and keep this as writel? That way, you can avoid the comment and the risk that comes with it. Sure, both solutions will work. I want to see what the maintainer prefers. I can repost accordingly.
Re: [PATCH 3/7] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs
On 2018-03-14 00:12, Jason Gunthorpe wrote: On Tue, Mar 13, 2018 at 11:20:24PM -0400, Sinan Kaya wrote: Code includes wmb() followed by writel() in multiple places. writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Since code already has an explicit barrier call, changing writel() to writel_relaxed(). Signed-off-by: Sinan Kayadrivers/infiniband/hw/qedr/verbs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Sure matches my understanding of writel_relaxed This is part of a series, should we take just this patch through the rdma tree? If not: Acked-by: Jason Gunthorpe Feel free to take pieces. Thanks, Jason
Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
On 2016-04-18 08:12, Christoph Hellwig wrote: On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote: Current code is assuming that the address returned by dma_alloc_coherent is a logical address. This is not true on ARM/ARM64 systems. Can you explain what you mean with a 'logical address' and what actual issue you're trying to solve? Vmap call is failing on arm64 systems because dma alloc api already returns an address mapped with vmap. Please see arch/arm64/mm directory.
Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
On 2016-04-18 00:00, David Miller wrote: From: Sinan KayaDate: Sat, 16 Apr 2016 18:23:32 -0400 Current code is assuming that the address returned by dma_alloc_coherent is a logical address. This is not true on ARM/ARM64 systems. This patch replaces dma_alloc_coherent with dma_map_page API. The address returned can later by virtually mapped from the CPU side with vmap API. Signed-off-by: Sinan Kaya You can't do this. The DMA map page API gives non-coherent mappings, and thus requires proper flushing. So a straight conversion like this is never legitimate. I would agree on proper dma api usage. However, the code is already assuming coherent architecture by mapping the cpu pages as page_kernel. Dma_map_page returns cached buffers and you don't need cache flushes on coherent architecture to make the data visible.