Re: [PATCH v3 0/5] powerpc: apm82181: adding customer devices

2021-07-23 Thread Christian Lamparter

Hi Andy!

On 23/07/2021 21:19, Andy Shevchenko wrote:

On Sun, Sep 06, 2020 at 12:06:10AM +0200, Christian Lamparter wrote:

I've been holding on to these devices dts' for a while now.
But ever since the recent purge of the PPC405, I'm feeling
the urge to move forward.

The devices in question have been running with OpenWrt since
around 2016/2017. Back then it was linux v4.4 and required
many out-of-tree patches (for WIFI, SATA, CRYPTO...), that
since have been integrated. So, there's nothing else in the
way I think.

A patch that adds the Meraki vendor-prefix has been sent
separately, as there's also the Meraki MR32 that I'm working
on as well. Here's the link to the patch:


Now, I've looked around in the arch/powerpc for recent .dts
and device submissions to get an understanding of what is
required.
>From the looks of it, it seems like every device gets a
skeleton defconfig and a CONFIG_$DEVICE symbol (Like:
CONFIG_MERAKI_MR24, CONFIG_WD_MYBOOKLIVE).

Will this be the case? Or would it make sense to further
unite the Bluestone, MR24 and MBL under a common CONFIG_APM82181
and integrate the BLUESTONE device's defconfig into it as well?
(I've stumbled across the special machine compatible
handling of ppc in the Documentation/devicetree/usage-model.rst
already.)


I haven't found any traces of this to be applied. What is the status of this
patch series? And what is the general state of affairs for the PPC44x?



My best guess is: It's complicated. While there was a recent big
UPSET EVENT regarding the My Book Live (MBL) that affected "hundreds"
and "thousands": "An unpleasant surprise for My Book Live owners"
(). Sadly this wasn't getting any
traction.

I can tell that the mentioned Cisco Meraki MR32 (Broadcom ARM SoC)
got merged. So this is off the plate .

But APM821xx sadly went nowhere . One reason being that I haven't
yet posted a V4, V5 and so on...

In theory, for v4 I would have liked to know how to handle the
kConfig aspect of the series: Would it be "OK" to have a
single CONFIG_APM82181/CONFIG_APM821XX symbol or should there
be a CONFIG_MBL the CONFIG_MR24 (CONFIG_WNDR4700 and CONFIG_MX60W
in the future)?

As for the MBL: Well, If you (or any one else) is interested in
having a more up-to-date Debian. Then I have something:

A while back, I made a "build.sh". This will build a
"out-of-the-box" Debian unstable/SID powerpc system image.
This includes sensible NAS defaults + programs as well as
a Cockpit Web-GUI. But also makes it easily possible to do
the DTBs development on the latest vanilla (5.14-rc2 as of
the time of writing this) kernel for the
MyBook Live Single and Duo:



I can't really make one for the MR24 though. Its 32MiB NAND
makes it difficult to install anything else than OpenWrt
(and get some use out of the device).

So, how to proceed?

Cheers,
Christian

PS.: As for PPC44x health regarding APM82181: It works!

This is with a My Book Live (MBL) and the 5.14.0-rc2(+) kernel.

[0.00] printk: bootconsole [udbg0] enabled
[0.00] Activating Kernel Userspace Execution Prevention
[0.00] Linux version 5.14.0-rc2+ (root@debian64) (powerpc-linux-gnu-gcc 
(Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 
Fri Jul 23 22:59:56 CEST 2021
[0.00] Found initrd at 0xcf00:0xcfe73b70
[0.00] Using PowerPC 44x Platform machine description
[0.00] -
[0.00] phys_mem_size = 0x1000
[0.00] dcache_bsize  = 0x20
[0.00] icache_bsize  = 0x20
[0.00] cpu_features  = 0x0100
[0.00]   possible= 0x4100
[0.00]   always  = 0x0100
[0.00] cpu_user_features = 0x8c008000 0x
[0.00] mmu_features  = 0x0008
[0.00] -
[0.00] Top of RAM: 0x1000, Total RAM: 0x1000
[0.00] Memory hole size: 0MB
[0.00] Zone ranges:
[0.00]   Normal   [mem 0x-0x0fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x0fff]
[0.00] Initmem setup node 0 [mem 0x-0x0fff]
[0.00] MMU: Allocated 1088 bytes of context maps for 255 contexts
[0.00] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[0.00] pcpu-alloc: [0] 0
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 16352
[0.00] Kernel command line: 
root=UUID=ef4e8942-768b-4d2e-ba57-486397c97081 console=ttyS0,115200
[0.00] Dentry cache hash table entries: 32768 (order: 3, 131072 bytes, 
linear)
[0.00] Inode-cache hash table entries: 16384 (order: 2, 

[PATCH v4 02/10] net/ps3_gelic: Use local dev variable

2021-07-23 Thread Geoff Levand
In an effort to make the PS3 gelic driver easier to maintain, add a
local variable dev to those routines that use the device structure that
makes the use the device structure more consistent.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 340 +++
 1 file changed, 191 insertions(+), 149 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index cb45571573d7..ba008a98928a 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -48,13 +48,15 @@ MODULE_LICENSE("GPL");
 /* set irq_mask */
 int gelic_card_set_irq_mask(struct gelic_card *card, u64 mask)
 {
+   struct device *dev = ctodev(card);
int status;
 
status = lv1_net_set_interrupt_mask(bus_id(card), dev_id(card),
mask, 0);
-   if (status)
-   dev_info(ctodev(card),
-"%s failed %d\n", __func__, status);
+   if (status) {
+   dev_err(dev, "%s:%d failed: %d\n", __func__, __LINE__, status);
+   }
+
return status;
 }
 
@@ -103,6 +105,7 @@ gelic_descr_get_status(struct gelic_descr *descr)
 
 static int gelic_card_set_link_mode(struct gelic_card *card, int mode)
 {
+   struct device *dev = ctodev(card);
int status;
u64 v1, v2;
 
@@ -110,8 +113,8 @@ static int gelic_card_set_link_mode(struct gelic_card 
*card, int mode)
 GELIC_LV1_SET_NEGOTIATION_MODE,
 GELIC_LV1_PHY_ETHERNET_0, mode, 0, , );
if (status) {
-   pr_info("%s: failed setting negotiation mode %d\n", __func__,
-   status);
+   dev_err(dev, "%s:%d: Failed setting negotiation mode: %d\n",
+   __func__, __LINE__, status);
return -EBUSY;
}
 
@@ -128,13 +131,15 @@ static int gelic_card_set_link_mode(struct gelic_card 
*card, int mode)
  */
 static void gelic_card_disable_txdmac(struct gelic_card *card)
 {
+   struct device *dev = ctodev(card);
int status;
 
/* this hvc blocks until the DMA in progress really stopped */
status = lv1_net_stop_tx_dma(bus_id(card), dev_id(card));
-   if (status)
-   dev_err(ctodev(card),
-   "lv1_net_stop_tx_dma failed, status=%d\n", status);
+
+   if (status) {
+   dev_err(dev, "lv1_net_stop_tx_dma failed, status=%d\n", status);
+   }
 }
 
 /**
@@ -146,6 +151,7 @@ static void gelic_card_disable_txdmac(struct gelic_card 
*card)
  */
 static void gelic_card_enable_rxdmac(struct gelic_card *card)
 {
+   struct device *dev = ctodev(card);
int status;
 
 #ifdef DEBUG
@@ -161,9 +167,10 @@ static void gelic_card_enable_rxdmac(struct gelic_card 
*card)
 #endif
status = lv1_net_start_rx_dma(bus_id(card), dev_id(card),
card->rx_chain.head->link.cpu_addr, 0);
-   if (status)
-   dev_info(ctodev(card),
-"lv1_net_start_rx_dma failed, status=%d\n", status);
+   if (status) {
+   dev_err(dev, "lv1_net_start_rx_dma failed, status=%d\n",
+   status);
+   }
 }
 
 /**
@@ -175,13 +182,15 @@ static void gelic_card_enable_rxdmac(struct gelic_card 
*card)
  */
 static void gelic_card_disable_rxdmac(struct gelic_card *card)
 {
+   struct device *dev = ctodev(card);
int status;
 
/* this hvc blocks until the DMA in progress really stopped */
status = lv1_net_stop_rx_dma(bus_id(card), dev_id(card));
-   if (status)
-   dev_err(ctodev(card),
-   "lv1_net_stop_rx_dma failed, %d\n", status);
+
+   if (status) {
+   dev_err(dev, "lv1_net_stop_rx_dma failed, %d\n", status);
+   }
 }
 
 /**
@@ -235,10 +244,11 @@ static void gelic_card_reset_chain(struct gelic_card 
*card,
 
 void gelic_card_up(struct gelic_card *card)
 {
-   pr_debug("%s: called\n", __func__);
+   struct device *dev = ctodev(card);
+
mutex_lock(>updown_lock);
if (atomic_inc_return(>users) == 1) {
-   pr_debug("%s: real do\n", __func__);
+   dev_dbg(dev, "%s:%d: Starting...\n", __func__, __LINE__);
/* enable irq */
gelic_card_set_irq_mask(card, card->irq_mask);
/* start rx */
@@ -247,16 +257,16 @@ void gelic_card_up(struct gelic_card *card)
napi_enable(>napi);
}
mutex_unlock(>updown_lock);
-   pr_debug("%s: done\n", __func__);
 }
 
 void gelic_card_down(struct gelic_card *card)
 {
+   struct device *dev = ctodev(card);
u64 mask;
-   pr_debug("%s: called\n", __func__);
+
mutex_lock(>updown_lock);
if (atomic_dec_if_positive(>users) == 0) {
-   pr_debug("%s: real do\n", __func__);
+   

[PATCH v4 09/10] net/ps3_gelic: Add new routine gelic_work_to_card

2021-07-23 Thread Geoff Levand
Add new helper routine gelic_work_to_card that converts a work_struct
to a gelic_card.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 60fcca5d20dd..42f4de9ad5fe 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -1420,6 +1420,11 @@ static const struct ethtool_ops gelic_ether_ethtool_ops 
= {
.set_link_ksettings = gelic_ether_set_link_ksettings,
 };
 
+static struct gelic_card *gelic_work_to_card(struct work_struct *work)
+{
+   return container_of(work, struct gelic_card, tx_timeout_task);
+}
+
 /**
  * gelic_net_tx_timeout_task - task scheduled by the watchdog timeout
  * function (to be called not under interrupt status)
@@ -1429,8 +1434,7 @@ static const struct ethtool_ops gelic_ether_ethtool_ops = 
{
  */
 static void gelic_net_tx_timeout_task(struct work_struct *work)
 {
-   struct gelic_card *card =
-   container_of(work, struct gelic_card, tx_timeout_task);
+   struct gelic_card *card = gelic_work_to_card(work);
struct net_device *netdev = card->netdev[GELIC_PORT_ETHERNET_0];
struct device *dev = ctodev(card);
 
-- 
2.25.1




[PATCH v4 08/10] net/ps3_gelic: Rename no to descr_count

2021-07-23 Thread Geoff Levand
In an effort to make the PS3 gelic driver easier to maintain, rename
the gelic_card_init_chain parameter 'no' to 'descr_count'.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index e55aa9fecfeb..60fcca5d20dd 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -325,7 +325,7 @@ static void gelic_card_free_chain(struct gelic_card *card,
  * @card: card structure
  * @chain: address of chain
  * @start_descr: address of descriptor array
- * @no: number of descriptors
+ * @descr_count: number of descriptors
  *
  * we manage a circular list that mirrors the hardware structure,
  * except that the hardware uses bus addresses.
@@ -334,16 +334,16 @@ static void gelic_card_free_chain(struct gelic_card *card,
  */
 static int gelic_card_init_chain(struct gelic_card *card,
struct gelic_descr_chain *chain, struct gelic_descr *start_descr,
-   int no)
+   int descr_count)
 {
int i;
struct gelic_descr *descr;
struct device *dev = ctodev(card);
 
descr = start_descr;
-   memset(descr, 0, sizeof(*descr) * no);
+   memset(descr, 0, sizeof(*descr) *descr_count);
 
-   for (i = 0; i < no; i++, descr++) {
+   for (i = 0; i < descr_count; i++, descr++) {
descr->link.size = sizeof(struct gelic_hw_regs);
gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
descr->link.cpu_addr =
@@ -361,7 +361,7 @@ static int gelic_card_init_chain(struct gelic_card *card,
start_descr->prev = (descr - 1);
 
descr = start_descr;
-   for (i = 0; i < no; i++, descr++) {
+   for (i = 0; i < descr_count; i++, descr++) {
descr->hw_regs.next_descr_addr =
cpu_to_be32(descr->next->link.cpu_addr);
}
-- 
2.25.1




[PATCH v4 07/10] net/ps3_gelic: Add new routine gelic_unmap_link

2021-07-23 Thread Geoff Levand
Put the common code for unmaping a link into its own routine,
gelic_unmap_link, and add some debugging checks.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 23 +++-
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 85fc1915c8be..e55aa9fecfeb 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -288,6 +288,21 @@ void gelic_card_down(struct gelic_card *card)
mutex_unlock(>updown_lock);
 }
 
+static void gelic_unmap_link(struct device *dev, struct gelic_descr *descr)
+{
+   BUG_ON_DEBUG(descr->hw_regs.payload.dev_addr);
+   BUG_ON_DEBUG(descr->hw_regs.payload.size);
+
+   BUG_ON_DEBUG(!descr->link.cpu_addr);
+   BUG_ON_DEBUG(!descr->link.size);
+
+   dma_unmap_single(dev, descr->link.cpu_addr, descr->link.size,
+   DMA_BIDIRECTIONAL);
+
+   descr->link.cpu_addr = 0;
+   descr->link.size = 0;
+}
+
 /**
  * gelic_card_free_chain - free descriptor chain
  * @card: card structure
@@ -301,9 +316,7 @@ static void gelic_card_free_chain(struct gelic_card *card,
 
for (descr = descr_in; descr && descr->link.cpu_addr;
descr = descr->next) {
-   dma_unmap_single(dev, descr->link.cpu_addr, descr->link.size,
-   DMA_BIDIRECTIONAL);
-   descr->link.cpu_addr = 0;
+   gelic_unmap_link(dev, descr);
}
 }
 
@@ -364,9 +377,7 @@ static int gelic_card_init_chain(struct gelic_card *card,
 iommu_error:
for (i--, descr--; 0 <= i; i--, descr--)
if (descr->link.cpu_addr)
-   dma_unmap_single(dev, descr->link.cpu_addr,
-descr->link.size,
-DMA_BIDIRECTIONAL);
+   gelic_unmap_link(dev, descr);
return -ENOMEM;
 }
 
-- 
2.25.1




[PATCH v4 03/10] net/ps3_gelic: Format cleanups

2021-07-23 Thread Geoff Levand
In an effort to make the PS3 gelic driver easier to maintain, cleanup the
the driver source file formatting to be more consistent.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 379 ++-
 1 file changed, 193 insertions(+), 186 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index ba008a98928a..ded467d81f36 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -44,8 +44,6 @@ MODULE_AUTHOR("SCE Inc.");
 MODULE_DESCRIPTION("Gelic Network driver");
 MODULE_LICENSE("GPL");
 
-
-/* set irq_mask */
 int gelic_card_set_irq_mask(struct gelic_card *card, u64 mask)
 {
struct device *dev = ctodev(card);
@@ -65,6 +63,7 @@ static void gelic_card_rx_irq_on(struct gelic_card *card)
card->irq_mask |= GELIC_CARD_RXINT;
gelic_card_set_irq_mask(card, card->irq_mask);
 }
+
 static void gelic_card_rx_irq_off(struct gelic_card *card)
 {
card->irq_mask &= ~GELIC_CARD_RXINT;
@@ -72,15 +71,14 @@ static void gelic_card_rx_irq_off(struct gelic_card *card)
 }
 
 static void gelic_card_get_ether_port_status(struct gelic_card *card,
-int inform)
+   int inform)
 {
u64 v2;
struct net_device *ether_netdev;
 
lv1_net_control(bus_id(card), dev_id(card),
-   GELIC_LV1_GET_ETH_PORT_STATUS,
-   GELIC_LV1_VLAN_TX_ETHERNET_0, 0, 0,
-   >ether_port_status, );
+   GELIC_LV1_GET_ETH_PORT_STATUS, GELIC_LV1_VLAN_TX_ETHERNET_0, 0,
+   0, >ether_port_status, );
 
if (inform) {
ether_netdev = card->netdev[GELIC_PORT_ETHERNET_0];
@@ -100,7 +98,8 @@ static void gelic_card_get_ether_port_status(struct 
gelic_card *card,
 static enum gelic_descr_dma_status
 gelic_descr_get_status(struct gelic_descr *descr)
 {
-   return be32_to_cpu(descr->hw_regs.dmac_cmd_status) & 
GELIC_DESCR_DMA_STAT_MASK;
+   return be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
+   GELIC_DESCR_DMA_STAT_MASK;
 }
 
 static int gelic_card_set_link_mode(struct gelic_card *card, int mode)
@@ -110,8 +109,9 @@ static int gelic_card_set_link_mode(struct gelic_card 
*card, int mode)
u64 v1, v2;
 
status = lv1_net_control(bus_id(card), dev_id(card),
-GELIC_LV1_SET_NEGOTIATION_MODE,
-GELIC_LV1_PHY_ETHERNET_0, mode, 0, , );
+   GELIC_LV1_SET_NEGOTIATION_MODE, GELIC_LV1_PHY_ETHERNET_0, mode,
+   0, , );
+
if (status) {
dev_err(dev, "%s:%d: Failed setting negotiation mode: %d\n",
__func__, __LINE__, status);
@@ -138,7 +138,8 @@ static void gelic_card_disable_txdmac(struct gelic_card 
*card)
status = lv1_net_stop_tx_dma(bus_id(card), dev_id(card));
 
if (status) {
-   dev_err(dev, "lv1_net_stop_tx_dma failed, status=%d\n", status);
+   dev_err(dev, "%s:%d: lv1_net_stop_tx_dma failed: %d\n",
+   __func__, __LINE__, status);
}
 }
 
@@ -166,10 +167,11 @@ static void gelic_card_enable_rxdmac(struct gelic_card 
*card)
}
 #endif
status = lv1_net_start_rx_dma(bus_id(card), dev_id(card),
-   card->rx_chain.head->link.cpu_addr, 0);
+   card->rx_chain.head->link.cpu_addr, 0);
+
if (status) {
-   dev_err(dev, "lv1_net_start_rx_dma failed, status=%d\n",
-   status);
+   dev_err(dev, "%s:%d: lv1_net_start_rx_dma failed: %d\n",
+   __func__, __LINE__, status);
}
 }
 
@@ -189,7 +191,8 @@ static void gelic_card_disable_rxdmac(struct gelic_card 
*card)
status = lv1_net_stop_rx_dma(bus_id(card), dev_id(card));
 
if (status) {
-   dev_err(dev, "lv1_net_stop_rx_dma failed, %d\n", status);
+   dev_err(dev, "%s:%d: lv1_net_stop_rx_dma failed: %d\n",
+   __func__, __LINE__, status);
}
 }
 
@@ -202,11 +205,11 @@ static void gelic_card_disable_rxdmac(struct gelic_card 
*card)
  * in the status
  */
 static void gelic_descr_set_status(struct gelic_descr *descr,
-  enum gelic_descr_dma_status status)
+   enum gelic_descr_dma_status status)
 {
descr->hw_regs.dmac_cmd_status = cpu_to_be32(status |
-   (be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
-~GELIC_DESCR_DMA_STAT_MASK));
+   (be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
+   ~GELIC_DESCR_DMA_STAT_MASK));
/*
 * dma_cmd_status field is used to indicate whether the descriptor
 * is valid or not.
@@ -226,14 +229,14 @@ static void gelic_descr_set_status(struct gelic_descr 
*descr,
  * and re-initialize the hardware chain 

[PATCH v4 10/10] net/ps3_gelic: Fix DMA mapping problems

2021-07-23 Thread Geoff Levand
Fixes several DMA mapping problems with the PS3's gelic network driver:

 * Change from checking the return value of dma_map_single to using the
   dma_mapping_error routine.
 * Use the correct buffer length when mapping the RX skb.
 * Improved error checking and debug logging.

Fixes runtime errors like these, and also other randomly occurring errors:

  IP-Config: Complete:
  DMA-API: ps3_gelic_driver sb_05: device driver failed to check map error
  WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:1027 .check_unmap+0x888/0x8dc

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 183 +++
 1 file changed, 108 insertions(+), 75 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 42f4de9ad5fe..11ddeacb1159 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -336,22 +336,31 @@ static int gelic_card_init_chain(struct gelic_card *card,
struct gelic_descr_chain *chain, struct gelic_descr *start_descr,
int descr_count)
 {
-   int i;
-   struct gelic_descr *descr;
+   struct gelic_descr *descr = start_descr;
struct device *dev = ctodev(card);
+   unsigned int index;
 
-   descr = start_descr;
-   memset(descr, 0, sizeof(*descr) *descr_count);
+   memset(start_descr, 0, descr_count * sizeof(*start_descr));
 
-   for (i = 0; i < descr_count; i++, descr++) {
-   descr->link.size = sizeof(struct gelic_hw_regs);
+   for (index = 0, descr = start_descr; index < descr_count;
+   index++, descr++) {
gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
-   descr->link.cpu_addr =
-   dma_map_single(dev, descr, descr->link.size,
-   DMA_BIDIRECTIONAL);
 
-   if (!descr->link.cpu_addr)
-   goto iommu_error;
+   descr->link.size = sizeof(struct gelic_hw_regs);
+   descr->link.cpu_addr = dma_map_single(dev, descr,
+   descr->link.size, DMA_BIDIRECTIONAL);
+
+   if (unlikely(dma_mapping_error(dev, descr->link.cpu_addr))) {
+   dev_err(dev, "%s:%d: dma_mapping_error\n", __func__,
+   __LINE__);
+
+   for (index--, descr--; index > 0; index--, descr--) {
+   if (descr->link.cpu_addr) {
+   gelic_unmap_link(dev, descr);
+   }
+   }
+   return -ENOMEM;
+   }
 
descr->next = descr + 1;
descr->prev = descr - 1;
@@ -360,8 +369,9 @@ static int gelic_card_init_chain(struct gelic_card *card,
(descr - 1)->next = start_descr;
start_descr->prev = (descr - 1);
 
-   descr = start_descr;
-   for (i = 0; i < descr_count; i++, descr++) {
+   /* chain bus addr of hw descriptor */
+   for (index = 0, descr = start_descr; index < descr_count;
+   index++, descr++) {
descr->hw_regs.next_descr_addr =
cpu_to_be32(descr->next->link.cpu_addr);
}
@@ -373,12 +383,6 @@ static int gelic_card_init_chain(struct gelic_card *card,
(descr - 1)->hw_regs.next_descr_addr = 0;
 
return 0;
-
-iommu_error:
-   for (i--, descr--; 0 <= i; i--, descr--)
-   if (descr->link.cpu_addr)
-   gelic_unmap_link(dev, descr);
-   return -ENOMEM;
 }
 
 /**
@@ -395,49 +399,63 @@ static int gelic_descr_prepare_rx(struct gelic_card *card,
struct gelic_descr *descr)
 {
struct device *dev = ctodev(card);
-   int offset;
-   unsigned int bufsize;
+   struct aligned_buff {
+   unsigned int total_bytes;
+   unsigned int offset;
+   };
+   struct aligned_buff a_buf;
+   dma_addr_t cpu_addr;
 
if (gelic_descr_get_status(descr) !=  GELIC_DESCR_DMA_NOT_IN_USE) {
dev_err(dev, "%s:%d: ERROR status\n", __func__, __LINE__);
}
 
-   /* we need to round up the buffer size to a multiple of 128 */
-   bufsize = ALIGN(GELIC_NET_MAX_MTU, GELIC_NET_RXBUF_ALIGN);
+   a_buf.total_bytes = ALIGN(GELIC_NET_MAX_MTU, GELIC_NET_RXBUF_ALIGN)
+   + GELIC_NET_RXBUF_ALIGN;
+
+   descr->skb = dev_alloc_skb(a_buf.total_bytes);
 
-   /* and we need to have it 128 byte aligned, therefore we allocate a
-* bit more */
-   descr->skb = dev_alloc_skb(bufsize + GELIC_NET_RXBUF_ALIGN - 1);
if (!descr->skb) {
-   descr->hw_regs.payload.dev_addr = 0; /* tell DMAC don't touch 
memory */
+   descr->hw_regs.payload.dev_addr = 0;
+   descr->hw_regs.payload.size = 0;
return -ENOMEM;
}
-   descr->hw_regs.payload.size = 

[PATCH v4 01/10] net/ps3_gelic: Add gelic_descr structures

2021-07-23 Thread Geoff Levand
In an effort to make the PS3 gelic driver easier to maintain, create two
new structures, struct gelic_hw_regs and struct gelic_chain_link, and
replace the corresponding members of struct gelic_descr with the new
structures.

struct gelic_hw_regs holds the register variables used by the gelic
hardware device.  struct gelic_chain_link holds variables used to manage
the driver's linked list of gelic descr structures.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 133 ++-
 drivers/net/ethernet/toshiba/ps3_gelic_net.h |  24 ++--
 2 files changed, 82 insertions(+), 75 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 55e652624bd7..cb45571573d7 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -98,7 +98,7 @@ static void gelic_card_get_ether_port_status(struct 
gelic_card *card,
 static enum gelic_descr_dma_status
 gelic_descr_get_status(struct gelic_descr *descr)
 {
-   return be32_to_cpu(descr->dmac_cmd_status) & GELIC_DESCR_DMA_STAT_MASK;
+   return be32_to_cpu(descr->hw_regs.dmac_cmd_status) & 
GELIC_DESCR_DMA_STAT_MASK;
 }
 
 static int gelic_card_set_link_mode(struct gelic_card *card, int mode)
@@ -154,13 +154,13 @@ static void gelic_card_enable_rxdmac(struct gelic_card 
*card)
printk(KERN_ERR "%s: status=%x\n", __func__,
   be32_to_cpu(card->rx_chain.head->dmac_cmd_status));
printk(KERN_ERR "%s: nextphy=%x\n", __func__,
-  be32_to_cpu(card->rx_chain.head->next_descr_addr));
+  
be32_to_cpu(card->rx_chain.head->hw_regs.next_descr_addr));
printk(KERN_ERR "%s: head=%p\n", __func__,
   card->rx_chain.head);
}
 #endif
status = lv1_net_start_rx_dma(bus_id(card), dev_id(card),
-   card->rx_chain.head->bus_addr, 0);
+   card->rx_chain.head->link.cpu_addr, 0);
if (status)
dev_info(ctodev(card),
 "lv1_net_start_rx_dma failed, status=%d\n", status);
@@ -195,8 +195,8 @@ static void gelic_card_disable_rxdmac(struct gelic_card 
*card)
 static void gelic_descr_set_status(struct gelic_descr *descr,
   enum gelic_descr_dma_status status)
 {
-   descr->dmac_cmd_status = cpu_to_be32(status |
-   (be32_to_cpu(descr->dmac_cmd_status) &
+   descr->hw_regs.dmac_cmd_status = cpu_to_be32(status |
+   (be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
 ~GELIC_DESCR_DMA_STAT_MASK));
/*
 * dma_cmd_status field is used to indicate whether the descriptor
@@ -224,13 +224,13 @@ static void gelic_card_reset_chain(struct gelic_card 
*card,
 
for (descr = start_descr; start_descr != descr->next; descr++) {
gelic_descr_set_status(descr, GELIC_DESCR_DMA_CARDOWNED);
-   descr->next_descr_addr = cpu_to_be32(descr->next->bus_addr);
+   descr->hw_regs.next_descr_addr = 
cpu_to_be32(descr->next->link.cpu_addr);
}
 
chain->head = start_descr;
chain->tail = (descr - 1);
 
-   (descr - 1)->next_descr_addr = 0;
+   (descr - 1)->hw_regs.next_descr_addr = 0;
 }
 
 void gelic_card_up(struct gelic_card *card)
@@ -286,10 +286,10 @@ static void gelic_card_free_chain(struct gelic_card *card,
 {
struct gelic_descr *descr;
 
-   for (descr = descr_in; descr && descr->bus_addr; descr = descr->next) {
-   dma_unmap_single(ctodev(card), descr->bus_addr,
-GELIC_DESCR_SIZE, DMA_BIDIRECTIONAL);
-   descr->bus_addr = 0;
+   for (descr = descr_in; descr && descr->link.cpu_addr; descr = 
descr->next) {
+   dma_unmap_single(ctodev(card), descr->link.cpu_addr,
+descr->link.size, DMA_BIDIRECTIONAL);
+   descr->link.cpu_addr = 0;
}
 }
 
@@ -317,13 +317,14 @@ static int gelic_card_init_chain(struct gelic_card *card,
 
/* set up the hardware pointers in each descriptor */
for (i = 0; i < no; i++, descr++) {
+   descr->link.size = sizeof(struct gelic_hw_regs);
gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
-   descr->bus_addr =
+   descr->link.cpu_addr =
dma_map_single(ctodev(card), descr,
-  GELIC_DESCR_SIZE,
+  descr->link.size,
   DMA_BIDIRECTIONAL);
 
-   if (!descr->bus_addr)
+   if (!descr->link.cpu_addr)
goto iommu_error;
 
descr->next = descr + 1;
@@ -336,22 +337,22 @@ static int gelic_card_init_chain(struct gelic_card *card,

[PATCH v4 04/10] net/ps3_gelic: Add new macro BUG_ON_DEBUG

2021-07-23 Thread Geoff Levand
Add a new preprocessor macro BUG_ON_DEBUG, that expands to BUG_ON when
the preprocessor macro DEBUG is defined, or to WARN_ON when DEBUG is not
defined.  Also, replace all occurrences of BUG_ON with BUG_ON_DEBUG.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index ded467d81f36..946e9bfa071b 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -44,6 +44,13 @@ MODULE_AUTHOR("SCE Inc.");
 MODULE_DESCRIPTION("Gelic Network driver");
 MODULE_LICENSE("GPL");
 
+#define BUG_ON_DEBUG(_cond) do { \
+   if (__is_defined(DEBUG)) \
+   BUG_ON(_cond); \
+   else \
+   WARN_ON(_cond); \
+} while (0)
+
 int gelic_card_set_irq_mask(struct gelic_card *card, u64 mask)
 {
struct device *dev = ctodev(card);
@@ -505,7 +512,7 @@ static void gelic_descr_release_tx(struct gelic_card *card,
struct sk_buff *skb = descr->skb;
struct device *dev = ctodev(card);
 
-   BUG_ON(!(be32_to_cpu(descr->hw_regs.data_status) &
+   BUG_ON_DEBUG(!(be32_to_cpu(descr->hw_regs.data_status) &
GELIC_DESCR_TX_TAIL));
 
dma_unmap_single(dev, be32_to_cpu(descr->hw_regs.payload.dev_addr),
@@ -1667,7 +1674,7 @@ static void gelic_card_get_vlan_info(struct gelic_card 
*card)
}
 
if (card->vlan[GELIC_PORT_ETHERNET_0].tx) {
-   BUG_ON(!card->vlan[GELIC_PORT_WIRELESS].tx);
+   BUG_ON_DEBUG(!card->vlan[GELIC_PORT_WIRELESS].tx);
card->vlan_required = 1;
} else
card->vlan_required = 0;
@@ -1709,7 +1716,7 @@ static int ps3_gelic_driver_probe(struct 
ps3_system_bus_device *sb_dev)
if (result) {
dev_err(dev, "%s:%d: ps3_dma_region_create failed: %d\n",
__func__, __LINE__, result);
-   BUG_ON("check region type");
+   BUG_ON_DEBUG("check region type");
goto fail_dma_region;
}
 
-- 
2.25.1




[PATCH v4 00/10] DMA fixes for PS3 gelic network driver

2021-07-23 Thread Geoff Levand
Hi Dave, Jakub,

This set of patches fixes various DMA related problems in the PS3 gelic
network driver and adds better error checking and improved message logging.

Please consider.

Changes from v3:
  Rebase to latest net-next.
  Split 2 patches into 10 patches.
  Fix checkpatch error.

Changes from v2:
  Rebase to latest net-next.

Changes from v1:
  Split the v1 series into two, one series with powerpc changes, and one series
  with gelic network driver changes.
  
-Geoff

The following changes since commit 94a994d2b2b74420c6fff5100220c2b636317242:

  net: phy: Remove unused including  (2021-07-23 17:54:53 
+0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git 
for-merge-dma-net-v4

for you to fetch changes up to 7aa1d9b1b4ffadcbdc6f88e4f8d4a323da307595:

  net/ps3_gelic: Fix DMA mapping problems (2021-07-24 13:02:14 -0700)


Geoff Levand (10):
  net/ps3_gelic: Add gelic_descr structures
  net/ps3_gelic: Use local dev variable
  net/ps3_gelic: Format cleanups
  net/ps3_gelic: Add new macro BUG_ON_DEBUG
  net/ps3_gelic: Add vlan_id structure
  net/ps3_gelic: Cleanup debug code
  net/ps3_gelic: Add new routine gelic_unmap_link
  net/ps3_gelic: Rename no to descr_count
  net/ps3_gelic: Add new routine gelic_work_to_card
  net/ps3_gelic: Fix DMA mapping problems

 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 983 +++
 drivers/net/ethernet/toshiba/ps3_gelic_net.h |  24 +-
 2 files changed, 559 insertions(+), 448 deletions(-)

-- 
2.25.1



[PATCH v4 06/10] net/ps3_gelic: Cleanup debug code

2021-07-23 Thread Geoff Levand
In an effort to make the PS3 gelic driver easier to maintain, change the
gelic_card_enable_rxdmac routine to use the optimizer to remove
debug code.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 54e50ad9e629..85fc1915c8be 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -162,17 +162,16 @@ static void gelic_card_enable_rxdmac(struct gelic_card 
*card)
struct device *dev = ctodev(card);
int status;
 
-#ifdef DEBUG
-   if (gelic_descr_get_status(card->rx_chain.head) !=
-   GELIC_DESCR_DMA_CARDOWNED) {
-   printk(KERN_ERR "%s: status=%x\n", __func__,
-  be32_to_cpu(card->rx_chain.head->dmac_cmd_status));
-   printk(KERN_ERR "%s: nextphy=%x\n", __func__,
-  
be32_to_cpu(card->rx_chain.head->hw_regs.next_descr_addr));
-   printk(KERN_ERR "%s: head=%p\n", __func__,
-  card->rx_chain.head);
+   if (__is_defined(DEBUG) && (gelic_descr_get_status(card->rx_chain.head)
+   != GELIC_DESCR_DMA_CARDOWNED)) {
+   dev_err(dev, "%s:%d: status=%x\n", __func__, __LINE__,
+   
be32_to_cpu(card->rx_chain.head->hw_regs.dmac_cmd_status));
+   dev_err(dev, "%s:%d: nextphy=%x\n", __func__, __LINE__,
+   
be32_to_cpu(card->rx_chain.head->hw_regs.next_descr_addr));
+   dev_err(dev, "%s:%d: head=%px\n", __func__, __LINE__,
+   card->rx_chain.head);
}
-#endif
+
status = lv1_net_start_rx_dma(bus_id(card), dev_id(card),
card->rx_chain.head->link.cpu_addr, 0);
 
-- 
2.25.1




[PATCH v4 05/10] net/ps3_gelic: Add vlan_id structure

2021-07-23 Thread Geoff Levand
In an effort to make the PS3 gelic driver easier to maintain, add
a definition for the vlan_id structure.

Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 946e9bfa071b..54e50ad9e629 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -1614,13 +1614,14 @@ static struct gelic_card *gelic_alloc_card_net(struct 
net_device **netdev)
 static void gelic_card_get_vlan_info(struct gelic_card *card)
 {
struct device *dev = ctodev(card);
+   unsigned int i;
u64 v1, v2;
int status;
-   unsigned int i;
-   struct {
+   struct vlan_id {
int tx;
int rx;
-   } vlan_id_ix[2] = {
+   };
+   struct vlan_id vlan_id_ix[2] = {
[GELIC_PORT_ETHERNET_0] = {
.tx = GELIC_LV1_VLAN_TX_ETHERNET_0,
.rx = GELIC_LV1_VLAN_RX_ETHERNET_0
-- 
2.25.1




[Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5

2021-07-23 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #1 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 298019
  --> https://bugzilla.kernel.org/attachment.cgi?id=298019=edit
kernel .config (5.13.4, PowerMac G5 11,2)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching someone on the CC list of the bug.

[Bug 213837] New: "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5

2021-07-23 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213837

Bug ID: 213837
   Summary: "Kernel panic - not syncing: corrupted stack end
detected inside scheduler" at building via distcc on a
G5
   Product: Memory Management
   Version: 2.5
Kernel Version: 5.13.4
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: a...@linux-foundation.org
  Reporter: erhar...@mailbox.org
CC: platform_ppc...@kernel-bugs.osdl.org
Regression: No

Created attachment 298017
  --> https://bugzilla.kernel.org/attachment.cgi?id=298017=edit
dmesg (5.13.4, PowerMac G5 11,2)

Happens when building larger projects on my G5 via distcc. Time to failure is
about 3-10 minutes. Kernel 5.10.x does not show this problem. Probably
connected to bug #213079.

[..]
Call Trace:
Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 1 PID: 11467 Comm: powerpc64-unkno Tainted: GW
5.13.4-PowerMacG5+ #2
[c0003e79ea80] [c0541c90] .dump_stack+0xe0/0x13c (unreliable)
[c0003e79eb20] [c006813c] .panic+0x168/0x430
[c0003e79ebd0] [c080a4b0] .__schedule+0x80/0x840
[c0003e79ecb0] [c080adbc] .preempt_schedule_common+0x28/0x48
[c0003e79ed30] [c080ae0c] .__cond_resched+0x30/0x4c
[c0003e79edb0] [c01c6d80] .mempool_alloc+0x38/0x198
[c0003e79ee90] [c049a444] .bio_alloc_bioset+0x94/0x174
[c0003e79ef40] [c049a544] .bio_clone_fast+0x20/0x7c
[c0003e79efd0] [c049a60c] .bio_split+0x6c/0xc4
[c0003e79f060] [c04a7018] .__blk_queue_split+0x120/0x474
[c0003e79f160] [c04adc30] .blk_mq_submit_bio+0x88/0x524
[c0003e79f250] [c04a0e30] .submit_bio_noacct+0xc4/0x26c
[c0003e79f340] [c0355bec] .ext4_io_submit+0x5c/0x70
[c0003e79f3c0] [c0355f08] .ext4_bio_write_page+0x2f4/0x480
[c0003e79f480] [c0334b84] .mpage_submit_page+0x70/0xa0
[c0003e79f500] [c033b09c] .ext4_writepages+0xcc4/0xe5c
[c0003e79f7b0] [c01cf214] .do_writepages+0x54/0xa0
[c0003e79f830] [c01c3ab8] .__filemap_fdatawrite_range+0xc0/0xfc
[c0003e79f930] [c0337f34] .ext4_alloc_da_blocks+0xf4/0x100
[c0003e79f9b0] [c0328594] .ext4_release_file+0x24/0xd8
[c0003e79fa40] [c026ea5c] .__fput+0x12c/0x270
[c0003e79fae0] [c008eb40] .task_work_run+0xa0/0xc0
[c0003e79fb70] [c006e284] .do_exit+0x55c/0xa6c
[c0003e79fc60] [c006e824] .do_group_exit+0x50/0xb0
[c0003e79fcf0] [c006e898] .__wake_up_parent+0x0/0x34
[c0003e79fd60] [c0021540] .system_call_exception+0x1b4/0x1ec
[c0003e79fe10] [c000b9c4] system_call_common+0xe4/0x214
--- interrupt: c00 at 0x3fffadc46aa8
NIP:  3fffadc46aa8 LR: 3fffadba6d04 CTR: 
REGS: c0003e79fe80 TRAP: 0c00   Tainted: GW 
(5.13.4-PowerMacG5+)
MSR:  9200f032   CR: 22000482  XER:

IRQMASK: 0 
GPR00: 00ea 33f1ae50 3fffadd65300  
GPR04:     
GPR08:     
GPR12:  3fffadeddc30 00014af075a0 00012379b0f0 
GPR16: 00012947ec38 3fffd7d95cd8 00012947eb28 002f 
GPR20:  3fffadd5fff8 0001 3fffadd5ea58 
GPR24:   0003 0001 
GPR28:  3fffaded6c50 f000  
NIP [3fffadc46aa8] 0x3fffadc46aa8
LR [3fffadba6d04] 0x3fffadba6d04
--- interrupt: c00
Rebooting in 40 seconds..

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching someone on the CC list of the bug.

Re: [PATCH v3 0/5] powerpc: apm82181: adding customer devices

2021-07-23 Thread Andy Shevchenko
On Sun, Sep 06, 2020 at 12:06:10AM +0200, Christian Lamparter wrote:
> Hello,
> 
> I've been holding on to these devices dts' for a while now.
> But ever since the recent purge of the PPC405, I'm feeling
> the urge to move forward.
> 
> The devices in question have been running with OpenWrt since
> around 2016/2017. Back then it was linux v4.4 and required
> many out-of-tree patches (for WIFI, SATA, CRYPTO...), that
> since have been integrated. So, there's nothing else in the
> way I think.
> 
> A patch that adds the Meraki vendor-prefix has been sent
> separately, as there's also the Meraki MR32 that I'm working
> on as well. Here's the link to the patch:
> 
> 
> Now, I've looked around in the arch/powerpc for recent .dts
> and device submissions to get an understanding of what is
> required.
> >From the looks of it, it seems like every device gets a
> skeleton defconfig and a CONFIG_$DEVICE symbol (Like:
> CONFIG_MERAKI_MR24, CONFIG_WD_MYBOOKLIVE).
> 
> Will this be the case? Or would it make sense to further
> unite the Bluestone, MR24 and MBL under a common CONFIG_APM82181
> and integrate the BLUESTONE device's defconfig into it as well?
> (I've stumbled across the special machine compatible
> handling of ppc in the Documentation/devicetree/usage-model.rst
> already.)

I haven't found any traces of this to be applied. What is the status of this
patch series? And what is the general state of affairs for the PPC44x?


-- 
With Best Regards,
Andy Shevchenko




[PATCH v2 05/21] alpha: return error code from alpha_pci_map_sg()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

pci_map_single_1() can fail for different reasons, but since the only
supported type of error return is DMA_MAPPING_ERROR, we coalesce those
errors into EIO.

ENOMEM is returned when no page tables can be allocated.

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
---
 arch/alpha/kernel/pci_iommu.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 35d7b3096d6e..21f9ac101324 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -649,7 +649,9 @@ static int alpha_pci_map_sg(struct device *dev, struct 
scatterlist *sg,
sg->dma_address
  = pci_map_single_1(pdev, SG_ENT_VIRT_ADDRESS(sg),
 sg->length, dac_allowed);
-   return sg->dma_address != DMA_MAPPING_ERROR;
+   if (sg->dma_address == DMA_MAPPING_ERROR)
+   return -EIO;
+   return 1;
}
 
start = sg;
@@ -685,8 +687,10 @@ static int alpha_pci_map_sg(struct device *dev, struct 
scatterlist *sg,
if (out < end)
out->dma_length = 0;
 
-   if (out - start == 0)
+   if (out - start == 0) {
printk(KERN_WARNING "pci_map_sg failed: no entries?\n");
+   return -ENOMEM;
+   }
DBGA("pci_map_sg: %ld entries\n", out - start);
 
return out - start;
@@ -699,7 +703,7 @@ static int alpha_pci_map_sg(struct device *dev, struct 
scatterlist *sg,
   entries.  Unmap them now.  */
if (out > start)
pci_unmap_sg(pdev, start, out - start, dir);
-   return 0;
+   return -ENOMEM;
 }
 
 /* Unmap a set of streaming mode DMA translations.  Again, cpu read
-- 
2.20.1



[PATCH v2 07/21] ARM/dma-mapping: don't set failed sg dma_address to DMA_MAPPING_ERROR

2021-07-23 Thread Logan Gunthorpe
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of the
->map_sg calling convention, so remove it.

Link: https://lore.kernel.org/linux-mips/20210716063241.gc13...@lst.de/
Suggested-by: Christoph Hellwig 
Signed-off-by: Logan Gunthorpe 
Cc: Russell King 
Cc: Thomas Bogendoerfer 
---
 arch/arm/mm/dma-mapping.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 113b9cb3701b..4b61541853ea 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1632,7 +1632,6 @@ static int __iommu_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
for (i = 1; i < nents; i++) {
s = sg_next(s);
 
-   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
 
if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) 
{
-- 
2.20.1



[PATCH v2 06/21] ARM/dma-mapping: return error code from .map_sg() ops

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.
In the case of a DMA_MAPPING_ERROR, -EIO is returned. Otherwise,
-ENOMEM or -EINVAL is returned depending on the error from
__map_sg_chunk().

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: Russell King 
Cc: Thomas Bogendoerfer 
---
 arch/arm/mm/dma-mapping.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index c4b8df2ad328..113b9cb3701b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -980,7 +980,7 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist 
*sg, int nents,
 {
const struct dma_map_ops *ops = get_dma_ops(dev);
struct scatterlist *s;
-   int i, j;
+   int i, j, ret;
 
for_each_sg(sg, s, nents, i) {
 #ifdef CONFIG_NEED_SG_DMA_LENGTH
@@ -988,15 +988,17 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist 
*sg, int nents,
 #endif
s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
s->length, dir, attrs);
-   if (dma_mapping_error(dev, s->dma_address))
+   if (dma_mapping_error(dev, s->dma_address)) {
+   ret = -EIO;
goto bad_mapping;
+   }
}
return nents;
 
  bad_mapping:
for_each_sg(sg, s, i, j)
ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, 
attrs);
-   return 0;
+   return ret;
 }
 
 /**
@@ -1622,7 +1624,7 @@ static int __iommu_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
 bool is_coherent)
 {
struct scatterlist *s = sg, *dma = sg, *start = sg;
-   int i, count = 0;
+   int i, count = 0, ret;
unsigned int offset = s->offset;
unsigned int size = s->offset + s->length;
unsigned int max = dma_get_max_seg_size(dev);
@@ -1634,8 +1636,10 @@ static int __iommu_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
s->dma_length = 0;
 
if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) 
{
-   if (__map_sg_chunk(dev, start, size, >dma_address,
-   dir, attrs, is_coherent) < 0)
+   ret = __map_sg_chunk(dev, start, size,
+>dma_address, dir, attrs,
+is_coherent);
+   if (ret < 0)
goto bad_mapping;
 
dma->dma_address += offset;
@@ -1648,8 +1652,9 @@ static int __iommu_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
}
size += s->length;
}
-   if (__map_sg_chunk(dev, start, size, >dma_address, dir, attrs,
-   is_coherent) < 0)
+   ret = __map_sg_chunk(dev, start, size, >dma_address, dir, attrs,
+is_coherent);
+   if (ret < 0)
goto bad_mapping;
 
dma->dma_address += offset;
@@ -1660,7 +1665,9 @@ static int __iommu_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
 bad_mapping:
for_each_sg(sg, s, count, i)
__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
-   return 0;
+   if (ret == -ENOMEM)
+   return ret;
+   return -EINVAL;
 }
 
 /**
-- 
2.20.1



[PATCH v2 08/21] ia64/sba_iommu: return error code from sba_map_sg_attrs()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

In the case of a dma_mapping_error() return -EIO as the actual cause
is opaque here.

sba_coalesce_chunks() may only presently fail if sba_alloc_range()
fails, which in turn only fails if the iommu is out of mapping
resources, hence a -ENOMEM is used in that case.

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: Michael Ellerman 
Cc: Niklas Schnelle 
Cc: Thomas Bogendoerfer 
---
 arch/ia64/hp/common/sba_iommu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 9148ddbf02e5..430c166b68cd 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -1458,8 +1458,8 @@ static int sba_map_sg_attrs(struct device *dev, struct 
scatterlist *sglist,
sglist->dma_length = sglist->length;
sglist->dma_address = sba_map_page(dev, sg_page(sglist),
sglist->offset, sglist->length, dir, attrs);
-   if (dma_mapping_error(dev, sglist->dma_address))
-   return 0;
+   if(dma_mapping_error(dev, sglist->dma_address))
+   return -EIO;
return 1;
}
 
@@ -1486,7 +1486,7 @@ static int sba_map_sg_attrs(struct device *dev, struct 
scatterlist *sglist,
coalesced = sba_coalesce_chunks(ioc, dev, sglist, nents);
if (coalesced < 0) {
sba_unmap_sg_attrs(dev, sglist, nents, dir, attrs);
-   return 0;
+   return -ENOMEM;
}
 
/*
-- 
2.20.1



[PATCH v2 11/21] powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR

2021-07-23 Thread Logan Gunthorpe
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of
the ->map_sg calling convention, so remove it.

Link: https://lore.kernel.org/linux-mips/20210716063241.gc13...@lst.de/
Suggested-by: Christoph Hellwig 
Signed-off-by: Logan Gunthorpe 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Geoff Levand 
---
 arch/powerpc/kernel/iommu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index a8ec4fe42817..30b7736f0896 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -575,7 +575,6 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
 */
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = DMA_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -593,7 +592,6 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
npages = iommu_num_pages(s->dma_address, s->dma_length,
 IOMMU_PAGE_SIZE(tbl));
__iommu_free(tbl, vaddr, npages);
-   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
-- 
2.20.1



[PATCH v2 10/21] powerpc/iommu: return error code from .map_sg() ops

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

Propagate the error up if vio_dma_iommu_map_sg() fails.

ppc_iommu_map_sg() may fail either because of iommu_range_alloc() or
because of tbl->it_ops->set(). The former only supports returning an
error with DMA_MAPPING_ERROR and an examination of the latter indicates
that it may return arch-specific errors (for example,
tce_buildmulti_pSeriesLP()). Hence, coalesce all of those errors into
-EIO, per the documentation on dma_map_sgtable().

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Geoff Levand 
---
 arch/powerpc/kernel/iommu.c | 4 ++--
 arch/powerpc/platforms/ps3/system-bus.c | 2 +-
 arch/powerpc/platforms/pseries/vio.c| 5 +++--
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 2af89a5e379f..a8ec4fe42817 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -473,7 +473,7 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
BUG_ON(direction == DMA_NONE);
 
if ((nelems == 0) || !tbl)
-   return 0;
+   return -EINVAL;
 
outs = s = segstart = [0];
outcount = 1;
@@ -599,7 +599,7 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
if (s == outs)
break;
}
-   return 0;
+   return -EIO;
 }
 
 
diff --git a/arch/powerpc/platforms/ps3/system-bus.c 
b/arch/powerpc/platforms/ps3/system-bus.c
index 1a5665875165..c54eb46f0cfb 100644
--- a/arch/powerpc/platforms/ps3/system-bus.c
+++ b/arch/powerpc/platforms/ps3/system-bus.c
@@ -663,7 +663,7 @@ static int ps3_ioc0_map_sg(struct device *_dev, struct 
scatterlist *sg,
   unsigned long attrs)
 {
BUG();
-   return 0;
+   return -EINVAL;
 }
 
 static void ps3_sb_unmap_sg(struct device *_dev, struct scatterlist *sg,
diff --git a/arch/powerpc/platforms/pseries/vio.c 
b/arch/powerpc/platforms/pseries/vio.c
index e00f3725ec96..e31e59c54f30 100644
--- a/arch/powerpc/platforms/pseries/vio.c
+++ b/arch/powerpc/platforms/pseries/vio.c
@@ -560,7 +560,8 @@ static int vio_dma_iommu_map_sg(struct device *dev, struct 
scatterlist *sglist,
for_each_sg(sglist, sgl, nelems, count)
alloc_size += roundup(sgl->length, IOMMU_PAGE_SIZE(tbl));
 
-   if (vio_cmo_alloc(viodev, alloc_size))
+   ret = vio_cmo_alloc(viodev, alloc_size);
+   if (ret)
goto out_fail;
ret = ppc_iommu_map_sg(dev, tbl, sglist, nelems, dma_get_mask(dev),
direction, attrs);
@@ -577,7 +578,7 @@ static int vio_dma_iommu_map_sg(struct device *dev, struct 
scatterlist *sglist,
vio_cmo_dealloc(viodev, alloc_size);
 out_fail:
atomic_inc(>cmo.allocs_failed);
-   return 0;
+   return ret;
 }
 
 static void vio_dma_iommu_unmap_sg(struct device *dev,
-- 
2.20.1



[PATCH v2 09/21] MIPS/jazzdma: return error code from jazz_dma_map_sg()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

vdma_alloc() may fail for different reasons, but since it only supports
indicating an error via a return of DMA_MAPPING_ERROR, we coalesce the
different reasons into -EIO as is documented on dma_map_sgtable().

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: Thomas Bogendoerfer 
---
 arch/mips/jazz/jazzdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index 461457b28982..eabddb89d221 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -552,7 +552,7 @@ static int jazz_dma_map_sg(struct device *dev, struct 
scatterlist *sglist,
dir);
sg->dma_address = vdma_alloc(sg_phys(sg), sg->length);
if (sg->dma_address == DMA_MAPPING_ERROR)
-   return 0;
+   return -EIO;
sg_dma_len(sg) = sg->length;
}
 
-- 
2.20.1



[PATCH v2 03/21] iommu: Return full error code from iommu_map_sg[_atomic]()

2021-07-23 Thread Logan Gunthorpe
Convert to ssize_t return code so the return code from __iommu_map()
can be returned all the way down through dma_iommu_map_sg().

Signed-off-by: Logan Gunthorpe 
Cc: Joerg Roedel 
Cc: Will Deacon 
---
 drivers/iommu/iommu.c | 15 +++
 include/linux/iommu.h | 22 +++---
 2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5419c4b9f27a..bf971b4e34aa 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2567,9 +2567,9 @@ size_t iommu_unmap_fast(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_unmap_fast);
 
-static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-struct scatterlist *sg, unsigned int nents, int 
prot,
-gfp_t gfp)
+static ssize_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+   struct scatterlist *sg, unsigned int nents, int prot,
+   gfp_t gfp)
 {
const struct iommu_ops *ops = domain->ops;
size_t len = 0, mapped = 0;
@@ -2610,19 +2610,18 @@ static size_t __iommu_map_sg(struct iommu_domain 
*domain, unsigned long iova,
/* undo mappings already done */
iommu_unmap(domain, iova, mapped);
 
-   return 0;
-
+   return ret;
 }
 
-size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-   struct scatterlist *sg, unsigned int nents, int prot)
+ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+struct scatterlist *sg, unsigned int nents, int prot)
 {
might_sleep();
return __iommu_map_sg(domain, iova, sg, nents, prot, GFP_KERNEL);
 }
 EXPORT_SYMBOL_GPL(iommu_map_sg);
 
-size_t iommu_map_sg_atomic(struct iommu_domain *domain, unsigned long iova,
+ssize_t iommu_map_sg_atomic(struct iommu_domain *domain, unsigned long iova,
struct scatterlist *sg, unsigned int nents, int prot)
 {
return __iommu_map_sg(domain, iova, sg, nents, prot, GFP_ATOMIC);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..9369458ba1bd 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -414,11 +414,11 @@ extern size_t iommu_unmap(struct iommu_domain *domain, 
unsigned long iova,
 extern size_t iommu_unmap_fast(struct iommu_domain *domain,
   unsigned long iova, size_t size,
   struct iommu_iotlb_gather *iotlb_gather);
-extern size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-  struct scatterlist *sg,unsigned int nents, int prot);
-extern size_t iommu_map_sg_atomic(struct iommu_domain *domain,
- unsigned long iova, struct scatterlist *sg,
- unsigned int nents, int prot);
+extern ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+   struct scatterlist *sg, unsigned int nents, int prot);
+extern ssize_t iommu_map_sg_atomic(struct iommu_domain *domain,
+  unsigned long iova, struct scatterlist *sg,
+  unsigned int nents, int prot);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t 
iova);
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
iommu_fault_handler_t handler, void *token);
@@ -679,18 +679,18 @@ static inline size_t iommu_unmap_fast(struct iommu_domain 
*domain,
return 0;
 }
 
-static inline size_t iommu_map_sg(struct iommu_domain *domain,
- unsigned long iova, struct scatterlist *sg,
- unsigned int nents, int prot)
+static inline ssize_t iommu_map_sg(struct iommu_domain *domain,
+  unsigned long iova, struct scatterlist *sg,
+  unsigned int nents, int prot)
 {
-   return 0;
+   return -ENODEV;
 }
 
-static inline size_t iommu_map_sg_atomic(struct iommu_domain *domain,
+static inline ssize_t iommu_map_sg_atomic(struct iommu_domain *domain,
  unsigned long iova, struct scatterlist *sg,
  unsigned int nents, int prot)
 {
-   return 0;
+   return -ENODEV;
 }
 
 static inline void iommu_flush_iotlb_all(struct iommu_domain *domain)
-- 
2.20.1



[PATCH v2 04/21] dma-iommu: Return error code from iommu_dma_map_sg()

2021-07-23 Thread Logan Gunthorpe
Return appropriate error codes EINVAL or ENOMEM from
iommup_dma_map_sg(). If lower level code returns ENOMEM, then we
return it, other errors are coalesced into EINVAL.

iommu_dma_map_sg_swiotlb() returns -EIO as its an unknown error
from a call that returns DMA_MAPPING_ERROR.

Signed-off-by: Logan Gunthorpe 
Cc: Joerg Roedel 
Cc: Will Deacon 
---
 drivers/iommu/dma-iommu.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 98ba927aee1a..d9aaed080e68 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -972,7 +972,7 @@ static int iommu_dma_map_sg_swiotlb(struct device *dev, 
struct scatterlist *sg,
 
 out_unmap:
iommu_dma_unmap_sg_swiotlb(dev, sg, i, dir, attrs | 
DMA_ATTR_SKIP_CPU_SYNC);
-   return 0;
+   return -EIO;
 }
 
 /*
@@ -993,11 +993,13 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
dma_addr_t iova;
size_t iova_len = 0;
unsigned long mask = dma_get_seg_boundary(dev);
+   ssize_t ret;
int i;
 
-   if (static_branch_unlikely(_deferred_attach_enabled) &&
-   iommu_deferred_attach(dev, domain))
-   return 0;
+   if (static_branch_unlikely(_deferred_attach_enabled)) {
+   ret = iommu_deferred_attach(dev, domain);
+   goto out;
+   }
 
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
@@ -1045,14 +1047,17 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
}
 
iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
-   if (!iova)
+   if (!iova) {
+   ret = -ENOMEM;
goto out_restore_sg;
+   }
 
/*
 * We'll leave any physical concatenation to the IOMMU driver's
 * implementation - it knows better than we do.
 */
-   if (iommu_map_sg_atomic(domain, iova, sg, nents, prot) < iova_len)
+   ret = iommu_map_sg_atomic(domain, iova, sg, nents, prot);
+   if (ret < iova_len)
goto out_free_iova;
 
return __finalise_sg(dev, sg, nents, iova);
@@ -1061,7 +1066,11 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
iommu_dma_free_iova(cookie, iova, iova_len, NULL);
 out_restore_sg:
__invalidate_sg(sg, nents);
-   return 0;
+out:
+   if (ret == -ENOMEM)
+   return ret;
+   else
+   return -EINVAL;
 }
 
 static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
-- 
2.20.1



[PATCH v2 12/21] s390/pci: return error code from s390_dma_map_sg()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

So propagate the error from __s390_dma_map_sg() up. __s390_dma_map_sg()
returns either -ENOMEM on allocation failure or -EINVAL which is
the same as what's expected by dma_map_sgtable().

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Acked-by: Niklas Schnelle 
Cc: Gerald Schaefer 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
---
 arch/s390/pci/pci_dma.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index ebc9a49523aa..c78b02012764 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -487,7 +487,7 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
unsigned int max = dma_get_max_seg_size(dev);
unsigned int size = s->offset + s->length;
unsigned int offset = s->offset;
-   int count = 0, i;
+   int count = 0, i, ret;
 
for (i = 1; i < nr_elements; i++) {
s = sg_next(s);
@@ -497,8 +497,9 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
 
if (s->offset || (size & ~PAGE_MASK) ||
size + s->length > max) {
-   if (__s390_dma_map_sg(dev, start, size,
- >dma_address, dir))
+   ret = __s390_dma_map_sg(dev, start, size,
+   >dma_address, dir);
+   if (ret)
goto unmap;
 
dma->dma_address += offset;
@@ -511,7 +512,8 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
}
size += s->length;
}
-   if (__s390_dma_map_sg(dev, start, size, >dma_address, dir))
+   ret = __s390_dma_map_sg(dev, start, size, >dma_address, dir);
+   if (ret)
goto unmap;
 
dma->dma_address += offset;
@@ -523,7 +525,7 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
s390_dma_unmap_pages(dev, sg_dma_address(s), sg_dma_len(s),
 dir, attrs);
 
-   return 0;
+   return ret;
 }
 
 static void s390_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
-- 
2.20.1



[PATCH v2 13/21] s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR

2021-07-23 Thread Logan Gunthorpe
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of
the ->map_sg calling convention, so remove it.

Link: https://lore.kernel.org/linux-mips/20210716063241.gc13...@lst.de/
Suggested-by: Christoph Hellwig 
Signed-off-by: Logan Gunthorpe 
Cc: Niklas Schnelle 
Cc: Gerald Schaefer 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
---
 arch/s390/pci/pci_dma.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index c78b02012764..be48e5b5bfcf 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -492,7 +492,6 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
for (i = 1; i < nr_elements; i++) {
s = sg_next(s);
 
-   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
 
if (s->offset || (size & ~PAGE_MASK) ||
-- 
2.20.1



[PATCH v2 15/21] sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR

2021-07-23 Thread Logan Gunthorpe
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of
the ->map_sg calling convention, so remove it.

Link: https://lore.kernel.org/linux-mips/20210716063241.gc13...@lst.de/
Suggested-by: Christoph Hellwig 
Signed-off-by: Logan Gunthorpe 
Cc: "David S. Miller" 
Cc: Niklas Schnelle 
Cc: Michael Ellerman 
---
 arch/sparc/kernel/iommu.c | 2 --
 arch/sparc/kernel/pci_sun4v.c | 2 --
 2 files changed, 4 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 0589acd34201..da0363692528 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -546,7 +546,6 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = DMA_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -572,7 +571,6 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu_tbl_range_free(>tbl, vaddr, npages,
 IOMMU_ERROR_CODE);
 
-   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index d90e80fa5705..384480971805 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -594,7 +594,6 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = DMA_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -611,7 +610,6 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu_tbl_range_free(tbl, vaddr, npages,
 IOMMU_ERROR_CODE);
/* XXX demap? XXX */
-   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
-- 
2.20.1



[PATCH v2 14/21] sparc/iommu: return error codes from .map_sg() ops

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

Returning an errno from __sbus_iommu_map_sg() results in
sbus_iommu_map_sg_gflush() and sbus_iommu_map_sg_pflush() returning an
errno, as those functions are wrappers around __sbus_iommu_map_sg().

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: "David S. Miller" 
Cc: Niklas Schnelle 
Cc: Michael Ellerman 
---
 arch/sparc/kernel/iommu.c | 4 ++--
 arch/sparc/kernel/pci_sun4v.c | 4 ++--
 arch/sparc/mm/iommu.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index a034f571d869..0589acd34201 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -448,7 +448,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu = dev->archdata.iommu;
strbuf = dev->archdata.stc;
if (nelems == 0 || !iommu)
-   return 0;
+   return -EINVAL;
 
spin_lock_irqsave(>lock, flags);
 
@@ -580,7 +580,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
}
spin_unlock_irqrestore(>lock, flags);
 
-   return 0;
+   return -EINVAL;
 }
 
 /* If contexts are being used, they are the same in all of the mappings
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 9de57e88f7a1..d90e80fa5705 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -486,7 +486,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
iommu = dev->archdata.iommu;
if (nelems == 0 || !iommu)
-   return 0;
+   return -EINVAL;
atu = iommu->atu;
 
prot = HV_PCI_MAP_ATTR_READ;
@@ -619,7 +619,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
}
local_irq_restore(flags);
 
-   return 0;
+   return -EINVAL;
 }
 
 static void dma_4v_unmap_sg(struct device *dev, struct scatterlist *sglist,
diff --git a/arch/sparc/mm/iommu.c b/arch/sparc/mm/iommu.c
index 0c0342e5b10d..9e3f6933ca13 100644
--- a/arch/sparc/mm/iommu.c
+++ b/arch/sparc/mm/iommu.c
@@ -256,7 +256,7 @@ static int __sbus_iommu_map_sg(struct device *dev, struct 
scatterlist *sgl,
sg->dma_address =__sbus_iommu_map_page(dev, sg_page(sg),
sg->offset, sg->length, per_page_flush);
if (sg->dma_address == DMA_MAPPING_ERROR)
-   return 0;
+   return -EIO;
sg->dma_length = sg->length;
}
 
-- 
2.20.1



[PATCH v2 16/21] parisc: return error code from .map_sg() ops

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.
Return -EINVAL if the ioc cannot be obtained.

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
---
 drivers/parisc/ccio-dma.c  | 2 +-
 drivers/parisc/sba_iommu.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index b5f9ee81a46c..452e72b7bd01 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -918,7 +918,7 @@ ccio_map_sg(struct device *dev, struct scatterlist *sglist, 
int nents,
BUG_ON(!dev);
ioc = GET_IOC(dev);
if (!ioc)
-   return 0;
+   return -EINVAL;

DBG_RUN_SG("%s() START %d entries\n", __func__, nents);
 
diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
index dce4cdf786cd..e60690d38d67 100644
--- a/drivers/parisc/sba_iommu.c
+++ b/drivers/parisc/sba_iommu.c
@@ -947,7 +947,7 @@ sba_map_sg(struct device *dev, struct scatterlist *sglist, 
int nents,
 
ioc = GET_IOC(dev);
if (!ioc)
-   return 0;
+   return -EINVAL;
 
/* Fast path single entry scatterlists. */
if (nents == 1) {
-- 
2.20.1



[PATCH v2 18/21] x86/amd_gart: return error code from gart_map_sg()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

So make __dma_map_cont() return a valid errno (which is then propagated
to gart_map_sg() via dma_map_cont()) and return it in case of failure.

Also, return -EINVAL in case of invalid nents.

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Niklas Schnelle 
Cc: Thomas Bogendoerfer 
Cc: Michael Ellerman 
---
 arch/x86/kernel/amd_gart_64.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 9ac696487b13..46aea9a4f26b 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -331,7 +331,7 @@ static int __dma_map_cont(struct device *dev, struct 
scatterlist *start,
int i;
 
if (iommu_start == -1)
-   return -1;
+   return -ENOMEM;
 
for_each_sg(start, s, nelems, i) {
unsigned long pages, addr;
@@ -380,13 +380,13 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
   enum dma_data_direction dir, unsigned long attrs)
 {
struct scatterlist *s, *ps, *start_sg, *sgmap;
-   int need = 0, nextneed, i, out, start;
+   int need = 0, nextneed, i, out, start, ret;
unsigned long pages = 0;
unsigned int seg_size;
unsigned int max_seg_size;
 
if (nents == 0)
-   return 0;
+   return -EINVAL;
 
out = 0;
start   = 0;
@@ -414,8 +414,9 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
if (!iommu_merge || !nextneed || !need || s->offset ||
(s->length + seg_size > max_seg_size) ||
(ps->offset + ps->length) % PAGE_SIZE) {
-   if (dma_map_cont(dev, start_sg, i - start,
-sgmap, pages, need) < 0)
+   ret = dma_map_cont(dev, start_sg, i - start,
+  sgmap, pages, need);
+   if (ret < 0)
goto error;
out++;
 
@@ -432,7 +433,8 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
pages += iommu_num_pages(s->offset, s->length, PAGE_SIZE);
ps = s;
}
-   if (dma_map_cont(dev, start_sg, i - start, sgmap, pages, need) < 0)
+   ret = dma_map_cont(dev, start_sg, i - start, sgmap, pages, need);
+   if (ret < 0)
goto error;
out++;
flush_gart();
@@ -458,7 +460,7 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
iommu_full(dev, pages << PAGE_SHIFT, dir);
for_each_sg(sg, s, nents, i)
s->dma_address = DMA_MAPPING_ERROR;
-   return 0;
+   return ret;
 }
 
 /* allocate and map a coherent mapping */
-- 
2.20.1



[PATCH v2 19/21] x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR

2021-07-23 Thread Logan Gunthorpe
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of
the ->map_sg calling convention, so remove it.

Link: https://lore.kernel.org/linux-mips/20210716063241.gc13...@lst.de/
Suggested-by: Christoph Hellwig 
Signed-off-by: Logan Gunthorpe 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Niklas Schnelle 
Cc: Thomas Bogendoerfer 
Cc: Michael Ellerman 
---
 arch/x86/kernel/amd_gart_64.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 46aea9a4f26b..ed837383de5c 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -458,8 +458,6 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
panic("dma_map_sg: overflow on %lu pages\n", pages);
 
iommu_full(dev, pages << PAGE_SHIFT, dir);
-   for_each_sg(sg, s, nents, i)
-   s->dma_address = DMA_MAPPING_ERROR;
return ret;
 }
 
-- 
2.20.1



[PATCH v2 17/21] xen: swiotlb: return error code from xen_swiotlb_map_sg()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

xen_swiotlb_map_sg() may only fail if xen_swiotlb_map_page() fails, but
xen_swiotlb_map_page() only supports returning errors as
DMA_MAPPING_ERROR. So coalesce all errors into EIO per the documentation
for dma_map_sgtable().

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
Reviewed-by: Boris Ostrovsky 
Cc: Konrad Rzeszutek Wilk 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
---
 drivers/xen/swiotlb-xen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 24d11861ac7d..85d58b720a24 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -509,7 +509,7 @@ xen_swiotlb_map_sg(struct device *dev, struct scatterlist 
*sgl, int nelems,
 out_unmap:
xen_swiotlb_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
sg_dma_len(sgl) = 0;
-   return 0;
+   return -EIO;
 }
 
 static void
-- 
2.20.1



[PATCH v2 01/21] dma-mapping: Allow map_sg() ops to return negative error codes

2021-07-23 Thread Logan Gunthorpe
Allow dma_map_sgtable() to pass errors from the map_sg() ops. This
will be required for returning appropriate error codes when mapping
P2PDMA memory.

Introduce __dma_map_sg_attrs() which will return the raw error code
from the map_sg operation (whether it be negative or zero). Then add a
dma_map_sg_attrs() wrapper to convert any negative errors to zero to
satisfy the existing calling convention.

dma_map_sgtable() defines three error codes that .map_sg implementations
are allowed to return: -EINVAL, -ENOMEM and -EIO. The latter of which
is a generic return for cases that are passing DMA_MAPPING_ERROR
through.

dma_map_sgtable() will convert a zero error return for old map_sg() ops
into a -EIO return and return any negative errors as reported.

This allows map_sg implementations to start returning multiple
negative error codes. Legacy map_sg implementations can continue
to return zero until they are all converted.

Signed-off-by: Logan Gunthorpe 
---
 include/linux/dma-map-ops.h |  5 ++-
 include/linux/dma-mapping.h | 35 +++
 kernel/dma/mapping.c| 85 +
 3 files changed, 87 insertions(+), 38 deletions(-)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d53a96a3d64..2f842498c448 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -41,8 +41,9 @@ struct dma_map_ops {
size_t size, enum dma_data_direction dir,
unsigned long attrs);
/*
-* map_sg returns 0 on error and a value > 0 on success.
-* It should never return a value < 0.
+* map_sg should return a negative error code on error. See
+* dma_map_sgtable() for a list of appropriate error codes
+* and their meanings.
 */
int (*map_sg)(struct device *dev, struct scatterlist *sg, int nents,
enum dma_data_direction dir, unsigned long attrs);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 183e7103a66d..daa1e360f0ee 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -110,6 +110,8 @@ int dma_map_sg_attrs(struct device *dev, struct scatterlist 
*sg, int nents,
 void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg,
  int nents, enum dma_data_direction dir,
  unsigned long attrs);
+int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
+   enum dma_data_direction dir, unsigned long attrs);
 dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
size_t size, enum dma_data_direction dir, unsigned long attrs);
 void dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
@@ -174,6 +176,11 @@ static inline void dma_unmap_sg_attrs(struct device *dev,
unsigned long attrs)
 {
 }
+static inline int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
+   enum dma_data_direction dir, unsigned long attrs)
+{
+   return -EOPNOTSUPP;
+}
 static inline dma_addr_t dma_map_resource(struct device *dev,
phys_addr_t phys_addr, size_t size, enum dma_data_direction dir,
unsigned long attrs)
@@ -343,34 +350,6 @@ static inline void dma_sync_single_range_for_device(struct 
device *dev,
return dma_sync_single_for_device(dev, addr + offset, size, dir);
 }
 
-/**
- * dma_map_sgtable - Map the given buffer for DMA
- * @dev:   The device for which to perform the DMA operation
- * @sgt:   The sg_table object describing the buffer
- * @dir:   DMA direction
- * @attrs: Optional DMA attributes for the map operation
- *
- * Maps a buffer described by a scatterlist stored in the given sg_table
- * object for the @dir DMA operation by the @dev device. After success the
- * ownership for the buffer is transferred to the DMA domain.  One has to
- * call dma_sync_sgtable_for_cpu() or dma_unmap_sgtable() to move the
- * ownership of the buffer back to the CPU domain before touching the
- * buffer by the CPU.
- *
- * Returns 0 on success or -EINVAL on error during mapping the buffer.
- */
-static inline int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
-   enum dma_data_direction dir, unsigned long attrs)
-{
-   int nents;
-
-   nents = dma_map_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
-   if (nents <= 0)
-   return -EINVAL;
-   sgt->nents = nents;
-   return 0;
-}
-
 /**
  * dma_unmap_sgtable - Unmap the given buffer for DMA
  * @dev:   The device for which to perform the DMA operation
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 2b06a809d0b9..b8dc8b1cb402 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -177,12 +177,8 @@ void dma_unmap_page_attrs(struct device *dev, dma_addr_t 
addr, size_t size,
 }
 EXPORT_SYMBOL(dma_unmap_page_attrs);
 
-/*
- * dma_maps_sg_attrs 

[PATCH v2 21/21] dma-mapping: Disallow .map_sg operations from returning zero on error

2021-07-23 Thread Logan Gunthorpe
Now that all the .map_sg operations have been converted to returning
proper error codes, drop the code to handle a zero return value,
add a warning if a zero is returned and update the comment for the
map_sg operation.

Signed-off-by: Logan Gunthorpe 
---
 kernel/dma/mapping.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index b8dc8b1cb402..86a8a421344a 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -194,6 +194,9 @@ static int __dma_map_sg_attrs(struct device *dev, struct 
scatterlist *sg,
else
ents = ops->map_sg(dev, sg, nents, dir, attrs);
 
+   if (WARN_ON_ONCE(ents == 0))
+   return -EIO;
+
if (ents > 0)
debug_dma_map_sg(dev, sg, nents, ents, dir);
 
@@ -259,9 +262,7 @@ int dma_map_sgtable(struct device *dev, struct sg_table 
*sgt,
int nents;
 
nents = __dma_map_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
-   if (nents == 0)
-   return -EIO;
-   else if (nents < 0) {
+   if (nents < 0) {
if (WARN_ON_ONCE(nents != -EINVAL && nents != -ENOMEM &&
 nents != -EIO))
return -EIO;
-- 
2.20.1



[PATCH v2 02/21] dma-direct: Return appropriate error code from dma_direct_map_sg()

2021-07-23 Thread Logan Gunthorpe
Now that the map_sg() op expects error codes instead of return zero on
error, convert dma_direct_map_sg() to return an error code. Per the
documentation for dma_map_sgtable(), -EIO is returned due to an
DMA_MAPPING_ERROR with unknown cause.

Signed-off-by: Logan Gunthorpe 
---
 kernel/dma/direct.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index f737e3347059..f33ceb68aef2 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -411,7 +411,7 @@ int dma_direct_map_sg(struct device *dev, struct 
scatterlist *sgl, int nents,
 
 out_unmap:
dma_direct_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
-   return 0;
+   return -EIO;
 }
 
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
-- 
2.20.1



[PATCH v2 00/21] .map_sg() error cleanup

2021-07-23 Thread Logan Gunthorpe
Hi,

This v2 of the series is spun out and expanded from my work to add
P2PDMA support to DMA map operations[1]. v1 is at [2]. The main changes
in v1 are to more carefully define the meaning of the error codes for
dma_map_sgtable().

The P2PDMA work requires distinguishing different error conditions in
a map_sg operation. dma_map_sgtable() already allows for returning an
error code (where as dma_map_sg() is only allowed to return zero)
however, it currently only returns -EINVAL when a .map_sg() call returns
zero.

This series cleans up all .map_sg() implementations to return appropriate
error codes. After the cleanup, dma_map_sg() will still return zero,
however dma_map_sgtable() will pass the error code from the .map_sg()
call. Thanks go to Martn Oliveira for doing a lot of the cleanup of the
obscure implementations.

The patch set is based off of v5.14-rc2 and a git repo can be found
here:

  https://github.com/sbates130272/linux-p2pmem map_sg_err_cleanup_v2

Thanks,

Logan

[1] 
https://lore.kernel.org/linux-block/20210513223203.5542-1-log...@deltatee.com/
[2] 
https://lore.kernel.org/linux-mips/20210715164544.6827-1-log...@deltatee.com/

--

Changes in v2:
  - Attempt to define the meanings of the errors returned by
dma_map_sgtable() and restrict the valid return codes of
.map_sg implementations. (Per Christoph)
  - Change dma_map_sgtable() to EXPORT_SYMBOL_GPL() (Per Christoph)
  - Add patches to remove the erroneous setting of sg->dma_address
to DMA_MAP_ERROR in a few .map_sg(0 implementations. (Per
Christoph).

--

Logan Gunthorpe (10):
  dma-mapping: Allow map_sg() ops to return negative error codes
  dma-direct: Return appropriate error code from dma_direct_map_sg()
  iommu: Return full error code from iommu_map_sg[_atomic]()
  dma-iommu: Return error code from iommu_dma_map_sg()
  ARM/dma-mapping: don't set failed sg dma_address to DMA_MAPPING_ERROR
  powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
  s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR
  sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
  x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR
  dma-mapping: Disallow .map_sg operations from returning zero on error

Martin Oliveira (11):
  alpha: return error code from alpha_pci_map_sg()
  ARM/dma-mapping: return error code from .map_sg() ops
  ia64/sba_iommu: return error code from sba_map_sg_attrs()
  MIPS/jazzdma: return error code from jazz_dma_map_sg()
  powerpc/iommu: return error code from .map_sg() ops
  s390/pci: return error code from s390_dma_map_sg()
  sparc/iommu: return error codes from .map_sg() ops
  parisc: return error code from .map_sg() ops
  xen: swiotlb: return error code from xen_swiotlb_map_sg()
  x86/amd_gart: return error code from gart_map_sg()
  dma-mapping: return error code from dma_dummy_map_sg()

 arch/alpha/kernel/pci_iommu.c   | 10 ++-
 arch/arm/mm/dma-mapping.c   | 26 +---
 arch/ia64/hp/common/sba_iommu.c |  6 +-
 arch/mips/jazz/jazzdma.c|  2 +-
 arch/powerpc/kernel/iommu.c |  6 +-
 arch/powerpc/platforms/ps3/system-bus.c |  2 +-
 arch/powerpc/platforms/pseries/vio.c|  5 +-
 arch/s390/pci/pci_dma.c | 13 ++--
 arch/sparc/kernel/iommu.c   |  6 +-
 arch/sparc/kernel/pci_sun4v.c   |  6 +-
 arch/sparc/mm/iommu.c   |  2 +-
 arch/x86/kernel/amd_gart_64.c   | 18 +++---
 drivers/iommu/dma-iommu.c   | 23 +--
 drivers/iommu/iommu.c   | 15 ++---
 drivers/parisc/ccio-dma.c   |  2 +-
 drivers/parisc/sba_iommu.c  |  2 +-
 drivers/xen/swiotlb-xen.c   |  2 +-
 include/linux/dma-map-ops.h |  5 +-
 include/linux/dma-mapping.h | 35 ++
 include/linux/iommu.h   | 22 +++
 kernel/dma/direct.c |  2 +-
 kernel/dma/dummy.c  |  2 +-
 kernel/dma/mapping.c| 86 ++---
 23 files changed, 181 insertions(+), 117 deletions(-)


base-commit: 2734d6c1b1a089fb593ef6a23d4b70903526fe0c
--
2.20.1


[PATCH v2 20/21] dma-mapping: return error code from dma_dummy_map_sg()

2021-07-23 Thread Logan Gunthorpe
From: Martin Oliveira 

The .map_sg() op now expects an error code instead of zero on failure.

The only errno to return is -EINVAL in the case when DMA is not
supported.

Signed-off-by: Martin Oliveira 
Signed-off-by: Logan Gunthorpe 
---
 kernel/dma/dummy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/dummy.c b/kernel/dma/dummy.c
index eacd4c5b10bf..b492d59ac77e 100644
--- a/kernel/dma/dummy.c
+++ b/kernel/dma/dummy.c
@@ -22,7 +22,7 @@ static int dma_dummy_map_sg(struct device *dev, struct 
scatterlist *sgl,
int nelems, enum dma_data_direction dir,
unsigned long attrs)
 {
-   return 0;
+   return -EINVAL;
 }
 
 static int dma_dummy_supported(struct device *hwdev, u64 mask)
-- 
2.20.1



Re: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes

2021-07-23 Thread Srikar Dronamraju
* Valentin Schneider  [2021-07-13 17:32:14]:

> On 12/07/21 18:18, Srikar Dronamraju wrote:
> > Hi Valentin,
> >
> >> On 01/07/21 09:45, Srikar Dronamraju wrote:
> >> > @@ -1891,12 +1894,30 @@ void sched_init_numa(void)
> >> >  void sched_domains_numa_masks_set(unsigned int cpu)
> >> >  {
> >
> > Unfortunately this is not helping.
> > I tried this patch alone and also with 2/2 patch of this series where
> > we update/fill fake topology numbers. However both cases are still failing.
> >
> 
> Thanks for testing it.
> 
> 
> Now, let's take examples from your cover letter:
> 
>   node distances:
>   node   0   1   2   3   4   5   6   7
> 0:  10  20  40  40  40  40  40  40
> 1:  20  10  40  40  40  40  40  40
> 2:  40  40  10  20  40  40  40  40
> 3:  40  40  20  10  40  40  40  40
> 4:  40  40  40  40  10  20  40  40
> 5:  40  40  40  40  20  10  40  40
> 6:  40  40  40  40  40  40  10  20
> 7:  40  40  40  40  40  40  20  10
> 
> But the system boots with just nodes 0 and 1, thus only this distance
> matrix is valid:
> 
>   node   0   1
> 0:  10  20
> 1:  20  10
> 
> topology_span_sane() is going to use tl->mask(cpu), and as you reported the
> NODE topology level should cause issues. Let's assume all offline nodes say
> they're 10 distance away from everyone else, and that we have one CPU per
> node. This would give us:
> 

No,
All offline nodes would be at a distance of 10 from node 0 only.
So here node distance of all offline nodes from node 1 would be 20.

>   NODE->mask(0) == 0,2-7
>   NODE->mask(1) == 1-7

so 

NODE->mask(0) == 0,2-7
NODE->mask(1) should be 1
and NODE->mask(2-7) == 0,2-7

> 
> The intersection is 2-7, we'll trigger the WARN_ON().
> Now, with the above snippet, we'll check if that intersection covers any
> online CPU. For sched_init_domains(), cpu_map is cpu_active_mask, so we'd
> end up with an empty intersection and we shouldn't warn - that's the theory
> at least.

Now lets say we onlined CPU 3 and node 3 which was at a actual distance
of 20 from node 0.

(If we only consider online CPUs, and since scheduler masks like
sched_domains_numa_masks arent updated with offline CPUs,)
then

NODE->mask(0) == 0
NODE->mask(1) == 1
NODE->mask(3) == 0,3

cpumask_and(intersect, tl->mask(cpu), tl->mask(i));
if (!cpumask_equal(tl->mask(cpu), tl->mask(i)) && cpumask_intersects(intersect, 
cpu_map))

cpu_map is 0,1,3
intersect is 0

>From above NODE->mask(0) is !equal to NODE->mask(1) and
cpumask_intersects(intersect, cpu_map) is also true.

I picked Node 3 since if Node 1 is online, we would have faked distance
for Node 2 to be at distance of 40.

Any node from 3 to 7, we would have faced the same problem.

> 
> Looking at sd_numa_mask(), I think there's a bug with topology_span_sane():
> it doesn't run in the right place wrt where sched_domains_curr_level is
> updated. Could you try the below on top of the previous snippet?
> 
> If that doesn't help, could you share the node distances / topology masks
> that lead to the WARN_ON()? Thanks.
> 
> ---
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index b77ad49dc14f..cda69dfa4065 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1516,13 +1516,6 @@ sd_init(struct sched_domain_topology_level *tl,
>   int sd_id, sd_weight, sd_flags = 0;
>   struct cpumask *sd_span;
> 
> -#ifdef CONFIG_NUMA
> - /*
> -  * Ugly hack to pass state to sd_numa_mask()...
> -  */
> - sched_domains_curr_level = tl->numa_level;
> -#endif
> -
>   sd_weight = cpumask_weight(tl->mask(cpu));
> 
>   if (tl->sd_flags)
> @@ -2131,7 +2124,12 @@ build_sched_domains(const struct cpumask *cpu_map, 
> struct sched_domain_attr *att
> 
>   sd = NULL;
>   for_each_sd_topology(tl) {
> -
> +#ifdef CONFIG_NUMA
> + /*
> +  * Ugly hack to pass state to sd_numa_mask()...
> +  */
> + sched_domains_curr_level = tl->numa_level;
> +#endif
>   if (WARN_ON(!topology_span_sane(tl, cpu_map, i)))
>   goto error;
> 
> 

I tested with the above patch too. However it still not helping.

Here is the log from my testing.

At Boot.

(Do remember to arrive at sched_max_numa_levels we faked the
numa_distance of node 1 to be at 20 from node 0. All other offline
nodes are at a distance of 10 from node 0.)

numactl -H
available: 2 nodes (0,5)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 0 MB
node 0 free: 0 MB
node 5 cpus:
node 5 size: 32038 MB
node 5 free: 29367 MB
node distances:
node   0   5
  0:  10  40
  5:  40  10
--
grep -r . /sys/kernel/debug/sched/domains/cpu0/domain{0,1,2,3,4}/{name,flags}
/sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
/sys/kernel/debug/sched/domains/cpu0/domain0/flags:SD_BALANCE_NEWIDLE 
SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_CPUCAPACITY 

Re: [PATCH v7 1/1] powerpc/pseries: Interface to represent PAPR firmware attributes

2021-07-23 Thread Fabiano Rosas
"Pratik R. Sampat"  writes:

> Adds a generic interface to represent the energy and frequency related
> PAPR attributes on the system using the new H_CALL
> "H_GET_ENERGY_SCALE_INFO".
>
> H_GET_EM_PARMS H_CALL was previously responsible for exporting this
> information in the lparcfg, however the H_GET_EM_PARMS H_CALL
> will be deprecated P10 onwards.
>
> The H_GET_ENERGY_SCALE_INFO H_CALL is of the following call format:
> hcall(
>   uint64 H_GET_ENERGY_SCALE_INFO,  // Get energy scale info
>   uint64 flags,   // Per the flag request
>   uint64 firstAttributeId,// The attribute id
>   uint64 bufferAddress,   // Guest physical address of the output buffer
>   uint64 bufferSize   // The size in bytes of the output buffer
> );
>
> This H_CALL can query either all the attributes at once with
> firstAttributeId = 0, flags = 0 as well as query only one attribute
> at a time with firstAttributeId = id, flags = 1.
>
> The output buffer consists of the following
> 1. number of attributes  - 8 bytes
> 2. array offset to the data location - 8 bytes
> 3. version info  - 1 byte
> 4. A data array of size num attributes, which contains the following:
>   a. attribute ID  - 8 bytes
>   b. attribute value in number - 8 bytes
>   c. attribute name in string  - 64 bytes
>   d. attribute value in string - 64 bytes
>
> The new H_CALL exports information in direct string value format, hence
> a new interface has been introduced in
> /sys/firmware/papr/energy_scale_info to export this information to
> userspace in an extensible pass-through format.
>
> The H_CALL returns the name, numeric value and string value (if exists)
>
> The format of exposing the sysfs information is as follows:
> /sys/firmware/papr/energy_scale_info/
>|-- /
>  |-- desc
>  |-- value
>  |-- value_desc (if exists)
>|-- /
>  |-- desc
>  |-- value
>  |-- value_desc (if exists)
> ...
>
> The energy information that is exported is useful for userspace tools
> such as powerpc-utils. Currently these tools infer the
> "power_mode_data" value in the lparcfg, which in turn is obtained from
> the to be deprecated H_GET_EM_PARMS H_CALL.
> On future platforms, such userspace utilities will have to look at the
> data returned from the new H_CALL being populated in this new sysfs
> interface and report this information directly without the need of
> interpretation.
>
> Signed-off-by: Pratik R. Sampat 
> Reviewed-by: Gautham R. Shenoy 

Reviewed-by: Fabiano Rosas 



[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3

2021-07-23 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213079

--- Comment #15 from Erhard F. (erhar...@mailbox.org) ---
(In reply to Oliver O'Halloran from comment #13)
> In the meanwhile, can you try the patch above? That seems to fix bug which
> is causing MSIs to be unusable. I'm not 100% sure why that woudld matter,
> but it's possible the crashes are due to some other bug which doesn't appear
> when MSIs are in use.
Now I had time to test your patch on top of kernel 5.13-rc6 and 5.13.4. Can't
test it on top of 5.14-rc2 due to bug #213803.

Your patch seems to work fine and I don't get this "irq 63: nobody cared"
messages and crashes any longer! However now when building stuff the G5 sooner
or later crashes with:

[...]
Kernel panic - not syncing: corrupted stack end detected inside scheduler
Call Trace:
CPU: 1 PID: 2968 Comm: powerpc64-unkno Tainted: GW
5.13.0-rc6-PowerMacG5+ #2
[c000717178c0] [c05412d0] .dump_stack+0xe0/0x13c (unreliable)
[c00071717960] [c00681a0] .panic+0x168/0x430
[c00071717a10] [c0809ca0] .__schedule+0x80/0x840
[c00071717af0] [c00a0ea8] .do_task_dead+0x54/0x58
[c00071717b70] [c006e7b4] .do_exit+0xa14/0xa6c
[c00071717c60] [c006e89c] .do_group_exit+0x50/0xb0
[c00071717cf0] [c006e910] .__wake_up_parent+0x0/0x34
[c00071717d60] [c0021530] .system_call_exception+0x1b4/0x1ec
[c00071717e10] [c000b9c4] system_call_common+0xe4/0x214
--- interrupt: c00 at 0x3fffa8092aa8
NIP:  3fffa8092aa8 LR: 3fffa7ff2d04 CTR: 
REGS: c00071717e80 TRAP: 0c00   Tainted: GW 
(5.13.0-rc6-PowerMacG5+)
MSR:  9200f032   CR: 22000482  XER:

IRQMASK: 0 
GPR00: 00ea 3fffd04ef2a0 3fffa81b1300  
GPR04:     
GPR08:     
GPR12:  3fffa8318c30 00012e5ff800 0001136b53b0 
GPR16: 0001200cec38 3fffddea1c68 0001200ceb28 002f 
GPR20:  3fffa81abff8 0001 3fffa81aaa58 
GPR24:   0003 0001 
GPR28:  3fffa8311c50 f000  
NIP [3fffa8092aa8] 0x3fffa8092aa8
LR [3fffa7ff2d04] 0x3fffa7ff2d04
--- interrupt: c00
Rebooting in 120 seconds..


Don't know whether this is related. I'll throw more debugging stuff in,  file
this as a seperate issue and link it here just in case.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2] Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"

2021-07-23 Thread Michael Ellerman
Will Deacon  writes:
> On Wed, 21 Jul 2021 17:02:13 +1000, Michael Ellerman wrote:
>> This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf.
>> 
>> c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
>> breaks arm64 in at least two ways for configurations where PUD or PMD
>> folding occur:
>> 
>>   1. We no longer install huge-vmap mappings and silently fall back to
>>  page-granular entries, despite being able to install block entries
>>  at what is effectively the PGD level.
>> 
>> [...]
>
> Thank you Michael! I owe you a beer next time I see you, if we don't go
> extinct before then.

No worries, thanks to Christophe for identifying the solution while on
vacation!

Beers seem a long way off, but hopefully one day :)

cheers


[Bug 205303] Compilation for PPC64 fails on warning in watchdog.o

2021-07-23 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205303

Michael Ellerman (mich...@ellerman.id.au) changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||mich...@ellerman.id.au
 Resolution|--- |CODE_FIX

--- Comment #3 from Michael Ellerman (mich...@ellerman.id.au) ---
This was fixed by:

4fe529449d85 ("powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration")

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH 2/8] hvsi: don't panic on tty_register_driver failure

2021-07-23 Thread Jiri Slaby
The alloc_tty_driver failure is handled gracefully in hvsi_init. But
tty_register_driver is not. panic is called if that one fails.

So handle the failure of tty_register_driver gracefully too. This will
keep at least the console functional as it was enabled earlier by
console_initcall in hvsi_console_init. Instead of shooting down the
whole system.

This means, we disable interrupts and restore hvsi_wait back to
poll_for_state().

Signed-off-by: Jiri Slaby 
Cc: linuxppc-dev@lists.ozlabs.org
---
 drivers/tty/hvc/hvsi.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/hvc/hvsi.c b/drivers/tty/hvc/hvsi.c
index bfc15279d5bc..f0bc8e780051 100644
--- a/drivers/tty/hvc/hvsi.c
+++ b/drivers/tty/hvc/hvsi.c
@@ -1038,7 +1038,7 @@ static const struct tty_operations hvsi_ops = {
 
 static int __init hvsi_init(void)
 {
-   int i;
+   int i, ret;
 
hvsi_driver = alloc_tty_driver(hvsi_count);
if (!hvsi_driver)
@@ -1069,12 +1069,25 @@ static int __init hvsi_init(void)
}
hvsi_wait = wait_for_state; /* irqs active now */
 
-   if (tty_register_driver(hvsi_driver))
-   panic("Couldn't register hvsi console driver\n");
+   ret = tty_register_driver(hvsi_driver);
+   if (ret) {
+   pr_err("Couldn't register hvsi console driver\n");
+   goto err_free_irq;
+   }
 
printk(KERN_DEBUG "HVSI: registered %i devices\n", hvsi_count);
 
return 0;
+err_free_irq:
+   hvsi_wait = poll_for_state;
+   for (i = 0; i < hvsi_count; i++) {
+   struct hvsi_struct *hp = _ports[i];
+
+   free_irq(hp->virq, hp);
+   }
+   tty_driver_kref_put(hvsi_driver);
+
+   return ret;
 }
 device_initcall(hvsi_init);
 
-- 
2.32.0



Re: [PATCH 1/2] PCI/AER: Disable AER interrupt during suspend

2021-07-23 Thread Kai-Heng Feng
On Fri, Jul 23, 2021 at 1:24 PM Christoph Hellwig  wrote:
>
> On Thu, Jul 22, 2021 at 05:23:51PM -0500, Bjorn Helgaas wrote:
> > Marking both of these as "not applicable" for now because I don't
> > think we really understand what's going on.
> >
> > Apparently a DMA occurs during suspend or resume and triggers an ACS
> > violation.  I don't think think such a DMA should occur in the first
> > place.
> >
> > Or maybe, since you say the problem happens right after ACS is enabled
> > during resume, we're doing the ACS enable incorrectly?  Although I
> > would think we should not be doing DMA at the same time we're enabling
> > ACS, either.
> >
> > If this really is a system firmware issue, both HP and Dell should
> > have the knowledge and equipment to figure out what's going on.
>
> DMA on resume sounds really odd.  OTOH the below mentioned case of
> a DMA during suspend seems very like in some setup.  NVMe has the
> concept of a host memory buffer (HMB) that allows the PCIe device
> to use arbitrary host memory for internal purposes.  Combine this
> with the "Storage D3" misfeature in modern x86 platforms that force
> a slot into d3cold without consulting the driver first and you'd see
> symptoms like this.  Another case would be the NVMe equivalent of the
> AER which could lead to a completion without host activity.

The issue can also be observed on non-HMB NVMe.

>
> We now have quirks in the ACPI layer and NVMe to fully shut down the
> NVMe controllers on these messed up systems with the "Storage D3"
> misfeature which should avoid such "spurious" DMAs at the cost of
> wearning out the device much faster.

Since the issue is on S3, I think the NVMe always fully shuts down.

Kai-Heng


Re: [PATCH v7 1/1] powerpc/pseries: Interface to represent PAPR firmware attributes

2021-07-23 Thread kajoljain



On 7/23/21 11:16 AM, Pratik R. Sampat wrote:
> Adds a generic interface to represent the energy and frequency related
> PAPR attributes on the system using the new H_CALL
> "H_GET_ENERGY_SCALE_INFO".
> 
> H_GET_EM_PARMS H_CALL was previously responsible for exporting this
> information in the lparcfg, however the H_GET_EM_PARMS H_CALL
> will be deprecated P10 onwards.
> 
> The H_GET_ENERGY_SCALE_INFO H_CALL is of the following call format:
> hcall(
>   uint64 H_GET_ENERGY_SCALE_INFO,  // Get energy scale info
>   uint64 flags,   // Per the flag request
>   uint64 firstAttributeId,// The attribute id
>   uint64 bufferAddress,   // Guest physical address of the output buffer
>   uint64 bufferSize   // The size in bytes of the output buffer
> );
> 
> This H_CALL can query either all the attributes at once with
> firstAttributeId = 0, flags = 0 as well as query only one attribute
> at a time with firstAttributeId = id, flags = 1.
> 
> The output buffer consists of the following
> 1. number of attributes  - 8 bytes
> 2. array offset to the data location - 8 bytes
> 3. version info  - 1 byte
> 4. A data array of size num attributes, which contains the following:
>   a. attribute ID  - 8 bytes
>   b. attribute value in number - 8 bytes
>   c. attribute name in string  - 64 bytes
>   d. attribute value in string - 64 bytes
> 
> The new H_CALL exports information in direct string value format, hence
> a new interface has been introduced in
> /sys/firmware/papr/energy_scale_info to export this information to
> userspace in an extensible pass-through format.
> 
> The H_CALL returns the name, numeric value and string value (if exists)
> 
> The format of exposing the sysfs information is as follows:
> /sys/firmware/papr/energy_scale_info/
>|-- /
>  |-- desc
>  |-- value
>  |-- value_desc (if exists)
>|-- /
>  |-- desc
>  |-- value
>  |-- value_desc (if exists)
> ...
> 
> The energy information that is exported is useful for userspace tools
> such as powerpc-utils. Currently these tools infer the
> "power_mode_data" value in the lparcfg, which in turn is obtained from
> the to be deprecated H_GET_EM_PARMS H_CALL.
> On future platforms, such userspace utilities will have to look at the
> data returned from the new H_CALL being populated in this new sysfs
> interface and report this information directly without the need of
> interpretation.
> 

Patch looks good to me.
Reviewed-by: Kajol Jain 

Thanks,
Kajol Jain

> Signed-off-by: Pratik R. Sampat 
> Reviewed-by: Gautham R. Shenoy 
> ---
>  .../sysfs-firmware-papr-energy-scale-info |  26 ++
>  arch/powerpc/include/asm/hvcall.h |  24 +-
>  arch/powerpc/kvm/trace_hv.h   |   1 +
>  arch/powerpc/platforms/pseries/Makefile   |   3 +-
>  .../pseries/papr_platform_attributes.c| 312 ++
>  5 files changed, 364 insertions(+), 2 deletions(-)
>  create mode 100644 
> Documentation/ABI/testing/sysfs-firmware-papr-energy-scale-info
>  create mode 100644 arch/powerpc/platforms/pseries/papr_platform_attributes.c
> 
> diff --git a/Documentation/ABI/testing/sysfs-firmware-papr-energy-scale-info 
> b/Documentation/ABI/testing/sysfs-firmware-papr-energy-scale-info
> new file mode 100644
> index ..139a576c7c9d
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-papr-energy-scale-info
> @@ -0,0 +1,26 @@
> +What:/sys/firmware/papr/energy_scale_info
> +Date:June 2021
> +Contact: Linux for PowerPC mailing list 
> +Description: Directory hosting a set of platform attributes like
> + energy/frequency on Linux running as a PAPR guest.
> +
> + Each file in a directory contains a platform
> + attribute hierarchy pertaining to performance/
> + energy-savings mode and processor frequency.
> +
> +What:/sys/firmware/papr/energy_scale_info/
> + /sys/firmware/papr/energy_scale_info//desc
> + /sys/firmware/papr/energy_scale_info//value
> + /sys/firmware/papr/energy_scale_info//value_desc
> +Date:June 2021
> +Contact: Linux for PowerPC mailing list 
> +Description: Energy, frequency attributes directory for POWERVM servers
> +
> + This directory provides energy, frequency, folding information. 
> It
> + contains below sysfs attributes:
> +
> + - desc: String description of the attribute 
> +
> + - value: Numeric value of attribute 
> +
> + - value_desc: String value of attribute 
> diff --git a/arch/powerpc/include/asm/hvcall.h 
> b/arch/powerpc/include/asm/hvcall.h
> index e3b29eda8074..c91714ea6719 100644
> --- a/arch/powerpc/include/asm/hvcall.h
> +++ b/arch/powerpc/include/asm/hvcall.h
> @@ -316,7 +316,8 @@
>  #define H_SCM_PERFORMANCE_STATS 0x418
>  #define H_RPT_INVALIDATE 0x448
>  #define H_SCM_FLUSH