Re: [PATCH v4 1/1] PCI/ERR: don't clobber status after reset_link()

2021-02-08 Thread Hedi Berriche

On Mon, Jan 25, 2021 at 09:34 Kuppuswamy, Sathyanarayanan wrote:



On 1/8/21 2:30 PM, Bjorn Helgaas wrote:

Can we push this forward now?  There are several pending patches in
this area from Keith and Sathyanarayanan; I haven't gotten to them
yet, so not sure whether they help address any of this.


Following two patches should also address the same issue.

My patch:

https://patchwork.kernel.org/project/linux-pci/patch/6f63321637fef86b6cf0beebf98b987062f9e811.1610153755.git.sathyanarayanan.kuppusw...@linux.intel.com/


This series does *not* fix the problem for me.


Keith's patch:

https://patchwork.kernel.org/project/linux-pci/patch/20210104230300.1277180-4-kbu...@kernel.org/


Keith's series *does* fix the problem for me:

Acked-by: Hedi Berriche 
Tested-by: Hedi Berriche 

Cheers,
Hedi.




--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


Re: [PATCH v4 1/1] PCI/ERR: don't clobber status after reset_link()

2020-11-12 Thread Hedi Berriche

On Mon, Nov 02, 2020 at 15:10 Hedi Berriche wrote:

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

  177 if (status == PCI_ERS_RESULT_CAN_RECOVER) {
  ...
  181 }

  183 if (status == PCI_ERS_RESULT_NEED_RESET) {
  ...
  192 }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status with the return value of reset_link(); set status
to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 

Reviewed-by: Sinan Kaya 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
drivers/pci/pcie/err.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..2730826cfd8a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
+   } else {
+   if (status == PCI_ERS_RESULT_DISCONNECT ||
+   status == PCI_ERS_RESULT_NO_AER_DRIVER)
+   status = PCI_ERS_RESULT_RECOVERED;
}
} else {
pci_walk_bus(bus, report_normal_detected, );
--
2.28.0


Bjorn,

Sorry to bug you, but could you please cast your eyes on this patch and let me 
know whether you have
any concerns that might be barring it from inclusion.

Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


[PATCH v4 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")

2020-11-02 Thread Hedi Berriche
This is essentially a resend of v3 as it failed to get enough traction;
no code change, I only added Sinan Kaya's Reviewed-by.

- Changes since v3:
* added Sinan Kaya  Reviewed-by

- Changes since v2:
 * set status to PCI_ERS_RESULT_RECOVERED, in case of successful link
   reset, if and only if the initial value of error status is
   PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER.

- Changes since v1:
 * changed the commit message to clarify what broke post commit 6d2c89441571
 * dropped the misnomer post_reset_status variable in favour of a more natural
   approach that relies on a boolean to keep track of the outcome of 
reset_link()

After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful
reset which breaks error recovery by breaking driver (re)initialisation.

Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 
Cc: Sinan Kaya 

Cc: sta...@kernel.org # v5.7+

---
Hedi Berriche (1):
  PCI/ERR: don't clobber status after reset_link()

 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
2.28.0



[PATCH v4 1/1] PCI/ERR: don't clobber status after reset_link()

2020-11-02 Thread Hedi Berriche
Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

   177 if (status == PCI_ERS_RESULT_CAN_RECOVER) {
   ...
   181 }

   183 if (status == PCI_ERS_RESULT_NEED_RESET) {
   ...
   192 }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status with the return value of reset_link(); set status
to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 

Reviewed-by: Sinan Kaya 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..2730826cfd8a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
+   } else {
+   if (status == PCI_ERS_RESULT_DISCONNECT ||
+   status == PCI_ERS_RESULT_NO_AER_DRIVER)
+   status = PCI_ERS_RESULT_RECOVERED;
}
} else {
pci_walk_bus(bus, report_normal_detected, );
-- 
2.28.0



Re: [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-15 Thread Hedi Berriche

On Sun, Oct 11, 2020 at 18:56 Sinan Kaya wrote:

On 10/10/2020 6:16 PM, Hedi Berriche wrote:

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

   177 if (status == PCI_ERS_RESULT_CAN_RECOVER) {
   ...
   181 }

   183 if (status == PCI_ERS_RESULT_NEED_RESET) {
   ...
   192 }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status with the return value of reset_link(); set status
to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..2730826cfd8a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
+   } else {
+   if (status == PCI_ERS_RESULT_DISCONNECT ||
+   status == PCI_ERS_RESULT_NO_AER_DRIVER)
+   status = PCI_ERS_RESULT_RECOVERED;
}
} else {
pci_walk_bus(bus, report_normal_detected, );



Reviewed-by: Sinan Kaya 


Thanks for the Reviewed-by, Sinan.

Folks,

Any further comments or is this ripe for acceptance?

Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


[RESEND PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")

2020-10-10 Thread Hedi Berriche
This is a resend of v3 as the the original, sent over 6 hours ago, is yet
to make it to LKML.

- Changes since v2:

 * set status to PCI_ERS_RESULT_RECOVERED, in case of successful link
   reset, if and only if the initial value of error status is
   PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER.

- Changes since v1:

 * changed the commit message to clarify what broke post commit 6d2c89441571
 * dropped the misnomer post_reset_status variable in favour of a more natural
   approach that relies on a boolean to keep track of the outcome of 
reset_link()

After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful
reset which breaks error recovery by breaking driver (re)initialisation.

Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+

---
Hedi Berriche (1):
  PCI/ERR: don't clobber status after reset_link()

 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
2.28.0



[RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-10 Thread Hedi Berriche
Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

   177 if (status == PCI_ERS_RESULT_CAN_RECOVER) {
   ...
   181 }

   183 if (status == PCI_ERS_RESULT_NEED_RESET) {
   ...
   192 }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status with the return value of reset_link(); set status
to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..2730826cfd8a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
+   } else {
+   if (status == PCI_ERS_RESULT_DISCONNECT ||
+   status == PCI_ERS_RESULT_NO_AER_DRIVER)
+   status = PCI_ERS_RESULT_RECOVERED;
}
} else {
pci_walk_bus(bus, report_normal_detected, );
-- 
2.28.0



[PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")

2020-10-10 Thread Hedi Berriche
- Changes since v2:

 * set status to PCI_ERS_RESULT_RECOVERED, in case of successful link
   reset, if and only if the initial value of error status is
   PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER.

- Changes since v1:

 * changed the commit message to clarify what broke post commit 6d2c89441571
 * dropped the misnomer post_reset_status variable in favour of a more natural
   approach that relies on a boolean to keep track of the outcome of 
reset_link()

After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful
reset which breaks error recovery by breaking driver (re)initialisation.

Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+

---
Hedi Berriche (1):
  PCI/ERR: don't clobber status after reset_link()

 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
2.28.0



[PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-10 Thread Hedi Berriche
Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

   177 if (status == PCI_ERS_RESULT_CAN_RECOVER) {
   ...
   181 }

   183 if (status == PCI_ERS_RESULT_NEED_RESET) {
   ...
   192 }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status with the return value of reset_link(); set status
to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..2730826cfd8a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
+   } else {
+   if (status == PCI_ERS_RESULT_DISCONNECT ||
+   status == PCI_ERS_RESULT_NO_AER_DRIVER)
+   status = PCI_ERS_RESULT_RECOVERED;
}
} else {
pci_walk_bus(bus, report_normal_detected, );
-- 
2.28.0



[PATCH v2 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-09 Thread Hedi Berriche
Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

   177 if (status == PCI_ERS_RESULT_CAN_RECOVER) {
   ...
   181 }

   183 if (status == PCI_ERS_RESULT_NEED_RESET) {
   ...
   192 }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status after reset_link(), use a boolean instead to track
the outcome of reset_link().

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..b4bfa87fc49d 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -152,6 +152,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 {
pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
struct pci_bus *bus;
+   bool reset_failed = false;
 
/*
 * Error recovery runs on all subordinates of the first downstream port.
@@ -165,8 +166,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   reset_failed = (reset_link(dev) != PCI_ERS_RESULT_RECOVERED);
+   if (reset_failed) {
pci_warn(dev, "link reset failed\n");
goto failed;
}
@@ -174,6 +175,12 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_walk_bus(bus, report_normal_detected, );
}
 
+   if ((status == PCI_ERS_RESULT_DISCONNECT ||
+status == PCI_ERS_RESULT_NO_AER_DRIVER) &&
+!reset_failed) {
+   status = PCI_ERS_RESULT_RECOVERED;
+   }
+
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
-- 
2.28.0



[PATCH v2 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")

2020-10-09 Thread Hedi Berriche
- Changes since v1:

 * changed the commit message to clarify what broke post commit 6d2c89441571
 * dropped the misnomer post_reset_status variable in favour of a more natural
   approach that relies on a boolean to keep track of the outcome of 
reset_link()

After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful
reset which breaks error recovery by breaking driver (re)initialisation.

Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+

---
Hedi Berriche (1):
  PCI/ERR: don't clobber status after reset_link()

 drivers/pci/pcie/err.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

-- 
2.28.0



Re: [PATCH v1 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-08 Thread Hedi Berriche

On Fri, Oct 09, 2020 at 05:09 Hedi Berriche wrote:

On Fri, Oct 09, 2020 at 04:46 Raj, Ashok wrote:

Hi Ashok,

Thanks for looking into this.


On Fri, Oct 09, 2020 at 03:52:51AM +0100, Hedi Berriche wrote:

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
changed pcie_do_recovery() so that status is updated with the return
value from reset_link(); this was to fix the problem where we would
wrongly report recovery failure, despite a successful reset_link(),
whenever the initial error status is PCI_ERS_RESULT_DISCONNECT or
PCI_ERS_RESULT_NO_AER_DRIVER.

Unfortunately this breaks the flow of pcie_do_recovery() as it prevents


What is the reference to "this breaks" above?


The code change introduced by commit 6d2c89441571; would

   "this code change" instead of "this breaks"

work better? If not, I can also rephrase the whole paragraph along the 
following lines:

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") breaks 
the flow
of pcie_do_recovery() as it prevents the actions needed when the initial error 
is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET from taking place which 
causes
error recovery to fail.

... and do away with the first paragraph.


the actions needed when the initial error is PCI_ERS_RESULT_CAN_RECOVER
or PCI_ERS_RESULT_NEED_RESET from taking place which causes error
recovery to fail.

Don't clobber status after reset_link() to restore the intended flow in
pcie_do_recovery().

Fix the original problem by saving the return value from reset_link()
and use it later on to decide whether error recovery should be deemed
successful in the scenarios where the initial error status is
PCI_ERS_RESULT_{DISCONNECT,NO_AER_DRIVER}.


I would rather rephrase the above to make it clear what is being proposed.
Since the description seems to talk about the old problem and new solution
all mixed up.


OK; will do that to clarify that what's being proposed here is:

   1. fix the regression introduced by commit 6d2c89441571
   2. address the problem that commit 6d2c89441571 aimed to fix


Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Keith Busch 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
drivers/pci/pcie/err.c | 13 ++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..dbd0b56bd6c1 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -150,7 +150,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_link)(struct pci_dev *pdev))
{
-   pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
+   pci_ers_result_t post_reset_status, status = PCI_ERS_RESULT_CAN_RECOVER;


why call it post_reset_status?


Perhaps post_reset_status is not a great choice; would reset_result or 
reset_link_result be better?


... or just do this with a boolean instead as I had it in an earlier iteration 
of the patch before I
eventually opted to use an pci_ers_result_t.

Cheers,
Hedi.


Cheers,
Hedi.




struct pci_bus *bus;

/*
@@ -165,8 +165,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   post_reset_status = reset_link(dev);
+   if (post_reset_status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
}
@@ -174,6 +174,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_walk_bus(bus, report_normal_detected, );
}

+   if ((status == PCI_ERS_RESULT_DISCONNECT ||
+status == PCI_ERS_RESULT_NO_AER_DRIVER) &&
+post_reset_status == PCI_ERS_RESULT_RECOVERED) {
+   /* error recovery succeeded thanks to reset_link() */
+   status = PCI_ERS_RESULT_RECOVERED;
+   }
+
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
--
2.28.0



--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


Re: [PATCH v1 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-08 Thread Hedi Berriche

On Fri, Oct 09, 2020 at 04:46 Raj, Ashok wrote:

Hi Ashok,

Thanks for looking into this.


On Fri, Oct 09, 2020 at 03:52:51AM +0100, Hedi Berriche wrote:

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
changed pcie_do_recovery() so that status is updated with the return
value from reset_link(); this was to fix the problem where we would
wrongly report recovery failure, despite a successful reset_link(),
whenever the initial error status is PCI_ERS_RESULT_DISCONNECT or
PCI_ERS_RESULT_NO_AER_DRIVER.

Unfortunately this breaks the flow of pcie_do_recovery() as it prevents


What is the reference to "this breaks" above?


The code change introduced by commit 6d2c89441571; would

"this code change" instead of "this breaks"

work better? If not, I can also rephrase the whole paragraph along the 
following lines:

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") breaks 
the flow
of pcie_do_recovery() as it prevents the actions needed when the initial error 
is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET from taking place which 
causes
error recovery to fail.

... and do away with the first paragraph.


the actions needed when the initial error is PCI_ERS_RESULT_CAN_RECOVER
or PCI_ERS_RESULT_NEED_RESET from taking place which causes error
recovery to fail.

Don't clobber status after reset_link() to restore the intended flow in
pcie_do_recovery().

Fix the original problem by saving the return value from reset_link()
and use it later on to decide whether error recovery should be deemed
successful in the scenarios where the initial error status is
PCI_ERS_RESULT_{DISCONNECT,NO_AER_DRIVER}.


I would rather rephrase the above to make it clear what is being proposed.
Since the description seems to talk about the old problem and new solution
all mixed up.


OK; will do that to clarify that what's being proposed here is:

1. fix the regression introduced by commit 6d2c89441571
2. address the problem that commit 6d2c89441571 aimed to fix


Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Keith Busch 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..dbd0b56bd6c1 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -150,7 +150,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_link)(struct pci_dev *pdev))
 {
-   pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
+   pci_ers_result_t post_reset_status, status = PCI_ERS_RESULT_CAN_RECOVER;


why call it post_reset_status?


Perhaps post_reset_status is not a great choice; would reset_result or 
reset_link_result be better?

Cheers,
Hedi.




struct pci_bus *bus;

/*
@@ -165,8 +165,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   post_reset_status = reset_link(dev);
+   if (post_reset_status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
}
@@ -174,6 +174,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_walk_bus(bus, report_normal_detected, );
}

+   if ((status == PCI_ERS_RESULT_DISCONNECT ||
+status == PCI_ERS_RESULT_NO_AER_DRIVER) &&
+post_reset_status == PCI_ERS_RESULT_RECOVERED) {
+   /* error recovery succeeded thanks to reset_link() */
+   status = PCI_ERS_RESULT_RECOVERED;
+   }
+
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
--
2.28.0



--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


[PATCH v1 1/1] PCI/ERR: don't clobber status after reset_link()

2020-10-08 Thread Hedi Berriche
Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
changed pcie_do_recovery() so that status is updated with the return
value from reset_link(); this was to fix the problem where we would
wrongly report recovery failure, despite a successful reset_link(),
whenever the initial error status is PCI_ERS_RESULT_DISCONNECT or
PCI_ERS_RESULT_NO_AER_DRIVER.

Unfortunately this breaks the flow of pcie_do_recovery() as it prevents
the actions needed when the initial error is PCI_ERS_RESULT_CAN_RECOVER
or PCI_ERS_RESULT_NEED_RESET from taking place which causes error
recovery to fail.

Don't clobber status after reset_link() to restore the intended flow in
pcie_do_recovery().

Fix the original problem by saving the return value from reset_link()
and use it later on to decide whether error recovery should be deemed
successful in the scenarios where the initial error status is
PCI_ERS_RESULT_{DISCONNECT,NO_AER_DRIVER}.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Kuppuswamy Sathyanarayanan 
Cc: Bjorn Helgaas 
Cc: Ashok Raj 
Cc: Keith Busch 
Cc: Joerg Roedel 

Cc: sta...@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..dbd0b56bd6c1 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -150,7 +150,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_link)(struct pci_dev *pdev))
 {
-   pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
+   pci_ers_result_t post_reset_status, status = PCI_ERS_RESULT_CAN_RECOVER;
struct pci_bus *bus;
 
/*
@@ -165,8 +165,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );
-   status = reset_link(dev);
-   if (status != PCI_ERS_RESULT_RECOVERED) {
+   post_reset_status = reset_link(dev);
+   if (post_reset_status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "link reset failed\n");
goto failed;
}
@@ -174,6 +174,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_walk_bus(bus, report_normal_detected, );
}
 
+   if ((status == PCI_ERS_RESULT_DISCONNECT ||
+status == PCI_ERS_RESULT_NO_AER_DRIVER) &&
+post_reset_status == PCI_ERS_RESULT_RECOVERED) {
+   /* error recovery succeeded thanks to reset_link() */
+   status = PCI_ERS_RESULT_RECOVERED;
+   }
+
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
-- 
2.28.0



[Patch] ACPICA: Increase AE_OWNER_ID_LIMIT to 2047

2019-05-30 Thread Hedi Berriche
32 sockets systems with 192 NVDIMMs run into the ACPI_OWNER_ID_MAX limit
which is currently set to 255, and nfit kernel module initialisation fails
with the following representative error messages:

ACPI Error: Could not allocate new OwnerId (255 max), AE_OWNER_ID_LIMIT 
(20170303/utownerid-149
ACPI Error: Method parse/execution failed [\_SB.NVDR.N031.PCDR] (Node 
9e2fffd8e280), AE_OWNER_ID_LIMIT (20170303/psparse-543)
ACPI Error: Method parse/execution failed [\_SB.NVDR.N031.CR05] (Node 
9547ffd91bb8), AE_OWNER_ID_LIMIT (20170303/psparse-543)
ACPI Error: Method parse/execution failed [\_SB.NVDR.N031.CRLD] (Node 
8e99ffd92550), AE_OWNER_ID_LIMIT (20170303/psparse-543)
ACPI Error: Method parse/execution failed [\_SB.NVDR.N031._DSM] (Node 
adc5ffd90e88), AE_OWNER_ID_LIMIT (20170303/psparse-543)
ACPI: \_SB_.NVDR.N031: failed to evaluate _DSM (0x1b)

Further debugging shows that, on such a system, we end up using 1020 owner IDs,
hence I'm suggesting that we bump ACPI_OWNER_ID_MAX up to 2047.

Signed-off-by: Hedi Berriche 
Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Frank Ramsay 
Cc: Robert Moore 
Cc: Erik Schmauss 
Cc: Rafael J. Wysocki 
---
 drivers/acpi/acpica/utownerid.c | 6 +++---
 include/acpi/acconfig.h | 4 ++--
 include/acpi/actypes.h  | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/acpi/acpica/utownerid.c b/drivers/acpi/acpica/utownerid.c
index 5eb8b76ce9d8..c015a2c147d9 100644
--- a/drivers/acpi/acpica/utownerid.c
+++ b/drivers/acpi/acpica/utownerid.c
@@ -88,7 +88,7 @@ acpi_status acpi_ut_allocate_owner_id(acpi_owner_id *owner_id)
/*
 * Construct encoded ID from the index and bit 
position
 *
-* Note: Last [j].k (bit 255) is never used and 
is marked
+* Note: Last [j].k (bit 2047) is never used 
and is marked
 * permanently allocated (prevents +1 overflow)
 */
*owner_id =
@@ -116,7 +116,7 @@ acpi_status acpi_ut_allocate_owner_id(acpi_owner_id 
*owner_id)
 */
status = AE_OWNER_ID_LIMIT;
ACPI_ERROR((AE_INFO,
-   "Could not allocate new OwnerId (255 max), 
AE_OWNER_ID_LIMIT"));
+   "Could not allocate new OwnerId (2047 max), 
AE_OWNER_ID_LIMIT"));
 
 exit:
(void)acpi_ut_release_mutex(ACPI_MTX_CACHES);
@@ -133,7 +133,7 @@ acpi_status acpi_ut_allocate_owner_id(acpi_owner_id 
*owner_id)
  *  control method or unloading a table. Either way, we would
  *  ignore any error anyway.
  *
- * DESCRIPTION: Release a table or method owner ID. Valid IDs are 1 - 255
+ * DESCRIPTION: Release a table or method owner ID. Valid IDs are 1 - 2047
  *
  
**/
 
diff --git a/include/acpi/acconfig.h b/include/acpi/acconfig.h
index 16a83959e616..536fe9a81cb7 100644
--- a/include/acpi/acconfig.h
+++ b/include/acpi/acconfig.h
@@ -95,9 +95,9 @@
 
 #define ACPI_DEFAULT_PAGE_SIZE  4096   /* Must be power of 2 */
 
-/* owner_id tracking. 8 entries allows for 255 owner_ids */
+/* owner_id tracking. 64 entries allow for 2047 owner_ids */
 
-#define ACPI_NUM_OWNERID_MASKS  8
+#define ACPI_NUM_OWNERID_MASKS  64
 
 /* Size of the root table array is increased by this increment */
 
diff --git a/include/acpi/actypes.h b/include/acpi/actypes.h
index ad6892a24015..f32a4d49ea13 100644
--- a/include/acpi/actypes.h
+++ b/include/acpi/actypes.h
@@ -442,8 +442,8 @@ typedef void *acpi_handle;  /* Actually a ptr to a NS Node 
*/
 
 /* Owner IDs are used to track namespace nodes for selective deletion */
 
-typedef u8 acpi_owner_id;
-#define ACPI_OWNER_ID_MAX   0xFF
+typedef u16 acpi_owner_id;
+#define ACPI_OWNER_ID_MAX   0x7FF
 
 #define ACPI_INTEGER_BIT_SIZE   64
 #define ACPI_MAX_DECIMAL_DIGITS 20 /* 2^64 = 
18,446,744,073,709,551,616 */
-- 
2.20.1



[tip:x86/uv] x86/platform/UV: Use efi_enabled() instead of test_bit()

2019-02-15 Thread tip-bot for Hedi Berriche
Commit-ID:  8945d96f7b3ead56e053ac79b8f7b0de98a30bfe
Gitweb: https://git.kernel.org/tip/8945d96f7b3ead56e053ac79b8f7b0de98a30bfe
Author: Hedi Berriche 
AuthorDate: Wed, 13 Feb 2019 19:34:12 +
Committer:  Borislav Petkov 
CommitDate: Fri, 15 Feb 2019 15:15:18 +0100

x86/platform/UV: Use efi_enabled() instead of test_bit()

Use ad hoc efi_enabled() instead of fiddling with test_bit().

Cleanup, no functional changes.

Signed-off-by: Hedi Berriche 
Signed-off-by: Borislav Petkov 
Reviewed-by: Ard Biesheuvel 
Reviewed-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Mike Travis 
Cc: Andy Shevchenko 
Cc: Bhupesh Sharma 
Cc: Darren Hart 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-efi 
Cc: platform-driver-...@vger.kernel.org
Cc: Steve Wahl 
Cc: Thomas Gleixner 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190213193413.25560-4-hedi.berri...@hpe.com
---
 arch/x86/platform/uv/bios_uv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 91e3d5285836..38a2e3431fc6 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -44,7 +44,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 * If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
 * callback method, which uses efi_call() directly, with the kernel 
page tables:
 */
-   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   if (unlikely(efi_enabled(EFI_OLD_MEMMAP)))
ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
else
ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);


[tip:x86/uv] x86/platform/UV: Remove unnecessary #ifdef CONFIG_EFI

2019-02-15 Thread tip-bot for Hedi Berriche
Commit-ID:  30ad3e031d2feae075cd5fd2c443baa2d86c0195
Gitweb: https://git.kernel.org/tip/30ad3e031d2feae075cd5fd2c443baa2d86c0195
Author: Hedi Berriche 
AuthorDate: Wed, 13 Feb 2019 19:34:10 +
Committer:  Borislav Petkov 
CommitDate: Fri, 15 Feb 2019 15:05:15 +0100

x86/platform/UV: Remove unnecessary #ifdef CONFIG_EFI

CONFIG_EFI is implied by CONFIG_X86_UV and x86/platform/uv/bios_uv.c
requires the latter, get rid of the redundant #ifdef CONFIG_EFI
directives.

Cleanup, no functional changes.

Signed-off-by: Hedi Berriche 
Signed-off-by: Borislav Petkov 
Reviewed-by: Ard Biesheuvel 
Reviewed-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Mike Travis 
Cc: Andy Shevchenko 
Cc: Bhupesh Sharma 
Cc: Darren Hart 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-efi 
Cc: platform-driver-...@vger.kernel.org
Cc: Steve Wahl 
Cc: Thomas Gleixner 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190213193413.25560-2-hedi.berri...@hpe.com
---
 arch/x86/include/asm/uv/bios.h | 4 
 arch/x86/platform/uv/bios_uv.c | 2 --
 2 files changed, 6 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index e652a7cc6186..00d862cfbcbe 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -151,11 +151,7 @@ extern s64 uv_bios_change_memprotect(u64, u64, enum 
uv_memprotect);
 extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *);
 extern int uv_bios_set_legacy_vga_target(bool decode, int domain, int bus);
 
-#ifdef CONFIG_EFI
 extern void uv_bios_init(void);
-#else
-void uv_bios_init(void) { }
-#endif
 
 extern unsigned long sn_rtc_cycles_per_second;
 extern int uv_type;
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a6a5a26c582..4a61ed2a7bb8 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -188,7 +188,6 @@ int uv_bios_set_legacy_vga_target(bool decode, int domain, 
int bus)
 }
 EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 
-#ifdef CONFIG_EFI
 void uv_bios_init(void)
 {
uv_systab = NULL;
@@ -218,4 +217,3 @@ void uv_bios_init(void)
}
pr_info("UV: UVsystab: Revision:%x\n", uv_systab->revision);
 }
-#endif


[tip:x86/uv] x86/platform/UV: Remove uv_bios_call_reentrant()

2019-02-15 Thread tip-bot for Hedi Berriche
Commit-ID:  f816525d615fff0336d0c881e5d960bbec5ea016
Gitweb: https://git.kernel.org/tip/f816525d615fff0336d0c881e5d960bbec5ea016
Author: Hedi Berriche 
AuthorDate: Wed, 13 Feb 2019 19:34:11 +
Committer:  Borislav Petkov 
CommitDate: Fri, 15 Feb 2019 15:13:48 +0100

x86/platform/UV: Remove uv_bios_call_reentrant()

uv_bios_call_reentrant() has no callers nor is it exported, remove it.

Cleanup, no functional changes.

Signed-off-by: Hedi Berriche 
Signed-off-by: Borislav Petkov 
Reviewed-by: Ard Biesheuvel 
Reviewed-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Mike Travis 
Cc: Andy Shevchenko 
Cc: Bhupesh Sharma 
Cc: Darren Hart 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-efi 
Cc: platform-driver-...@vger.kernel.org
Cc: Steve Wahl 
Cc: Thomas Gleixner 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190213193413.25560-3-hedi.berri...@hpe.com
---
 arch/x86/include/asm/uv/bios.h |  1 -
 arch/x86/platform/uv/bios_uv.c | 12 
 2 files changed, 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index 00d862cfbcbe..8c6ac271b5b3 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -140,7 +140,6 @@ enum uv_memprotect {
  */
 extern s64 uv_bios_call(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 extern s64 uv_bios_call_irqsave(enum uv_bios_cmd, u64, u64, u64, u64, u64);
-extern s64 uv_bios_call_reentrant(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 
 extern s64 uv_bios_get_sn_info(int, int *, long *, long *, long *, long *);
 extern s64 uv_bios_freq_base(u64, u64 *);
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a61ed2a7bb8..91e3d5285836 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -66,18 +66,6 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 
a2, u64 a3,
return ret;
 }
 
-s64 uv_bios_call_reentrant(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
-   u64 a4, u64 a5)
-{
-   s64 ret;
-
-   preempt_disable();
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
-   preempt_enable();
-
-   return ret;
-}
-
 
 long sn_partition_id;
 EXPORT_SYMBOL_GPL(sn_partition_id);


[tip:x86/urgent] x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls

2019-02-15 Thread tip-bot for Hedi Berriche
Commit-ID:  f331e766c4be33f4338574f3c9f7f77e98ab4571
Gitweb: https://git.kernel.org/tip/f331e766c4be33f4338574f3c9f7f77e98ab4571
Author: Hedi Berriche 
AuthorDate: Wed, 13 Feb 2019 19:34:13 +
Committer:  Borislav Petkov 
CommitDate: Fri, 15 Feb 2019 15:19:56 +0100

x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls

Calls into UV firmware must be protected against concurrency, expose the
efi_runtime_lock to the UV platform, and use it to serialise UV BIOS
calls.

Signed-off-by: Hedi Berriche 
Signed-off-by: Borislav Petkov 
Reviewed-by: Ard Biesheuvel 
Reviewed-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Mike Travis 
Cc: Andy Shevchenko 
Cc: Bhupesh Sharma 
Cc: Darren Hart 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-efi 
Cc: platform-driver-...@vger.kernel.org
Cc: sta...@vger.kernel.org # v4.9+
Cc: Steve Wahl 
Cc: Thomas Gleixner 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190213193413.25560-5-hedi.berri...@hpe.com
---
 arch/x86/include/asm/uv/bios.h  |  8 +++-
 arch/x86/platform/uv/bios_uv.c  | 23 +--
 drivers/firmware/efi/runtime-wrappers.c |  7 +++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index e652a7cc6186..3f697a9e3f59 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -48,7 +48,8 @@ enum {
BIOS_STATUS_SUCCESS =  0,
BIOS_STATUS_UNIMPLEMENTED   = -ENOSYS,
BIOS_STATUS_EINVAL  = -EINVAL,
-   BIOS_STATUS_UNAVAIL = -EBUSY
+   BIOS_STATUS_UNAVAIL = -EBUSY,
+   BIOS_STATUS_ABORT   = -EINTR,
 };
 
 /* Address map parameters */
@@ -167,4 +168,9 @@ extern long system_serial_number;
 
 extern struct kobject *sgi_uv_kobj;/* /sys/firmware/sgi_uv */
 
+/*
+ * EFI runtime lock; cf. firmware/efi/runtime-wrappers.c for details
+ */
+extern struct semaphore __efi_uv_runtime_lock;
+
 #endif /* _ASM_X86_UV_BIOS_H */
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a6a5a26c582..eb33432f2f24 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -29,7 +29,8 @@
 
 struct uv_systab *uv_systab;
 
-s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+static s64 __uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
+   u64 a4, u64 a5)
 {
struct uv_systab *tab = uv_systab;
s64 ret;
@@ -51,6 +52,19 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 
return ret;
 }
+
+s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+{
+   s64 ret;
+
+   if (down_interruptible(&__efi_uv_runtime_lock))
+   return BIOS_STATUS_ABORT;
+
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
+   up(&__efi_uv_runtime_lock);
+
+   return ret;
+}
 EXPORT_SYMBOL_GPL(uv_bios_call);
 
 s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
@@ -59,10 +73,15 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, 
u64 a2, u64 a3,
unsigned long bios_flags;
s64 ret;
 
+   if (down_interruptible(&__efi_uv_runtime_lock))
+   return BIOS_STATUS_ABORT;
+
local_irq_save(bios_flags);
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
local_irq_restore(bios_flags);
 
+   up(&__efi_uv_runtime_lock);
+
return ret;
 }
 
diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..e2abfdb5cee6 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -146,6 +146,13 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
  */
 static DEFINE_SEMAPHORE(efi_runtime_lock);
 
+/*
+ * Expose the EFI runtime lock to the UV platform
+ */
+#ifdef CONFIG_X86_UV
+extern struct semaphore __efi_uv_runtime_lock __alias(efi_runtime_lock);
+#endif
+
 /*
  * Calls the appropriate efi_runtime_service() with the appropriate
  * arguments.


[Patch v3 4/4] x86/platform/UV: use efi_runtime_lock to serialise BIOS calls

2019-02-13 Thread Hedi Berriche
Calls into UV firmware must be protected against concurrency, expose the
efi_runtime_lock to the UV platform, and use it to serialise UV BIOS calls.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org # v4.9+
Signed-off-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/bios.h  |  8 +++-
 arch/x86/platform/uv/bios_uv.c  | 23 +--
 drivers/firmware/efi/runtime-wrappers.c |  7 +++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index 8c6ac271b5b3..8cfccc3cbbf4 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -48,7 +48,8 @@ enum {
BIOS_STATUS_SUCCESS =  0,
BIOS_STATUS_UNIMPLEMENTED   = -ENOSYS,
BIOS_STATUS_EINVAL  = -EINVAL,
-   BIOS_STATUS_UNAVAIL = -EBUSY
+   BIOS_STATUS_UNAVAIL = -EBUSY,
+   BIOS_STATUS_ABORT   = -EINTR,
 };
 
 /* Address map parameters */
@@ -162,4 +163,9 @@ extern long system_serial_number;
 
 extern struct kobject *sgi_uv_kobj;/* /sys/firmware/sgi_uv */
 
+/*
+ * EFI runtime lock; cf. firmware/efi/runtime-wrappers.c for details
+ */
+extern struct semaphore __efi_uv_runtime_lock;
+
 #endif /* _ASM_X86_UV_BIOS_H */
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 38a2e3431fc6..ef60d789c76e 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -29,7 +29,8 @@
 
 struct uv_systab *uv_systab;
 
-s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+static s64 __uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
+   u64 a4, u64 a5)
 {
struct uv_systab *tab = uv_systab;
s64 ret;
@@ -51,6 +52,19 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 
return ret;
 }
+
+s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+{
+   s64 ret;
+
+   if (down_interruptible(&__efi_uv_runtime_lock))
+   return BIOS_STATUS_ABORT;
+
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
+   up(&__efi_uv_runtime_lock);
+
+   return ret;
+}
 EXPORT_SYMBOL_GPL(uv_bios_call);
 
 s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
@@ -59,10 +73,15 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, 
u64 a2, u64 a3,
unsigned long bios_flags;
s64 ret;
 
+   if (down_interruptible(&__efi_uv_runtime_lock))
+   return BIOS_STATUS_ABORT;
+
local_irq_save(bios_flags);
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
local_irq_restore(bios_flags);
 
+   up(&__efi_uv_runtime_lock);
+
return ret;
 }
 
diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..e2abfdb5cee6 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -146,6 +146,13 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
  */
 static DEFINE_SEMAPHORE(efi_runtime_lock);
 
+/*
+ * Expose the EFI runtime lock to the UV platform
+ */
+#ifdef CONFIG_X86_UV
+extern struct semaphore __efi_uv_runtime_lock __alias(efi_runtime_lock);
+#endif
+
 /*
  * Calls the appropriate efi_runtime_service() with the appropriate
  * arguments.
-- 
2.20.1



[Patch v3 2/4] x86/platform/UV: kill uv_bios_call_reentrant()

2019-02-13 Thread Hedi Berriche
uv_bios_call_reentrant() has no callers nor is it exported, kill it.

Cleanup, no functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org # v4.9+
Signed-off-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/bios.h |  1 -
 arch/x86/platform/uv/bios_uv.c | 12 
 2 files changed, 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index 00d862cfbcbe..8c6ac271b5b3 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -140,7 +140,6 @@ enum uv_memprotect {
  */
 extern s64 uv_bios_call(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 extern s64 uv_bios_call_irqsave(enum uv_bios_cmd, u64, u64, u64, u64, u64);
-extern s64 uv_bios_call_reentrant(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 
 extern s64 uv_bios_get_sn_info(int, int *, long *, long *, long *, long *);
 extern s64 uv_bios_freq_base(u64, u64 *);
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a61ed2a7bb8..91e3d5285836 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -66,18 +66,6 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 
a2, u64 a3,
return ret;
 }
 
-s64 uv_bios_call_reentrant(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
-   u64 a4, u64 a5)
-{
-   s64 ret;
-
-   preempt_disable();
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
-   preempt_enable();
-
-   return ret;
-}
-
 
 long sn_partition_id;
 EXPORT_SYMBOL_GPL(sn_partition_id);
-- 
2.20.1



[Patch v3 1/4] x86/platform/UV: remove unnecessary #ifdef CONFIG_EFI

2019-02-13 Thread Hedi Berriche
CONFIG_EFI is implied by CONFIG_X86_UV and x86/platform/uv/bios_uv.c
requires the latter, get rid of the redundant #ifdef CONFIG_EFI directives.

Cleanup, no functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org # v4.9+
Signed-off-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/bios.h | 4 
 arch/x86/platform/uv/bios_uv.c | 2 --
 2 files changed, 6 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index e652a7cc6186..00d862cfbcbe 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -151,11 +151,7 @@ extern s64 uv_bios_change_memprotect(u64, u64, enum 
uv_memprotect);
 extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *);
 extern int uv_bios_set_legacy_vga_target(bool decode, int domain, int bus);
 
-#ifdef CONFIG_EFI
 extern void uv_bios_init(void);
-#else
-void uv_bios_init(void) { }
-#endif
 
 extern unsigned long sn_rtc_cycles_per_second;
 extern int uv_type;
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a6a5a26c582..4a61ed2a7bb8 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -188,7 +188,6 @@ int uv_bios_set_legacy_vga_target(bool decode, int domain, 
int bus)
 }
 EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 
-#ifdef CONFIG_EFI
 void uv_bios_init(void)
 {
uv_systab = NULL;
@@ -218,4 +217,3 @@ void uv_bios_init(void)
}
pr_info("UV: UVsystab: Revision:%x\n", uv_systab->revision);
 }
-#endif
-- 
2.20.1



[Patch v3 0/4] Protect against concurrent calls into UV BIOS

2019-02-13 Thread Hedi Berriche
- Changes since v2
  Addressed comments from Ard Biesheuvel:
 * expose efi_runtime_lock to UV platform only instead of globally
 * remove unnecessary #ifdef CONFIG_EFI from bios_uv.c

- Changes since v1:
  Addressed comments from Bhupesh Sharma, Thomas Gleixner, and Ard Biesheuvel:
 * made __uv_bios_call() static
 * moved the efi_enabled() cleanup to its own patchlet
 * explained the reason for renaming the efi_runtime_lock semaphore
 * dropped the reviewed-bys as they should be given on the mailing list
 * Cc'ng sta...@vger.kernel.org given the nature of the problem addressed by 
the patches

---

Calls into UV BIOS were not being serialised which is wrong as it violates EFI
runtime rules, and bad as it does result in all sorts of potentially hard to
track down hangs and panics when efi_scratch.prev_mm gets clobbered whenever
efi_switch_mm() gets called concurrently from two different CPUs.

Patch #1 removes an unnecessary #ifdef CONFIG_EFI guard from bios_uv.c.

Patch #2 removes uv_bios_call_reentrant() because it's dead code.

Patch #3 is a cleanup that drops test_bit() in favour of the ad hoc 
efi_enabled().

Patch #4 makes uv_bios_call() variants use the efi_runtime_lock semaphore to
protect against concurrency.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org # v4.9+

Hedi Berriche (4):
  x86/platform/UV: remove unnecessary #ifdef CONFIG_EFI
  x86/platform/UV: kill uv_bios_call_reentrant() as it has no callers
  x86/platform/UV: use efi_enabled() instead of test_bit()
  x86/platform/UV: use efi_runtime_lock to serialise BIOS calls

 arch/x86/include/asm/uv/bios.h  | 13 -
 arch/x86/platform/uv/bios_uv.c  | 35 ++---
 drivers/firmware/efi/runtime-wrappers.c |  7 +
 3 files changed, 34 insertions(+), 21 deletions(-)

-- 
2.20.1



[Patch v3 3/4] x86/platform/UV: use efi_enabled() instead of test_bit()

2019-02-13 Thread Hedi Berriche
Use ad hoc efi_enabled() instead of fiddling with test_bit().

Cleanup, no functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org # v4.9+
Signed-off-by: Hedi Berriche 
---
 arch/x86/platform/uv/bios_uv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 91e3d5285836..38a2e3431fc6 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -44,7 +44,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 * If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
 * callback method, which uses efi_call() directly, with the kernel 
page tables:
 */
-   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   if (unlikely(efi_enabled(EFI_OLD_MEMMAP)))
ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
else
ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
-- 
2.20.1



Re: [Patch v2 1/4] efi/x86: turn EFI runtime semaphore into a global lock

2019-02-12 Thread Hedi Berriche

On Tue, Feb 12, 2019 at 17:25 Hedi Berriche wrote:

On Thu, Feb 07, 2019 at 17:38 Hedi Berriche wrote:

On Thu, Feb 07, 2019 at 16:05 Ard Biesheuvel wrote:

On Thu, 7 Feb 2019 at 05:23, Hedi Berriche  wrote:


Make efi_runtime_lock semaphore global so that it can be used by EFI
runtime callers that may be defined outside efi/runtime-wrappers.c.

Also now that efi_runtime_lock semaphore is no longer static, rename it
to efi_runtime_sem so it doesn't clash with the efi_runtime_lock spinlock
defined in arch/x86/platform/efi/efi_64.c when CONFIG_EFI_MIXED is
enabled.

The immediate motivation of this change is to serialise UV platform BIOS
calls using the efi_runtime_sem semaphore.

No functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 


Hello Hedi,


Hi Ard,


Same feedback as on v1: please don't rename the lock.


With the efi_runtime_lock semaphore being no longer static, not renaming it
will cause a compile failure as it will clash with the declaration of the
efi_runtime_lock *spinlock* in arch/x86/platform/efi/efi_64.c in the
CONFIG_EFI_MIXED case.


Ard,

Any comments on whether resolving the conflict between

{efi_runtime_lock, drivers/firmware/efi/runtime-wrappers.c}

and
{efi_runtime_lock, arch/x86/platform/efi/efi_64.c}

provides the required justification for renaming the efi_runtime_lock semaphore?


Ard,

Apologies for sending this email which must have come across as absurd given
the *second* email you sent on 2019-02-07 19:38:42.

The trouble is that I *never* received that email (nor the one you sent today
2019-02-12 17:26:06 as reply to this one) because for some reason my email
address was dropped from the distribution list.

It's only about 30 minutes ago that a colleague brought it up to my attention
and forwarded both emails:

Thu,  7 Feb 2019 20:38:42 +0100
Tue, 12 Feb 2019 18:26:06 +0100

No idea how my email address got dropped but I wanted to make sure to explain
the seemingly absurd email I sent today as well as the lack of comment on your
earlier email.

Cheers,
Hedi.

---
drivers/firmware/efi/runtime-wrappers.c | 60 -
include/linux/efi.h |  3 ++
2 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..ec60d6227925 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -53,7 +53,7 @@ struct efi_runtime_work efi_rts_work;
* @rts_arg<1-5>:  efi_runtime_service() function arguments
*
* Accesses to efi_runtime_services() are serialized by a binary
- * semaphore (efi_runtime_lock) and caller waits until the work is
+ * semaphore (efi_runtime_sem) and caller waits until the work is
* finished, hence _only_ one work is queued at a time and the caller
* thread waits for completion.
*/
@@ -144,7 +144,7 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
* none of the remaining functions are actually ever called at runtime.
* So let's just use a single lock to serialize all Runtime Services calls.
*/
-static DEFINE_SEMAPHORE(efi_runtime_lock);
+DEFINE_SEMAPHORE(efi_runtime_sem);

/*
* Calls the appropriate efi_runtime_service() with the appropriate
@@ -233,10 +233,10 @@ static efi_status_t virt_efi_get_time(efi_time_t *tm, 
efi_time_cap_t *tc)
{
  efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
  return EFI_ABORTED;
  status = efi_queue_work(GET_TIME, tm, tc, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
  return status;
}

@@ -244,10 +244,10 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
{
  efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
  return EFI_ABORTED;
  status = efi_queue_work(SET_TIME, tm, NULL, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
  return status;
}

@@ -257,11 +257,11 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t 
*enabled,
{
  efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
  return EFI_ABORTED;
  status = efi_queue_work(GET_WAKEUP_TIME, enabled, pending, tm, NULL,
  NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
  return status;
}

@@ -269,11 +269,11 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t 
enabled, efi_time_t *tm)
{
  efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
  return EFI_ABORTED;
  status = efi_queue_work(SET_WAKEUP_TIME, , tm, NULL, NULL,
  NULL);
-   up(_runtime_lock);
+   

Re: [Patch v2 1/4] efi/x86: turn EFI runtime semaphore into a global lock

2019-02-12 Thread Hedi Berriche

On Thu, Feb 07, 2019 at 17:38 Hedi Berriche wrote:

On Thu, Feb 07, 2019 at 16:05 Ard Biesheuvel wrote:

On Thu, 7 Feb 2019 at 05:23, Hedi Berriche  wrote:


Make efi_runtime_lock semaphore global so that it can be used by EFI
runtime callers that may be defined outside efi/runtime-wrappers.c.

Also now that efi_runtime_lock semaphore is no longer static, rename it
to efi_runtime_sem so it doesn't clash with the efi_runtime_lock spinlock
defined in arch/x86/platform/efi/efi_64.c when CONFIG_EFI_MIXED is
enabled.

The immediate motivation of this change is to serialise UV platform BIOS
calls using the efi_runtime_sem semaphore.

No functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 


Hello Hedi,


Hi Ard,


Same feedback as on v1: please don't rename the lock.


With the efi_runtime_lock semaphore being no longer static, not renaming it
will cause a compile failure as it will clash with the declaration of the
efi_runtime_lock *spinlock* in arch/x86/platform/efi/efi_64.c in the
CONFIG_EFI_MIXED case.


Ard,

Any comments on whether resolving the conflict between

{efi_runtime_lock, drivers/firmware/efi/runtime-wrappers.c}

and
{efi_runtime_lock, arch/x86/platform/efi/efi_64.c}

provides the required justification for renaming the efi_runtime_lock semaphore?

Cheers,
Hedi.


---
drivers/firmware/efi/runtime-wrappers.c | 60 -
include/linux/efi.h |  3 ++
2 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..ec60d6227925 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -53,7 +53,7 @@ struct efi_runtime_work efi_rts_work;
 * @rts_arg<1-5>:  efi_runtime_service() function arguments
 *
 * Accesses to efi_runtime_services() are serialized by a binary
- * semaphore (efi_runtime_lock) and caller waits until the work is
+ * semaphore (efi_runtime_sem) and caller waits until the work is
 * finished, hence _only_ one work is queued at a time and the caller
 * thread waits for completion.
 */
@@ -144,7 +144,7 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
 * none of the remaining functions are actually ever called at runtime.
 * So let's just use a single lock to serialize all Runtime Services calls.
 */
-static DEFINE_SEMAPHORE(efi_runtime_lock);
+DEFINE_SEMAPHORE(efi_runtime_sem);

/*
 * Calls the appropriate efi_runtime_service() with the appropriate
@@ -233,10 +233,10 @@ static efi_status_t virt_efi_get_time(efi_time_t *tm, 
efi_time_cap_t *tc)
{
   efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
   return EFI_ABORTED;
   status = efi_queue_work(GET_TIME, tm, tc, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
   return status;
}

@@ -244,10 +244,10 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
{
   efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
   return EFI_ABORTED;
   status = efi_queue_work(SET_TIME, tm, NULL, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
   return status;
}

@@ -257,11 +257,11 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t 
*enabled,
{
   efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
   return EFI_ABORTED;
   status = efi_queue_work(GET_WAKEUP_TIME, enabled, pending, tm, NULL,
   NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
   return status;
}

@@ -269,11 +269,11 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t 
enabled, efi_time_t *tm)
{
   efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
   return EFI_ABORTED;
   status = efi_queue_work(SET_WAKEUP_TIME, , tm, NULL, NULL,
   NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
   return status;
}

@@ -285,11 +285,11 @@ static efi_status_t virt_efi_get_variable(efi_char16_t 
*name,
{
   efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
   return EFI_ABORTED;
   status = efi_queue_work(GET_VARIABLE, name, vendor, attr, data_size,
   data);
-   up(_runtime_lock);
+   up(_runtime_sem);
   return status;
}

@@ -299,11 +299,11 @@ static efi_status_t virt_efi_get_next_variable(unsigned 
long *name_size,
{
   efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
   return EFI_A

Re: [Patch v2 1/4] efi/x86: turn EFI runtime semaphore into a global lock

2019-02-07 Thread Hedi Berriche

On Thu, Feb 07, 2019 at 16:05 Ard Biesheuvel wrote:

On Thu, 7 Feb 2019 at 05:23, Hedi Berriche  wrote:


Make efi_runtime_lock semaphore global so that it can be used by EFI
runtime callers that may be defined outside efi/runtime-wrappers.c.

Also now that efi_runtime_lock semaphore is no longer static, rename it
to efi_runtime_sem so it doesn't clash with the efi_runtime_lock spinlock
defined in arch/x86/platform/efi/efi_64.c when CONFIG_EFI_MIXED is
enabled.

The immediate motivation of this change is to serialise UV platform BIOS
calls using the efi_runtime_sem semaphore.

No functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 


Hello Hedi,


Hi Ard,


Same feedback as on v1: please don't rename the lock.


With the efi_runtime_lock semaphore being no longer static, not renaming it
will cause a compile failure as it will clash with the declaration of the
efi_runtime_lock *spinlock* in arch/x86/platform/efi/efi_64.c in the
CONFIG_EFI_MIXED case.


---
 drivers/firmware/efi/runtime-wrappers.c | 60 -
 include/linux/efi.h |  3 ++
 2 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..ec60d6227925 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -53,7 +53,7 @@ struct efi_runtime_work efi_rts_work;
  * @rts_arg<1-5>:  efi_runtime_service() function arguments
  *
  * Accesses to efi_runtime_services() are serialized by a binary
- * semaphore (efi_runtime_lock) and caller waits until the work is
+ * semaphore (efi_runtime_sem) and caller waits until the work is
  * finished, hence _only_ one work is queued at a time and the caller
  * thread waits for completion.
  */
@@ -144,7 +144,7 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
  * none of the remaining functions are actually ever called at runtime.
  * So let's just use a single lock to serialize all Runtime Services calls.
  */
-static DEFINE_SEMAPHORE(efi_runtime_lock);
+DEFINE_SEMAPHORE(efi_runtime_sem);

 /*
  * Calls the appropriate efi_runtime_service() with the appropriate
@@ -233,10 +233,10 @@ static efi_status_t virt_efi_get_time(efi_time_t *tm, 
efi_time_cap_t *tc)
 {
efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_TIME, tm, tc, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }

@@ -244,10 +244,10 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
 {
efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_TIME, tm, NULL, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }

@@ -257,11 +257,11 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t 
*enabled,
 {
efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_WAKEUP_TIME, enabled, pending, tm, NULL,
NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }

@@ -269,11 +269,11 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t 
enabled, efi_time_t *tm)
 {
efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_WAKEUP_TIME, , tm, NULL, NULL,
NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }

@@ -285,11 +285,11 @@ static efi_status_t virt_efi_get_variable(efi_char16_t 
*name,
 {
efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_VARIABLE, name, vendor, attr, data_size,
data);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }

@@ -299,11 +299,11 @@ static efi_status_t virt_efi_get_next_variable(unsigned 
long *name_size,
 {
efi_status_t status;

-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_NEXT_VARIABLE, name_size, name, vendor,
NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }

@@ -315,11 +315,11 @@ static efi_status_t virt_efi_set_variable(efi_char16_t 

[Patch v2 3/4] x86/platform/UV: use efi_enabled() instead of test_bit()

2019-02-07 Thread Hedi Berriche
Use ad hoc efi_enabled() instead of fiddling with test_bit().

Cleanup, no functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 
---
 arch/x86/platform/uv/bios_uv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index cd05af157763..8bace0ca9e57 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -44,7 +44,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 * If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
 * callback method, which uses efi_call() directly, with the kernel 
page tables:
 */
-   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   if (unlikely(efi_enabled(EFI_OLD_MEMMAP)))
ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
else
ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
-- 
2.20.1



[Patch v2 2/4] x86/platform/UV: kill uv_bios_call_reentrant() as it has no callers

2019-02-06 Thread Hedi Berriche
uv_bios_call_reentrant() has no callers nor is it exported, kill it.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/bios.h |  1 -
 arch/x86/platform/uv/bios_uv.c | 12 
 2 files changed, 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index e652a7cc6186..4eee646544b2 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -140,7 +140,6 @@ enum uv_memprotect {
  */
 extern s64 uv_bios_call(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 extern s64 uv_bios_call_irqsave(enum uv_bios_cmd, u64, u64, u64, u64, u64);
-extern s64 uv_bios_call_reentrant(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 
 extern s64 uv_bios_get_sn_info(int, int *, long *, long *, long *, long *);
 extern s64 uv_bios_freq_base(u64, u64 *);
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a6a5a26c582..cd05af157763 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -66,18 +66,6 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 
a2, u64 a3,
return ret;
 }
 
-s64 uv_bios_call_reentrant(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
-   u64 a4, u64 a5)
-{
-   s64 ret;
-
-   preempt_disable();
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
-   preempt_enable();
-
-   return ret;
-}
-
 
 long sn_partition_id;
 EXPORT_SYMBOL_GPL(sn_partition_id);
-- 
2.20.1



[Patch v2 0/4] Protect against concurrent calls into UV BIOS

2019-02-06 Thread Hedi Berriche
Changes since v1:

Addressed comments from Bhupesh Sharma, Thomas Gleixner, and Ard Biesheuvel:

 * made __uv_bios_call() static
 * moved the efi_enabled() cleanup to its own patchlet
 * explained the reason for renaming the efi_runtime_lock semaphore
 * dropped the reviewed-bys as they should be given on the mailing list
 * Cc'ng sta...@vger.kernel.org given the nature of the problem addressed by 
the patches


---

Calls into UV BIOS were not being serialised which is wrong as it violates EFI
runtime rules, and bad as it does result in all sorts of potentially hard to
track down hangs and panics when efi_scratch.prev_mm gets clobbered whenever
efi_switch_mm() gets called concurrently from two different CPUs.

Patch #1 makes the efi_runtime_lock semaphore visible to EFI runtime callers
defined outside drivers/firmware/efi/runtime-wrappers.c in preparation for
using it to serialise calls into UV BIOS; the lock is also renamed to
efi_runtime_sem so that it can coexist with the efi_runtime_lock spinlock
defined in arch/x86/platform/efi/efi_64.c when CONFIG_EFI_MIXED is enabled.

Patch #2 removes uv_bios_call_reentrant() because it's dead code.

Patch #3 is a cleanup that drops test_bit() in favour of the ad hoc 
efi_enabled().

Patch #4 makes uv_bios_call() variants use efi_runtime_sem to protect against
concurrency.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org

Hedi Berriche (4):
  efi/x86: turn EFI runtime semaphore into a global lock
  x86/platform/UV: kill uv_bios_call_reentrant() as it has no callers
  x86/platform/UV: use efi_enabled() instead of test_bit()
  x86/platform/UV: use efi_runtime_sem to serialise BIOS calls

 arch/x86/include/asm/uv/bios.h  |  4 +-
 arch/x86/platform/uv/bios_uv.c  | 33 --
 drivers/firmware/efi/runtime-wrappers.c | 60 -
 include/linux/efi.h |  3 ++
 4 files changed, 55 insertions(+), 45 deletions(-)

-- 
2.20.1



[Patch v2 4/4] x86/platform/UV: use efi_runtime_sem to serialise BIOS calls

2019-02-06 Thread Hedi Berriche
Calls into UV firmware must be protected against concurrency, use the
now visible efi_runtime_sem lock to serialise them.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/bios.h |  3 ++-
 arch/x86/platform/uv/bios_uv.c | 23 +--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index 4eee646544b2..581e2978a16c 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -48,7 +48,8 @@ enum {
BIOS_STATUS_SUCCESS =  0,
BIOS_STATUS_UNIMPLEMENTED   = -ENOSYS,
BIOS_STATUS_EINVAL  = -EINVAL,
-   BIOS_STATUS_UNAVAIL = -EBUSY
+   BIOS_STATUS_UNAVAIL = -EBUSY,
+   BIOS_STATUS_ABORT   = -EINTR,
 };
 
 /* Address map parameters */
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 8bace0ca9e57..e779b2a128ea 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -29,7 +29,8 @@
 
 struct uv_systab *uv_systab;
 
-s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+static s64 __uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
+   u64 a4, u64 a5)
 {
struct uv_systab *tab = uv_systab;
s64 ret;
@@ -51,6 +52,19 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 
return ret;
 }
+
+s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+{
+   s64 ret;
+
+   if (down_interruptible(_runtime_sem))
+   return BIOS_STATUS_ABORT;
+
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
+   up(_runtime_sem);
+
+   return ret;
+}
 EXPORT_SYMBOL_GPL(uv_bios_call);
 
 s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
@@ -59,10 +73,15 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, 
u64 a2, u64 a3,
unsigned long bios_flags;
s64 ret;
 
+   if (down_interruptible(_runtime_sem))
+   return BIOS_STATUS_ABORT;
+
local_irq_save(bios_flags);
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
local_irq_restore(bios_flags);
 
+   up(_runtime_sem);
+
return ret;
 }
 
-- 
2.20.1



[Patch v2 1/4] efi/x86: turn EFI runtime semaphore into a global lock

2019-02-06 Thread Hedi Berriche
Make efi_runtime_lock semaphore global so that it can be used by EFI
runtime callers that may be defined outside efi/runtime-wrappers.c.

Also now that efi_runtime_lock semaphore is no longer static, rename it
to efi_runtime_sem so it doesn't clash with the efi_runtime_lock spinlock
defined in arch/x86/platform/efi/efi_64.c when CONFIG_EFI_MIXED is
enabled.

The immediate motivation of this change is to serialise UV platform BIOS
calls using the efi_runtime_sem semaphore.

No functional changes.

Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Dimitri Sivanich 
Cc: Steve Wahl 
Cc: sta...@vger.kernel.org
Signed-off-by: Hedi Berriche 
---
 drivers/firmware/efi/runtime-wrappers.c | 60 -
 include/linux/efi.h |  3 ++
 2 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..ec60d6227925 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -53,7 +53,7 @@ struct efi_runtime_work efi_rts_work;
  * @rts_arg<1-5>:  efi_runtime_service() function arguments
  *
  * Accesses to efi_runtime_services() are serialized by a binary
- * semaphore (efi_runtime_lock) and caller waits until the work is
+ * semaphore (efi_runtime_sem) and caller waits until the work is
  * finished, hence _only_ one work is queued at a time and the caller
  * thread waits for completion.
  */
@@ -144,7 +144,7 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
  * none of the remaining functions are actually ever called at runtime.
  * So let's just use a single lock to serialize all Runtime Services calls.
  */
-static DEFINE_SEMAPHORE(efi_runtime_lock);
+DEFINE_SEMAPHORE(efi_runtime_sem);
 
 /*
  * Calls the appropriate efi_runtime_service() with the appropriate
@@ -233,10 +233,10 @@ static efi_status_t virt_efi_get_time(efi_time_t *tm, 
efi_time_cap_t *tc)
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_TIME, tm, tc, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -244,10 +244,10 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_TIME, tm, NULL, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -257,11 +257,11 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t 
*enabled,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_WAKEUP_TIME, enabled, pending, tm, NULL,
NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -269,11 +269,11 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t 
enabled, efi_time_t *tm)
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_WAKEUP_TIME, , tm, NULL, NULL,
NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -285,11 +285,11 @@ static efi_status_t virt_efi_get_variable(efi_char16_t 
*name,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_VARIABLE, name, vendor, attr, data_size,
data);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -299,11 +299,11 @@ static efi_status_t virt_efi_get_next_variable(unsigned 
long *name_size,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_NEXT_VARIABLE, name_size, name, vendor,
NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -315,11 +315,11 @@ static efi_status_t virt_efi_set_variable(efi_char16_t 
*name,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_VARIABLE, name, vendor, , _size,
data);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -330,12 +330,12 @@ virt_efi_set_variable_nonblocking(efi_char16_t

Re: [PATCH 1/3] efi/x86: turn EFI runtime semaphore into a global lock

2019-01-15 Thread Hedi Berriche

On Tue, Jan 15, 2019 at 18:55 Thomas Gleixner wrote:

On Wed, 9 Jan 2019, Hedi Berriche wrote:


Make efi_runtime_lock semaphore global so that it can be used by EFI
runtime callers that may be defined outside efi/runtime-wrappers.c.


The changelog should mention why the lock is renamed.


OK; will add to V2 that it was renamed so that it doesn't clash with the
efi_runtime_lock spinlock defined by CONFIG_EFI_MIXED.


I have no strong opinion, but to apply that I need an Ack from Ard.


OK; thanks for the feedback.

Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


Re: [PATCH 3/3] x86/platform/UV: use efi_runtime_sem to serialise BIOS calls

2019-01-14 Thread Hedi Berriche

On Thu, Jan 10, 2019 at 07:14 Bhupesh Sharma wrote:

Hi Hedi,


Hi Bhupesh,


Thanks for the patchset.


Thanks for looking at it.

I will give this a go on my sgi-uv300 machine and come back with more 
detailed inputs,


and for testing it.

but I wanted to ask about the hang/panic you 
mentioned in the cover letter when efi_scratch gets clobbered. Can you 
describe the same (for e.g. how to reproduce this).


When efi_switch_mm() gets called concurrently from two different CPUs
--via arch_efi_call_virt_setup()-- due to lack of serialisation in
uv_bios_call(), efi_scratch.prev_mm is overwritten and that's how all
hell breaks loose, and that's when you see either a hang (the more
frequent failure mode) or a panic.

In order to reproduce the problem you'd need, for example, a kernel
module that makes use of uv_bios_call(), in which case a test case
would be a loop with:

- 2 concurrent tasks both invoking uv_bios_call()

or
- 2 concurrent tasks
- one invoking uv_bios_call()
- one, for example, accessing an EFI vars via efivars


Nitpicks below:


On 01/09/2019 04:15 PM, Hedi Berriche wrote:

Calls into UV firmware must be protected against concurrency, use the
now visible efi_runtime_sem lock to serialise them.

Signed-off-by: Hedi Berriche 
Reviewed-by: Russ Anderson 
Reviewed-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Steve Wahl 
---
 arch/x86/include/asm/uv/bios.h |  3 ++-
 arch/x86/platform/uv/bios_uv.c | 25 ++---
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index 4eee646544b2..33e94aa0b1ff 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -48,7 +48,8 @@ enum {
BIOS_STATUS_SUCCESS =  0,
BIOS_STATUS_UNIMPLEMENTED   = -ENOSYS,
BIOS_STATUS_EINVAL  = -EINVAL,
-   BIOS_STATUS_UNAVAIL = -EBUSY
+   BIOS_STATUS_UNAVAIL = -EBUSY,
+   BIOS_STATUS_ABORT   = -EINTR
 };
 /* Address map parameters */
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index cd05af157763..92f960798e20 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -29,7 +29,8 @@
 struct uv_systab *uv_systab;
-s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+s64 __uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
+   u64 a4, u64 a5)


Can we make this static?


Will do.


 {
struct uv_systab *tab = uv_systab;
s64 ret;
@@ -44,13 +45,26 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, 
u64 a3, u64 a4, u64 a5)
 * If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
 * callback method, which uses efi_call() directly, with the kernel 
page tables:
 */
-   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   if (unlikely(efi_enabled(EFI_OLD_MEMMAP)))
ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
else
ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
return ret;
 }
+
+s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+{
+   s64 ret;
+
+   if (down_interruptible(_runtime_sem))
+   return BIOS_STATUS_ABORT;
+
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
+   up(_runtime_sem);
+
+   return ret;
+}
 EXPORT_SYMBOL_GPL(uv_bios_call);
 s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
@@ -59,10 +73,15 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, 
u64 a2, u64 a3,
unsigned long bios_flags;
s64 ret;
+   if (down_interruptible(_runtime_sem))
+   return BIOS_STATUS_ABORT;
+
local_irq_save(bios_flags);
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
local_irq_restore(bios_flags);
+   up(_runtime_sem);
+
return ret;
 }


Thanks,
Bhupesh


Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain


[PATCH 3/3] x86/platform/UV: use efi_runtime_sem to serialise BIOS calls

2019-01-09 Thread Hedi Berriche
Calls into UV firmware must be protected against concurrency, use the
now visible efi_runtime_sem lock to serialise them.

Signed-off-by: Hedi Berriche 
Reviewed-by: Russ Anderson 
Reviewed-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Steve Wahl 
---
 arch/x86/include/asm/uv/bios.h |  3 ++-
 arch/x86/platform/uv/bios_uv.c | 25 ++---
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index 4eee646544b2..33e94aa0b1ff 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -48,7 +48,8 @@ enum {
BIOS_STATUS_SUCCESS =  0,
BIOS_STATUS_UNIMPLEMENTED   = -ENOSYS,
BIOS_STATUS_EINVAL  = -EINVAL,
-   BIOS_STATUS_UNAVAIL = -EBUSY
+   BIOS_STATUS_UNAVAIL = -EBUSY,
+   BIOS_STATUS_ABORT   = -EINTR
 };
 
 /* Address map parameters */
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index cd05af157763..92f960798e20 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -29,7 +29,8 @@
 
 struct uv_systab *uv_systab;
 
-s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+s64 __uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
+   u64 a4, u64 a5)
 {
struct uv_systab *tab = uv_systab;
s64 ret;
@@ -44,13 +45,26 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, 
u64 a3, u64 a4, u64 a5)
 * If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
 * callback method, which uses efi_call() directly, with the kernel 
page tables:
 */
-   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   if (unlikely(efi_enabled(EFI_OLD_MEMMAP)))
ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
else
ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
 
return ret;
 }
+
+s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3, u64 a4, u64 
a5)
+{
+   s64 ret;
+
+   if (down_interruptible(_runtime_sem))
+   return BIOS_STATUS_ABORT;
+
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
+   up(_runtime_sem);
+
+   return ret;
+}
 EXPORT_SYMBOL_GPL(uv_bios_call);
 
 s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
@@ -59,10 +73,15 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, 
u64 a2, u64 a3,
unsigned long bios_flags;
s64 ret;
 
+   if (down_interruptible(_runtime_sem))
+   return BIOS_STATUS_ABORT;
+
local_irq_save(bios_flags);
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
+   ret = __uv_bios_call(which, a1, a2, a3, a4, a5);
local_irq_restore(bios_flags);
 
+   up(_runtime_sem);
+
return ret;
 }
 
-- 
2.20.0



[PATCH 1/3] efi/x86: turn EFI runtime semaphore into a global lock

2019-01-09 Thread Hedi Berriche
Make efi_runtime_lock semaphore global so that it can be used by EFI
runtime callers that may be defined outside efi/runtime-wrappers.c.

The immediate motivation is to piggy-back it to serialise UV platform BIOS
calls.

No functional changes.

Signed-off-by: Hedi Berriche 
Reviewed-by: Russ Anderson 
Reviewed-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Steve Wahl 
---
 drivers/firmware/efi/runtime-wrappers.c |   60 
 include/linux/efi.h |3 +
 2 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 8903b9ccfc2b..ec60d6227925 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -53,7 +53,7 @@ struct efi_runtime_work efi_rts_work;
  * @rts_arg<1-5>:  efi_runtime_service() function arguments
  *
  * Accesses to efi_runtime_services() are serialized by a binary
- * semaphore (efi_runtime_lock) and caller waits until the work is
+ * semaphore (efi_runtime_sem) and caller waits until the work is
  * finished, hence _only_ one work is queued at a time and the caller
  * thread waits for completion.
  */
@@ -144,7 +144,7 @@ void efi_call_virt_check_flags(unsigned long flags, const 
char *call)
  * none of the remaining functions are actually ever called at runtime.
  * So let's just use a single lock to serialize all Runtime Services calls.
  */
-static DEFINE_SEMAPHORE(efi_runtime_lock);
+DEFINE_SEMAPHORE(efi_runtime_sem);
 
 /*
  * Calls the appropriate efi_runtime_service() with the appropriate
@@ -233,10 +233,10 @@ static efi_status_t virt_efi_get_time(efi_time_t *tm, 
efi_time_cap_t *tc)
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_TIME, tm, tc, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -244,10 +244,10 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_TIME, tm, NULL, NULL, NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -257,11 +257,11 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t 
*enabled,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_WAKEUP_TIME, enabled, pending, tm, NULL,
NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -269,11 +269,11 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t 
enabled, efi_time_t *tm)
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_WAKEUP_TIME, , tm, NULL, NULL,
NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -285,11 +285,11 @@ static efi_status_t virt_efi_get_variable(efi_char16_t 
*name,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_VARIABLE, name, vendor, attr, data_size,
data);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -299,11 +299,11 @@ static efi_status_t virt_efi_get_next_variable(unsigned 
long *name_size,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(GET_NEXT_VARIABLE, name_size, name, vendor,
NULL, NULL);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -315,11 +315,11 @@ static efi_status_t virt_efi_set_variable(efi_char16_t 
*name,
 {
efi_status_t status;
 
-   if (down_interruptible(_runtime_lock))
+   if (down_interruptible(_runtime_sem))
return EFI_ABORTED;
status = efi_queue_work(SET_VARIABLE, name, vendor, , _size,
data);
-   up(_runtime_lock);
+   up(_runtime_sem);
return status;
 }
 
@@ -330,12 +330,12 @@ virt_efi_set_variable_nonblocking(efi_char16_t *name, 
efi_guid_t *vendor,
 {
efi_status_t status;
 
-   if (down_trylock(_runtime_lock))
+   if (down_trylock(_runtime_sem))
return EFI_NOT_READY;
 
status = efi_call_virt(set_variable, name, 

[PATCH 0/3] Protect against concurrent calls into UV BIOS

2019-01-09 Thread Hedi Berriche
Calls into UV BIOS were not being serialised which is wrong as it violates
EFI runtime rules, and bad as it does result in all sorts of potentially
hard to track down hangs and panics when efi_scratch gets clobbered.

Patch #1 makes the efi_runtime_lock semaphore visible to EFI runtime
callers defined outside drivers/firmware/efi/runtime-wrappers.c in
preparation for using it to serialise calls into UV BIOS.

Patch #2 removes uv_bios_call_reentrant() because it's dead code.

Patch #3 makes uv_bios_call() variants use efi_runtime_sem to protect
against concurrency.

Hedi Berriche (3):
  efi/x86: turn EFI runtime semaphore into a global lock
  x86/platform/UV: kill uv_bios_call_reentrant() as it has no callers
  x86/platform/UV: use efi_runtime_sem to serialise BIOS calls

 arch/x86/include/asm/uv/bios.h  |4 +-
 arch/x86/platform/uv/bios_uv.c  |   37 +++
 drivers/firmware/efi/runtime-wrappers.c |   60 
 include/linux/efi.h |3 +
 4 files changed, 57 insertions(+), 47 deletions(-)

-- 
2.20.0



[PATCH 2/3] x86/platform/UV: kill uv_bios_call_reentrant() as it has no callers

2019-01-09 Thread Hedi Berriche
uv_bios_call_reentrant() has no callers nor is it exported, kill it.

Signed-off-by: Hedi Berriche 
Reviewed-by: Russ Anderson 
Reviewed-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Steve Wahl 
---
 arch/x86/include/asm/uv/bios.h |  1 -
 arch/x86/platform/uv/bios_uv.c | 12 
 2 files changed, 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h
index e652a7cc6186..4eee646544b2 100644
--- a/arch/x86/include/asm/uv/bios.h
+++ b/arch/x86/include/asm/uv/bios.h
@@ -140,7 +140,6 @@ enum uv_memprotect {
  */
 extern s64 uv_bios_call(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 extern s64 uv_bios_call_irqsave(enum uv_bios_cmd, u64, u64, u64, u64, u64);
-extern s64 uv_bios_call_reentrant(enum uv_bios_cmd, u64, u64, u64, u64, u64);
 
 extern s64 uv_bios_get_sn_info(int, int *, long *, long *, long *, long *);
 extern s64 uv_bios_freq_base(u64, u64 *);
diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 4a6a5a26c582..cd05af157763 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -66,18 +66,6 @@ s64 uv_bios_call_irqsave(enum uv_bios_cmd which, u64 a1, u64 
a2, u64 a3,
return ret;
 }
 
-s64 uv_bios_call_reentrant(enum uv_bios_cmd which, u64 a1, u64 a2, u64 a3,
-   u64 a4, u64 a5)
-{
-   s64 ret;
-
-   preempt_disable();
-   ret = uv_bios_call(which, a1, a2, a3, a4, a5);
-   preempt_enable();
-
-   return ret;
-}
-
 
 long sn_partition_id;
 EXPORT_SYMBOL_GPL(sn_partition_id);
-- 
2.20.0



Re: [PATCH] kernel, add panic_on_warn

2014-10-30 Thread Hedi Berriche
On Thu, Oct 30, 2014 at 17:06 Prarit Bhargava wrote:
| There have been several times where I have had to rebuild a kernel to
| cause a panic when hitting a WARN() in the code in order to get a crash
| dump from a system.  Sometimes this is easy to do, other times (such as
| in the case of a remote admin) it is not trivial to send new images to the
| user.
| 
| A much easier method would be a switch to change the WARN() over to a
| panic.  This makes debugging easier in that I can now test the actual
| image the WARN() was seen on and I do not have to engage in remote
| debugging.

Do we want to leave it to usersspace[1] to ensure panic_on_warn is out
of the way in when the kdump kernel boots? or would a self-contained
approach be more preferable i.e. test whether we're a kdump kernel
before bothering with panic_on_warn?

Cheers,
Hedi.

[1] kexec-tools in the case of the boot param by filtering it out of the
kdump kernel cmdline. In the case of sysctl.conf, it would depend on
whether there are distros out there that  include it in the kdump
initrd.

-- 
Hedi Berriche
Linux Kernel Engineer
http://www.sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel, add panic_on_warn

2014-10-30 Thread Hedi Berriche
On Thu, Oct 30, 2014 at 17:06 Prarit Bhargava wrote:
| There have been several times where I have had to rebuild a kernel to
| cause a panic when hitting a WARN() in the code in order to get a crash
| dump from a system.  Sometimes this is easy to do, other times (such as
| in the case of a remote admin) it is not trivial to send new images to the
| user.
| 
| A much easier method would be a switch to change the WARN() over to a
| panic.  This makes debugging easier in that I can now test the actual
| image the WARN() was seen on and I do not have to engage in remote
| debugging.

Do we want to leave it to usersspace[1] to ensure panic_on_warn is out
of the way in when the kdump kernel boots? or would a self-contained
approach be more preferable i.e. test whether we're a kdump kernel
before bothering with panic_on_warn?

Cheers,
Hedi.

[1] kexec-tools in the case of the boot param by filtering it out of the
kdump kernel cmdline. In the case of sysctl.conf, it would depend on
whether there are distros out there that  include it in the kdump
initrd.

-- 
Hedi Berriche
Linux Kernel Engineer
http://www.sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Regression] sched: division by zero in find_busiest_group()

2013-12-17 Thread Hedi Berriche
On Mon, Dec 09, 2013 at 18:10 Hedi Berriche wrote:
| Folks,
| 
| The following panic occurs *early* at boot time on high *enough* CPU count
| machines:
| 
| divide error:  [#1] SMP 
| Modules linked in:
| CPU: 22 PID: 1146 Comm: kworker/22:0 Not tainted 3.13.0-rc2-00122-gdea4f48 #8
| Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
| task: 8827d49f31c0 ti: 8827d4a18000 task.ti: 8827d4a18000
| RIP: 0010:[]  [] 
find_busiest_group+0x26b/0x890
| RSP: :8827d4a19b68  EFLAGS: 00010006
| RAX: 7fff RBX: 8000 RCX: 0200
| RDX:  RSI: 8000 RDI: 0020
| RBP: 8827d4a19cc0 R08:  R09: 
| R10:  R11:  R12: 
| R13: 8827d4a19d28 R14: 8827d4a19b98 R15: 
| FS:  () GS:8827dfd8() knlGS:
| CS:  0010 DS:  ES:  CR0: 8005003b
| CR2: 00b8 CR3: 018da000 CR4: 07e0
| Stack:
| 8827d4b35800  00014600 00014600
|  8827d4b35818  
|   8000 
| Call Trace:
| [] load_balance+0x166/0x7f0
| [] idle_balance+0x10e/0x1b0
| [] __schedule+0x723/0x780
| [] schedule+0x29/0x70
| [] worker_thread+0x1c9/0x400
| [] ? rescuer_thread+0x3e0/0x3e0
| [] kthread+0xd2/0xf0
| [] ? kthread_create_on_node+0x180/0x180
| [] ret_from_fork+0x7c/0xb0
| [] ? kthread_create_on_node+0x180/0x180

Hmm...had time to dig into this a bit deeper and looking at
build_overlap_sched_groups(), specifically this bit of code:

kernel/sched/core.c:

5066 static int
5067 build_overlap_sched_groups(struct sched_domain *sd, int cpu)
5068 {
...
5109 /*
5110  * Initialize sgp->power such that even if we mess up the
5111  * domains and no possible iteration will get us here, we 
won't
5112  * die on a /0 trap.
5113  */
5114 sg->sgp->power = SCHED_POWER_SCALE * 
cpumask_weight(sg_span);

I'm wondering whether the same precaution should be used when it comes to 
sg->sgp->power_orig.

Cheers,
Hedi.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Regression] sched: division by zero in find_busiest_group()

2013-12-17 Thread Hedi Berriche
On Mon, Dec 09, 2013 at 18:10 Hedi Berriche wrote:
| Folks,
| 
| The following panic occurs *early* at boot time on high *enough* CPU count
| machines:
| 
| divide error:  [#1] SMP 
| Modules linked in:
| CPU: 22 PID: 1146 Comm: kworker/22:0 Not tainted 3.13.0-rc2-00122-gdea4f48 #8
| Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
| task: 8827d49f31c0 ti: 8827d4a18000 task.ti: 8827d4a18000
| RIP: 0010:[810a345b]  [810a345b] 
find_busiest_group+0x26b/0x890
| RSP: :8827d4a19b68  EFLAGS: 00010006
| RAX: 7fff RBX: 8000 RCX: 0200
| RDX:  RSI: 8000 RDI: 0020
| RBP: 8827d4a19cc0 R08:  R09: 
| R10:  R11:  R12: 
| R13: 8827d4a19d28 R14: 8827d4a19b98 R15: 
| FS:  () GS:8827dfd8() knlGS:
| CS:  0010 DS:  ES:  CR0: 8005003b
| CR2: 00b8 CR3: 018da000 CR4: 07e0
| Stack:
| 8827d4b35800  00014600 00014600
|  8827d4b35818  
|   8000 
| Call Trace:
| [810a3be6] load_balance+0x166/0x7f0
| [810a477e] idle_balance+0x10e/0x1b0
| [815d83d3] __schedule+0x723/0x780
| [815d8459] schedule+0x29/0x70
| [810818b9] worker_thread+0x1c9/0x400
| [810816f0] ? rescuer_thread+0x3e0/0x3e0
| [81088562] kthread+0xd2/0xf0
| [81088490] ? kthread_create_on_node+0x180/0x180
| [815e437c] ret_from_fork+0x7c/0xb0
| [81088490] ? kthread_create_on_node+0x180/0x180

Hmm...had time to dig into this a bit deeper and looking at
build_overlap_sched_groups(), specifically this bit of code:

kernel/sched/core.c:

5066 static int
5067 build_overlap_sched_groups(struct sched_domain *sd, int cpu)
5068 {
...
5109 /*
5110  * Initialize sgp-power such that even if we mess up the
5111  * domains and no possible iteration will get us here, we 
won't
5112  * die on a /0 trap.
5113  */
5114 sg-sgp-power = SCHED_POWER_SCALE * 
cpumask_weight(sg_span);

I'm wondering whether the same precaution should be used when it comes to 
sg-sgp-power_orig.

Cheers,
Hedi.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Regression] sched: division by zero in find_busiest_group()

2013-12-09 Thread Hedi Berriche
Folks,

The following panic occurs *early* at boot time on high *enough* CPU count
machines:

divide error:  [#1] SMP 
Modules linked in:
CPU: 22 PID: 1146 Comm: kworker/22:0 Not tainted 3.13.0-rc2-00122-gdea4f48 #8
Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
task: 8827d49f31c0 ti: 8827d4a18000 task.ti: 8827d4a18000
RIP: 0010:[]  [] 
find_busiest_group+0x26b/0x890
RSP: :8827d4a19b68  EFLAGS: 00010006
RAX: 7fff RBX: 8000 RCX: 0200
RDX:  RSI: 8000 RDI: 0020
RBP: 8827d4a19cc0 R08:  R09: 
R10:  R11:  R12: 
R13: 8827d4a19d28 R14: 8827d4a19b98 R15: 
FS:  () GS:8827dfd8() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 00b8 CR3: 018da000 CR4: 07e0
Stack:
8827d4b35800  00014600 00014600
 8827d4b35818  
  8000 
Call Trace:
[] load_balance+0x166/0x7f0
[] idle_balance+0x10e/0x1b0
[] __schedule+0x723/0x780
[] schedule+0x29/0x70
[] worker_thread+0x1c9/0x400
[] ? rescuer_thread+0x3e0/0x3e0
[] kthread+0xd2/0xf0
[] ? kthread_create_on_node+0x180/0x180
[] ret_from_fork+0x7c/0xb0
[] ? kthread_create_on_node+0x180/0x180

Bisection points to

9abf24d sched: Check sched_domain before computing group power

but without it (as clearly indicated in the changelog) the kernel panics thus:

BUG: unable to handle kernel NULL pointer dereference at 0010
IP: [] update_group_power+0xa2/0x250
PGD 0 
Oops:  [#1] SMP 
Modules linked in: 
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-00122-gdea4f48 #10 
Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
task: 881054528000 ti: 88105453 task.ti: 88105453
RIP: 0010:[]  [] 
update_group_power+0xa2/0x250
RSP: :881054531d48  EFLAGS: 00010287
RAX:  RBX:  RCX: 
RDX: 00f0 RSI: 0100 RDI: 00c0
RBP: 881054531d70 R08: 89e7d4ae6018 R09: 0004
R10: 89e7d4ae6818 R11: 81098d5d R12: 
R13: 000146c0 R14: 89e7d4ae6000 R15: 89e7d4ae6018
FS:  () GS:88105fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0010 CR3: 018c2000 CR4: 07f0
Stack:
89e7d4ae6000 89e7d4ac 00f0 00f0
898fd3a68c00 881054531e30 8109933f 0100
00ff 00010448 00ff 01000100
Call Trace:
[] build_sched_domains+0xbff/0xc80
[] sched_init_smp+0x3ad/0x469
[] kernel_init_freeable+0xfa/0x207
[] ? rest_init+0x80/0x80
[] kernel_init+0xe/0x120
[] ret_from_fork+0x7c/0xb0
[] ? rest_init+0x80/0x80

and this is because of:

863bffc sched/fair: Fix group power_orig computation

IOW, 9abf24d can't be blamed for it all, and this is not a case of a
straightforward revert of a single commit.

Back to the division by zero itself, it's taking place in the inlined 
sg_capacity():

find_busiest_group
  update_sd_lb_stats
update_sg_lb_stats
  sg_capacity

5492 static inline int sg_capacity(struct lb_env *env, struct sched_group 
*group)
5493 {
5494 unsigned int capacity, smt, cpus;
5495 unsigned int power, power_orig;
5496 
5497 power = group->sgp->power;
5498 power_orig = group->sgp->power_orig;
5499 cpus = group->group_weight;
5500 
5501 /* smt := ceil(cpus / power), assumes: 1 < smt_power < 2 */
5502 smt = DIV_ROUND_UP(SCHED_POWER_SCALE * cpus, power_orig);  <-- 
HERE

so we're arriving here with group->sgp->power_orig == 0.

Cheers,
Hedi.

P.S. 

The following *works* around the panic:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e85cda2..48c8d0b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5735,6 +5735,9 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
if (!sgp)
return -ENOMEM;
 
+   /* WAR: avoid a divison by zero in sg_capacity() */
+   sgp->power_orig = 1;
+
*per_cpu_ptr(sdd->sgp, j) = sgp;
}
}

and I wonder whether the following --on its own-- would make sense as a fix:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fd773ad..57578b3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5495,7 +5495,7 @@ static inline int sg_capacity(struct lb_env *env, struct 
sched_group *group)
unsigned int power, power_orig;
 
power = group->sgp->power;
-   

[Regression] sched: division by zero in find_busiest_group()

2013-12-09 Thread Hedi Berriche
Folks,

The following panic occurs *early* at boot time on high *enough* CPU count
machines:

divide error:  [#1] SMP 
Modules linked in:
CPU: 22 PID: 1146 Comm: kworker/22:0 Not tainted 3.13.0-rc2-00122-gdea4f48 #8
Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
task: 8827d49f31c0 ti: 8827d4a18000 task.ti: 8827d4a18000
RIP: 0010:[810a345b]  [810a345b] 
find_busiest_group+0x26b/0x890
RSP: :8827d4a19b68  EFLAGS: 00010006
RAX: 7fff RBX: 8000 RCX: 0200
RDX:  RSI: 8000 RDI: 0020
RBP: 8827d4a19cc0 R08:  R09: 
R10:  R11:  R12: 
R13: 8827d4a19d28 R14: 8827d4a19b98 R15: 
FS:  () GS:8827dfd8() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 00b8 CR3: 018da000 CR4: 07e0
Stack:
8827d4b35800  00014600 00014600
 8827d4b35818  
  8000 
Call Trace:
[810a3be6] load_balance+0x166/0x7f0
[810a477e] idle_balance+0x10e/0x1b0
[815d83d3] __schedule+0x723/0x780
[815d8459] schedule+0x29/0x70
[810818b9] worker_thread+0x1c9/0x400
[810816f0] ? rescuer_thread+0x3e0/0x3e0
[81088562] kthread+0xd2/0xf0
[81088490] ? kthread_create_on_node+0x180/0x180
[815e437c] ret_from_fork+0x7c/0xb0
[81088490] ? kthread_create_on_node+0x180/0x180

Bisection points to

9abf24d sched: Check sched_domain before computing group power

but without it (as clearly indicated in the changelog) the kernel panics thus:

BUG: unable to handle kernel NULL pointer dereference at 0010
IP: [810a3542] update_group_power+0xa2/0x250
PGD 0 
Oops:  [#1] SMP 
Modules linked in: 
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-00122-gdea4f48 #10 
Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
task: 881054528000 ti: 88105453 task.ti: 88105453
RIP: 0010:[810a3542]  [810a3542] 
update_group_power+0xa2/0x250
RSP: :881054531d48  EFLAGS: 00010287
RAX:  RBX:  RCX: 
RDX: 00f0 RSI: 0100 RDI: 00c0
RBP: 881054531d70 R08: 89e7d4ae6018 R09: 0004
R10: 89e7d4ae6818 R11: 81098d5d R12: 
R13: 000146c0 R14: 89e7d4ae6000 R15: 89e7d4ae6018
FS:  () GS:88105fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0010 CR3: 018c2000 CR4: 07f0
Stack:
89e7d4ae6000 89e7d4ac 00f0 00f0
898fd3a68c00 881054531e30 8109933f 0100
00ff 00010448 00ff 01000100
Call Trace:
[8109933f] build_sched_domains+0xbff/0xc80
[81a3c89e] sched_init_smp+0x3ad/0x469
[81a1c00c] kernel_init_freeable+0xfa/0x207
[815b3ea0] ? rest_init+0x80/0x80
[815b3eae] kernel_init+0xe/0x120
[815d547c] ret_from_fork+0x7c/0xb0
[815b3ea0] ? rest_init+0x80/0x80

and this is because of:

863bffc sched/fair: Fix group power_orig computation

IOW, 9abf24d can't be blamed for it all, and this is not a case of a
straightforward revert of a single commit.

Back to the division by zero itself, it's taking place in the inlined 
sg_capacity():

find_busiest_group
  update_sd_lb_stats
update_sg_lb_stats
  sg_capacity

5492 static inline int sg_capacity(struct lb_env *env, struct sched_group 
*group)
5493 {
5494 unsigned int capacity, smt, cpus;
5495 unsigned int power, power_orig;
5496 
5497 power = group-sgp-power;
5498 power_orig = group-sgp-power_orig;
5499 cpus = group-group_weight;
5500 
5501 /* smt := ceil(cpus / power), assumes: 1  smt_power  2 */
5502 smt = DIV_ROUND_UP(SCHED_POWER_SCALE * cpus, power_orig);  -- 
HERE

so we're arriving here with group-sgp-power_orig == 0.

Cheers,
Hedi.

P.S. 

The following *works* around the panic:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e85cda2..48c8d0b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5735,6 +5735,9 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
if (!sgp)
return -ENOMEM;
 
+   /* WAR: avoid a divison by zero in sg_capacity() */
+   sgp-power_orig = 1;
+
*per_cpu_ptr(sdd-sgp, j) = sgp;
}
}

and I wonder whether the following --on its own-- would make 

Commit 07f9b61 breaks systems that don't implement a _CBA method

2013-10-03 Thread Hedi Berriche
Chaps,

The following failure was encountered on hardware that does *not*
implement a _CBA method which is AFAICT (and confirmed to me by BIOS
chaps) optional.

[1.230647] PCI: MMCONFIG for domain  [bus 00-0c] at [mem 
0x8000-0x80cf] (base 0x8000)
[1.241046] PCI: MMCONFIG for domain 0001 [bus 00-02] at [mem 
0xff8400-0xff842f] (base 0xff8400)
[1.252025] PCI: MMCONFIG for domain 1000 [bus 3f-3f] at [mem 
0xff83f0-0xff83ff] (base 0xff8000)
[1.263006] PCI: MMCONFIG for domain 1001 [bus 3f-3f] at [mem 
0xff87f0-0xff87ff] (base 0xff8400)
[1.273984] PCI: MMCONFIG at [mem 0x8000-0x80cf] reserved in E820
[1.281564] PCI: MMCONFIG at [mem 0xff8400-0xff842f] reserved in E820
[1.289535] PCI: MMCONFIG at [mem 0xff83f0-0xff83ff] reserved in E820
[1.297505] PCI: MMCONFIG at [mem 0xff87f0-0xff87ff] reserved in E820



[1.427926] ACPI: PCI Root Bridge [I001] (domain 0001 [bus 00-3d])
[1.434968] acpi PNP0A08:00: fail to add MMCONFIG information, can't access 
PCI configuration space under this host bridge.
[1.447405] acpi PNP0A08:00: Bus 0001:00 not present in PCI namespace

...

[1.454608] ACPI: PCI Root Bridge [S000] (domain 1000 [bus 3f])
[1.461300] acpi PNP0A08:01: fail to add MMCONFIG information, can't access 
PCI configuration space under this host bridge.
[1.473735] acpi PNP0A08:01: Bus 1000:3f not present in PCI namespace

...

[1.480935] ACPI: PCI Root Bridge [S001] (domain 1001 [bus 3f])
[1.487630] acpi PNP0A08:02: fail to add MMCONFIG information, can't access 
PCI configuration space under this host bridge.
[1.500066] acpi PNP0A08:02: Bus 1001:3f not present in PCI namespace

Bisection points to this commit (included in full given its brevity):

commit 07f9b61c3915e8eb156cb4461b3946736356ad02
Author: ethan.zhao 
Date:   Fri Jul 26 11:21:24 2013 -0600

x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero

We can check for addr being zero earlier and thus avoid the mutex_unlock()
cleanup path.

[bhelgaas: drop warning printk]
Signed-off-by: ethan.zhao 
Signed-off-by: Bjorn Helgaas 
Acked-by: Yinghai Lu 

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 082e881..5596c7b 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -700,7 +700,7 @@ int pci_mmconfig_insert(struct device *dev, u16 seg, u8 
start, u8 end,
if (!(pci_probe & PCI_PROBE_MMCONF) || pci_mmcfg_arch_init_failed)
return -ENODEV;

-   if (start > end)
+   if (start > end || !addr)
return -EINVAL;

mutex_lock(_mmcfg_lock);
@@ -716,11 +716,6 @@ int pci_mmconfig_insert(struct device *dev, u16 seg, u8 
start, u8 end,
return -EEXIST;
}

-   if (!addr) {
-   mutex_unlock(_mmcfg_lock);
-   return -EINVAL;
-   }
-
rc = -EBUSY;
cfg = pci_mmconfig_alloc(seg, start, end, addr);
if (cfg == NULL) {


With this change in place, pci_mmconfig_insert(), when called with addr==0
bails out (too) early with -EINVAL and no longer calls into 
pci_mmconfig_lookup():

693 int pci_mmconfig_insert(struct device *dev, u16 seg, u8 start, u8 end,
694 phys_addr_t addr)
695 {
696 int rc;
697 struct resource *tmp = NULL;
698 struct pci_mmcfg_region *cfg;
699 
700 if (!(pci_probe & PCI_PROBE_MMCONF) || pci_mmcfg_arch_init_failed)
701 return -ENODEV;
702 
703 if (start > end || !addr)
704 return -EINVAL;
705 
706 mutex_lock(_mmcfg_lock);
707 cfg = pci_mmconfig_lookup(seg, start);
708 if (cfg) {
709 if (cfg->end_bus < end)
710 dev_info(dev, FW_INFO
711  "MMCONFIG for "
712  "domain %04x [bus %02x-%02x] "
713  "only partially covers this bridge\n",
714   cfg->segment, cfg->start_bus, 
cfg->end_bus);
715 mutex_unlock(_mmcfg_lock);
716 return -EEXIST;
717 }

And given the code path reads:

pci_acpi_scan_root()
 setup_mcfg_map()
  pci_mmconfig_insert()

this causes setup_mcfg_map() to fail and go down the check_segment() error
path:

171 static int setup_mcfg_map(struct pci_root_info *info, u16 seg, u8 start,
172   u8 end, phys_addr_t addr)
173 {
...
188 result = pci_mmconfig_insert(dev, seg, start, end, addr);
189 if (result == 0) {
...
194 } else if (result != -EEXIST)
195 return check_segment(seg, dev,
196  "fail to add MMCONFIG information,");
197
198 return 0;
199 }

and this finally causes pci_acpi_scan_root() to skip calling 
pci_create_root_bus()
and all is doom and gloom:

473 

Commit 07f9b61 breaks systems that don't implement a _CBA method

2013-10-03 Thread Hedi Berriche
Chaps,

The following failure was encountered on hardware that does *not*
implement a _CBA method which is AFAICT (and confirmed to me by BIOS
chaps) optional.

[1.230647] PCI: MMCONFIG for domain  [bus 00-0c] at [mem 
0x8000-0x80cf] (base 0x8000)
[1.241046] PCI: MMCONFIG for domain 0001 [bus 00-02] at [mem 
0xff8400-0xff842f] (base 0xff8400)
[1.252025] PCI: MMCONFIG for domain 1000 [bus 3f-3f] at [mem 
0xff83f0-0xff83ff] (base 0xff8000)
[1.263006] PCI: MMCONFIG for domain 1001 [bus 3f-3f] at [mem 
0xff87f0-0xff87ff] (base 0xff8400)
[1.273984] PCI: MMCONFIG at [mem 0x8000-0x80cf] reserved in E820
[1.281564] PCI: MMCONFIG at [mem 0xff8400-0xff842f] reserved in E820
[1.289535] PCI: MMCONFIG at [mem 0xff83f0-0xff83ff] reserved in E820
[1.297505] PCI: MMCONFIG at [mem 0xff87f0-0xff87ff] reserved in E820

snip

[1.427926] ACPI: PCI Root Bridge [I001] (domain 0001 [bus 00-3d])
[1.434968] acpi PNP0A08:00: fail to add MMCONFIG information, can't access 
PCI configuration space under this host bridge.
[1.447405] acpi PNP0A08:00: Bus 0001:00 not present in PCI namespace

...

[1.454608] ACPI: PCI Root Bridge [S000] (domain 1000 [bus 3f])
[1.461300] acpi PNP0A08:01: fail to add MMCONFIG information, can't access 
PCI configuration space under this host bridge.
[1.473735] acpi PNP0A08:01: Bus 1000:3f not present in PCI namespace

...

[1.480935] ACPI: PCI Root Bridge [S001] (domain 1001 [bus 3f])
[1.487630] acpi PNP0A08:02: fail to add MMCONFIG information, can't access 
PCI configuration space under this host bridge.
[1.500066] acpi PNP0A08:02: Bus 1001:3f not present in PCI namespace

Bisection points to this commit (included in full given its brevity):

commit 07f9b61c3915e8eb156cb4461b3946736356ad02
Author: ethan.zhao ethan.z...@oracle.com
Date:   Fri Jul 26 11:21:24 2013 -0600

x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero

We can check for addr being zero earlier and thus avoid the mutex_unlock()
cleanup path.

[bhelgaas: drop warning printk]
Signed-off-by: ethan.zhao ethan.z...@oracle.com
Signed-off-by: Bjorn Helgaas bhelg...@google.com
Acked-by: Yinghai Lu ying...@kernel.org

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 082e881..5596c7b 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -700,7 +700,7 @@ int pci_mmconfig_insert(struct device *dev, u16 seg, u8 
start, u8 end,
if (!(pci_probe  PCI_PROBE_MMCONF) || pci_mmcfg_arch_init_failed)
return -ENODEV;

-   if (start  end)
+   if (start  end || !addr)
return -EINVAL;

mutex_lock(pci_mmcfg_lock);
@@ -716,11 +716,6 @@ int pci_mmconfig_insert(struct device *dev, u16 seg, u8 
start, u8 end,
return -EEXIST;
}

-   if (!addr) {
-   mutex_unlock(pci_mmcfg_lock);
-   return -EINVAL;
-   }
-
rc = -EBUSY;
cfg = pci_mmconfig_alloc(seg, start, end, addr);
if (cfg == NULL) {


With this change in place, pci_mmconfig_insert(), when called with addr==0
bails out (too) early with -EINVAL and no longer calls into 
pci_mmconfig_lookup():

693 int pci_mmconfig_insert(struct device *dev, u16 seg, u8 start, u8 end,
694 phys_addr_t addr)
695 {
696 int rc;
697 struct resource *tmp = NULL;
698 struct pci_mmcfg_region *cfg;
699 
700 if (!(pci_probe  PCI_PROBE_MMCONF) || pci_mmcfg_arch_init_failed)
701 return -ENODEV;
702 
703 if (start  end || !addr)
704 return -EINVAL;
705 
706 mutex_lock(pci_mmcfg_lock);
707 cfg = pci_mmconfig_lookup(seg, start);
708 if (cfg) {
709 if (cfg-end_bus  end)
710 dev_info(dev, FW_INFO
711  MMCONFIG for 
712  domain %04x [bus %02x-%02x] 
713  only partially covers this bridge\n,
714   cfg-segment, cfg-start_bus, 
cfg-end_bus);
715 mutex_unlock(pci_mmcfg_lock);
716 return -EEXIST;
717 }

And given the code path reads:

pci_acpi_scan_root()
 setup_mcfg_map()
  pci_mmconfig_insert()

this causes setup_mcfg_map() to fail and go down the check_segment() error
path:

171 static int setup_mcfg_map(struct pci_root_info *info, u16 seg, u8 start,
172   u8 end, phys_addr_t addr)
173 {
...
188 result = pci_mmconfig_insert(dev, seg, start, end, addr);
189 if (result == 0) {
...
194 } else if (result != -EEXIST)
195 return check_segment(seg, dev,
196  fail to add MMCONFIG information,);
197
198 return 0;
199 }

and this finally causes 

Re: [PATCH TRIVIAL] kexec: Typo s/the/then/

2013-09-15 Thread Hedi Berriche
On Sun, Sep 15, 2013 at 10:36 Geert Uytterhoeven wrote:
| Signed-off-by: Geert Uytterhoeven 
| ---
|  kernel/kexec.c |2 +-
|  1 file changed, 1 insertion(+), 1 deletion(-)
| 
| diff --git a/kernel/kexec.c b/kernel/kexec.c
| index 2a74f30..490afc0 100644
| --- a/kernel/kexec.c
| +++ b/kernel/kexec.c
| @@ -921,7 +921,7 @@ static int kimage_load_segment(struct kimage *image,
|   *   reinitialize them.
|   *
|   * - A machine specific part that includes the syscall number
| - *   and the copies the image to it's final destination.  And
| + *   and then copies the image to it's final destination.  And
|   *   jumps into the image at entry.

Well, if you're fixing typos, might as well get rid of the wrong "it's"
and turn into "its".

Cheers,
Hedi.
-- 
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH TRIVIAL] kexec: Typo s/the/then/

2013-09-15 Thread Hedi Berriche
On Sun, Sep 15, 2013 at 10:36 Geert Uytterhoeven wrote:
| Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org
| ---
|  kernel/kexec.c |2 +-
|  1 file changed, 1 insertion(+), 1 deletion(-)
| 
| diff --git a/kernel/kexec.c b/kernel/kexec.c
| index 2a74f30..490afc0 100644
| --- a/kernel/kexec.c
| +++ b/kernel/kexec.c
| @@ -921,7 +921,7 @@ static int kimage_load_segment(struct kimage *image,
|   *   reinitialize them.
|   *
|   * - A machine specific part that includes the syscall number
| - *   and the copies the image to it's final destination.  And
| + *   and then copies the image to it's final destination.  And
|   *   jumps into the image at entry.

Well, if you're fixing typos, might as well get rid of the wrong it's
and turn into its.

Cheers,
Hedi.
-- 
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/9] x86/UV: Add ability to disable UV NMI handler

2013-09-12 Thread Hedi Berriche
On Thu, Sep 12, 2013 at 19:59 Mike Travis wrote:
| On 9/12/2013 10:27 AM, Paul E. McKenney wrote:
|
| > But what is it that you are looking for?  If you want to silence it
| > completely, the rcu_cpu_stall_suppress boot/sysfs parameter is what
| > you want to use.
| 
| We have by default rcutree.rcu_cpu_stall_suppress=1 on the kernel
| cmdline.  I'll double check if it was set during my testing.

FWIW, for recent enough kernels the correct boot parameter is
rcupdate.rcu_cpu_stall_suppress.

It used to be rcutree.rcu_cpu_stall_suppress, but that has changed after
commit 6bfc09e.

Cheers,
Hedi.
-- 
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/9] x86/UV: Add ability to disable UV NMI handler

2013-09-12 Thread Hedi Berriche
On Thu, Sep 12, 2013 at 19:59 Mike Travis wrote:
| On 9/12/2013 10:27 AM, Paul E. McKenney wrote:
|
|  But what is it that you are looking for?  If you want to silence it
|  completely, the rcu_cpu_stall_suppress boot/sysfs parameter is what
|  you want to use.
| 
| We have by default rcutree.rcu_cpu_stall_suppress=1 on the kernel
| cmdline.  I'll double check if it was set during my testing.

FWIW, for recent enough kernels the correct boot parameter is
rcupdate.rcu_cpu_stall_suppress.

It used to be rcutree.rcu_cpu_stall_suppress, but that has changed after
commit 6bfc09e.

Cheers,
Hedi.
-- 
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/