Re: [PATCH v2] powerpc/numa: Correct kernel message severity

2018-05-30 Thread Vipin K Parashar

Hi,


Any progress/update with this patch ?

Please do let know, if something more is needed here.


Regards,

Vipin


On Wednesday 14 March 2018 01:22 PM, Vipin K Parashar wrote:

printk() in unmap_cpu_from_node() uses KERN_ERR message severity,
for a WARNING message. Change it to pr_warn().

Signed-off-by: Vipin K Parashar 
---
  arch/powerpc/mm/numa.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index edd8d0b..1632f4b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -163,8 +163,7 @@ static void unmap_cpu_from_node(unsigned long cpu)
if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
} else {
-   printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n",
-  cpu, node);
+   pr_warn("WARNING: cpu %lu not found in node %d\n", cpu, node);
}
  }
  #endif /* CONFIG_HOTPLUG_CPU || CONFIG_PPC_SPLPAR */




Re: [PATCH] powerpc/numa: Correct kernel message severity

2018-03-14 Thread Vipin K Parashar


On Tuesday 13 March 2018 03:58 PM, Christophe LEROY wrote:



Le 13/03/2018 à 11:11, Vipin K Parashar a écrit :

printk in unmap_cpu_from_node() uses KERN_ERR message severity
for a WARNING message. Correct message severity to KERN_WARNING.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
  arch/powerpc/mm/numa.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index edd8d0b..79c94cc 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -163,7 +163,7 @@ static void unmap_cpu_from_node(unsigned long cpu)
  if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
  cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
  } else {
-    printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n",
+    printk(KERN_WARNING "WARNING: cpu %lu not found in node %d\n",

>  cpu, node);

Why not take the opportunity to use pr_warn() instead, hence to put 
back the cpu and node vars on the same line.


Christophe



Yes, Thanks!!

Send v2.



  }
  }







[PATCH v2] powerpc/numa: Correct kernel message severity

2018-03-14 Thread Vipin K Parashar
printk() in unmap_cpu_from_node() uses KERN_ERR message severity,
for a WARNING message. Change it to pr_warn().

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index edd8d0b..1632f4b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -163,8 +163,7 @@ static void unmap_cpu_from_node(unsigned long cpu)
if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
} else {
-   printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n",
-  cpu, node);
+   pr_warn("WARNING: cpu %lu not found in node %d\n", cpu, node);
}
 }
 #endif /* CONFIG_HOTPLUG_CPU || CONFIG_PPC_SPLPAR */
-- 
2.7.4



[PATCH] powerpc/numa: Correct kernel message severity

2018-03-13 Thread Vipin K Parashar
printk in unmap_cpu_from_node() uses KERN_ERR message severity
for a WARNING message. Correct message severity to KERN_WARNING.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index edd8d0b..79c94cc 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -163,7 +163,7 @@ static void unmap_cpu_from_node(unsigned long cpu)
if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
} else {
-   printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n",
+   printk(KERN_WARNING "WARNING: cpu %lu not found in node %d\n",
   cpu, node);
}
 }
-- 
2.7.4



Re: [PATCH v4] powernv/sensor: Handle OPAL_WRONG_STATE error return

2017-03-30 Thread Vipin K Parashar

Hi Michael,

Any feedback/outlook with this patch ?

Regards,
Vipin


On Friday 10 March 2017 05:27 PM, Vipin K Parashar wrote:

OPAL returns OPAL_WRONG_STATE upon failing to provide
sensor data due to core sleeping/offline. Added check
in opal_get_sensor_data() for sensor read failure with
OPAL_WRONG_STATE return code and returned -EIO.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v4:
  - Removed sleeping core log message with KERN_NOTICE priority.

Changes in v3:
  - Added a new case for OPAL_WRONG_STATE in sensor read
along with a log message indicating sleeping/offline core
causing read fail.

  arch/powerpc/platforms/powernv/opal-sensor.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-sensor.c 
b/arch/powerpc/platforms/powernv/opal-sensor.c
index 308efd1..aa267f1 100644
--- a/arch/powerpc/platforms/powernv/opal-sensor.c
+++ b/arch/powerpc/platforms/powernv/opal-sensor.c
@@ -64,6 +64,10 @@ int opal_get_sensor_data(u32 sensor_hndl, u32 *sensor_data)
*sensor_data = be32_to_cpu(data);
break;

+   case OPAL_WRONG_STATE:
+   ret = -EIO;
+   break;
+
default:
ret = opal_error_code(ret);
break;




[PATCH v4] powernv/sensor: Handle OPAL_WRONG_STATE error return

2017-03-10 Thread Vipin K Parashar
OPAL returns OPAL_WRONG_STATE upon failing to provide
sensor data due to core sleeping/offline. Added check
in opal_get_sensor_data() for sensor read failure with
OPAL_WRONG_STATE return code and returned -EIO.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v4:
 - Removed sleeping core log message with KERN_NOTICE priority.

Changes in v3:
 - Added a new case for OPAL_WRONG_STATE in sensor read
   along with a log message indicating sleeping/offline core
   causing read fail.

 arch/powerpc/platforms/powernv/opal-sensor.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-sensor.c 
b/arch/powerpc/platforms/powernv/opal-sensor.c
index 308efd1..aa267f1 100644
--- a/arch/powerpc/platforms/powernv/opal-sensor.c
+++ b/arch/powerpc/platforms/powernv/opal-sensor.c
@@ -64,6 +64,10 @@ int opal_get_sensor_data(u32 sensor_hndl, u32 *sensor_data)
*sensor_data = be32_to_cpu(data);
break;
 
+   case OPAL_WRONG_STATE:
+   ret = -EIO;
+   break;
+
default:
ret = opal_error_code(ret);
break;
-- 
2.7.4



Re: [PATCH v3] powernv/sensor: Handle OPAL_WRONG_STATE error return

2017-03-06 Thread Vipin K Parashar



On Monday 06 March 2017 03:03 PM, Michael Ellerman wrote:

Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:


OPAL returns OPAL_WRONG_STATE upon failing to provide
sensor data due to core sleeping/offline. Added check
for OPAL_WRONG_STATE rerurn code with sensor read failure.
Also added a log message indicating sensor data being
queried for sleeping/offline core.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v3:
  - Added a new case for OPAL_WRONG_STATE in sensor read
along with a log message indicating sleeping/offline core
causing read fail.

  arch/powerpc/platforms/powernv/opal-sensor.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-sensor.c 
b/arch/powerpc/platforms/powernv/opal-sensor.c
index 308efd1..fb6d6bb 100644
--- a/arch/powerpc/platforms/powernv/opal-sensor.c
+++ b/arch/powerpc/platforms/powernv/opal-sensor.c
@@ -64,6 +64,12 @@ int opal_get_sensor_data(u32 sensor_hndl, u32 *sensor_data)
*sensor_data = be32_to_cpu(data);
break;
  
+	case OPAL_WRONG_STATE:

+   pr_notice("%s: Sensor data read failure due to "
+   "core sleeping/offline\n", __func__);

I don't think it should print.

It's not the users fault, or anything they can prevent. It's a
mis-feature (aka. bug) in the driver that it queries sensors for offline
CPUs. At least it should be ratelimited.

I thought the entire motivation for the patch in the first place was
that we were spamming the console with messages?


Yes. Correct.
To avoid flooding console, logged message using pr_notice.
I think, it was suggested in previous reviews to log a message before
returning -EIO.
Shall i directly return -EIO without any log message ?


cheers





[PATCH v3] powernv/sensor: Handle OPAL_WRONG_STATE error return

2017-03-05 Thread Vipin K Parashar
OPAL returns OPAL_WRONG_STATE upon failing to provide
sensor data due to core sleeping/offline. Added check
for OPAL_WRONG_STATE rerurn code with sensor read failure.
Also added a log message indicating sensor data being
queried for sleeping/offline core.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v3:
 - Added a new case for OPAL_WRONG_STATE in sensor read
   along with a log message indicating sleeping/offline core
   causing read fail.

 arch/powerpc/platforms/powernv/opal-sensor.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-sensor.c 
b/arch/powerpc/platforms/powernv/opal-sensor.c
index 308efd1..fb6d6bb 100644
--- a/arch/powerpc/platforms/powernv/opal-sensor.c
+++ b/arch/powerpc/platforms/powernv/opal-sensor.c
@@ -64,6 +64,12 @@ int opal_get_sensor_data(u32 sensor_hndl, u32 *sensor_data)
*sensor_data = be32_to_cpu(data);
break;
 
+   case OPAL_WRONG_STATE:
+   pr_notice("%s: Sensor data read failure due to "
+   "core sleeping/offline\n", __func__);
+   ret = -EIO;
+   break;
+
default:
ret = opal_error_code(ret);
break;
-- 
2.7.4



Re: [PATCH v2] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-03-05 Thread Vipin K Parashar



On Thursday 02 March 2017 06:00 PM, Vipin K Parashar wrote:

Hi Stewart/Michael,

Thanks!! for review.

Responses as below:


On Wednesday 01 March 2017 02:38 AM, Stewart Smith wrote:

Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:

Added check for OPAL_WRONG_STATE error code returned from OPAL.
Currently Linux flashes "unexpected error" over console for this
error. This will avoid throwing such message and return I/O error
for such OPAL failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v2:
  - Added log message indicating sleeping/offline core
for OPAL_WRONG_STATE

  arch/powerpc/platforms/powernv/opal.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c

index 86d9fde..8af230e 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -869,8 +869,11 @@ int opal_error_code(int rc)
  case OPAL_UNSUPPORTED:return -EIO;
  case OPAL_HARDWARE:return -EIO;
  case OPAL_INTERNAL_ERROR:return -EIO;
+case OPAL_WRONG_STATE:
+pr_notice("%s: Core sleeping/offline\n", __func__);
+return -EIO;

Since this is part of opal_error_code() though, this will be printed for
any OPAL call that returns that.


opal_error_coder is used by functions to handle OPAL error codes
and return Linux error codes. Apart from opal_get_sensor_data ()
in opal-sensor.c, opal_error_code  is also getting invoked from
opal_get_sys_param( ) in opal-sysparam.c.

Handling OPAL_WRONG_STATE in opal_error_code itself, seems
modular  and avoids extra checks for OPAL_WRONG_STATE after
opal_error_code usage in multiple functions.

opal_error_code is already adding a message upon OPAL_WRONG_STATE
return, so its already leaving trace about Sleeping core causing XSCOM
failure. By returning OPAL_WRONG_CODE from opal_error_code are we
planning some action like on-lining back the sleeping or off-lined core ?



Why not have the sensor code do this:

rc = opal_sensor_read(foo)
if (rc == OPAL_WRONG_STATE)
return -EIO;
else
return oal_error_code(rc);

?



Avoided adding OPAL_WRONG_STATE check in opal_error_code
and instead added a new case for OPAL_WRONG_STATE in
opal_get_sensor_data itself. Sending out v3 with the changes.


  default:
-pr_err("%s: unexpected OPAL error %d\n", __func__, rc);
+pr_err("%s: Unexpected OPAL error %d\n", __func__, rc);

Do we need this?



This print helps in alerting about OPAL return codes that aren't 
supported in

running Linux version. Helpful in catching OPAL return code that missed
out detection check in Linux.
Shall we consider reducing message severity from pr_err to pr_warn ?





Re: [PATCH v2] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-03-02 Thread Vipin K Parashar

Hi Stewart/Michael,

Thanks!! for review.

Responses as below:


On Wednesday 01 March 2017 02:38 AM, Stewart Smith wrote:

Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:

Added check for OPAL_WRONG_STATE error code returned from OPAL.
Currently Linux flashes "unexpected error" over console for this
error. This will avoid throwing such message and return I/O error
for such OPAL failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v2:
  - Added log message indicating sleeping/offline core
for OPAL_WRONG_STATE

  arch/powerpc/platforms/powernv/opal.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c
index 86d9fde..8af230e 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -869,8 +869,11 @@ int opal_error_code(int rc)
case OPAL_UNSUPPORTED:  return -EIO;
case OPAL_HARDWARE: return -EIO;
case OPAL_INTERNAL_ERROR:   return -EIO;
+   case OPAL_WRONG_STATE:
+   pr_notice("%s: Core sleeping/offline\n", __func__);
+   return -EIO;

Since this is part of opal_error_code() though, this will be printed for
any OPAL call that returns that.


opal_error_coder is used by functions to handle OPAL error codes
and return Linux error codes. Apart from opal_get_sensor_data ()
in opal-sensor.c, opal_error_code  is also getting invoked from
opal_get_sys_param( ) in opal-sysparam.c.

Handling OPAL_WRONG_STATE in opal_error_code itself, seems
modular  and avoids extra checks for OPAL_WRONG_STATE after
opal_error_code usage in multiple functions.

opal_error_code is already adding a message upon OPAL_WRONG_STATE
return, so its already leaving trace about Sleeping core causing XSCOM
failure. By returning OPAL_WRONG_CODE from opal_error_code are we
planning some action like on-lining back the sleeping or off-lined core ?



Why not have the sensor code do this:

rc = opal_sensor_read(foo)
if (rc == OPAL_WRONG_STATE)
return -EIO;
else
return oal_error_code(rc);

?


default:
-   pr_err("%s: unexpected OPAL error %d\n", __func__, rc);
+   pr_err("%s: Unexpected OPAL error %d\n", __func__, rc);

Do we need this?



This print helps in alerting about OPAL return codes that aren't 
supported in

running Linux version. Helpful in catching OPAL return code that missed
out detection check in Linux.
Shall we consider reducing message severity from pr_err to pr_warn ?



[PATCH v2] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-02-28 Thread Vipin K Parashar
Added check for OPAL_WRONG_STATE error code returned from OPAL.
Currently Linux flashes "unexpected error" over console for this
error. This will avoid throwing such message and return I/O error
for such OPAL failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
Changes in v2:
 - Added log message indicating sleeping/offline core
   for OPAL_WRONG_STATE

 arch/powerpc/platforms/powernv/opal.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c
index 86d9fde..8af230e 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -869,8 +869,11 @@ int opal_error_code(int rc)
case OPAL_UNSUPPORTED:  return -EIO;
case OPAL_HARDWARE: return -EIO;
case OPAL_INTERNAL_ERROR:   return -EIO;
+   case OPAL_WRONG_STATE:
+   pr_notice("%s: Core sleeping/offline\n", __func__);
+   return -EIO;
default:
-   pr_err("%s: unexpected OPAL error %d\n", __func__, rc);
+   pr_err("%s: Unexpected OPAL error %d\n", __func__, rc);
return -EIO;
}
 }
-- 
2.7.4



Re: [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-02-28 Thread Vipin K Parashar

Thanks!! for review.

Sending out v2 with  suggested changes.


On Thursday 23 February 2017 09:22 AM, Stewart Smith wrote:

Michael Ellerman <m...@ellerman.id.au> writes:


Stewart Smith <stew...@linux.vnet.ibm.com> writes:


Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:

On Monday 13 February 2017 06:13 AM, Michael Ellerman wrote:

Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:


OPAL returns OPAL_WRONG_STATE for XSCOM operations

done to read any core FIR which is sleeping, offline.

OK.

Do we know why Linux is causing that to happen?

This issue is originally seen upon running STAF (Software Test
Automation Framework) stress tests and off-lining some cores
with stress tests running.

It can also be re-created after off-lining few cores and following
one of below methods.
1. Executing Linux "sensors" command
2. Reading contents of file /sys/class/hwmon/hwmon0/tempX_input,
 where X is offline CPU.

Its "opal_get_sensor_data" Linux API that that triggers
OPAL call "opal_sensor_read", performing XSCOM ops here.
If core is found sleeping/offline Linux throws up
"opal_error_code: Unexpected OPAL error" error onto console.

Currently Linux isn't aware about OPAL_WRONG_STATE return code
from OPAL. Thus it prints "Unexpected OPAL error" message, same
as it would log for any unknown OPAL return codes.

Seeing this error over console has been a concern for Test and
would puzzle real user as well. This patch makes Linux aware about
OPAL_WRONG_STATE return code from OPAL and stops printing
"Unexpected OPAL error" message onto console for OPAL fails
with OPAL_WRONG_STATE

Ahh... so this is a DTS sensor, which indeed is just XSCOMs and we
return the xscom_read return code in event of error.

I would argue that converting to EIO in that instance is probably
correct... or EAGAIN? EAGAIN may be more correct in the situation where
the core is just sleeping.

What kind of offlining are you doing?

Arguably, the correct behaviour would be to remove said sensors when the
core is offline.

Right, that would be ideal. There appear to be at least two other hwmon
drivers that are CPU hotplug aware (coretemp and via-cputemp).

But perhaps it's not possible to work out which sensors are attached to
which CPU etc., I haven't looked in detail.

Each core-temp@ sensor has a ibm,pir property, so linking back to what
core shouldn't be too hard. For mem-temp@ sensors, we have the chip-id.


In that case changing just opal_get_sensor_data() to handle
OPAL_WRONG_STATE would be OK, with a comment explaining that we might be
asked to read a sensor on an offline CPU and we aren't able to detect
that.

Agree.





Re: [PATCH v2] KVM: PPC: Book3S: Ratelimit copy data failure error messages

2017-02-23 Thread Vipin K Parashar

This patch uses "printk_ratelimited" in place of
"printk_ratelimit" used in v1


On Thursday 16 February 2017 10:40 PM, Vipin K Parashar wrote:

kvm_ppc_mmu_book3s_32/64 xlat() logs "KVM can't copy data" error
upon failing to copy user data to kernel space. This floods kernel
log once such fails occur in short time period. Ratelimit this
error to avoid flooding kernel logs upon copy data failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
  arch/powerpc/kvm/book3s_32_mmu.c | 3 ++-
  arch/powerpc/kvm/book3s_64_mmu.c | 3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index a2eb6d3..1992676 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu 
*vcpu, gva_t eaddr,
ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);

if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+   printk_ratelimited(KERN_ERR
+   "KVM: Can't copy data from 0x%lx!\n", ptegp);
goto no_page_found;
}

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index b9131aa..7015357 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
goto no_page_found;

if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+   printk_ratelimited(KERN_ERR
+   "KVM: Can't copy data from 0x%lx!\n", ptegp);
goto no_page_found;
}





Re: [PATCH] KVM: PPC: Book3S: Ratelimit copy data failure error messages

2017-02-23 Thread Vipin K Parashar

v2 for this patch with 'printk_ratelimit' replaced with

'printk_ratelimited' is available at mailing list.


https://patchwork.ozlabs.org/patch/728831/



On Tuesday 14 February 2017 11:50 AM, Vipin K Parashar wrote:

Forwarded same patch to k...@vger.kernel.org

and kvm-...@vger.kernel.org too.


On Tuesday 14 February 2017 12:26 AM, Vipin K Parashar wrote:

kvm_ppc_mmu_book3s_32/64 xlat() log "KVM can't copy data" error
upon failing to copy user data to kernel space. This floods kernel
log once such fails occur in short time period. Ratelimit this
error to avoid flooding kernel logs upon copy data failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
  arch/powerpc/kvm/book3s_32_mmu.c | 3 ++-
  arch/powerpc/kvm/book3s_64_mmu.c | 3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c 
b/arch/powerpc/kvm/book3s_32_mmu.c

index a2eb6d3..ca8f960 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct 
kvm_vcpu *vcpu, gva_t eaddr,

  ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);

  if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+if (printk_ratelimit())
+printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", 
ptegp);

  goto no_page_found;
  }

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c 
b/arch/powerpc/kvm/book3s_64_mmu.c

index b9131aa..b420aca 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct 
kvm_vcpu *vcpu, gva_t eaddr,

  goto no_page_found;

  if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+if (printk_ratelimit())
+printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", 
ptegp);

  goto no_page_found;
  }







[PATCH v2] KVM: PPC: Book3S: Ratelimit copy data failure error messages

2017-02-16 Thread Vipin K Parashar
kvm_ppc_mmu_book3s_32/64 xlat() logs "KVM can't copy data" error
upon failing to copy user data to kernel space. This floods kernel
log once such fails occur in short time period. Ratelimit this
error to avoid flooding kernel logs upon copy data failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
 arch/powerpc/kvm/book3s_32_mmu.c | 3 ++-
 arch/powerpc/kvm/book3s_64_mmu.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index a2eb6d3..1992676 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu 
*vcpu, gva_t eaddr,
ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);
 
if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+   printk_ratelimited(KERN_ERR
+   "KVM: Can't copy data from 0x%lx!\n", ptegp);
goto no_page_found;
}
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index b9131aa..7015357 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
goto no_page_found;
 
if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+   printk_ratelimited(KERN_ERR
+   "KVM: Can't copy data from 0x%lx!\n", ptegp);
goto no_page_found;
}
 
-- 
2.7.4



Re: [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-02-15 Thread Vipin K Parashar

Hi Michael,

Thanks!! for review.

Answers to your questions as below:


On Monday 13 February 2017 06:13 AM, Michael Ellerman wrote:

Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:


OPAL returns OPAL_WRONG_STATE for XSCOM operations

done to read any core FIR which is sleeping, offline.

OK.

Do we know why Linux is causing that to happen?


This issue is originally seen upon running STAF (Software Test
Automation Framework) stress tests and off-lining some cores
with stress tests running.

It can also be re-created after off-lining few cores and following
one of below methods.
1. Executing Linux "sensors" command
2. Reading contents of file /sys/class/hwmon/hwmon0/tempX_input,
   where X is offline CPU.

Its "opal_get_sensor_data" Linux API that that triggers
OPAL call "opal_sensor_read", performing XSCOM ops here.
If core is found sleeping/offline Linux throws up
"opal_error_code: Unexpected OPAL error" error onto console.

Currently Linux isn't aware about OPAL_WRONG_STATE return code
from OPAL. Thus it prints "Unexpected OPAL error" message, same
as it would log for any unknown OPAL return codes.

Seeing this error over console has been a concern for Test and
would puzzle real user as well. This patch makes Linux aware about
OPAL_WRONG_STATE return code from OPAL and stops printing
"Unexpected OPAL error" message onto console for OPAL fails
with OPAL_WRONG_STATE



It's also returned from many of the XIVE routines if we're in the wrong
xive mode, all of which would indicate a fairly bad Linux bug.

Also the skiboot patch which added WRONG_STATE for XSCOM ops did so
explicitly so we could differentiate from other errors:

 commit 9c2d82394fd2303847cac4a665dee62556ca528a
 Author: Russell Currey <rus...@russell.cc>
 AuthorDate: Mon Mar 21 12:00:00 2016 +1100

 xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep
 
 xscom_read and xscom_write return OPAL_SUCCESS if they worked, and

 OPAL_HARDWARE if they didn't.  This doesn't provide information about why
 the operation failed, such as if the CPU happens to be asleep.
 
 This is specifically useful in error scanning, so if every CPU is being

 scanned for errors, sleeping CPUs likely aren't the cause of failures.
 
 So, return OPAL_WRONG_STATE in xscom_read and xscom_write if the CPU is

 sleeping.
 
 Signed-off-by: Russell Currey <rus...@russell.cc>

 Reviewed-by: Alistair Popple <alist...@popple.id.au>
 Signed-off-by: Stewart Smith <stew...@linux.vnet.ibm.com>



So I'm still not convinced that quietly swallowing this error and
mapping it to -EIO along with several of the other error codes is the
right thing to do.


How about returning -ENXIO upon receiving OPAL_WRONG_STATE ?

while -EIO remains to be returned for OPAL_HARDWARE.

I can send out new patch doing pr_notice for fails with supported OPAL
return codes and pr_err for any unexpected OPAL return code. So this way
we will have logging of any OPAL call failure onto Linux log and only
unexpected OPAL error codes would get flashed onto console.



cheers





Re: [PATCH] KVM: PPC: Book3S: Ratelimit copy data failure error messages

2017-02-13 Thread Vipin K Parashar

Forwarded same patch to k...@vger.kernel.org

and kvm-...@vger.kernel.org too.


On Tuesday 14 February 2017 12:26 AM, Vipin K Parashar wrote:

kvm_ppc_mmu_book3s_32/64 xlat() log "KVM can't copy data" error
upon failing to copy user data to kernel space. This floods kernel
log once such fails occur in short time period. Ratelimit this
error to avoid flooding kernel logs upon copy data failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
  arch/powerpc/kvm/book3s_32_mmu.c | 3 ++-
  arch/powerpc/kvm/book3s_64_mmu.c | 3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index a2eb6d3..ca8f960 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu 
*vcpu, gva_t eaddr,
ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);

if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+   if (printk_ratelimit())
+   printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", 
ptegp);
goto no_page_found;
}

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index b9131aa..b420aca 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
goto no_page_found;

if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+   if (printk_ratelimit())
+   printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", 
ptegp);
goto no_page_found;
}





[PATCH] KVM: PPC: Book3S: Ratelimit copy data failure error messages

2017-02-13 Thread Vipin K Parashar
kvm_ppc_mmu_book3s_32/64 xlat() log "KVM can't copy data" error
upon failing to copy user data to kernel space. This floods kernel
log once such fails occur in short time period. Ratelimit this
error to avoid flooding kernel logs upon copy data failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
 arch/powerpc/kvm/book3s_32_mmu.c | 3 ++-
 arch/powerpc/kvm/book3s_64_mmu.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index a2eb6d3..ca8f960 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu 
*vcpu, gva_t eaddr,
ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);
 
if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+   if (printk_ratelimit())
+   printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", 
ptegp);
goto no_page_found;
}
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index b9131aa..b420aca 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
goto no_page_found;
 
if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
-   printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+   if (printk_ratelimit())
+   printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", 
ptegp);
goto no_page_found;
}
 
-- 
2.7.4



Re: [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-01-26 Thread Vipin K Parashar

OPAL returns OPAL_WRONG_STATE for XSCOM operations

done to read any core FIR which is sleeping, offline.


On Friday 27 January 2017 05:47 AM, Michael Ellerman wrote:

Vipin K Parashar <vi...@linux.vnet.ibm.com> writes:


Added check for OPAL_WRONG_STATE error code returned from OPAL.
Currently Linux flashes "unexpected error" over console for this
error. This will avoid throwing such message and return I/O error
for such OPAL failures.

Why do we expect to get OPAL_WRONG_STATE ?

cheers





[PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2016-12-20 Thread Vipin K Parashar
Added check for OPAL_WRONG_STATE error code returned from OPAL.
Currently Linux flashes "unexpected error" over console for this
error. This will avoid throwing such message and return I/O error
for such OPAL failures.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/opal.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c
index 2822935..ab91d53 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -866,9 +866,10 @@ int opal_error_code(int rc)
case OPAL_NO_MEM:   return -ENOMEM;
case OPAL_PERMISSION:   return -EPERM;
 
-   case OPAL_UNSUPPORTED:  return -EIO;
-   case OPAL_HARDWARE: return -EIO;
-   case OPAL_INTERNAL_ERROR:   return -EIO;
+   case OPAL_UNSUPPORTED:
+   case OPAL_HARDWARE:
+   case OPAL_INTERNAL_ERROR:
+   case OPAL_WRONG_STATE:  return -EIO;
default:
pr_err("%s: unexpected OPAL error %d\n", __func__, rc);
return -EIO;
-- 
2.7.4



Re: [PATCH 0215/1529] Fix typo

2016-05-24 Thread Vipin K Parashar


On Saturday 21 May 2016 05:34 PM, Andrea Gelmini wrote:

Signed-off-by: Andrea Gelmini <andrea.gelm...@gelma.net>


Reviewed-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>


---
  arch/powerpc/include/asm/opal-api.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 9bb8ddf..70b5cbc 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -802,7 +802,7 @@ struct opal_sg_entry {
  };

  /*
- * Candiate image SG list.
+ * Candidate image SG list.
   *
   * length = VER | length
   */
@@ -852,7 +852,7 @@ struct opal_i2c_request {
   * with individual elements being 16 bits wide to fetch the system
   * wide EPOW status. Each element in the buffer will contain the
   * EPOW status in it's bit representation for a particular EPOW sub
- * class as defiend here. So multiple detailed EPOW status bits
+ * class as defined here. So multiple detailed EPOW status bits
   * specific for any sub class can be represented in a single buffer
   * element as it's bit representation.
   */


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5] powerpc/pseries: Limit EPOW reset event warnings

2015-12-01 Thread Vipin K Parashar



On Tuesday 01 December 2015 09:16 AM, Michael Ellerman wrote:

On Mon, 2015-11-30 at 17:31 +0530, Vipin K Parashar wrote:

On Thursday 26 November 2015 02:50 PM, Vasant Hegde wrote:

On 11/18/2015 02:12 PM, Vipin K Parashar wrote:

Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
below EPOW reset event warning is seen to be flooding kernel log
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared


@Michael,
   I think above log is raising some concern. We have been asked by multiple
people on this. Hence I think we should avoid these duplicate messages.

Hi Michael,
   Please do let know if you have some suggestions with this patch.


It seems OK to me. I actually had it in my testing tree and was about to merge
it when Vasant replied with comments.

So if you two can agree on a final patch I'll merge it.


Thanks Michael, I will send out next revision for this addressing 
Vasant's comment.




cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6] powerpc/pseries: Limit EPOW reset event warnings

2015-12-01 Thread Vipin K Parashar
Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
below EPOW reset event warning is seen to be flooding kernel log
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared

These EPOW reset events are spurious in nature and are triggered by
firmware without an actual EPOW event being reset. This patch avoids these
multiple EPOW reset warnings by using a counter variable. This variable
is incremented every time an EPOW event is reported. Upon receiving a EPOW
reset event the same variable is checked to filter out spurious events and
decremented accordingly.

This patch also improves log messages to better describe EPOW event being
reported. Merged adjacent log messages into single one to reduce number of
lines printed per event.

Signed-off-by: Kamalesh Babulal <kamal...@linux.vnet.ibm.com>
Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
v6 changes:
   - Added single increment for epow events counter variable outside 
 epow events switch-case scenario.
   - Corrected typos in commit log.

v5 changes:
   - Used num_epow_events counter variable to count number of epow_events
   - Improved log messages to better describe epow event.
   - Merged adjacent warnings into single lines.

v4 changes:
   - Changed the approach to depth counter to match the EPOW events and
 EPOW reset.
   - Converted pr_err() ot pr_info() for non-critical errors.
   - Merged adjacent warnings into single line across the file.
   - Fixed grammar in the warnings to make is short.

v3 changes:
   - Limit warning printed by EPOW RESET event, by guarding it with bool flag.
 Instead of rate limiting all the EPOW events.

v2 changes:
   - Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
 warnings, based on Michael's comments.

 arch/powerpc/platforms/pseries/ras.c | 55 
 1 file changed, 31 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 3b6647e..9a3e27b 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,9 @@ static int ras_check_exception_token;
 #define EPOW_SENSOR_TOKEN  9
 #define EPOW_SENSOR_INDEX  0
 
+/* EPOW events counter variable */
+static int num_epow_events;
+
 static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
 static irqreturn_t ras_error_interrupt(int irq, void *dev_id);
 
@@ -82,32 +85,30 @@ static void handle_system_shutdown(char event_modifier)
 {
switch (event_modifier) {
case EPOW_SHUTDOWN_NORMAL:
-   pr_emerg("Firmware initiated power off");
+   pr_emerg("Power off requested\n");
orderly_poweroff(true);
break;
 
case EPOW_SHUTDOWN_ON_UPS:
-   pr_emerg("Loss of power reported by firmware, system is "
-   "running on UPS/battery");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("Loss of system power detected. System is running on"
+" UPS/battery. Check RTAS error log for details\n");
orderly_poweroff(true);
break;
 
case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:
-   pr_emerg("Loss of system critical functions reported by "
-   "firmware");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("Loss of system critical functions detected. Check"
+" RTAS error log for details\n");
orderly_poweroff(true);
break;
 
case EPOW_SHUTDOWN_AMBIENT_TEMPERATURE_TOO_HIGH:
-   pr_emerg("Ambient temperature too high reported by firmware");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("High ambient temperature detected. Check RTAS"
+" error log for details\n");
orderly_poweroff(true);
break;
 
default:
-   pr_err("Unknown power/cooling shutdown event (modifier %d)",
+   pr_err("Unknown power/cooling shutdown event (modifier = %d)\n",
event_modifier);
   

Re: [PATCH v5] powerpc/pseries: Limit EPOW reset event warnings

2015-11-30 Thread Vipin K Parashar



On Thursday 26 November 2015 02:50 PM, Vasant Hegde wrote:

On 11/18/2015 02:12 PM, Vipin K Parashar wrote:

Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
below EPOW reset event warning is seen to be flooding kernel log
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared


@Michael,
  I think above log is raising some concern. We have been asked by multiple
people on this. Hence I think we should avoid these duplicate messages.


Hi Michael,
 Please do let know if you have some suggestions with this patch.




These EPOW reset events are spurious in nature and are triggered by
firmware witout an actual EPOW event being reset. This patch avoids these

s/witout/without/


sure, will edit.




multiple EPOW reset warnings by using a counter variable. This variable
is incremented every time an EPOW event is reported. Upon receiving a EPOW
reset event the same variable is checked to filer out spurious events and
decremented accordingly.

This patch also improves log messages to better describe EPOW event being
reported. Merged adjacent log messages into single one to reduce number of
lines printed per event.

Signed-off-by: Kamalesh Babulal<kamal...@linux.vnet.ibm.com>
Signed-off-by: Vipin K Parashar<vi...@linux.vnet.ibm.com>
---
v5 changes:
- Used num_epow_events counter variable to count number of epow_events
- Improved log messages to better describe epow event.
- Merged adjacent warnings into single lines.

v4 changes:
- Changed the approach to depth counter to match the EPOW events and
  EPOW reset.
- Converted pr_err() ot pr_info() for non-critical errors.
- Merged adjacent warnings into single line across the file.
- Fixed grammar in the warnings to make is short.

v3 changes:
- Limit warning printed by EPOW RESET event, by guarding it with bool flag.
  Instead of rate limiting all the EPOW events.

v2 changes:
- Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
  warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 54 
  1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 3b6647e..bbe2856 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,8 @@ static int ras_check_exception_token;
  #define EPOW_SENSOR_TOKEN 9
  #define EPOW_SENSOR_INDEX 0
  
+static int num_epow_events;

+
  static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
  static irqreturn_t ras_error_interrupt(int irq, void *dev_id);
  
@@ -82,32 +84,30 @@ static void handle_system_shutdown(char event_modifier)

  {
switch (event_modifier) {
case EPOW_SHUTDOWN_NORMAL:
-   pr_emerg("Firmware initiated power off");
+   pr_emerg("Power off requested\n");

Why are you changing this  message? These are FW initiated Power off and helps
us to identify who initiated shutdown request.


EPOW_SHUTDOWN_NORMAL event maps to DPO event in harwdare, which is received
upon system admin requesting LPAR poweroff. I felt that using FW 
initiated poweroff

phrase doesn't convey that poweroff was requested by a user so changed it.
   Please do suggest if you have something better to convey message.


orderly_poweroff(true);
break;
  
  	case EPOW_SHUTDOWN_ON_UPS:

-   pr_emerg("Loss of power reported by firmware, system is "
-   "running on UPS/battery");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("Loss of system power detected. System is running on"
+" UPS/battery. Check RTAS error log for details\n");
orderly_poweroff(true);
break;
  
  	case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:

-   pr_emerg("Loss of system critical functions reported by "
-   "firmware");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("Loss of system critical functions detected. Check"
+" RTAS error log for details\n");
orderly_poweroff(true);
break;

[PATCH v5] powerpc/pseries: Limit EPOW reset event warnings

2015-11-18 Thread Vipin K Parashar
Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
below EPOW reset event warning is seen to be flooding kernel log
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared

These EPOW reset events are spurious in nature and are triggered by
firmware witout an actual EPOW event being reset. This patch avoids these
multiple EPOW reset warnings by using a counter variable. This variable
is incremented every time an EPOW event is reported. Upon receiving a EPOW
reset event the same variable is checked to filer out spurious events and
decremented accordingly.

This patch also improves log messages to better describe EPOW event being
reported. Merged adjacent log messages into single one to reduce number of
lines printed per event.

Signed-off-by: Kamalesh Babulal <kamal...@linux.vnet.ibm.com>
Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
v5 changes:
   - Used num_epow_events counter variable to count number of epow_events
   - Improved log messages to better describe epow event.
   - Merged adjacent warnings into single lines.

v4 changes:
   - Changed the approach to depth counter to match the EPOW events and
 EPOW reset.
   - Converted pr_err() ot pr_info() for non-critical errors.
   - Merged adjacent warnings into single line across the file.
   - Fixed grammar in the warnings to make is short.

v3 changes:
   - Limit warning printed by EPOW RESET event, by guarding it with bool flag.
 Instead of rate limiting all the EPOW events.

v2 changes:
   - Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
 warnings, based on Michael's comments.

 arch/powerpc/platforms/pseries/ras.c | 54 
 1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 3b6647e..bbe2856 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,8 @@ static int ras_check_exception_token;
 #define EPOW_SENSOR_TOKEN  9
 #define EPOW_SENSOR_INDEX  0
 
+static int num_epow_events;
+
 static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
 static irqreturn_t ras_error_interrupt(int irq, void *dev_id);
 
@@ -82,32 +84,30 @@ static void handle_system_shutdown(char event_modifier)
 {
switch (event_modifier) {
case EPOW_SHUTDOWN_NORMAL:
-   pr_emerg("Firmware initiated power off");
+   pr_emerg("Power off requested\n");
orderly_poweroff(true);
break;
 
case EPOW_SHUTDOWN_ON_UPS:
-   pr_emerg("Loss of power reported by firmware, system is "
-   "running on UPS/battery");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("Loss of system power detected. System is running on"
+" UPS/battery. Check RTAS error log for details\n");
orderly_poweroff(true);
break;
 
case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:
-   pr_emerg("Loss of system critical functions reported by "
-   "firmware");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("Loss of system critical functions detected. Check"
+" RTAS error log for details\n");
orderly_poweroff(true);
break;
 
case EPOW_SHUTDOWN_AMBIENT_TEMPERATURE_TOO_HIGH:
-   pr_emerg("Ambient temperature too high reported by firmware");
-   pr_emerg("Check RTAS error log for details");
+   pr_emerg("High ambient temperature detected. Check RTAS"
+" error log for details\n");
orderly_poweroff(true);
break;
 
default:
-   pr_err("Unknown power/cooling shutdown event (modifier %d)",
+   pr_err("Unknown power/cooling shutdown event (modifier = %d)\n",
event_modifier);
}
 }
@@ -145,40 +145,47 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)
 
switch (action_code) {
case EPOW_RESET:
-   pr_err("Non cri

Re: [RESEND,v3] powerpc/pseries: Limit EPOW reset event warnings

2015-09-10 Thread Vipin K Parashar


On 07/17/2015 01:47 PM, Kamalesh Babulal wrote:

* Michael Ellerman <m...@ellerman.id.au> [2015-07-16 14:05:52]:

[..]

Why are we even getting these reset events when nothing has happened?


Based on info received from PFW guys and HW working of EPOW

FW only acts as a passthru here passing EPOW info obtained from 
underneath PHYP/HW.
On FSP based POWER systems EPOW information is send via Panel status 
notification alerts which also
contains SPCN info along with EPOW details. As a result even when no 
EPOW condition is present
these notifications are still sent by HW to notify any SPCN related 
changes. Thus FW ends up sending
multiple EPOW_RESET notifications with no actual EPOW event being 
active. So multiple EPOW_RESET
notifications are expected as per design and Linux would need to add a 
fix to avoid multiple logging for them.



Thanks for the review. It was seen only on one machine, couldn't
get hold of the machine any more. I am guessing here, that it might be
the firmware.


Also, merged adjacent pr_err/pr_emerg into single one to reduce
the number of lines printed per warning.

[..]
  
+/* Flag to limit EPOW RESET warning. */

+static bool epow_state;

This name is terrible, it doesn't give me any hint to what it means.

But really it should be a counter, not a boolean.

We could have multiple EPOW events come in and then later get the reset events
for them, couldn't we?


Below is EPOW_RESET description from PAPR:

EPOW_RESET/MESSAGE (0) - No EPOW event is pending. This action code is 
the lowest priority.


PFW sends EPOW_RESET only when none of EPOW condition is present in system.
For two outstanding EPOW conditions HW also doesn't provide any means to 
know
that one has got reset. It only tells phyp/PFW the highest priority EPOW 
condition

and would inform reset case when all such conditions go away.
With this would a boolean flag be more appropriate here ?




So what about:

static unsigned epow_event_depth;


--->8

 From 0d27916fd09a9f0912a217432a41e2b579dc2952 Mon Sep 17 00:00:00 2001
From: Kamalesh Babulal <kamal...@linux.vnet.ibm.com>
Date: Fri, 17 Jul 2015 13:19:31 +0530
Subject: [PATCH v4] powerpc/pseries: Limit EPOW reset event warnings

Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
EPOW reset event warning, such as below could flood kernel log,
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared

This patch avoids these multiple EPOW reset warnings by using epow_depth
counter. Which is incremented every time EPOW event is reported and
decremented on EPOW_RESET event. With this approach number EPOW RESET
warning matches the number of EPOW events.

Also, merged adjacent pr_info/pr_err/pr_emerg into single one to reduce
the number of lines printed per warning across the file and converted
non-critical errors to pr_info from pr_error, including grammar
correction in the warnings printed.

Suggested-by: Michael Ellerman <m...@ellerman.id.au>
Cc: Anshuman Khandual <khand...@linux.vnet.ibm.com>
Cc: Anton Blanchard <an...@samba.org>
Cc: Vipin K Parashar <vi...@linux.vnet.ibm.com>
Signed-off-by: Kamalesh Babulal <kamal...@linux.vnet.ibm.com>
---
V4: Changes:
- Changed the approach to depth counter to match the EPOW events and
  EPOW reset.
- Converted pr_err() ot pr_info() for non-critical errors.
- Merged adjacent warnings into single line across the file.
- Fixed grammar in the warnings to make is short.
v3 Changes:
- Limit warning printed by EPOW RESET event, by guarding it with bool flag.
  Instead of rate limiting all the EPOW events.

v2 Changes:
- Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
  warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 53 
  1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..995cab8 1006

Re: [RESEND,v3] powerpc/pseries: Limit EPOW reset event warnings

2015-09-07 Thread Vipin K Parashar


On 07/17/2015 01:47 PM, Kamalesh Babulal wrote:

* Michael Ellerman <m...@ellerman.id.au> [2015-07-16 14:05:52]:

[..]

Why are we even getting these reset events when nothing has happened?

Thanks for the review. It was seen only on one machine, couldn't
get hold of the machine any more. I am guessing here, that it might be
the firmware.


Checking with PFW guys  as to under what circumstances one would see
so many reset events being reported ? Will post out findings as soon as i
hear things back from PFW guys on this.


Also, merged adjacent pr_err/pr_emerg into single one to reduce
the number of lines printed per warning.

[..]
  
+/* Flag to limit EPOW RESET warning. */

+static bool epow_state;

This name is terrible, it doesn't give me any hint to what it means.

But really it should be a counter, not a boolean.

We could have multiple EPOW events come in and then later get the reset events
for them, couldn't we?


So what about:

static unsigned epow_event_depth;


--->8

 From 0d27916fd09a9f0912a217432a41e2b579dc2952 Mon Sep 17 00:00:00 2001
From: Kamalesh Babulal <kamal...@linux.vnet.ibm.com>
Date: Fri, 17 Jul 2015 13:19:31 +0530
Subject: [PATCH v4] powerpc/pseries: Limit EPOW reset event warnings

Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
EPOW reset event warning, such as below could flood kernel log,
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared

This patch avoids these multiple EPOW reset warnings by using epow_depth
counter. Which is incremented every time EPOW event is reported and
decremented on EPOW_RESET event. With this approach number EPOW RESET
warning matches the number of EPOW events.

Also, merged adjacent pr_info/pr_err/pr_emerg into single one to reduce
the number of lines printed per warning across the file and converted
non-critical errors to pr_info from pr_error, including grammar
correction in the warnings printed.

Suggested-by: Michael Ellerman <m...@ellerman.id.au>
Cc: Anshuman Khandual <khand...@linux.vnet.ibm.com>
Cc: Anton Blanchard <an...@samba.org>
Cc: Vipin K Parashar <vi...@linux.vnet.ibm.com>
Signed-off-by: Kamalesh Babulal <kamal...@linux.vnet.ibm.com>
---
V4: Changes:
- Changed the approach to depth counter to match the EPOW events and
  EPOW reset.
- Converted pr_err() ot pr_info() for non-critical errors.
- Merged adjacent warnings into single line across the file.
- Fixed grammar in the warnings to make is short.

v3 Changes:
- Limit warning printed by EPOW RESET event, by guarding it with bool flag.
  Instead of rate limiting all the EPOW events.

v2 Changes:
- Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
  warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 53 
  1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..995cab8 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,8 @@ static int ras_check_exception_token;
  #define EPOW_SENSOR_TOKEN 9
  #define EPOW_SENSOR_INDEX 0

+static unsigned epow_event_depth;
+
  static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
  static irqreturn_t ras_error_interrupt(int irq, void *dev_id);

@@ -82,32 +84,30 @@ static void handle_system_shutdown(char event_modifier)
  {
switch (event_modifier) {
case EPOW_SHUTDOWN_NORMAL:
-   pr_emerg("Firmware initiated power off");
+   pr_emerg("Firmware initiated power off\n");
orderly_poweroff(true);
break;

case EPOW_SHUTDOWN_ON_UPS:
-   pr_emerg("Loss of power reported by firmware, system is "
-   "running on UPS/battery");
-   pr_emerg("Check RTAS er

[PATCH] asm/opal-api: Assign numbers to OPAL_MSG macros of enum opal_msg_type

2015-08-31 Thread Vipin K Parashar
This patch assigns numbers to OPAL_MSG macros of enum opal_msg_type
to prevent accidental insertion of any new value in between and thus
break OPAL API. This is also helpful while backporting mainline kernel
changes to distros which run downlevel kernel and thus don't have all
OPAL messages defined, avoiding unnecessary bugs due to enum values
order mismatch.

Signed-off-by: Vipin K Parashar <vi...@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/opal-api.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index e9e4c52..b53f9b3 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -352,15 +352,15 @@ enum OpalLPCAddressType {
 };
 
 enum opal_msg_type {
-   OPAL_MSG_ASYNC_COMP = 0,/* params[0] = token, params[1] = rc,
+   OPAL_MSG_ASYNC_COMP = 0,/* params[0] = token, params[1] = rc,
 * additional params function-specific
 */
-   OPAL_MSG_MEM_ERR,
-   OPAL_MSG_EPOW,
-   OPAL_MSG_SHUTDOWN,  /* params[0] = 1 reboot, 0 shutdown */
-   OPAL_MSG_HMI_EVT,
-   OPAL_MSG_DPO,
-   OPAL_MSG_PRD,
+   OPAL_MSG_MEM_ERR= 1,
+   OPAL_MSG_EPOW   = 2,
+   OPAL_MSG_SHUTDOWN   = 3,/* params[0] = 1 reboot, 0 shutdown */
+   OPAL_MSG_HMI_EVT= 4,
+   OPAL_MSG_DPO= 5,
+   OPAL_MSG_PRD= 6,
OPAL_MSG_TYPE_MAX,
 };
 
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND,v3] powerpc/pseries: Limit EPOW reset event warnings

2015-07-17 Thread Vipin K Parashar


On 07/16/2015 09:35 AM, Michael Ellerman wrote:

On Wed, 2015-15-07 at 04:22:06 UTC, Kamalesh Babulal wrote:

Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts.Prompting
user to take action depending upon the severity of the event.

At times EPOW reset event warning, such as below could flood
kernel log, over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared

This patch avoids these multiple EPOW reset warnings by using a boolean
flag. This flag is initialized to false and is set to true upon arrival
of EPOW event. This same flag is checked and reset during EPOW_RESET
scenario to filter out valid EPOW reset events and avoid multiple warning
logs.

Why are we even getting these reset events when nothing has happened?


Also, merged adjacent pr_err/pr_emerg into single one to reduce
the number of lines printed per warning.

Suggested-by: Vipin K Parashar vi...@linux.vnet.ibm.com
[Vipin: edited the changelog]
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: Anton Blanchard an...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Kamalesh Babulal kamal...@linux.vnet.ibm.com
---
v3 Changes:
- Limit warning printed by EPOW RESET event, by guarding it with bool flag.
  Instead of rate limiting all the EPOW events.

v2 Changes:
- Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
  warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 25 +
  1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..b30396a 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,9 @@ static int ras_check_exception_token;
  #define EPOW_SENSOR_TOKEN 9
  #define EPOW_SENSOR_INDEX 0
  
+/* Flag to limit EPOW RESET warning. */

+static bool epow_state;

This name is terrible, it doesn't give me any hint to what it means.

But really it should be a counter, not a boolean.

We could have multiple EPOW events come in and then later get the reset events
for them, couldn't we?


As per PAPR i see below description of EPOW_RESET

EPOW_RESET / MESSAGE (0)  - No EPOW event is pending.

So we probably need to understand if it is send only after all EPOW 
events have
reset or just last EPOW event. From the PAPR description is seems to be 
first case.





So what about:

static unsigned epow_event_depth;

And then below:


@@ -145,21 +148,27 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)
  

epow_event_depth++;

switch (action_code) {
case EPOW_RESET:
if (epow_event_depth)
epow_event_depth--;

if (epow_event_depth)

+   pr_err(Non critical power or cooling issue cleared);
break;


And that's all you need.



case EPOW_WARN_COOLING:
-   pr_err(Non critical cooling issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical cooling issue reported by firmware, 
+  Check RTAS error log for details);

This should be:

pr_err(Non-critical cooling issue reported by firmware, check RTAS 
error log for details.\n);

But that's too long, so how about:

pr_err(Non-critical cooling issue reported, check RTAS error log 
for details.\n);

And if it's non-critical it shouldn't be pr_err(), it should be pr_info().

Similarly for all the other messages.


cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v3] powerpc/pseries: Limit EPOW reset event warnings

2015-07-15 Thread Vipin K Parashar


On 07/15/2015 09:52 AM, Kamalesh Babulal wrote:

Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. Prompting
user to take action depending upon the severity of the event.


Second line probably isn't needed.  Also below line can be merged with 
first one

as both are in same context to describe problem.



At times EPOW reset event warning, such as below could flood
kernel log, over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared

This patch avoids these multiple EPOW reset warnings by using a boolean
flag. This flag is initialized to false and is set to true upon arrival
of EPOW event. This same flag is checked and reset during EPOW_RESET
scenario to filter out valid EPOW reset events and avoid multiple warning
logs.

Also, merged adjacent pr_err/pr_emerg into single one to reduce
the number of lines printed per warning.

Suggested-by: Vipin K Parashar vi...@linux.vnet.ibm.com
[Vipin: edited the changelog]


This probably should go to change summary below.


Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: Anton Blanchard an...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Kamalesh Babulal kamal...@linux.vnet.ibm.com
---
v3 Changes:
- Limit warning printed by EPOW RESET event, by guarding it with bool flag.
  Instead of rate limiting all the EPOW events.

v2 Changes:
- Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
  warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 25 +
  1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..b30396a 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,9 @@ static int ras_check_exception_token;
  #define EPOW_SENSOR_TOKEN 9
  #define EPOW_SENSOR_INDEX 0

+/* Flag to limit EPOW RESET warning. */
+static bool epow_state;
+
  static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
  static irqreturn_t ras_error_interrupt(int irq, void *dev_id);

@@ -145,21 +148,27 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)

switch (action_code) {
case EPOW_RESET:
-   pr_err(Non critical power or cooling issue cleared);
+   if (epow_state) {
+   pr_err(Non critical power or cooling issue cleared);
+   epow_state = false;
+   }
break;

case EPOW_WARN_COOLING:
-   pr_err(Non critical cooling issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical cooling issue reported by firmware, 
+  Check RTAS error log for details);
+   epow_state = true;
break;

case EPOW_WARN_POWER:
-   pr_err(Non critical power issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical power issue reported by firmware, 
+  Check RTAS error log for details);
+   epow_state = true;
break;

case EPOW_SYSTEM_SHUTDOWN:
handle_system_shutdown(epow_log-event_modifier);
+   epow_state = true;
break;

case EPOW_SYSTEM_HALT:
@@ -169,9 +178,8 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)

case EPOW_MAIN_ENCLOSURE:
case EPOW_POWER_OFF:
-   pr_emerg(Critical power/cooling issue reported by firmware);
-   pr_emerg(Check RTAS error log for details);
-   pr_emerg(Immediate power off);
+   pr_emerg(Critical power/cooling issue reported by firmware, 
+Check RTAS error log for details. Immediate power 
off.);
emergency_sync();
kernel_power_off();
break;
@@ -179,6 +187,7 @@ static void

Re: [PATCH v2] powerpc/pseries: Ratelimit EPOW event warnings

2015-07-14 Thread Vipin K Parashar

Patch looks good to me. A small nit pick below.

On 07/14/2015 01:21 PM, Kamalesh Babulal wrote:

* Vipin K Parashar vi...@linux.vnet.ibm.com [2015-06-25 00:48:20]:


On 06/02/2015 10:48 AM, Kamalesh Babulal wrote:

We print the respective warning after parsing EPOW interrupts,
prompting user to take action depending upon the severity of the
event.

Some times same EPOW event warning, such as below could flood kernel
log, over a period of time. So Limit the warnings by using ratelimit
variant of pr_err. Also, merge adjacent pr_err/pr_emerg into single
one to reduce the number of lines printed per warning.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared

These messages are minutes apart and thus rate limiting won't help.
One solution could be to use a flag based approach. Set a flag once a
EPOW condition is detected and check that flag upon receiving EPOW_RESET.
EPOW condition clear message should be logged only if a EPOW was previously
detected i.e. flag found set.

Thanks for reviewing it. Sorry for late response.

bool flag epow_state, which is initialized to false and when any event gets
reported, the flag set to true once the event gets acknowledged by a reset.
As, seen in the example of flooded messages occurring only with reset event.
The reset action is guarded with bool flag (set only if there was event
reported previously) and ignore multiple resets, without real EPOW event.

I have only compile tested the patch. If this approach sounds good.
I will resend formal patch.


diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..4819b1d 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,8 @@ static int ras_check_exception_token;
  #define EPOW_SENSOR_TOKEN 9
  #define EPOW_SENSOR_INDEX 0

+static bool epow_state = false;
+


Explicit declaration isn't needed. default value would be false already.
A one line comment about flag usage would be good.


  static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
  static irqreturn_t ras_error_interrupt(int irq, void *dev_id);

@@ -145,21 +147,27 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)

switch (action_code) {
case EPOW_RESET:
-   pr_err(Non critical power or cooling issue cleared);
+   if (epow_state) {
+   pr_err(Non critical power or cooling issue cleared);
+   epow_state = false;
+   }
break;

case EPOW_WARN_COOLING:
-   pr_err(Non critical cooling issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical cooling issue reported by firmware, 
+  Check RTAS error log for details);
+   epow_state = true;
break;

case EPOW_WARN_POWER:
-   pr_err(Non critical power issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical power issue reported by firmware, 
+  Check RTAS error log for details);
+   epow_state = true;
break;

case EPOW_SYSTEM_SHUTDOWN:
handle_system_shutdown(epow_log-event_modifier);
+   epow_state = true;
break;

case EPOW_SYSTEM_HALT:
@@ -169,9 +177,8 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)

case EPOW_MAIN_ENCLOSURE:
case EPOW_POWER_OFF:
-   pr_emerg(Critical power/cooling issue reported by firmware);
-   pr_emerg(Check RTAS error log for details);
-   pr_emerg(Immediate power off);
+   pr_emerg(Critical power/cooling issue reported by firmware, 
+Check RTAS error log for details. Immediate power 
off.);
emergency_sync();
kernel_power_off();
break;
@@ -179,6 +186,7 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log

Re: [PATCH v3] powerpc/pseries: Limit EPOW reset event warnings

2015-07-14 Thread Vipin K Parashar


Patch looks good. Though it seems that we can improve upon
commit log description to better describe the problem and solution.
Few suggestions as below:

Avoid multiple EPOW reset ..
is better suited as one line description of this problem.

On 07/14/2015 08:39 PM, Kamalesh Babulal wrote:

We print the respective warning after parsing EPOW interrupts,


Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts.


prompting user to take action depending upon the severity of the
event.


Please merge below line with above one.



Some times EPOW rest event warning, such as below could flood

   ^^  ^^
At times EPOW reset  event warning is found to be flooding..

kernel log, over a period of time.


Paste the multiple warnings log here.


Limit these warnings by use of


This patch avoids these multiple EPOW reset warnings by using a boolean 
flag.



epow_state flag, which is initialized to false and when any event


This flag is initialized to false and is set to true upon arrival of 
EPOW event.



gets reported, the flag set to true once an event gets acknowledged
by a reset.

The reset action is guarded by bool flag (set only if there was event


This same flag is checked and reset during EPOW_RESET scenario to
filter out valid EPOW reset events and avoid multiple warning logs.


reported previously) and ignore multiple resets, without real EPOW
event.
Also, merge adjacent pr_err/pr_emerg into single one to reduce

  ^
 merged

the number of lines printed per warning.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared

Suggested-by: Vipin K Parashar vi...@linux.vnet.ibm.com
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: Anton Blanchard an...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Kamalesh Babulal kamal...@linux.vnet.ibm.com
---
v3 Changes:
   - Limit warning printed by EPOW RESET event, by guarding it with bool flag.
 Instead of rate limiting all the EPOW events.

v2 Changes:
   - Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
 warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 25 +
  1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..b30396a 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -40,6 +40,9 @@ static int ras_check_exception_token;
  #define EPOW_SENSOR_TOKEN 9
  #define EPOW_SENSOR_INDEX 0

+/* Flag to limit EPOW RESET warning. */
+static bool epow_state;
+
  static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
  static irqreturn_t ras_error_interrupt(int irq, void *dev_id);

@@ -145,21 +148,27 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)

switch (action_code) {
case EPOW_RESET:
-   pr_err(Non critical power or cooling issue cleared);
+   if (epow_state) {
+   pr_err(Non critical power or cooling issue cleared);
+   epow_state = false;
+   }
break;

case EPOW_WARN_COOLING:
-   pr_err(Non critical cooling issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical cooling issue reported by firmware, 
+  Check RTAS error log for details);
+   epow_state = true;
break;

case EPOW_WARN_POWER:
-   pr_err(Non critical power issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err(Non critical power issue reported by firmware, 
+  Check RTAS error log for details);
+   epow_state = true;
break;

case EPOW_SYSTEM_SHUTDOWN:
handle_system_shutdown(epow_log-event_modifier);
+   epow_state

[PATCH v9] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-07-08 Thread Vipin K Parashar
This patch adds support for OPAL EPOW (Environmental and Power Warnings)
and DPO (Delayed Power Off) events for the PowerNV platform. These events
are generated on FSP (Flexible Service Processor) based systems. EPOW
events are generated due to various critical system conditions that
require system shutdown. A few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system shutdown
request. Upon receipt of EPOW and DPO events the host kernel invokes
orderly_poweroff() for performing graceful system shutdown.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
Acked-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  40 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 173 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index e9e4c52..442995b 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -756,6 +756,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 958e941..a091c27 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
+int64_t opal_get_dpo_status(__be64 *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..58dc330 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -9,9 +9,12 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)opal-power:   fmt
+
 #include linux/kernel.h
 #include linux/reboot.h
 #include linux/notifier.h
+#include linux/of.h
 
 #include asm/opal.h
 #include asm/machdep.h
@@ -19,30 +22,116 @@
 #define SOFT_OFF 0x00
 #define SOFT_REBOOT 0x01
 
+/* Detect EPOW event */
+static bool detect_epow(void)
+{
+   u16 epow;
+   int i, rc;
+   __be16 epow_classes;
+   __be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
+
+   /*
+   * Check for EPOW event. Kernel sends supported EPOW classes info
+   * to OPAL. OPAL returns EPOW info along with classes present.
+   */
+   epow_classes = cpu_to_be16(OPAL_SYSEPOW_MAX);
+   rc = opal_get_epow_status(opal_epow_status, epow_classes);
+   if (rc != OPAL_SUCCESS

[PATCH v9] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-07-08 Thread Vipin K Parashar
This patch adds support for OPAL EPOW (Environmental and Power Warnings)
and DPO (Delayed Power Off) events for the PowerNV platform. These events
are generated on FSP (Flexible Service Processor) based systems. EPOW
events are generated due to various critical system conditions that
require system shutdown. A few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system shutdown
request. Upon receipt of EPOW and DPO events the host kernel invokes
orderly_poweroff() for performing graceful system shutdown.

Changes in v9:
 - Edited commit log for EPOW acronym expansion and reviewers list.

Changes in v8:
 - Added logic to filter events which doesn't require system shutdown
   for EPOW scenario.
 - Re-phrased patch description.

Changes in v7:
 - Fixed logic to check epow support to avoid EPOW, DPO handling path
   for BMC systems.

Changes in v6:
 - Made below changes as suggested by Michael Ellerman on previous patch.
 - Changed EPOW, DPO notifier blocks to use opal_power_control_event()
   and enhanced opal_power_control_event() to handle EPOW and DPO events.
 - Reorganized code and added/changed few variable, function names removing
   older ones.
 - Minor cleanup like removing unused headers, blank lines etc.

Changes in v5:
 - Made changes to address review comments on previous patch.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed patch to extract EPOW event timeout from OPAL device-tree.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  40 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 173 insertions(+), 18 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v8] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-26 Thread Vipin K Parashar


On 06/11/2015 05:13 PM, Vipin K Parashar wrote:

This patch adds support for OPAL EPOW (Early Power Off Warning) and


Hi Micheal,
  Please use below expansion for EPOW acronym in commit log
once u add this patch.
EPOW = Environmental and Power Warnings
It matches with PAPR expansion for EPOW. This way we will have same
EPOW expansion for pSeries and PowerNV platforms avoiding any confusion.

Thanks for your help with this.


DPO (Delayed Power Off) events for the PowerNV platform. These events
are generated on FSP (Flexible Service Processor) based systems. EPOW
events are generated due to various critical system conditions that
require system shutdown. A few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system shutdown
request. Upon receipt of EPOW and DPO events the host kernel invokes
orderly_poweroff() for performing graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
  arch/powerpc/include/asm/opal-api.h|  40 +++
  arch/powerpc/include/asm/opal.h|   3 +-
  arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
  arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
  4 files changed, 173 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..f460435 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
  };

+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
  #endif /* __ASSEMBLY__ */

  #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..8b174f3 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
  int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t 
data);
  int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
  int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
+int64_t opal_get_dpo_status(__be64 *dpo_timeout);
  int64_t opal_set_system_attention_led(uint8_t led_action);
  int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..58dc330 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -9,9 +9,12 @@
   * 2 of the License, or (at your option) any later version.
   */

+#define pr_fmt(fmt)opal-power: fmt
+
  #include linux/kernel.h
  #include linux/reboot.h
  #include linux/notifier.h
+#include linux/of.h

  #include asm/opal.h
  #include asm/machdep.h
@@ -19,30 +22,116 @@
  #define SOFT_OFF 0x00
  #define SOFT_REBOOT 0x01

+/* Detect EPOW event */
+static

Re: [PATCH v2] powerpc/pseries: Ratelimit EPOW event warnings

2015-06-24 Thread Vipin K Parashar


On 06/02/2015 10:48 AM, Kamalesh Babulal wrote:

We print the respective warning after parsing EPOW interrupts,
prompting user to take action depending upon the severity of the
event.

Some times same EPOW event warning, such as below could flood kernel
log, over a period of time. So Limit the warnings by using ratelimit
variant of pr_err. Also, merge adjacent pr_err/pr_emerg into single
one to reduce the number of lines printed per warning.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared
May 25 04:04:24 alp kernel: Non critical power or cooling issue cleared
May 25 04:07:18 alp kernel: Non critical power or cooling issue cleared
May 25 04:13:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:26 alp kernel: Non critical power or cooling issue cleared
May 25 04:22:36 alp kernel: Non critical power or cooling issue cleared


These messages are minutes apart and thus rate limiting won't help.
One solution could be to use a flag based approach. Set a flag once a
EPOW condition is detected and check that flag upon receiving EPOW_RESET.
EPOW condition clear message should be logged only if a EPOW was previously
detected i.e. flag found set.



Signed-off-by: Kamalesh Babulal kamal...@linux.vnet.ibm.com
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: Anton Blanchard an...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
---
v2 Changes:
  - Merged multiple adjacent pr_err/pr_emerg into single line to reduce 
multi-line
warnings, based on Michael's comments.

  arch/powerpc/platforms/pseries/ras.c | 17 -
  1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 02e4a17..3620935 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -145,17 +145,17 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)
  
  	switch (action_code) {

case EPOW_RESET:
-   pr_err(Non critical power or cooling issue cleared);
+   pr_err_ratelimited(Non critical power or cooling issue 
cleared);
break;
  
  	case EPOW_WARN_COOLING:

-   pr_err(Non critical cooling issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err_ratelimited(Non critical cooling issue reported by 
firmware,
+   Check RTAS error log for details);
break;
  
  	case EPOW_WARN_POWER:

-   pr_err(Non critical power issue reported by firmware);
-   pr_err(Check RTAS error log for details);
+   pr_err_ratelimited(Non critical power issue reported by 
firmware,
+   Check RTAS error log for details);
break;
  
  	case EPOW_SYSTEM_SHUTDOWN:

@@ -169,15 +169,14 @@ static void rtas_parse_epow_errlog(struct rtas_error_log 
*log)
  
  	case EPOW_MAIN_ENCLOSURE:

case EPOW_POWER_OFF:
-   pr_emerg(Critical power/cooling issue reported by firmware);
-   pr_emerg(Check RTAS error log for details);
-   pr_emerg(Immediate power off);
+   pr_emerg(Critical power/cooling issue reported by firmware,
+ Check RTAS error log for details. Immediate power 
off);
emergency_sync();
kernel_power_off();
break;
  
  	default:

-   pr_err(Unknown power/cooling event (action code %d),
+   pr_err_ratelimited(Unknown power/cooling event (action code 
%d),
action_code);
}
  }


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v8] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-11 Thread Vipin K Parashar


On 06/11/2015 04:25 AM, Stewart Smith wrote:

Vipin K Parashar vi...@linux.vnet.ibm.com writes:

This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for

Not restricted to FSP systems, it's a generic OPAL API that any platform
could implement.


Re-phrased commit log message and re-posted patch.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RESEND PATCH v8] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-11 Thread Vipin K Parashar
This patch adds support for OPAL EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for the PowerNV platform. These events
are generated on FSP (Flexible Service Processor) based systems. EPOW
events are generated due to various critical system conditions that
require system shutdown. A few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system shutdown
request. Upon receipt of EPOW and DPO events the host kernel invokes
orderly_poweroff() for performing graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au

Changes in v8:
 - Added logic to filter events which doesn't require system shutdown
   for EPOW scenario.
 - Re-phrased patch description.

Changes in v7:
 - Fixed logic to check epow support to avoid EPOW, DPO handling path
   for BMC systems.

Changes in v6:
 - Made below changes as suggested by Michael Ellerman on previous patch.
 - Changed EPOW, DPO notifier blocks to use opal_power_control_event()
   and enhanced opal_power_control_event() to handle EPOW and DPO events.
 - Reorganized code and added/changed few variable, function names removing
   older ones.
 - Minor cleanup like removing unused headers, blank lines etc.

Changes in v5:
 - Made changes to address review comments on previous patch.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed patch to extract EPOW event timeout from OPAL device-tree.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  40 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 173 insertions(+), 18 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RESEND PATCH v8] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-11 Thread Vipin K Parashar
This patch adds support for OPAL EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for the PowerNV platform. These events
are generated on FSP (Flexible Service Processor) based systems. EPOW
events are generated due to various critical system conditions that
require system shutdown. A few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system shutdown
request. Upon receipt of EPOW and DPO events the host kernel invokes
orderly_poweroff() for performing graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  40 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 173 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..f460435 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..8b174f3 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
+int64_t opal_get_dpo_status(__be64 *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..58dc330 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -9,9 +9,12 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)opal-power:   fmt
+
 #include linux/kernel.h
 #include linux/reboot.h
 #include linux/notifier.h
+#include linux/of.h
 
 #include asm/opal.h
 #include asm/machdep.h
@@ -19,30 +22,116 @@
 #define SOFT_OFF 0x00
 #define SOFT_REBOOT 0x01
 
+/* Detect EPOW event */
+static bool detect_epow(void)
+{
+   u16 epow;
+   int i, rc;
+   __be16 epow_classes;
+   __be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
+
+   /*
+   * Check for EPOW event. Kernel sends supported EPOW classes info
+   * to OPAL. OPAL returns EPOW info along with classes present.
+   */
+   epow_classes = cpu_to_be16(OPAL_SYSEPOW_MAX);
+   rc

[PATCH v8] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-10 Thread Vipin K Parashar
This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au

Changes in v8:
 - Added logic to filter events which doesn't require system shutdown
   for EPOW scenario.

Changes in v7:
 - Fixed logic to check epow support to avoid EPOW, DPO handling path
   for BMC systems.

Changes in v6:
 - Made below changes as suggested by Michael Ellerman on previous patch.
 - Changed EPOW, DPO notifier blocks to use opal_power_control_event()
   and enhanced opal_power_control_event() to handle EPOW and DPO events.
 - Reorganized code and added/changed few variable, function names removing
   older ones.
 - Minor cleanup like removing unused headers, blank lines etc.

Changes in v5:
 - Made changes to address review comments on previous patch.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed patch to extract EPOW event timeout from OPAL device-tree.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  40 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 173 insertions(+), 18 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-10 Thread Vipin K Parashar
This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  40 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 173 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..f460435 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..8b174f3 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
+int64_t opal_get_dpo_status(__be64 *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..58dc330 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -9,9 +9,12 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)opal-power:   fmt
+
 #include linux/kernel.h
 #include linux/reboot.h
 #include linux/notifier.h
+#include linux/of.h
 
 #include asm/opal.h
 #include asm/machdep.h
@@ -19,30 +22,116 @@
 #define SOFT_OFF 0x00
 #define SOFT_REBOOT 0x01
 
+/* Detect EPOW event */
+static bool detect_epow(void)
+{
+   u16 epow;
+   int i, rc;
+   __be16 epow_classes;
+   __be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
+
+   /*
+   * Check for EPOW event. Kernel sends supported EPOW classes info
+   * to OPAL. OPAL returns EPOW info along with classes present.
+   */
+   epow_classes = cpu_to_be16(OPAL_SYSEPOW_MAX);
+   rc = opal_get_epow_status(opal_epow_status, epow_classes

Re: [PATCH v8] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-10 Thread Vipin K Parashar


On 06/11/2015 04:25 AM, Stewart Smith wrote:

Vipin K Parashar vi...@linux.vnet.ibm.com writes:

This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for

Not restricted to FSP systems, it's a generic OPAL API that any platform
could implement.


Yes EPOW and DPO APIs are generic any can be used on any platform.
But the text describes that it adds support for EPOW, DPO events which
exists only for FSP based system as if today.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v6] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-08 Thread Vipin K Parashar


On 06/05/2015 03:31 AM, Michael Ellerman wrote:

On Thu, 2015-04-06 at 12:03:17 UTC, Vipin K Parashar wrote:

This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com

Hi Vipin,

One issue, on mambo I'm seeing:

   [666973573,3] OPAL: Called with bad token 105 !
   opal-power: Existing DPO event detected.
   reboot: Failed to start orderly shutdown: forcing the issue
   reboot: Power down
   [684431322,5] OPAL: Shutdown request type 0x0...


ie. at boot it shuts down immediately.

The problem is in here I think:


+   /* Check for DPO event */
+   rc = opal_get_dpo_status(opal_dpo_timeout);
+   if (rc != OPAL_WRONG_STATE) {
+   pr_info(Existing DPO event detected.\n);
+   return true;
+   }


Thanks for catching it. EPOW, DPO doesn't exist for BMC and thus we 
shouldn't

be hitting this path on BMC/mambo. Bug exists below down where we check for
 epow device-tree  node . This bug got introduced with this version of 
patch

when i renamed epow_dpo_supported flag with supported flag and re-organized
code. Will send out fix for this. Above if cond is also isn't prefect so 
will fix it too

with new patch.

For FSP systems please use below FW patch to make it avoid notifications 
which

doesn't cause EPOW. Its already in recent skiboot tree.
Commit id 1954251ca83b8a458193e629d15da06d00643ae8

https://patchwork.ozlabs.org/patch/472303/



This also makes me think you probably haven't tested this on a BMC machine?

cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-08 Thread Vipin K Parashar
This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au

Changes in v7:
 - Fixed logic to check epow support to avoid EPOW, DPO handling path on BMC.

Changes in v6:
 - Made below changes as suggested by Michael Ellerman on previous patch.
 - Changed EPOW, DPO notifier blocks to use opal_power_control_event()
   and enhanced opal_power_control_event() to handle EPOW and DPO events.
 - Reorganized code and added/changed few variable, function names removing
   older ones.
 - Minor cleanup like removing unused headers, blank lines etc.

Changes in v5:
 - Made changes to address review comments on previous patch.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed patch to extract EPOW event timeout from OPAL device-tree.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  40 
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 126 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 152 insertions(+), 18 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-08 Thread Vipin K Parashar
This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  40 
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 126 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 152 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..f460435 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..8b174f3 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
+int64_t opal_get_dpo_status(__be64 *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..a27d390 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -9,9 +9,12 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)opal-power:   fmt
+
 #include linux/kernel.h
 #include linux/reboot.h
 #include linux/notifier.h
+#include linux/of.h
 
 #include asm/opal.h
 #include asm/machdep.h
@@ -19,30 +22,95 @@
 #define SOFT_OFF 0x00
 #define SOFT_REBOOT 0x01
 
+/* Detect existing EPOW, DPO events */
+static bool poweroff_pending(void)
+{
+   int i, rc;
+   __be16 epow_classes;
+   __be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
+   __be64 opal_dpo_timeout;
+
+   /* Check for DPO event */
+   rc = opal_get_dpo_status(opal_dpo_timeout);
+   if (rc == OPAL_SUCCESS) {
+   pr_info(Existing DPO event detected.\n);
+   return true;
+   }
+
+   /*
+   * Check

[PATCH v6] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-04 Thread Vipin K Parashar
This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au

Changes in v6:
 - Made below changes as suggested by Michael Ellerman on previous patch.
 - Changed EPOW, DPO notifier blocks to use opal_power_control_event()
   and enhanced opal_power_control_event() to handle EPOW and DPO events.
 - Reorganized code and added/changed few variable, function names removing
   older ones.
 - Minor cleanup like removing unused headers, blank lines etc.

Changes in v5:
 - Made changes to address review comments on previous patch.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed patch to extract EPOW event timeout from OPAL device-tree.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  40 
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 125 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 152 insertions(+), 17 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-04 Thread Vipin K Parashar
This patch adds support for FSP (Flexible Service Processor)
EPOW (Early Power Off Warning) and DPO (Delayed Power Off) events for
the PowerNV platform. EPOW events are generated by FSP due to various
critical system conditions that require system shutdown. A few examples
of these conditions are high ambient temperature or system running on
UPS power with low UPS battery. DPO event is generated in response to
admin initiated system shutdown request. Upon receipt of EPOW and DPO
events the host kernel invokes orderly_poweroff() for performing
graceful system shutdown.

Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
Reviewed-by: Michael Ellerman m...@ellerman.id.au
Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  40 
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 125 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 152 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..f460435 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..8b174f3 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
+int64_t opal_get_dpo_status(__be64 *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..b9f6620 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -9,9 +9,12 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)opal-power:   fmt
+
 #include linux/kernel.h
 #include linux/reboot.h
 #include linux/notifier.h
+#include linux/of.h
 
 #include asm/opal.h
 #include asm/machdep.h
@@ -19,30 +22,95 @@
 #define SOFT_OFF 0x00
 #define SOFT_REBOOT 0x01
 
+/* Detect existing EPOW, DPO events */
+static bool poweroff_pending(void)
+{
+   int i, rc;
+   __be16 epow_classes;
+   __be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
+   __be64 opal_dpo_timeout;
+
+   /* Check for DPO event */
+   rc = opal_get_dpo_status(opal_dpo_timeout);
+   if (rc != OPAL_WRONG_STATE) {
+   pr_info(Existing DPO event detected.\n);
+   return true;
+   }
+
+   /*
+   * Check

Re: [v5] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-03 Thread Vipin K Parashar

Hi Michael,
  Thanks for review. Responses below

On 06/03/2015 10:43 AM, Michael Ellerman wrote:

On Mon, 2015-18-05 at 15:18:04 UTC, Vipin K Parashar wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and

Please spell out the acronyms the first time you use them, including FSP.


Will do.




DPO (Delayed Power Off) events for PowerNV platform. EPOW events are

 ^
the


the PowerNV platform.  Will edit.




generated by SPCN/FSP due to various critical system conditions that

SPCN?


Will remove SPCN. FSP should be sufficient.




need system shutdown. Few examples of these conditions are high

 ^
s/need/require/ ?   A few


Agreed.




ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system request.

Blank line between paragraphs please.


Sure




Upon receipt of EPOW and DPO events host kernel invokes

 ^
the host kernel


will edit


orderly_poweroff for performing graceful system shutdown. System admin

I like it if you spell functions with a trailing () to make it clear they are
functions, so this would be orderly_powerof().


Agreed.




can also add systemd service shutdown scripts to perform any specific
actions like graceful guest shutdown upon system poweroff. libvirt-guests
is systemd service available on recent distros for management of guests
at system start/shutdown time.

This last part about the scripts is not relevant to the kernel patch so just
leave it out please.


Agreed.




Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com
---
  arch/powerpc/include/asm/opal-api.h|  44 
  arch/powerpc/include/asm/opal.h|   3 +-
  arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
  arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
  4 files changed, 179 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..90fa364 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -355,6 +355,10 @@ enum opal_msg_type {
OPAL_MSG_TYPE_MAX,
  };
  
+/* OPAL_MSG_SHUTDOWN parameter values */

+#defineSOFT_OFF0x00
+#defineSOFT_REBOOT 0x01

I don't see this in the skiboot version of opal-api.h ?

They should be kept in sync.

If it's a Linux only define it should go in opal.h


Agreed. Won't add these definitions to opal-api.h as its not present in 
skiboot version of opal-api.h.



  struct opal_msg {
__be32 msg_type;
__be32 reserved;
@@ -730,6 +734,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
  };
  
+/*

+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};

I don't see the last three of these enums used at all, so please drop them.


OPAL_SYSPOWER_CHNG / FAIL / INCL, OPAL_SYSTEMP_HMD and OPAL_SYSCOOL_INSF
enums aren't used here but they are part of skiboot version of 
opal-api.h and

thus need to be retained.
 PKVM2.1 uses these enums and thus can't be removed from 
skiboot opal-api.h





  #endif /* __ASSEMBLY__ */
  
  #endif /* __OPAL_API_H */

diff

Re: [PATCH v5] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-06-01 Thread Vipin K Parashar

Hi Michael,
 Please add below minor additions in commit log once u
accept this patch. Thanks for your help with this. Let me know,
if anything else is needed from me on this.

Regards,
Vipin

On 05/18/2015 08:48 PM, Vipin K Parashar wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system request.


s/system request/shutdown request/


Upon receipt of EPOW and DPO events host kernel invokes
orderly_poweroff for performing graceful system shutdown. System admin
can also add systemd service shutdown scripts to perform any specific
actions like graceful guest shutdown upon system poweroff. libvirt-guests
is systemd service available on recent distros for management of guests
at system start/shutdown time.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com


Reviewed-by: Joel Stanley j...@jms.id.au
Reviewed-by: Vaibhav Jain vaib...@linux.vnet.ibm.com


---
  arch/powerpc/include/asm/opal-api.h|  44 
  arch/powerpc/include/asm/opal.h|   3 +-
  arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
  arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
  4 files changed, 179 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..90fa364 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -355,6 +355,10 @@ enum opal_msg_type {
OPAL_MSG_TYPE_MAX,
  };

+/* OPAL_MSG_SHUTDOWN parameter values */
+#defineSOFT_OFF0x00
+#defineSOFT_REBOOT 0x01
+
  struct opal_msg {
__be32 msg_type;
__be32 reserved;
@@ -730,6 +734,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
  };

+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
  #endif /* __ASSEMBLY__ */

  #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..d30766f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
  int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t 
data);
  int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
  int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
+int64_t opal_get_dpo_status(int64_t *dpo_timeout);
  int64_t opal_set_system_attention_led(uint8_t led_action);
  int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..581bbd8 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
  /*
- * PowerNV OPAL power control for graceful shutdown handling

[PATCH v5] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-18 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated shutdown request.
Upon receipt of EPOW and DPO events host kernel invokes
orderly_poweroff for performing graceful system shutdown. System admin
can also add systemd service shutdown scripts to perform any specific
actions like graceful guest shutdown upon system poweroff. libvirt-guests
is systemd service available on recent distros for management of guests
at system start/shutdown time.

Changes in v5:
 - Made changes to address review comments on previous patch.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed patch to extract EPOW event timeout from OPAL device-tree.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  44 
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 179 insertions(+), 16 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v5] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-18 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high
ambient temperature or system running on UPS power with low UPS battery.
DPO event is generated in response to admin initiated system request.
Upon receipt of EPOW and DPO events host kernel invokes
orderly_poweroff for performing graceful system shutdown. System admin
can also add systemd service shutdown scripts to perform any specific
actions like graceful guest shutdown upon system poweroff. libvirt-guests
is systemd service available on recent distros for management of guests
at system start/shutdown time.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  44 
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 147 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 179 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..90fa364 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -355,6 +355,10 @@ enum opal_msg_type {
OPAL_MSG_TYPE_MAX,
 };
 
+/* OPAL_MSG_SHUTDOWN parameter values */
+#defineSOFT_OFF0x00
+#defineSOFT_REBOOT 0x01
+
 struct opal_msg {
__be32 msg_type;
__be32 reserved;
@@ -730,6 +734,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..d30766f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
+int64_t opal_get_dpo_status(int64_t *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..581bbd8 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
 /*
- * PowerNV OPAL power control for graceful shutdown handling
+ * PowerNV support for OPAL power-control, poweroff events
  *
  * Copyright 2015 IBM Corp.
  *
@@ -9,18 +9,87 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)OPAL-POWER:   fmt
+
 #include linux/kernel.h
+#include linux/spinlock.h
+#include linux/timer.h
 #include linux/reboot.h
-#include linux/notifier.h
-
+#include linux/of.h
 #include asm/opal.h

Re: [PATCH v3] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-15 Thread Vipin K Parashar

Thanks for review.
Made changes as suggested.

On 05/14/2015 08:51 PM, trigg wrote:



On 14-May-2015, at 16:16, Vipin K Parashar vi...@linux.vnet.ibm.com wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform. EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
Upon receipt of EPOW and DPO events host kernel invokes orderly_poweroff
for performing graceful system shutdown. System admin can also add systemd
service shutdown scripts to perform any specific actions like graceful guest
shutdown upon system poweroff. libvirt-guests is systemd service available on
recent distros for management of guests at system start/shutdown time.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
arch/powerpc/include/asm/opal-api.h|  44 +++
arch/powerpc/include/asm/opal.h|   3 +-
arch/powerpc/platforms/powernv/opal-power.c| 167 ++---
arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
4 files changed, 197 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..90fa364 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -355,6 +355,10 @@ enum opal_msg_type {
OPAL_MSG_TYPE_MAX,
};

+/* OPAL_MSG_SHUTDOWN parameter values */
+#defineSOFT_OFF0x00
+#defineSOFT_REBOOT0x01
+
struct opal_msg {
__be32 msg_type;
__be32 reserved;
@@ -730,6 +734,46 @@ struct opal_i2c_request {
__be64 buffer_ra;/* Buffer real address */
};

+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+OPAL_SYSEPOW_POWER= 0,/* Power EPOW */
+OPAL_SYSEPOW_TEMP= 1,/* Temperature EPOW */
+OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+OPAL_SYSPOWER_UPS= 0x0001, /* System on UPS power */
+OPAL_SYSPOWER_CHNG= 0x0002, /* System power config change */
+OPAL_SYSPOWER_FAIL= 0x0004, /* System impending power failure */
+OPAL_SYSPOWER_INCL= 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+OPAL_SYSCOOL_INSF= 0x0001, /* System insufficient cooling */
+};
+
#endif /* __ASSEMBLY__ */

#endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..d30766f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
+int64_t opal_get_dpo_status(int64_t *dpo_timeout);
int64_t opal_set_system_attention_led(uint8_t led_action);
int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..0a1e07b 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
/*
- * PowerNV OPAL power control for graceful shutdown handling
+ * PowerNV support for OPAL power-control, poweroff events
  *
  * Copyright 2015 IBM Corp.
  *
@@ -9,40 +9,137 @@
  * 2 of the License, or (at your option) any later version.
  */

+#define pr_fmt(fmt)OPAL-POWER: fmt
+
#include linux/kernel.h
+#include linux/spinlock.h
+#include linux/timer.h
#include linux/reboot.h
-#include linux/notifier.h
-
+#include

[PATCH v4] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-15 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
Upon receipt of EPOW and DPO events host kernel invokes orderly_poweroff
for performing graceful system shutdown. System admin can also add systemd
service shutdown scripts to perform any specific actions like graceful guest
shutdown upon system poweroff. libvirt-guests is systemd service available on
recent distros for management of guests at system start/shutdown time.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  44 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 153 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 186 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..90fa364 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -355,6 +355,10 @@ enum opal_msg_type {
OPAL_MSG_TYPE_MAX,
 };
 
+/* OPAL_MSG_SHUTDOWN parameter values */
+#defineSOFT_OFF0x00
+#defineSOFT_REBOOT 0x01
+
 struct opal_msg {
__be32 msg_type;
__be32 reserved;
@@ -730,6 +734,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..d30766f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
+int64_t opal_get_dpo_status(int64_t *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..c1dfa09 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
 /*
- * PowerNV OPAL power control for graceful shutdown handling
+ * PowerNV support for OPAL power-control, poweroff events
  *
  * Copyright 2015 IBM Corp.
  *
@@ -9,18 +9,94 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)OPAL-POWER:   fmt
+
 #include linux/kernel.h
+#include linux/spinlock.h
+#include linux/timer.h
 #include linux/reboot.h
-#include linux/notifier.h
-
+#include linux/of.h
 #include asm/opal.h

[PATCH v4] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-15 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
Upon receipt of EPOW and DPO events host kernel invokes orderly_poweroff
for performing graceful system shutdown. System admin can also add systemd
service shutdown scripts to perform any specific actions like graceful guest
shutdown upon system poweroff. libvirt-guests is systemd service available on
recent distros for management of guests at system start/shutdown time.

Changes in v4:
 - Made changes to address review comments on previous patch.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed device-tree patch.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  44 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 153 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 186 insertions(+), 15 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3] powerpc/powernv: Poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-14 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform. EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
Upon receipt of EPOW and DPO events host kernel invokes orderly_poweroff
for performing graceful system shutdown. System admin can also add systemd
service shutdown scripts to perform any specific actions like graceful guest
shutdown upon system poweroff. libvirt-guests is systemd service available on
recent distros for management of guests at system start/shutdown time.

Changes in v3:
 - Made changes to immediately call orderly_poweroff upon receipt of
   OPAL EPOW, DPO notifications.
 - Made code changes to address review comments on previous patch.
 - Made code changes to use existing OPAL EPOW API.
 - Removed device-tree patch.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (1):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform

 arch/powerpc/include/asm/opal-api.h|  44 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 167 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 197 insertions(+), 18 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-14 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform. EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
Upon receipt of EPOW and DPO events host kernel invokes orderly_poweroff
for performing graceful system shutdown. System admin can also add systemd
service shutdown scripts to perform any specific actions like graceful guest
shutdown upon system poweroff. libvirt-guests is systemd service available on
recent distros for management of guests at system start/shutdown time.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  44 +++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 167 ++---
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 197 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..90fa364 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -355,6 +355,10 @@ enum opal_msg_type {
OPAL_MSG_TYPE_MAX,
 };
 
+/* OPAL_MSG_SHUTDOWN parameter values */
+#defineSOFT_OFF0x00
+#defineSOFT_REBOOT 0x01
+
 struct opal_msg {
__be32 msg_type;
__be32 reserved;
@@ -730,6 +734,46 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL  = 0x0004, /* System impending power failure */
+   OPAL_SYSPOWER_INCL  = 0x0008, /* System incomplete power */
+};
+
+/* Temperature EPOW */
+enum OpalSysTemp {
+   OPAL_SYSTEMP_AMB= 0x0001, /* System over ambient temperature */
+   OPAL_SYSTEMP_INT= 0x0002, /* System over internal temperature */
+   OPAL_SYSTEMP_HMD= 0x0004, /* System over ambient humidity */
+};
+
+/* Cooling EPOW */
+enum OpalSysCooling {
+   OPAL_SYSCOOL_INSF   = 0x0001, /* System insufficient cooling */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..d30766f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
+int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
+int64_t opal_get_dpo_status(int64_t *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..0a1e07b 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
 /*
- * PowerNV OPAL power control for graceful shutdown handling
+ * PowerNV support for OPAL power-control, poweroff events
  *
  * Copyright 2015 IBM Corp.
  *
@@ -9,40 +9,137 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt)OPAL-POWER:   fmt
+
 #include linux/kernel.h
+#include linux/spinlock.h
+#include linux/timer.h
 #include linux/reboot.h
-#include linux/notifier.h
-
+#include linux/of.h
 #include

Re: [PATCH v2 1/2] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-13 Thread Vipin K Parashar


On 05/11/2015 12:19 PM, Michael Ellerman wrote:

On Thu, 2015-05-07 at 15:00 +0530, Vipin K Parashar wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform.  EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown.  Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
This patch enables host kernel on PowerNV platform to handle OPAL
notifications for these events and initiate system poweroff. Since EPOW
notifications are sent in advance of impending shutdown event and thus
this patch also adds functionality to wait for EPOW condition to return to
normal. Host allows MAX_POWEROFF_SYS_TIME (600 seconds) as system
poweroff time (time for host + guests shutdown) and waits for remaining
time for EPOW condition to return to normal. If EPOW condition doesn't
return to normal in calculated time it proceeds with graceful system
shutdown. For EPOW events with smaller timeouts values than
MAX_POWEROFF_SYS_TIME it proceeds with system shutdown without any wait
for EPOW condition to return to normal.


Can I suggest an alternative design:
  - when we recieve a DPO event call orderly_poweroff()
  - when we recieve an EPOW event call orderly_poweroff()

Thoughts?


Will make changes as per suggested design above and sent out patch.



cheers




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 1/2] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-11 Thread Vipin K Parashar

Hi Joel,
   Thanks for review. My comments below.

On 05/08/2015 06:56 AM, Joel Stanley wrote:

Hello Vipin,

On Thu, May 7, 2015 at 7:00 PM, Vipin K Parashar
vi...@linux.vnet.ibm.com wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform.

I reviewed this patch for the changes it made to the existing poweroff
code, you still need someone to look at the EPOW code itself.


Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
  arch/powerpc/include/asm/opal-api.h|  30 ++
  arch/powerpc/include/asm/opal.h|   3 +-
  arch/powerpc/platforms/powernv/opal-power.c| 379 +++--
  arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
  4 files changed, 391 insertions(+), 22 deletions(-)

  /* Internal functions */
  extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..7c1b2f8 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
  /*
- * PowerNV OPAL power control for graceful shutdown handling
+ * PowerNV poweroff events support
   *
   * Copyright 2015 IBM Corp.
   *
@@ -9,58 +9,395 @@
   * 2 of the License, or (at your option) any later version.
   */

+#define pr_fmt(fmt)POWEROFF_EVENT: fmt

OPAL_POWER?


Agreed.




+
  #include linux/kernel.h
+#include linux/spinlock.h
+#include linux/timer.h
  #include linux/reboot.h
-#include linux/notifier.h
-
+#include linux/of.h
  #include asm/opal.h
  #include asm/machdep.h

-#define SOFT_OFF 0x00
-#define SOFT_REBOOT 0x01
+/* Power control event types */
+#define SOFT_OFF   0x00
+#define SOFT_REBOOT0x01

While you're touching this code, I think these should be moved to opal-api.h


Sure will do.




+
+/* Max time for graceful system shutdown including guests. */
+#define MAX_POWEROFF_SYS_TIME  600
+
+/* IPMI power-control events notifier */
  static int opal_power_control_event(struct notifier_block *nb,
-   unsigned long msg_type, void *msg)
+   unsigned long msg_type, void *msg)
  {
-   struct opal_msg *power_msg = msg;
 uint64_t type;
+   struct opal_msg *power_msg = msg;

 type = be64_to_cpu(power_msg-params[0]);

 switch (type) {
 case SOFT_REBOOT:
-   pr_info(OPAL: reboot requested\n);
+   pr_info(Reboot requested\n);

I prefer the OPAL prefix.


OPAL prefix will get added with pr_fmt used above. So separate tag not 
needed.





 orderly_reboot();
 break;
 case SOFT_OFF:
-   pr_info(OPAL: poweroff requested\n);
+   pr_info(Poweroff requested\n);

Ditto.


Same as above.




 orderly_poweroff(true);
 break;
 default:
-   pr_err(OPAL: power control type unexpected %016llx\n, type);
+   pr_err(Unknown event %llu\n, type);

Ditto.


Same as above.




 }

 return 0;
  }

+/* OPAL EPOW event notifier block */
+static struct notifier_block opal_epow_nb = {
+   .notifier_call  = opal_epow_event,
+   .next   = NULL,
+   .priority   = 0,
+};
+
+/* OPAL DPO event notifier block */
+static struct notifier_block opal_dpo_nb = {
+   .notifier_call  = opal_dpo_event,
+   .next   = NULL,
+   .priority   = 0,
+};
+
+/* OPAL Power control events */
  static struct notifier_block opal_power_control_nb = {
-   .notifier_call  = opal_power_control_event,
-   .next   = NULL,
-   .priority   = 0,
+   .notifier_call  = opal_power_control_event,
+   .next   = NULL,
+   .priority   = 0,
  };

Looks like you changed the whitespace?


Probably otherwise no change needed here.




-static int __init opal_power_control_init(void)
+/* Poweroff events init */
+static int __init opal_poweroff_events_init(void)

This comment does not add any value.

Renaming the function doesn't add much either.


ok. Will retain original name.




  {
 int ret;
+   struct device_node *node_epow;

-   ret = opal_message_notifier_register(OPAL_MSG_SHUTDOWN,
-opal_power_control_nb);
-   if (ret) {
-   pr_err(%s: Can't register OPAL event notifier (%d)\n,
-   __func__, ret);
-   return ret;
+   /*
+   * Determine EPOW, DPO support in hardware.
+   */
+   node_epow = of_find_node_by_path(/ibm,opal/epow);
+   if (node_epow) {
+   if (of_device_is_compatible(node_epow, ibm,opal-epow)) {
+   epow_supported = true;
+   dpo_supported = true;

Why are these separate flags? Do we have any systems that will support

Re: [PATCH v2 1/2] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-11 Thread Vipin K Parashar


On 05/11/2015 12:19 PM, Michael Ellerman wrote:

On Thu, 2015-05-07 at 15:00 +0530, Vipin K Parashar wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform.  EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown.  Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
This patch enables host kernel on PowerNV platform to handle OPAL
notifications for these events and initiate system poweroff. Since EPOW
notifications are sent in advance of impending shutdown event and thus
this patch also adds functionality to wait for EPOW condition to return to
normal. Host allows MAX_POWEROFF_SYS_TIME (600 seconds) as system
poweroff time (time for host + guests shutdown) and waits for remaining
time for EPOW condition to return to normal. If EPOW condition doesn't
return to normal in calculated time it proceeds with graceful system
shutdown. For EPOW events with smaller timeouts values than
MAX_POWEROFF_SYS_TIME it proceeds with system shutdown without any wait
for EPOW condition to return to normal.


Can I suggest an alternative design:
  - when we recieve a DPO event call orderly_poweroff()
  - when we recieve an EPOW event call orderly_poweroff()

Thoughts?


Current design is calling orderly_poweroff immediately upon DPO event
as there is not need to wait for user initiated shutdowns.
EPOW is sent in anticipation of a poweroff needed ahead of time. A 
typical example

is EPOW due to system on UPS power with 15 mins timeout. There could be case
when power is restored back within timeout and a poweroff is not needed. 
In such case

HW sends EPOW reset informing that EPOW condition has returned to normal.
Another example is EPOW due to high ambient temp with 15 mins timeout.
Here too if temp goes down to manageable limits within timeout window, 
HW sends EPOW reset

to avoid shutdown.
So to handle such cases current design implements wait for
HW Timeout - MAX_POWEROFF_SYS_TIME seconds before poweroff. If EPOW 
condition
returns to normal within this time poweroff is cancelled. So immediate 
poweroff is avoided

here to handle cases when EPOW condition returns to  normal.
For EPOW cases (like system on UPS power with UPS battery low or 
Ambient temp
critically high) which have timeouts lower than MAX_POWEROFF_SYS 
seconds, it calls immediate

orderly_poweroff.
 In concise design implements immediate orderly_poweroff for DPO as 
well as EPOW cases when
timeout is less than  MAX_POWEROFF_SYS_TIME while for EPOW cases with 
MAX_POWEROFF_SYS_TIME

timeout it implements a wait for EPOW to return the normal.

Suggestions/thoughts ?

--Vipin

cheers




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 1/2] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-11 Thread Vipin K Parashar


On 05/11/2015 02:31 PM, Vipin K Parashar wrote:


On 05/11/2015 12:19 PM, Michael Ellerman wrote:

On Thu, 2015-05-07 at 15:00 +0530, Vipin K Parashar wrote:

This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform. EPOW 
events
are generated by SPCN/FSP due to various critical system conditions 
that
need system shutdown.  Few examples of these conditions are high 
ambient
temperature or system running on UPS power with low UPS battery. DPO 
event

is generated in response to admin initiated system shutdown request.
This patch enables host kernel on PowerNV platform to handle OPAL
notifications for these events and initiate system poweroff. Since EPOW
notifications are sent in advance of impending shutdown event and thus
this patch also adds functionality to wait for EPOW condition to 
return to

normal. Host allows MAX_POWEROFF_SYS_TIME (600 seconds) as system
poweroff time (time for host + guests shutdown) and waits for remaining
time for EPOW condition to return to normal. If EPOW condition doesn't
return to normal in calculated time it proceeds with graceful system
shutdown. For EPOW events with smaller timeouts values than
MAX_POWEROFF_SYS_TIME it proceeds with system shutdown without any wait
for EPOW condition to return to normal.


Can I suggest an alternative design:
  - when we recieve a DPO event call orderly_poweroff()
  - when we recieve an EPOW event call orderly_poweroff()

Thoughts?


Current design is calling orderly_poweroff immediately upon DPO event
as there is not need to wait for user initiated shutdowns.
EPOW is sent in anticipation of a poweroff needed ahead of time. A 
typical example
is EPOW due to system on UPS power with 15 mins timeout. There could 
be case
when power is restored back within timeout and a poweroff is not 
needed. In such case

HW sends EPOW reset informing that EPOW condition has returned to normal.
Another example is EPOW due to high ambient temp with 15 mins 
timeout.
Here too if temp goes down to manageable limits within timeout window, 
HW sends EPOW reset

to avoid shutdown.
So to handle such cases current design implements wait for
HW Timeout - MAX_POWEROFF_SYS_TIME seconds before poweroff. If EPOW 
condition
returns to normal within this time poweroff is cancelled. So immediate 
poweroff is avoided

here to handle cases when EPOW condition returns to  normal.
For EPOW cases (like system on UPS power with UPS battery low or 
Ambient temp
critically high) which have timeouts lower than MAX_POWEROFF_SYS 
seconds, it calls immediate

orderly_poweroff.
 In concise design implements immediate orderly_poweroff for DPO 
as well as EPOW cases when
timeout is less than  MAX_POWEROFF_SYS_TIME while for EPOW cases with 
MAX_POWEROFF_SYS_TIME

timeout it implements a wait for EPOW to return the normal.

Correction as below for above line:
... it implements a wait for EPOW to return to normal 
followed by a orderly_poweroff after wait time.




Suggestions/thoughts ?

--Vipin

cheers




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/2] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-07 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform.  EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown.  Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
This patch enables host kernel on PowerNV platform to handle OPAL
notifications for these events and initiate system poweroff. Since EPOW
notifications are sent in advance of impending shutdown event and thus
this patch also adds functionality to wait for EPOW condition to return to
normal. Host allows MAX_POWEROFF_SYS_TIME (600 seconds) as system
poweroff time (time for host + guests shutdown) and waits for remaining
time for EPOW condition to return to normal. If EPOW condition doesn't
return to normal in calculated time it proceeds with graceful system
shutdown. For EPOW events with smaller timeouts values than
MAX_POWEROFF_SYS_TIME it proceeds with system shutdown without any wait
for EPOW condition to return to normal.
System admin can also add systemd service shutdown scripts to
perform any specific actions like graceful guest shutdown upon system
poweroff. libvirt-guests is systemd service available on recent distros
for management of guests at system stat/shutdown time.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  30 ++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 379 +++--
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 391 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..03b3cef 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,36 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_EPOW_MAX_CLASSES
+ * to fetch system wide EPOW status. Each element in the returned buffer
+ * will contain bitwise EPOW status for each EPOW sub class.
+ */
+
+/* EPOW types */
+enum OpalEpow {
+   OPAL_EPOW_POWER = 0,/* Power EPOW */
+   OPAL_EPOW_TEMP  = 1,/* Temperature EPOW */
+   OPAL_EPOW_COOLING   = 2,/* Cooling EPOW */
+   OPAL_MAX_EPOW_CLASSES   = 3,/* Max EPOW categories */
+};
+
+/* Power EPOW events */
+enum OpalEpowPower {
+   OPAL_EPOW_POWER_UPS = 0x1, /* System on UPS power */
+   OPAL_EPOW_POWER_UPS_LOW = 0x2, /* System on UPS power with low battery*/
+};
+
+/* Temperature EPOW events */
+enum OpalEpowTemp {
+   OPAL_EPOW_TEMP_HIGH_AMB = 0x1, /* High ambient temperature */
+   OPAL_EPOW_TEMP_CRIT_AMB = 0x2, /* Critical ambient temperature */
+   OPAL_EPOW_TEMP_HIGH_INT = 0x4, /* High internal temperature */
+   OPAL_EPOW_TEMP_CRIT_INT = 0x8, /* Critical internal temperature */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..0777864 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,6 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
@@ -200,6 +199,8 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, 
uint64_t buf,
uint64_t size, uint64_t token);
 int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
uint64_t token);
+int32_t opal_get_epow_status(__be32 *status, __be32 *num_classes);
+int32_t opal_get_dpo_status(__be32 *timeout);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index ac46c2c..7c1b2f8 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -1,5 +1,5 @@
 /*
- * PowerNV OPAL power control for graceful shutdown handling
+ * PowerNV poweroff events support
  *
  * Copyright 2015 IBM Corp.
  *
@@ -9,58 +9,395 @@
  * 2 of the License, or (at your option) any

[PATCH v2 0/2] Poweroff (EPOW, DPO) events support for PowerNV platform

2015-05-07 Thread Vipin K Parashar
This patchset adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform.  EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown. Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
This patchset enables host kernel on PowerNV platform to handle OPAL
notifications for these events and initiate system poweroff. Since EPOW
notifications are sent in advance of impending shutdown event and thus
functionality is also added to wait for EPOW condition to return to
normal. EPOW events timeout values are available via OPAL exported device
tree values under EPOW node.
Host kernel allows MAX_POWEROFF_SYS_TIME (600 seconds) as system
poweroff time (time for host + guests shutdown) and waits for remaining
time for EPOW condition to return to normal. If EPOW condition doesn't
return to normal in calculated time it proceeds with graceful system
shutdown. For EPOW events with smaller timeouts values than
MAX_POWEROFF_SYS_TIME it proceeds with system shutdown without any wait
for EPOW condition to return to normal.
System admin can also add systemd service shutdown scripts to
perform any specific actions like graceful guest shutdown upon system
poweroff. libvirt-guests is systemd service available on recent distros
for management of guests at system stat/shutdown time.

Changes in v2:
 - Made code changes to improve code as per previous review comments.
 - Added patch to obtain EPOW event timeout values from OPAL device-tree.

Vipin K Parashar (2):
  powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV
platform
  powerpc/powernv: Extract EPOW events timeout values from OPAL device
tree

 arch/powerpc/include/asm/opal-api.h|  30 ++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-power.c| 440 +++--
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 4 files changed, 452 insertions(+), 22 deletions(-)

--
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/2] powerpc/powernv: Extract EPOW events timeout values from OPAL device tree

2015-05-07 Thread Vipin K Parashar
OPAL exports plaform timeout values for various EPOW events under
EPOW device tree node. EPOW node contains sub nodes for each EPOW
class. Under each class platform timeout property files are located
for EPOW events under that class. Each file contains platform timeout
value for corresponding EPOW event in seconds.
Support for extracting EPOW event timeout values from OPAL
device tree is added by this patch. Below property files are parsed
to extract EPOW event timeout values.

 Power EPOW
 ===
 ups-timeout
 ups-low-timeout

 Temp EPOW
 ==
 high-ambient-temp-timeout
 crit-ambient-temp-timeout
 high-internal-temp-timeout
 crit-internal-temp-timeout

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/powernv/opal-power.c | 79 +
 1 file changed, 70 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-power.c 
b/arch/powerpc/platforms/powernv/opal-power.c
index 7c1b2f8..5b015f3 100644
--- a/arch/powerpc/platforms/powernv/opal-power.c
+++ b/arch/powerpc/platforms/powernv/opal-power.c
@@ -60,15 +60,7 @@ static const char * const epow_events_map[] = {
 };
 
 /* Poweroff EPOW events timeout values in seconds */
-static const int epow_timeout[] = {
-   [EPOW_POWER_UPS]= 900,
-   [EPOW_POWER_UPS_LOW]= 20,
-   [EPOW_TEMP_HIGH_AMB]= 900,
-   [EPOW_TEMP_CRIT_AMB]= 20,
-   [EPOW_TEMP_HIGH_INT]= 900,
-   [EPOW_TEMP_CRIT_INT]= 20,
-   [EPOW_UNKNOWN]  = 0,
-};
+static int epow_timeout[MAX_EPOW_EVENTS];
 
 /* System poweroff function. */
 static void epow_poweroff(unsigned long event)
@@ -125,6 +117,72 @@ static void stop_epow_timer(void)
pr_info(Poweroff timer deactivated\n);
 }
 
+/* Extract timeout value from device tree property */
+static int get_timeout_value(struct device_node *node, const char *prop)
+{
+   const __be32 *pval;
+   int timeout = 0;
+
+   pval = of_get_property(node, prop, NULL);
+   if (pval)
+   timeout = be32_to_cpup(pval);
+   else
+   pr_err(Didn't find %s dt property\n, prop);
+
+   return timeout;
+}
+
+/* Get EPOW events timeout values from OPAL device tree */
+static void get_epow_timeouts(void)
+{
+   struct device_node *epow_power, *epow_temp;
+
+   /* EPOW power class event timeouts */
+   epow_power = of_find_node_by_path(/ibm,opal/epow/power);
+   if (epow_power) {
+   epow_timeout[EPOW_POWER_UPS] =
+   get_timeout_value(epow_power, ups-timeout);
+   pr_info(Power EPOW ups-timeout = %d seconds\n,
+   epow_timeout[EPOW_POWER_UPS]);
+
+   epow_timeout[EPOW_POWER_UPS_LOW] =
+   get_timeout_value(epow_power, ups-low-timeout);
+   pr_info(Power EPOW ups-low-timeout = %d seconds\n,
+   epow_timeout[EPOW_POWER_UPS_LOW]);
+
+   of_node_put(epow_power);
+   } else
+   pr_info(Power EPOW class not supported in OPAL\n);
+
+   /* EPOW temp class event timeouts */
+   epow_temp = of_find_node_by_path(/ibm,opal/epow/temp);
+   if (epow_temp) {
+   epow_timeout[EPOW_TEMP_HIGH_AMB] =
+   get_timeout_value(epow_temp, high-ambient-temp-timeout);
+   pr_info(Temp EPOW high-ambient-temp-timeout = %d seconds\n,
+   epow_timeout[EPOW_TEMP_HIGH_AMB]);
+
+   epow_timeout[EPOW_TEMP_CRIT_AMB] =
+   get_timeout_value(epow_temp, crit-ambient-temp-timeout);
+   pr_info(Temp EPOW crit-ambient-temp-timeout = %d seconds\n,
+   epow_timeout[EPOW_TEMP_CRIT_AMB]);
+
+   epow_timeout[EPOW_TEMP_HIGH_INT] =
+   get_timeout_value(epow_temp, high-internal-temp-timeout);
+   pr_info(Temp EPOW high-inernal-temp-timeout = %d seconds\n,
+   epow_timeout[EPOW_TEMP_HIGH_INT]);
+
+   epow_timeout[EPOW_TEMP_CRIT_INT] =
+   get_timeout_value(epow_temp, crit-internal-temp-timeout);
+   pr_info(Temp EPOW crit-inernal-temp-timeout = %d seconds\n,
+   epow_timeout[EPOW_TEMP_CRIT_INT]);
+
+   of_node_put(epow_temp);
+   } else
+   pr_info(Temp EPOW class not supported in OPAL\n);
+
+}
+
 /* Get DPO status */
 static bool get_dpo_status(int32_t *dpo_timeout)
 {
@@ -366,6 +424,9 @@ static int __init opal_poweroff_events_init(void)
init_timer(epow_timer);
epow_timer.function = epow_poweroff;
 
+   /* Get EPOW events timeout value */
+   get_epow_timeouts();
+
/* Register EPOW event notifier */
ret = opal_message_notifier_register(OPAL_MSG_EPOW,
opal_epow_nb);
-- 
1.9.3

[PATCH] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

2015-04-30 Thread Vipin K Parashar
This patch adds support for FSP EPOW (Early Power Off Warning) and
DPO (Delayed Power Off) events support for PowerNV platform.  EPOW events
are generated by SPCN/FSP due to various critical system conditions that
need system shutdown.  Few examples of these conditions are high ambient
temperature or system running on UPS power with low UPS battery. DPO event
is generated in response to admin initiated system shutdown request.
This patch enables host kernel on PowerNV platform to handle OPAL
notifications for these events and initiate system poweroff. Since EPOW
notifications are sent in advance of impending shutdown event and thus this
patch also adds functionality to wait for EPOW condition to return to
normal.  If EPOW condition doesn't return to normal in estimated time it
proceeds with graceful system shutdown. System admin can also add host
userspace scripts to perform any specific actions like graceful guest
shutdown upon system poweroff.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/opal-api.h|  30 ++
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/Makefile|   1 +
 .../platforms/powernv/opal-poweroff-events.c   | 358 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 5 files changed, 392 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/powernv/opal-poweroff-events.c

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 0321a90..03b3cef 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -730,6 +730,36 @@ struct opal_i2c_request {
__be64 buffer_ra;   /* Buffer real address */
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_EPOW_MAX_CLASSES
+ * to fetch system wide EPOW status. Each element in the returned buffer
+ * will contain bitwise EPOW status for each EPOW sub class.
+ */
+
+/* EPOW types */
+enum OpalEpow {
+   OPAL_EPOW_POWER = 0,/* Power EPOW */
+   OPAL_EPOW_TEMP  = 1,/* Temperature EPOW */
+   OPAL_EPOW_COOLING   = 2,/* Cooling EPOW */
+   OPAL_MAX_EPOW_CLASSES   = 3,/* Max EPOW categories */
+};
+
+/* Power EPOW events */
+enum OpalEpowPower {
+   OPAL_EPOW_POWER_UPS = 0x1, /* System on UPS power */
+   OPAL_EPOW_POWER_UPS_LOW = 0x2, /* System on UPS power with low battery*/
+};
+
+/* Temperature EPOW events */
+enum OpalEpowTemp {
+   OPAL_EPOW_TEMP_HIGH_AMB = 0x1, /* High ambient temperature */
+   OPAL_EPOW_TEMP_CRIT_AMB = 0x2, /* Critical ambient temperature */
+   OPAL_EPOW_TEMP_HIGH_INT = 0x4, /* High internal temperature */
+   OPAL_EPOW_TEMP_CRIT_INT = 0x8, /* Critical internal temperature */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..0777864 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -141,7 +141,6 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
 int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
 int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t 
error_type, uint8_t mask_action);
 int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t 
led_type, uint8_t led_action);
-int64_t opal_get_epow_status(__be64 *status);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
__be16 *pci_error_type, __be16 *severity);
@@ -200,6 +199,8 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, 
uint64_t buf,
uint64_t size, uint64_t token);
 int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
uint64_t token);
+int32_t opal_get_epow_status(__be32 *status, __be32 *num_classes);
+int32_t opal_get_dpo_status(__be32 *timeout);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index 33e44f3..b817bdb 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,6 +2,7 @@ obj-y   += setup.o opal-wrappers.o opal.o 
opal-async.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
 obj-y  += opal-msglog.o opal-hmi.o opal-power.o
+obj-y  += opal-poweroff-events.o
 
 obj-$(CONFIG_SMP)  += smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)  += pci.o pci-p5ioc2.o pci-ioda.o
diff --git a/arch/powerpc/platforms/powernv/opal-poweroff-events.c 
b/arch

Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver

2015-03-02 Thread Vipin K Parashar

Hi Stewart,
  Tried to fake ACPI via acpi_bus_generate_netlink_event 
and found that

it needs other files which arch specific and use x86 assembly.

Regards,
Vipin


On 02/24/2015 03:14 PM, Vipin K Parashar wrote:

Hi Stewart,
 I looked into ACPI and found details about it. But before we 
go into
discussing more details of it, would like to  share a brief about OPAL 
platform

events (EPOW/DPO) work and original design proposed.

As if now OPAL platform events work supports two events:
EPOW (Early Power Off Warning) and DPO (Delayed Power Off).

On FSP based systems FSP notifies OPAL about EPOW and DPO events via mbox
mechanism. Subsequently OPAL sends notifications for these events to 
pkvm kernel.
Original design is to have a kernel driver maintain a queue and add 
these events
to queue upon arrival. pkvm driver also provides a character device 
for host to consume
these events. A daemon is proposed for pkvm host to poll/read these 
events from
char device. This daemon would process these events and take action to 
log
and shutdown host. Apart from this it would also send these event info 
to VMs
which is handled by OSes running on VMs. Linux on VMs already has code 
in place
to handle these events as it expects this info to reach it in PAPR 
format under

EPOW (Environmental and Power Warnings) category.

EPOW mbox msgs are received for below events:
1. UPS events - UPS Battery Low, UPS Bypassed, UPS Utility Failure, 
UPS On
2. SPCN events - Configuration Change, Log SPCN Fault, Impending Power 
Failure, Power Incomplete
3. Temprature events - Over Ambient temperature, Over internal 
temperature.


Now ACPI:

Looked into ACPI and tried to figure out how ACPI userspace/kernel 
framework

can be helpful for our work.

ACPI user space consists of below components.
acpid - ACPI daemon to receive events from kernel
acpid provides events and actions files in /etc/acpi dir to configure 
actions

for various events.

acpi, acpi_listen, acpitool - Commands to query and set various ACPI 
supported parameters.
These tools work with various sysfs files to show/set various 
parameter values.


As if today acpid and other tools don't exist for POWER so would need 
to be ported.
acpid is useful for our work but other tools might not be helpful as 
they look into
various sysfs files created by various ACPI kernel drivers which we 
won't have.

Also we would need to map our EPOW/DPO events to acpid supported events
and few events link SPCN ones won't map straight away and might need 
to be

added in acpid as new events.

ACPI in kernel has various drivers for fan, battery, laptop buttons 
etc. They handle events
and uses netlink mechanism to sent out these events to userspace. Now 
looking into ACPI
code it seems that we would be reusing a small chunk of acpi code but 
instead end up adding
unnecessary complexity due to support a lot of stuff than needed by 
us. Here too mapping our
 EPOW/DPO events to ACPI defined structures in needed and we would 
need to add
new member varaibles in ACPI event structures for unmapped events like 
SPCN ones.


In nutshell it seems that by using ACPI we would end up adding lot 
more complexity with a little

gain of code reuse.

Netlink:

On technology side netlink seems to be a faster method compared to 
character driver. So that could be
a good alternative to use as a method of communication between our 
pkvm driver and userspace.
But EPOW/DPO events occur at very low rate unlike network subsystem 
which receive data packets
at a very high rate. So probably netlink could be a faster method but 
due to slow EPOW/DPO event

traffic a character driver might be sufficient.

We already have ppc64-diag package which is part of various distros so 
would be used for hosting
daemon code. Thus it takes off overhead of convincing distros for 
adding something extra.


This was my findings and opinions on alternatives. Apologies for a 
little lengthy text :-)


Let me know if i missed out anything and any suggestions that you 
would have.


Regards,
Vipin

On 02/11/2015 10:32 AM, Stewart Smith wrote:

Vipin K Parashar vi...@linux.vnet.ibm.com writes:

(1) Environmental and Power Warning (EPOW)
(2) Delayed Power Off (DPO)
The user interface for this driver is /dev/opal_event character
device file where the user space clients can poll and read for
new opal platform events. The expected sequence of events driven
from user space should be like the following.

(1) Open the character device file
(2) Poll on the file for POLLIN event
(3) When unblocked, must attempt to read 
OPAL_PLAT_EVENT_MAX_SIZE size

(4) Kernel driver will pass at most one opal_plat_event structure
(5) Poll again for more new events

A few thoughts from discussing with Michael and Joel:
- not convinced that a chardev is the most ideal way to notify
   userspace. It seems like yet-another powerpc specific notification
   mechanism, which isn't ideal.
- netlink probably isn't right

Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver

2015-02-24 Thread Vipin K Parashar

Hi Stewart,
 I looked into ACPI and found details about it. But before we 
go into
discussing more details of it, would like to  share a brief about OPAL 
platform

events (EPOW/DPO) work and original design proposed.

As if now OPAL platform events work supports two events:
EPOW (Early Power Off Warning) and DPO (Delayed Power Off).

On FSP based systems FSP notifies OPAL about EPOW and DPO events via mbox
mechanism. Subsequently OPAL sends notifications for these events to 
pkvm kernel.
Original design is to have a kernel driver maintain a queue and add 
these events
to queue upon arrival. pkvm driver also provides a character device for 
host to consume
these events. A daemon is proposed for pkvm host to poll/read these 
events from

char device. This daemon would process these events and take action to log
and shutdown host. Apart from this it would also send these event info 
to VMs
which is handled by OSes running on VMs. Linux on VMs already has code 
in place
to handle these events as it expects this info to reach it in PAPR 
format under

EPOW (Environmental and Power Warnings) category.

EPOW mbox msgs are received for below events:
1. UPS events - UPS Battery Low, UPS Bypassed, UPS Utility Failure, UPS On
2. SPCN events - Configuration Change, Log SPCN Fault, Impending Power 
Failure, Power Incomplete

3. Temprature events - Over Ambient temperature, Over internal temperature.

Now ACPI:

Looked into ACPI and tried to figure out how ACPI userspace/kernel 
framework

can be helpful for our work.

ACPI user space consists of below components.
acpid - ACPI daemon to receive events from kernel
acpid provides events and actions files in /etc/acpi dir to configure 
actions

for various events.

acpi, acpi_listen, acpitool - Commands to query and set various ACPI 
supported parameters.
These tools work with various sysfs files to show/set various parameter 
values.


As if today acpid and other tools don't exist for POWER so would need to 
be ported.
acpid is useful for our work but other tools might not be helpful as 
they look into
various sysfs files created by various ACPI kernel drivers which we 
won't have.

Also we would need to map our EPOW/DPO events to acpid supported events
and few events link SPCN ones won't map straight away and might need to be
added in acpid as new events.

ACPI in kernel has various drivers for fan, battery, laptop buttons etc. 
They handle events
and uses netlink mechanism to sent out these events to userspace. Now 
looking into ACPI
code it seems that we would be reusing a small chunk of acpi code but 
instead end up adding
unnecessary complexity due to support a lot of stuff than needed by us. 
Here too mapping our
 EPOW/DPO events to ACPI defined structures in needed and we would need 
to add
new member varaibles in ACPI event structures for unmapped events like 
SPCN ones.


In nutshell it seems that by using ACPI we would end up adding lot more 
complexity with a little

gain of code reuse.

Netlink:

On technology side netlink seems to be a faster method compared to 
character driver. So that could be
a good alternative to use as a method of communication between our pkvm 
driver and userspace.
But EPOW/DPO events occur at very low rate unlike network subsystem 
which receive data packets
at a very high rate. So probably netlink could be a faster method but 
due to slow EPOW/DPO event

traffic a character driver might be sufficient.

We already have ppc64-diag package which is part of various distros so 
would be used for hosting
daemon code. Thus it takes off overhead of convincing distros for adding 
something extra.


This was my findings and opinions on alternatives. Apologies for a 
little lengthy text :-)


Let me know if i missed out anything and any suggestions that you would 
have.


Regards,
Vipin

On 02/11/2015 10:32 AM, Stewart Smith wrote:

Vipin K Parashar vi...@linux.vnet.ibm.com writes:

(1) Environmental and Power Warning (EPOW)
(2) Delayed Power Off (DPO)
The user interface for this driver is /dev/opal_event character
device file where the user space clients can poll and read for
new opal platform events. The expected sequence of events driven
from user space should be like the following.

(1) Open the character device file
(2) Poll on the file for POLLIN event
(3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size
(4) Kernel driver will pass at most one opal_plat_event structure
(5) Poll again for more new events

A few thoughts from discussing with Michael and Joel:
- not convinced that a chardev is the most ideal way to notify
   userspace. It seems like yet-another powerpc specific notification
   mechanism, which isn't ideal.
- netlink probably isn't right either (although maybe *sligthtly*
   better?)
- it seems that the standard way is ACPI, so I wonder if we could emit
   an ACPI event and essentially fake having ACPI... that would make all
   existing

Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver

2015-02-11 Thread Vipin K Parashar


On 02/11/2015 10:32 AM, Stewart Smith wrote:

Vipin K Parashar vi...@linux.vnet.ibm.com writes:

(1) Environmental and Power Warning (EPOW)
(2) Delayed Power Off (DPO)
The user interface for this driver is /dev/opal_event character
device file where the user space clients can poll and read for
new opal platform events. The expected sequence of events driven
from user space should be like the following.

(1) Open the character device file
(2) Poll on the file for POLLIN event
(3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size
(4) Kernel driver will pass at most one opal_plat_event structure
(5) Poll again for more new events

A few thoughts from discussing with Michael and Joel:
- not convinced that a chardev is the most ideal way to notify
   userspace. It seems like yet-another powerpc specific notification
   mechanism, which isn't ideal.
- netlink probably isn't right either (although maybe *sligthtly*
   better?)
- it seems that the standard way is ACPI, so I wonder if we could emit
   an ACPI event and essentially fake having ACPI... that would make all
   existing userspace just work, right?
   Looking at acpi_bus_generate_netlink_event call in
   drivers/acpi/button.c it looks possible that we may be able to
   (relatively simply) do that?
Thanks Stewart, i will explore more about ACPI and will also try to see 
if we could use it to throw

events to guests.

- What do UPSs do? It would seem that some common this is what's about
   to happen to your power would almost *have* to exist somewhat
   generically?
UPS class tells about UPS status with system. FSP sends mbox messages 
with UPS status along with
UPS status bit which tells exactly as to what change is there in UPS 
status like UPS installed, UPS battery low, UPS removed (By passed). We 
plan to add support for these UPS events in skiboot to provide more

UPS details.

I strongly advocate for anything that doesn't require custom userspace
that's OPAL/POWER specific (that we then have to get into distros etc etc

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4] powerpc, powernv: Add OPAL platform event driver

2015-02-05 Thread Vipin K Parashar
This patch creates a new OPAL platform event character driver
which will give userspace clients the access to these events
and process them effectively. Following platforms events are
currently supported with this platform driver.

(1) Environmental and Power Warning (EPOW)
(2) Delayed Power Off (DPO)

The user interface for this driver is /dev/opal_event character
device file where the user space clients can poll and read for
new opal platform events. The expected sequence of events driven
from user space should be like the following.

(1) Open the character device file
(2) Poll on the file for POLLIN event
(3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size
(4) Kernel driver will pass at most one opal_plat_event structure
(5) Poll again for more new events

The driver registers for OPAL messages notifications corresponding to
individual OPAL events. When any of those event messages arrive in the
kernel, the callbacks are called to process them which in turn unblocks
the polling thread on the character device file. The driver also registers
a timer function which will be called after a threshold amount of time to
shutdown the system. The user space client receives the timeout value for
all individual OPAL platform events and hence must prepare the system and
eventually shutdown. In case the user client does not shutdown the system,
the timer function will be called after the threshold and shutdown the
system explicitly.

Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com
Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com
---
Changes in V4:
- Used miscdev in place of chardev
- Used module_platform_driver macro for registering platform driver
- Added endianness conversions before and after making OPAL calls
- Changed events data structure in opal_platform_events.h to use bitmask
  for various events in each event class
- Added some info prints
- Added code changes to return remaining time for DPO event for user space query
- Made O_NONBLOCK unsupported for file open call
- Changed actionable_epow function to exclude events and purged epow_exclude 
function

Changes in V3:
- Rebased the patch against the mainline

Changes in V2:
- Changed the function fetch_dpo_timeout
- Export opal_platform_events.h for user space consumption
- Posted here https://patchwork.ozlabs.org/patch/396725/

Original V1:
- Original patch
- Posted here http://patchwork.ozlabs.org/patch/394340/

 arch/powerpc/include/asm/opal.h|  45 +-
 arch/powerpc/include/uapi/asm/Kbuild   |   1 +
 .../include/uapi/asm/opal_platform_events.h|  90 +++
 arch/powerpc/platforms/powernv/Makefile|   2 +-
 .../platforms/powernv/opal-platform-events.c   | 663 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 arch/powerpc/platforms/powernv/opal.c  |   8 +-
 7 files changed, 807 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/opal_platform_events.h
 create mode 100644 arch/powerpc/platforms/powernv/opal-platform-events.c

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index eb95b67..950839c 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -166,6 +166,7 @@ struct opal_sg_list {
 #define OPAL_UNREGISTER_DUMP_REGION102
 #define OPAL_WRITE_TPO 103
 #define OPAL_READ_TPO  104
+#define OPAL_GET_DPO_STATUS105
 #define OPAL_IPMI_SEND 107
 #define OPAL_IPMI_RECV 108
 #define OPAL_I2C_REQUEST   109
@@ -306,6 +307,7 @@ enum OpalMessageType {
OPAL_MSG_EPOW,
OPAL_MSG_SHUTDOWN,
OPAL_MSG_HMI_EVT,
+   OPAL_MSG_DPO,
OPAL_MSG_TYPE_MAX,
 };
 
@@ -421,6 +423,46 @@ struct opal_msg {
__be64 params[8];
 };
 
+/*
+ * EPOW status sharing (OPAL and the host)
+ *
+ * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
+ * with individual elements being 16 bits wide to fetch the system
+ * wide EPOW status. Each element in the buffer will contain the
+ * EPOW status in it's bit representation for a particular EPOW sub
+ * class as defiend here. So multiple detailed EPOW status bits
+ * specific for any sub class can be represented in a single buffer
+ * element as it's bit representation.
+ */
+
+/* System EPOW type */
+enum OpalSysEpow {
+   OPAL_SYSEPOW_POWER  = 0,/* Power EPOW */
+   OPAL_SYSEPOW_TEMP   = 1,/* Temperature EPOW */
+   OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */
+   OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */
+};
+
+/* Power EPOW */
+enum OpalSysPower {
+   OPAL_SYSPOWER_UPS   = 0x0001, /* System on UPS power */
+   OPAL_SYSPOWER_CHNG  = 0x0002, /* System power config change */
+   OPAL_SYSPOWER_FAIL