Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver
Hi Stewart, Tried to fake ACPI via acpi_bus_generate_netlink_event and found that it needs other files which arch specific and use x86 assembly. Regards, Vipin On 02/24/2015 03:14 PM, Vipin K Parashar wrote: Hi Stewart, I looked into ACPI and found details about it. But before we go into discussing more details of it, would like to share a brief about OPAL platform events (EPOW/DPO) work and original design proposed. As if now OPAL platform events work supports two events: EPOW (Early Power Off Warning) and DPO (Delayed Power Off). On FSP based systems FSP notifies OPAL about EPOW and DPO events via mbox mechanism. Subsequently OPAL sends notifications for these events to pkvm kernel. Original design is to have a kernel driver maintain a queue and add these events to queue upon arrival. pkvm driver also provides a character device for host to consume these events. A daemon is proposed for pkvm host to poll/read these events from char device. This daemon would process these events and take action to log and shutdown host. Apart from this it would also send these event info to VMs which is handled by OSes running on VMs. Linux on VMs already has code in place to handle these events as it expects this info to reach it in PAPR format under EPOW (Environmental and Power Warnings) category. EPOW mbox msgs are received for below events: 1. UPS events - UPS Battery Low, UPS Bypassed, UPS Utility Failure, UPS On 2. SPCN events - Configuration Change, Log SPCN Fault, Impending Power Failure, Power Incomplete 3. Temprature events - Over Ambient temperature, Over internal temperature. Now ACPI: Looked into ACPI and tried to figure out how ACPI userspace/kernel framework can be helpful for our work. ACPI user space consists of below components. acpid - ACPI daemon to receive events from kernel acpid provides events and actions files in /etc/acpi dir to configure actions for various events. acpi, acpi_listen, acpitool - Commands to query and set various ACPI supported parameters. These tools work with various sysfs files to show/set various parameter values. As if today acpid and other tools don't exist for POWER so would need to be ported. acpid is useful for our work but other tools might not be helpful as they look into various sysfs files created by various ACPI kernel drivers which we won't have. Also we would need to map our EPOW/DPO events to acpid supported events and few events link SPCN ones won't map straight away and might need to be added in acpid as new events. ACPI in kernel has various drivers for fan, battery, laptop buttons etc. They handle events and uses netlink mechanism to sent out these events to userspace. Now looking into ACPI code it seems that we would be reusing a small chunk of acpi code but instead end up adding unnecessary complexity due to support a lot of stuff than needed by us. Here too mapping our EPOW/DPO events to ACPI defined structures in needed and we would need to add new member varaibles in ACPI event structures for unmapped events like SPCN ones. In nutshell it seems that by using ACPI we would end up adding lot more complexity with a little gain of code reuse. Netlink: On technology side netlink seems to be a faster method compared to character driver. So that could be a good alternative to use as a method of communication between our pkvm driver and userspace. But EPOW/DPO events occur at very low rate unlike network subsystem which receive data packets at a very high rate. So probably netlink could be a faster method but due to slow EPOW/DPO event traffic a character driver might be sufficient. We already have ppc64-diag package which is part of various distros so would be used for hosting daemon code. Thus it takes off overhead of convincing distros for adding something extra. This was my findings and opinions on alternatives. Apologies for a little lengthy text :-) Let me know if i missed out anything and any suggestions that you would have. Regards, Vipin On 02/11/2015 10:32 AM, Stewart Smith wrote: Vipin K Parashar vi...@linux.vnet.ibm.com writes: (1) Environmental and Power Warning (EPOW) (2) Delayed Power Off (DPO) The user interface for this driver is /dev/opal_event character device file where the user space clients can poll and read for new opal platform events. The expected sequence of events driven from user space should be like the following. (1) Open the character device file (2) Poll on the file for POLLIN event (3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size (4) Kernel driver will pass at most one opal_plat_event structure (5) Poll again for more new events A few thoughts from discussing with Michael and Joel: - not convinced that a chardev is the most ideal way to notify userspace. It seems like yet-another powerpc specific notification mechanism, which isn't ideal. - netlink probably isn't right
Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver
Hi Stewart, I looked into ACPI and found details about it. But before we go into discussing more details of it, would like to share a brief about OPAL platform events (EPOW/DPO) work and original design proposed. As if now OPAL platform events work supports two events: EPOW (Early Power Off Warning) and DPO (Delayed Power Off). On FSP based systems FSP notifies OPAL about EPOW and DPO events via mbox mechanism. Subsequently OPAL sends notifications for these events to pkvm kernel. Original design is to have a kernel driver maintain a queue and add these events to queue upon arrival. pkvm driver also provides a character device for host to consume these events. A daemon is proposed for pkvm host to poll/read these events from char device. This daemon would process these events and take action to log and shutdown host. Apart from this it would also send these event info to VMs which is handled by OSes running on VMs. Linux on VMs already has code in place to handle these events as it expects this info to reach it in PAPR format under EPOW (Environmental and Power Warnings) category. EPOW mbox msgs are received for below events: 1. UPS events - UPS Battery Low, UPS Bypassed, UPS Utility Failure, UPS On 2. SPCN events - Configuration Change, Log SPCN Fault, Impending Power Failure, Power Incomplete 3. Temprature events - Over Ambient temperature, Over internal temperature. Now ACPI: Looked into ACPI and tried to figure out how ACPI userspace/kernel framework can be helpful for our work. ACPI user space consists of below components. acpid - ACPI daemon to receive events from kernel acpid provides events and actions files in /etc/acpi dir to configure actions for various events. acpi, acpi_listen, acpitool - Commands to query and set various ACPI supported parameters. These tools work with various sysfs files to show/set various parameter values. As if today acpid and other tools don't exist for POWER so would need to be ported. acpid is useful for our work but other tools might not be helpful as they look into various sysfs files created by various ACPI kernel drivers which we won't have. Also we would need to map our EPOW/DPO events to acpid supported events and few events link SPCN ones won't map straight away and might need to be added in acpid as new events. ACPI in kernel has various drivers for fan, battery, laptop buttons etc. They handle events and uses netlink mechanism to sent out these events to userspace. Now looking into ACPI code it seems that we would be reusing a small chunk of acpi code but instead end up adding unnecessary complexity due to support a lot of stuff than needed by us. Here too mapping our EPOW/DPO events to ACPI defined structures in needed and we would need to add new member varaibles in ACPI event structures for unmapped events like SPCN ones. In nutshell it seems that by using ACPI we would end up adding lot more complexity with a little gain of code reuse. Netlink: On technology side netlink seems to be a faster method compared to character driver. So that could be a good alternative to use as a method of communication between our pkvm driver and userspace. But EPOW/DPO events occur at very low rate unlike network subsystem which receive data packets at a very high rate. So probably netlink could be a faster method but due to slow EPOW/DPO event traffic a character driver might be sufficient. We already have ppc64-diag package which is part of various distros so would be used for hosting daemon code. Thus it takes off overhead of convincing distros for adding something extra. This was my findings and opinions on alternatives. Apologies for a little lengthy text :-) Let me know if i missed out anything and any suggestions that you would have. Regards, Vipin On 02/11/2015 10:32 AM, Stewart Smith wrote: Vipin K Parashar vi...@linux.vnet.ibm.com writes: (1) Environmental and Power Warning (EPOW) (2) Delayed Power Off (DPO) The user interface for this driver is /dev/opal_event character device file where the user space clients can poll and read for new opal platform events. The expected sequence of events driven from user space should be like the following. (1) Open the character device file (2) Poll on the file for POLLIN event (3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size (4) Kernel driver will pass at most one opal_plat_event structure (5) Poll again for more new events A few thoughts from discussing with Michael and Joel: - not convinced that a chardev is the most ideal way to notify userspace. It seems like yet-another powerpc specific notification mechanism, which isn't ideal. - netlink probably isn't right either (although maybe *sligthtly* better?) - it seems that the standard way is ACPI, so I wonder if we could emit an ACPI event and essentially fake having ACPI... that would make all existing
Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver
On 02/11/2015 10:32 AM, Stewart Smith wrote: Vipin K Parashar vi...@linux.vnet.ibm.com writes: (1) Environmental and Power Warning (EPOW) (2) Delayed Power Off (DPO) The user interface for this driver is /dev/opal_event character device file where the user space clients can poll and read for new opal platform events. The expected sequence of events driven from user space should be like the following. (1) Open the character device file (2) Poll on the file for POLLIN event (3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size (4) Kernel driver will pass at most one opal_plat_event structure (5) Poll again for more new events A few thoughts from discussing with Michael and Joel: - not convinced that a chardev is the most ideal way to notify userspace. It seems like yet-another powerpc specific notification mechanism, which isn't ideal. - netlink probably isn't right either (although maybe *sligthtly* better?) - it seems that the standard way is ACPI, so I wonder if we could emit an ACPI event and essentially fake having ACPI... that would make all existing userspace just work, right? Looking at acpi_bus_generate_netlink_event call in drivers/acpi/button.c it looks possible that we may be able to (relatively simply) do that? Thanks Stewart, i will explore more about ACPI and will also try to see if we could use it to throw events to guests. - What do UPSs do? It would seem that some common this is what's about to happen to your power would almost *have* to exist somewhat generically? UPS class tells about UPS status with system. FSP sends mbox messages with UPS status along with UPS status bit which tells exactly as to what change is there in UPS status like UPS installed, UPS battery low, UPS removed (By passed). We plan to add support for these UPS events in skiboot to provide more UPS details. I strongly advocate for anything that doesn't require custom userspace that's OPAL/POWER specific (that we then have to get into distros etc etc ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver
Vipin K Parashar vi...@linux.vnet.ibm.com writes: - What do UPSs do? It would seem that some common this is what's about to happen to your power would almost *have* to exist somewhat generically? UPS class tells about UPS status with system. FSP sends mbox messages with UPS status along with UPS status bit which tells exactly as to what change is there in UPS status like UPS installed, UPS battery low, UPS removed (By passed). We plan to add support for these UPS events in skiboot to provide more UPS details. I was thinking of UPSs on systems other than IBM FSP based POWER systems like if I went down the street and bought one for $100 and plugged it into my x86 desktop. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4] powerpc, powernv: Add OPAL platform event driver
Vipin K Parashar vi...@linux.vnet.ibm.com writes: (1) Environmental and Power Warning (EPOW) (2) Delayed Power Off (DPO) The user interface for this driver is /dev/opal_event character device file where the user space clients can poll and read for new opal platform events. The expected sequence of events driven from user space should be like the following. (1) Open the character device file (2) Poll on the file for POLLIN event (3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size (4) Kernel driver will pass at most one opal_plat_event structure (5) Poll again for more new events A few thoughts from discussing with Michael and Joel: - not convinced that a chardev is the most ideal way to notify userspace. It seems like yet-another powerpc specific notification mechanism, which isn't ideal. - netlink probably isn't right either (although maybe *sligthtly* better?) - it seems that the standard way is ACPI, so I wonder if we could emit an ACPI event and essentially fake having ACPI... that would make all existing userspace just work, right? Looking at acpi_bus_generate_netlink_event call in drivers/acpi/button.c it looks possible that we may be able to (relatively simply) do that? - What do UPSs do? It would seem that some common this is what's about to happen to your power would almost *have* to exist somewhat generically? I strongly advocate for anything that doesn't require custom userspace that's OPAL/POWER specific (that we then have to get into distros etc etc) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V4] powerpc, powernv: Add OPAL platform event driver
This patch creates a new OPAL platform event character driver which will give userspace clients the access to these events and process them effectively. Following platforms events are currently supported with this platform driver. (1) Environmental and Power Warning (EPOW) (2) Delayed Power Off (DPO) The user interface for this driver is /dev/opal_event character device file where the user space clients can poll and read for new opal platform events. The expected sequence of events driven from user space should be like the following. (1) Open the character device file (2) Poll on the file for POLLIN event (3) When unblocked, must attempt to read OPAL_PLAT_EVENT_MAX_SIZE size (4) Kernel driver will pass at most one opal_plat_event structure (5) Poll again for more new events The driver registers for OPAL messages notifications corresponding to individual OPAL events. When any of those event messages arrive in the kernel, the callbacks are called to process them which in turn unblocks the polling thread on the character device file. The driver also registers a timer function which will be called after a threshold amount of time to shutdown the system. The user space client receives the timeout value for all individual OPAL platform events and hence must prepare the system and eventually shutdown. In case the user client does not shutdown the system, the timer function will be called after the threshold and shutdown the system explicitly. Signed-off-by: Vipin K Parashar vi...@linux.vnet.ibm.com Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com --- Changes in V4: - Used miscdev in place of chardev - Used module_platform_driver macro for registering platform driver - Added endianness conversions before and after making OPAL calls - Changed events data structure in opal_platform_events.h to use bitmask for various events in each event class - Added some info prints - Added code changes to return remaining time for DPO event for user space query - Made O_NONBLOCK unsupported for file open call - Changed actionable_epow function to exclude events and purged epow_exclude function Changes in V3: - Rebased the patch against the mainline Changes in V2: - Changed the function fetch_dpo_timeout - Export opal_platform_events.h for user space consumption - Posted here https://patchwork.ozlabs.org/patch/396725/ Original V1: - Original patch - Posted here http://patchwork.ozlabs.org/patch/394340/ arch/powerpc/include/asm/opal.h| 45 +- arch/powerpc/include/uapi/asm/Kbuild | 1 + .../include/uapi/asm/opal_platform_events.h| 90 +++ arch/powerpc/platforms/powernv/Makefile| 2 +- .../platforms/powernv/opal-platform-events.c | 663 + arch/powerpc/platforms/powernv/opal-wrappers.S | 1 + arch/powerpc/platforms/powernv/opal.c | 8 +- 7 files changed, 807 insertions(+), 3 deletions(-) create mode 100644 arch/powerpc/include/uapi/asm/opal_platform_events.h create mode 100644 arch/powerpc/platforms/powernv/opal-platform-events.c diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index eb95b67..950839c 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -166,6 +166,7 @@ struct opal_sg_list { #define OPAL_UNREGISTER_DUMP_REGION102 #define OPAL_WRITE_TPO 103 #define OPAL_READ_TPO 104 +#define OPAL_GET_DPO_STATUS105 #define OPAL_IPMI_SEND 107 #define OPAL_IPMI_RECV 108 #define OPAL_I2C_REQUEST 109 @@ -306,6 +307,7 @@ enum OpalMessageType { OPAL_MSG_EPOW, OPAL_MSG_SHUTDOWN, OPAL_MSG_HMI_EVT, + OPAL_MSG_DPO, OPAL_MSG_TYPE_MAX, }; @@ -421,6 +423,46 @@ struct opal_msg { __be64 params[8]; }; +/* + * EPOW status sharing (OPAL and the host) + * + * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX + * with individual elements being 16 bits wide to fetch the system + * wide EPOW status. Each element in the buffer will contain the + * EPOW status in it's bit representation for a particular EPOW sub + * class as defiend here. So multiple detailed EPOW status bits + * specific for any sub class can be represented in a single buffer + * element as it's bit representation. + */ + +/* System EPOW type */ +enum OpalSysEpow { + OPAL_SYSEPOW_POWER = 0,/* Power EPOW */ + OPAL_SYSEPOW_TEMP = 1,/* Temperature EPOW */ + OPAL_SYSEPOW_COOLING= 2,/* Cooling EPOW */ + OPAL_SYSEPOW_MAX= 3,/* Max EPOW categories */ +}; + +/* Power EPOW */ +enum OpalSysPower { + OPAL_SYSPOWER_UPS = 0x0001, /* System on UPS power */ + OPAL_SYSPOWER_CHNG = 0x0002, /* System power config change */ + OPAL_SYSPOWER_FAIL =