[PATCH v4 00/21] EEH reorganization

2012-02-24 Thread Gavin Shan
This series of patches is going to reorganize EEH so that it could support
multiple platforms in future. The requirements were raised from the aspects.

* The original EEH implementation only support pSeries platform, which
  would be regarded as guest system. Platform powernv is coming and EEH
  needs to be supported on powernv as well.
* Different platforms might be running based on variable 
firmware.Further
  more, the firmware would supply different EEH interfaces to kernel.
  Therefore, we have to do necessary abstraction on current EEH 
implementation.

In order to accomodate the requirements, the series of patches have reorganized
current EEH implementation.

* The original implementation looks not clean enough. Necessary cleanup
  will be done in some of the patches.
* struct eeh_ops has been introduced so that EEH core components and 
platform
  dependent implementation could be split up. That make it possible for 
EEH
  to be supported on multiple platforms.
* struct eeh_dev has been introduced to replace struct pci_dn so that 
EEH module
  works independently as much as possible.
* EEH global statistics will be maintained in a collective fashion.

v1 - v2:

* If possible, to add eeh_ prefix for function names.
* The format of leading function comments won't be changed in order not 
to
  break kernel document automatic generation (e.g. by make pdfdocs).
* The name of local variables won't be changed if there're no explicit 
reasons.
* Represent the PE's state in bitmap fasion.
* Some function names have been adjusted so that they look shorter and
  meaningful.
* Platform operation name has been changed to pseries.
* Merge those patches for cleanup if possible.
* The line length is kept as appropriately short if possible.
* Fixup on alignment  spacing issues.

v2 - v3:
* Split cleanup patch into 2: one for comment cleanup and another one 
for
  renaming function names.
* Try to use pr_warning/pr_info/pr_debug instead of printk() function 
call.
* Function names are adjusted a little bit so that they looks more 
meaningful
  according to comments from Michael/Ben.
* Useful comment has been kept according to Michael's comments.
* struct eeh_ops::set_eeh has been changed to eeh_ops::set_option.
* struct eeh_ops::name has been changed to char *.
* Remove file name from the source file.
* Copyright (C) format has been changed since (C) isn't encouraged to 
use.
* The header files included in the source file have been sorted 
alphabetically.
* eeh_platform_init() has been replaced by eeh_pseries_init() to avoid 
duplicate
  functions when kernel supports multiple platforms.
* F/W has been changed to Firmware.
* The maximal wait time to retrieve PE's state has been covered by 
macro.
* It also include changes according to the minor comments from Michael.

v3 - v4:
* Fix some typo included in the commit messages.
* Reduce code nesting according to Ram's suggestions.
* Addtinal pr_warning on failure of configuring bridges.

The series of patches (v4) has been verified on Firebird-L machine. In order to 
carry out
the test, you have to install IBM Power Tools from IBM internal yum source. 
Following
command is used to force EEH check on ethernet interface, which could be 
recovered eventually
by EEH and device driver successfully. You could keep pinging to the blade 
before issuing
the following command to force EEH. You should see the network interface can't 
be reached for
a moment and everything will be recovered couple of seconds after the forced 
EEH error. At the
same time, you should see EEH error log out of system console. 

* errinjct eeh -v -f 0 -p U78AE.001.WZS00M9-P1-C18-L1-T2 -a 0x0 -m 0x0


-

arch/powerpc/include/asm/device.h|3 -
arch/powerpc/include/asm/eeh.h   |  143 +---
arch/powerpc/include/asm/eeh_event.h |   33 +-
arch/powerpc/include/asm/ppc-pci.h   |   89 ++-
arch/powerpc/kernel/of_platform.c|3 -
arch/powerpc/kernel/rtas_pci.c   |3 -
arch/powerpc/platforms/pseries/Makefile  |3 +-
arch/powerpc/platforms/pseries/eeh.c | 1016 +++---
arch/powerpc/platforms/pseries/eeh_cache.c   |   44 +-
arch/powerpc/platforms/pseries/eeh_driver.c  |  213 +++---
arch/powerpc/platforms/pseries/eeh_event.c   |   55 +-
arch/powerpc/platforms/pseries/eeh_pseries.c |  565 --
arch/powerpc/platforms/pseries/eeh_sysfs.c   |   25 +-
arch/powerpc/platforms/pseries/msi.c |2 +-
arch/powerpc/platforms/pseries/pci_dlpar.c   |3 -
arch/powerpc/platforms/pseries/setup.c   |7 +-
include/linux/of.h   

[PATCH 08/21] pSeries platform EEH wait PE state

2012-02-24 Thread Gavin Shan
On pSeries platform, the PE state might be temporarily unavailable.
In that case, the firmware will return the corresponding wait time.
That means the kernel has to wait for appropriate time in order to
get the PE state.

The patch does the implementation for that. Besides, the function
has been abstracted through struct eeh_ops::wait_state so that EEH core
components could support multiple platforms in future.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/ppc-pci.h   |1 -
 arch/powerpc/platforms/pseries/eeh.c |   46 +
 arch/powerpc/platforms/pseries/eeh_driver.c  |2 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c |   47 +-
 4 files changed, 49 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 6150349..1cfb2b0 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -58,7 +58,6 @@ struct pci_dev *pci_get_device_by_addr(unsigned long addr);
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
 int eeh_pci_enable(struct pci_dn *pdn, int function);
 int eeh_reset_pe(struct pci_dn *);
-int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs);
 void eeh_restore_bars(struct pci_dn *);
 void eeh_configure_bridge(struct pci_dn *);
 int rtas_write_config(struct pci_dn *, int where, int size, u32 val);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 8d11f1f..b5b03d4 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -287,48 +287,6 @@ void eeh_slot_error_detail(struct pci_dn *pdn, int 
severity)
 }
 
 /**
- * eeh_wait_for_slot_status - Returns error status of slot
- * @pdn: pci device node
- * @max_wait_msecs: maximum number to millisecs to wait
- *
- * Return negative value if a permanent error, else return
- * Partition Endpoint (PE) status value.
- *
- * If @max_wait_msecs is positive, then this routine will
- * sleep until a valid status can be obtained, or until
- * the max allowed wait time is exceeded, in which case
- * a -2 is returned.
- */
-int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs)
-{
-   int rc;
-   int mwait;
-
-   while (1) {
-   rc = eeh_ops-get_state(pdn-node, mwait);
-   if (rc != EEH_STATE_UNAVAILABLE)
-   return rc;
-
-   if (max_wait_msecs = 0) break;
-
-   if (mwait = 0) {
-   printk(KERN_WARNING EEH: Firmware returned bad wait 
value=%d\n,
-   mwait);
-   mwait = 1000;
-   } else if (mwait  300*1000) {
-   printk(KERN_WARNING EEH: Firmware is taking too long, 
time=%d\n,
-   mwait);
-   mwait = 300*1000;
-   }
-   max_wait_msecs -= mwait;
-   msleep(mwait);
-   }
-
-   printk(KERN_WARNING EEH: Timed out waiting for slot status\n);
-   return -2;
-}
-
-/**
  * eeh_token_to_phys - Convert EEH address token to phys address
  * @token: I/O token, should be address in the form 0xA
  *
@@ -640,7 +598,7 @@ int eeh_pci_enable(struct pci_dn *pdn, int function)
printk(KERN_WARNING EEH: Unexpected state change %d, err=%d 
dn=%s\n,
function, rc, pdn-node-full_name);
 
-   rc = eeh_wait_for_slot_status(pdn, PCI_BUS_RESET_WAIT_MSEC);
+   rc = eeh_ops-wait_state(pdn-node, PCI_BUS_RESET_WAIT_MSEC);
if (rc  0  (rc  EEH_STATE_MMIO_ENABLED) 
   (function == EEH_OPT_THAW_MMIO))
return 0;
@@ -838,7 +796,7 @@ int eeh_reset_pe(struct pci_dn *pdn)
for (i=0; i3; i++) {
eeh_reset_pe_once(pdn);
 
-   rc = eeh_wait_for_slot_status(pdn, PCI_BUS_RESET_WAIT_MSEC);
+   rc = eeh_ops-wait_state(pdn-node, PCI_BUS_RESET_WAIT_MSEC);
if (rc == (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE))
return 0;
 
diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c 
b/arch/powerpc/platforms/pseries/eeh_driver.c
index 4c6e0c1c..584defe 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -396,7 +396,7 @@ struct pci_dn * handle_eeh_events (struct eeh_event *event)
 
/* Get the current PCI slot state. This can take a long time,
 * sometimes over 3 seconds for certain systems. */
-   rc = eeh_wait_for_slot_status (frozen_pdn, MAX_WAIT_FOR_RECOVERY*1000);
+   rc = eeh_ops-wait_state(frozen_pdn-node, MAX_WAIT_FOR_RECOVERY*1000);
if (rc  0 || rc == EEH_STATE_NOT_SUPPORT) {
printk(KERN_WARNING EEH: Permanent failure\n);
goto hard_fail;
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c 

[PATCH 21/21] pSeries platform config space access in EEH

2012-02-24 Thread Gavin Shan
With the original EEH implementation, the access to config space of
the corresponding PCI device is done by RTAS sensitive function. That
depends on pci_dn heavily. That would limit EEH extension to other
platforms like powernv because other platforms might have different
ways to access PCI config space.

The patch splits those functions used to access PCI config space
and implement them in platform related EEH component. It would be
helpful to support EEH on multiple platforms simutaneously in future.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |2 +
 arch/powerpc/platforms/pseries/eeh.c |   32 ++--
 arch/powerpc/platforms/pseries/eeh_pseries.c |   40 +-
 3 files changed, 57 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 226c9a5..121fa54 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -96,6 +96,8 @@ struct eeh_ops {
int (*wait_state)(struct device_node *dn, int max_wait);
int (*get_log)(struct device_node *dn, int severity, char *drv_log, 
unsigned long len);
int (*configure_bridge)(struct device_node *dn);
+   int (*read_config)(struct device_node *dn, int where, int size, u32 
*val);
+   int (*write_config)(struct device_node *dn, int where, int size, u32 
val);
 };
 
 /*
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index a8a8c27..899df26 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -127,11 +127,11 @@ static size_t eeh_gather_pci_data(struct eeh_dev *edev, 
char * buf, size_t len)
n += scnprintf(buf+n, len-n, %s\n, dn-full_name);
printk(KERN_WARNING EEH: of node=%s\n, dn-full_name);
 
-   rtas_read_config(PCI_DN(dn), PCI_VENDOR_ID, 4, cfg);
+   eeh_ops-read_config(dn, PCI_VENDOR_ID, 4, cfg);
n += scnprintf(buf+n, len-n, dev/vend:%08x\n, cfg);
printk(KERN_WARNING EEH: PCI device/vendor: %08x\n, cfg);
 
-   rtas_read_config(PCI_DN(dn), PCI_COMMAND, 4, cfg);
+   eeh_ops-read_config(dn, PCI_COMMAND, 4, cfg);
n += scnprintf(buf+n, len-n, cmd/stat:%x\n, cfg);
printk(KERN_WARNING EEH: PCI cmd/status register: %08x\n, cfg);
 
@@ -142,11 +142,11 @@ static size_t eeh_gather_pci_data(struct eeh_dev *edev, 
char * buf, size_t len)
 
/* Gather bridge-specific registers */
if (dev-class  16 == PCI_BASE_CLASS_BRIDGE) {
-   rtas_read_config(PCI_DN(dn), PCI_SEC_STATUS, 2, cfg);
+   eeh_ops-read_config(dn, PCI_SEC_STATUS, 2, cfg);
n += scnprintf(buf+n, len-n, sec stat:%x\n, cfg);
printk(KERN_WARNING EEH: Bridge secondary status: %04x\n, 
cfg);
 
-   rtas_read_config(PCI_DN(dn), PCI_BRIDGE_CONTROL, 2, cfg);
+   eeh_ops-read_config(dn, PCI_BRIDGE_CONTROL, 2, cfg);
n += scnprintf(buf+n, len-n, brdg ctl:%x\n, cfg);
printk(KERN_WARNING EEH: Bridge control: %04x\n, cfg);
}
@@ -154,11 +154,11 @@ static size_t eeh_gather_pci_data(struct eeh_dev *edev, 
char * buf, size_t len)
/* Dump out the PCI-X command and status regs */
cap = pci_find_capability(dev, PCI_CAP_ID_PCIX);
if (cap) {
-   rtas_read_config(PCI_DN(dn), cap, 4, cfg);
+   eeh_ops-read_config(dn, cap, 4, cfg);
n += scnprintf(buf+n, len-n, pcix-cmd:%x\n, cfg);
printk(KERN_WARNING EEH: PCI-X cmd: %08x\n, cfg);
 
-   rtas_read_config(PCI_DN(dn), cap+4, 4, cfg);
+   eeh_ops-read_config(dn, cap+4, 4, cfg);
n += scnprintf(buf+n, len-n, pcix-stat:%x\n, cfg);
printk(KERN_WARNING EEH: PCI-X status: %08x\n, cfg);
}
@@ -171,7 +171,7 @@ static size_t eeh_gather_pci_data(struct eeh_dev *edev, 
char * buf, size_t len)
   EEH: PCI-E capabilities and status follow:\n);
 
for (i=0; i=8; i++) {
-   rtas_read_config(PCI_DN(dn), cap+4*i, 4, cfg);
+   eeh_ops-read_config(dn, cap+4*i, 4, cfg);
n += scnprintf(buf+n, len-n, %02x:%x\n, 4*i, cfg);
printk(KERN_WARNING EEH: PCI-E %02x: %08x\n, i, cfg);
}
@@ -183,7 +183,7 @@ static size_t eeh_gather_pci_data(struct eeh_dev *edev, 
char * buf, size_t len)
   EEH: PCI-E AER capability register set 
follows:\n);
 
for (i=0; i14; i++) {
-   rtas_read_config(PCI_DN(dn), cap+4*i, 4, cfg);
+   eeh_ops-read_config(dn, cap+4*i, 4, cfg);
n += scnprintf(buf+n, len-n, %02x:%x\n, 4*i, 
cfg);
printk(KERN_WARNING EEH: PCI-E AER %02x: 
%08x\n, i, cfg);
}
@@ 

[PATCH 10/21] pSeries platform EEH error log retrieval

2012-02-24 Thread Gavin Shan
On RTAS compliant pSeries platform, one dedicated RTAS call has
been introduced to retrieve EEH temporary or permanent error log.

The patch implements the function of retriving EEH error log through
RTAS call. Besides, it has been abstracted by struct eeh_ops::get_log
so that EEH core components could support multiple platforms in future.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |2 +
 arch/powerpc/include/asm/ppc-pci.h   |2 -
 arch/powerpc/platforms/pseries/eeh.c |   63 +-
 arch/powerpc/platforms/pseries/eeh_driver.c  |4 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c |   47 +++-
 5 files changed, 51 insertions(+), 67 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 894ea6c..ad8f318 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -52,6 +52,8 @@ struct device_node;
 #define EEH_RESET_DEACTIVATE   0   /* Deactivate the PE reset  */
 #define EEH_RESET_HOT  1   /* Hot reset*/
 #define EEH_RESET_FUNDAMENTAL  3   /* Fundamental reset*/
+#define EEH_LOG_TEMP   1   /* EEH temporary error log  */
+#define EEH_LOG_PERM   2   /* EEH permanent error log  */
 
 struct eeh_ops {
char *name;
diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 1cfb2b0..bd1a84f 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -53,8 +53,6 @@ void pci_addr_cache_insert_device(struct pci_dev *dev);
 void pci_addr_cache_remove_device(struct pci_dev *dev);
 void pci_addr_cache_build(void);
 struct pci_dev *pci_get_device_by_addr(unsigned long addr);
-#define EEH_LOG_TEMP_FAILURE 1
-#define EEH_LOG_PERM_FAILURE 2
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
 int eeh_pci_enable(struct pci_dn *pdn, int function);
 int eeh_reset_pe(struct pci_dn *);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 4f329f5..39fcecb 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -87,7 +87,6 @@
 #define PCI_BUS_RESET_WAIT_MSEC (60*1000)
 
 /* RTAS tokens */
-static int ibm_slot_error_detail;
 static int ibm_configure_bridge;
 static int ibm_configure_pe;
 
@@ -100,14 +99,6 @@ EXPORT_SYMBOL(eeh_subsystem_enabled);
 /* Lock to avoid races due to multiple reports of an error */
 static DEFINE_RAW_SPINLOCK(confirm_error_lock);
 
-/* Buffer for reporting slot-error-detail rtas calls. Its here
- * in BSS, and not dynamically alloced, so that it ends up in
- * RMO where RTAS can access it.
- */
-static unsigned char slot_errbuf[RTAS_ERROR_LOG_MAX];
-static DEFINE_SPINLOCK(slot_errbuf_lock);
-static int eeh_error_buf_size;
-
 /* Buffer for reporting pci register dumps. Its here in BSS, and
  * not dynamically alloced, so that it ends up in RMO where RTAS
  * can access it.
@@ -127,46 +118,6 @@ static unsigned long slot_resets;
 #define IS_BRIDGE(class_code) (((class_code)16) == PCI_BASE_CLASS_BRIDGE)
 
 /**
- * eeh_rtas_slot_error_detail - Retrieve error log through RTAS call
- * @pdn: device node
- * @severity: temporary or permanent error log
- * @driver_log: driver log to be combined with the retrieved error log
- * @loglen: length of driver log
- *
- * This routine should be called to retrieve error log through the dedicated
- * RTAS call.
- */
-static void eeh_rtas_slot_error_detail(struct pci_dn *pdn, int severity,
-   char *driver_log, size_t loglen)
-{
-   int config_addr;
-   unsigned long flags;
-   int rc;
-
-   /* Log the error with the rtas logger */
-   spin_lock_irqsave(slot_errbuf_lock, flags);
-   memset(slot_errbuf, 0, eeh_error_buf_size);
-
-   /* Use PE configuration address, if present */
-   config_addr = pdn-eeh_config_addr;
-   if (pdn-eeh_pe_config_addr)
-   config_addr = pdn-eeh_pe_config_addr;
-
-   rc = rtas_call(ibm_slot_error_detail,
-  8, 1, NULL, config_addr,
-  BUID_HI(pdn-phb-buid),
-  BUID_LO(pdn-phb-buid),
-  virt_to_phys(driver_log), loglen,
-  virt_to_phys(slot_errbuf),
-  eeh_error_buf_size,
-  severity);
-
-   if (rc == 0)
-   log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0);
-   spin_unlock_irqrestore(slot_errbuf_lock, flags);
-}
-
-/**
  * eeh_gather_pci_data - Copy assorted PCI config space registers to buff
  * @pdn: device to report data for
  * @buf: point to buffer in which to log
@@ -282,7 +233,7 @@ void eeh_slot_error_detail(struct pci_dn *pdn, int severity)
eeh_restore_bars(pdn);
loglen = eeh_gather_pci_data(pdn, pci_regs_buf, EEH_PCI_REGS_LOG_LEN);
 
-   

[PATCH 17/21] Replace pci_dn with eeh_dev for EEH core

2012-02-24 Thread Gavin Shan
The original EEH implementation is heavily depending on struct pci_dn.
We have to put EEH related information to pci_dn. Actually, we could
split struct pci_dn so that the EEH sensitive information to form an
individual struct, then EEH looks more independent.

The patch replaces pci_dn with eeh_dev for EEH core.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/ppc-pci.h   |8 +-
 arch/powerpc/platforms/pseries/eeh.c |  269 ++
 2 files changed, 144 insertions(+), 133 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index c02d5a7..8c003eb 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -53,10 +53,10 @@ void pci_addr_cache_build(void);
 void pci_addr_cache_insert_device(struct pci_dev *dev);
 void pci_addr_cache_remove_device(struct pci_dev *dev);
 struct pci_dev *pci_addr_cache_get_device(unsigned long addr);
-void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
-int eeh_pci_enable(struct pci_dn *pdn, int function);
-int eeh_reset_pe(struct pci_dn *);
-void eeh_restore_bars(struct pci_dn *);
+void eeh_slot_error_detail(struct eeh_dev *pdn, int severity);
+int eeh_pci_enable(struct eeh_dev *edev, int function);
+int eeh_reset_pe(struct eeh_dev *);
+void eeh_restore_bars(struct eeh_dev *);
 int rtas_write_config(struct pci_dn *, int where, int size, u32 val);
 int rtas_read_config(struct pci_dn *, int where, int size, u32 *val);
 void eeh_mark_slot(struct device_node *dn, int mode_flag);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 646b520..84a8a0c 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -115,28 +115,29 @@ static unsigned long slot_resets;
 
 /**
  * eeh_gather_pci_data - Copy assorted PCI config space registers to buff
- * @pdn: device to report data for
+ * @edev: device to report data for
  * @buf: point to buffer in which to log
  * @len: amount of room in buffer
  *
  * This routine captures assorted PCI configuration space data,
  * and puts them into a buffer for RTAS error logging.
  */
-static size_t eeh_gather_pci_data(struct pci_dn *pdn, char * buf, size_t len)
+static size_t eeh_gather_pci_data(struct eeh_dev *edev, char * buf, size_t len)
 {
-   struct pci_dev *dev = pdn-pcidev;
+   struct device_node *dn = EEH_DEV_TO_OF_NODE(edev);
+   struct pci_dev *dev = EEH_DEV_TO_PCI_DEV(edev);
u32 cfg;
int cap, i;
int n = 0;
 
-   n += scnprintf(buf+n, len-n, %s\n, pdn-node-full_name);
-   printk(KERN_WARNING EEH: of node=%s\n, pdn-node-full_name);
+   n += scnprintf(buf+n, len-n, %s\n, dn-full_name);
+   printk(KERN_WARNING EEH: of node=%s\n, dn-full_name);
 
-   rtas_read_config(pdn, PCI_VENDOR_ID, 4, cfg);
+   rtas_read_config(PCI_DN(dn), PCI_VENDOR_ID, 4, cfg);
n += scnprintf(buf+n, len-n, dev/vend:%08x\n, cfg);
printk(KERN_WARNING EEH: PCI device/vendor: %08x\n, cfg);
 
-   rtas_read_config(pdn, PCI_COMMAND, 4, cfg);
+   rtas_read_config(PCI_DN(dn), PCI_COMMAND, 4, cfg);
n += scnprintf(buf+n, len-n, cmd/stat:%x\n, cfg);
printk(KERN_WARNING EEH: PCI cmd/status register: %08x\n, cfg);
 
@@ -147,11 +148,11 @@ static size_t eeh_gather_pci_data(struct pci_dn *pdn, 
char * buf, size_t len)
 
/* Gather bridge-specific registers */
if (dev-class  16 == PCI_BASE_CLASS_BRIDGE) {
-   rtas_read_config(pdn, PCI_SEC_STATUS, 2, cfg);
+   rtas_read_config(PCI_DN(dn), PCI_SEC_STATUS, 2, cfg);
n += scnprintf(buf+n, len-n, sec stat:%x\n, cfg);
printk(KERN_WARNING EEH: Bridge secondary status: %04x\n, 
cfg);
 
-   rtas_read_config(pdn, PCI_BRIDGE_CONTROL, 2, cfg);
+   rtas_read_config(PCI_DN(dn), PCI_BRIDGE_CONTROL, 2, cfg);
n += scnprintf(buf+n, len-n, brdg ctl:%x\n, cfg);
printk(KERN_WARNING EEH: Bridge control: %04x\n, cfg);
}
@@ -159,11 +160,11 @@ static size_t eeh_gather_pci_data(struct pci_dn *pdn, 
char * buf, size_t len)
/* Dump out the PCI-X command and status regs */
cap = pci_find_capability(dev, PCI_CAP_ID_PCIX);
if (cap) {
-   rtas_read_config(pdn, cap, 4, cfg);
+   rtas_read_config(PCI_DN(dn), cap, 4, cfg);
n += scnprintf(buf+n, len-n, pcix-cmd:%x\n, cfg);
printk(KERN_WARNING EEH: PCI-X cmd: %08x\n, cfg);
 
-   rtas_read_config(pdn, cap+4, 4, cfg);
+   rtas_read_config(PCI_DN(dn), cap+4, 4, cfg);
n += scnprintf(buf+n, len-n, pcix-stat:%x\n, cfg);
printk(KERN_WARNING EEH: PCI-X status: %08x\n, cfg);
}
@@ -176,7 +177,7 @@ static size_t eeh_gather_pci_data(struct pci_dn *pdn, char 
* buf, size_t len)
   EEH: PCI-E capabilities and status 

[PATCH 09/21] pSeries platform EEH reset PE

2012-02-24 Thread Gavin Shan
On RTAS compliant pSeries platform, there is a dedicated RTAS call
(ibm,set-slot-reset) to reset the specified PE. Furthermore, two
types of resets are supported: hot and fundamental. the type of
reset is to be used actually depends on the included PCI device's
requirements.

The patch implements resetting PE on pSeries platform through RTAS
call. Besides, it has been abstracted through struct eeh_ops::reset
so that EEH core components could support multiple platforms in future.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |3 +
 arch/powerpc/platforms/pseries/eeh.c |   63 +++---
 arch/powerpc/platforms/pseries/eeh_pseries.c |   25 ++-
 3 files changed, 33 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 1d3c9e5..894ea6c 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -49,6 +49,9 @@ struct device_node;
 #define EEH_STATE_DMA_ACTIVE   (1  4)/* Active DMA   */
 #define EEH_STATE_MMIO_ENABLED (1  5)/* MMIO enabled */
 #define EEH_STATE_DMA_ENABLED  (1  6)/* DMA enabled  */
+#define EEH_RESET_DEACTIVATE   0   /* Deactivate the PE reset  */
+#define EEH_RESET_HOT  1   /* Hot reset*/
+#define EEH_RESET_FUNDAMENTAL  3   /* Fundamental reset*/
 
 struct eeh_ops {
char *name;
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index b5b03d4..4f329f5 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -87,7 +87,6 @@
 #define PCI_BUS_RESET_WAIT_MSEC (60*1000)
 
 /* RTAS tokens */
-static int ibm_set_slot_reset;
 static int ibm_slot_error_detail;
 static int ibm_configure_bridge;
 static int ibm_configure_pe;
@@ -607,54 +606,6 @@ int eeh_pci_enable(struct pci_dn *pdn, int function)
 }
 
 /**
- * eeh_slot_reset - Raises/Lowers the pci #RST line
- * @pdn: pci device node
- * @state: 1/0 to raise/lower the #RST
- *
- * Clear the EEH-frozen condition on a slot.  This routine
- * asserts the PCI #RST line if the 'state' argument is '1',
- * and drops the #RST line if 'state is '0'.  This routine is
- * safe to call in an interrupt context.
- */
-static void eeh_slot_reset(struct pci_dn *pdn, int state)
-{
-   int config_addr;
-   int rc;
-
-   BUG_ON(pdn==NULL);
-
-   if (!pdn-phb) {
-   printk(KERN_WARNING EEH: in slot reset, device node %s has no 
phb\n,
-   pdn-node-full_name);
-   return;
-   }
-
-   /* Use PE configuration address, if present */
-   config_addr = pdn-eeh_config_addr;
-   if (pdn-eeh_pe_config_addr)
-   config_addr = pdn-eeh_pe_config_addr;
-
-   rc = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
-  config_addr,
-  BUID_HI(pdn-phb-buid),
-  BUID_LO(pdn-phb-buid),
-  state);
-
-   /* Fundamental-reset not supported on this PE, try hot-reset */
-   if (rc == -8  state == 3) {
-   rc = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
-  config_addr,
-  BUID_HI(pdn-phb-buid),
-  BUID_LO(pdn-phb-buid), 1);
-   if (rc)
-   printk(KERN_WARNING
-   EEH: Unable to reset the failed slot,
-#RST=%d dn=%s\n,
-   rc, pdn-node-full_name);
-   }
-}
-
-/**
  * pcibios_set_pcie_slot_reset - Set PCI-E reset state
  * @dev: pci device struct
  * @state: reset state to enter
@@ -665,17 +616,16 @@ static void eeh_slot_reset(struct pci_dn *pdn, int state)
 int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state 
state)
 {
struct device_node *dn = pci_device_to_OF_node(dev);
-   struct pci_dn *pdn = PCI_DN(dn);
 
switch (state) {
case pcie_deassert_reset:
-   eeh_slot_reset(pdn, 0);
+   eeh_ops-reset(dn, EEH_RESET_DEACTIVATE);
break;
case pcie_hot_reset:
-   eeh_slot_reset(pdn, 1);
+   eeh_ops-reset(dn, EEH_RESET_HOT);
break;
case pcie_warm_reset:
-   eeh_slot_reset(pdn, 3);
+   eeh_ops-reset(dn, EEH_RESET_FUNDAMENTAL);
break;
default:
return -EINVAL;
@@ -754,9 +704,9 @@ static void eeh_reset_pe_once(struct pci_dn *pdn)
eeh_set_pe_freset(pdn-node, freset);
 
if (freset)
-   eeh_slot_reset(pdn, 3);
+   eeh_ops-reset(pdn-node, EEH_RESET_FUNDAMENTAL);
else
-   eeh_slot_reset(pdn, 1);
+   eeh_ops-reset(pdn-node, EEH_RESET_HOT);
 
/* The PCI bus requires that the reset be held 

[PATCH 02/21] Cleanup on function names of EEH core

2012-02-24 Thread Gavin Shan
The EEH has been implemented on pSeries platform. The original
code looks a little bit nasty. The patch does cleanup on the
current EEH implementation so that it looks more clean.

* Try adding prefix eeh for functions.
* Some function names have been adjusted so that they looks
  shorter and meaningful.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/ppc-pci.h  |8 +-
 arch/powerpc/platforms/pseries/eeh.c|  102 +--
 arch/powerpc/platforms/pseries/eeh_driver.c |   10 ++--
 arch/powerpc/platforms/pseries/msi.c|2 +-
 4 files changed, 59 insertions(+), 63 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 221d82f..605a970 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -58,16 +58,16 @@ struct pci_dev *pci_get_device_by_addr(unsigned long addr);
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
 #define EEH_THAW_MMIO 2
 #define EEH_THAW_DMA  3
-int rtas_pci_enable(struct pci_dn *pdn, int function);
-int rtas_set_slot_reset (struct pci_dn *);
+int eeh_pci_enable(struct pci_dn *pdn, int function);
+int eeh_reset_pe(struct pci_dn *);
 int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs);
 void eeh_restore_bars(struct pci_dn *);
-void rtas_configure_bridge(struct pci_dn *);
+void eeh_configure_bridge(struct pci_dn *);
 int rtas_write_config(struct pci_dn *, int where, int size, u32 val);
 int rtas_read_config(struct pci_dn *, int where, int size, u32 *val);
 void eeh_mark_slot(struct device_node *dn, int mode_flag);
 void eeh_clear_slot(struct device_node *dn, int mode_flag);
-struct device_node *find_device_pe(struct device_node *dn);
+struct device_node *eeh_find_device_pe(struct device_node *dn);
 
 void eeh_sysfs_add_device(struct pci_dev *pdev);
 void eeh_sysfs_remove_device(struct pci_dev *pdev);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 5f6d37b..fa88589 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -130,7 +130,7 @@ static unsigned long slot_resets;
 #define IS_BRIDGE(class_code) (((class_code)16) == PCI_BASE_CLASS_BRIDGE)
 
 /**
- * rtas_slot_error_detail - Retrieve error log through RTAS call
+ * eeh_rtas_slot_error_detail - Retrieve error log through RTAS call
  * @pdn: device node
  * @severity: temporary or permanent error log
  * @driver_log: driver log to be combined with the retrieved error log
@@ -139,7 +139,7 @@ static unsigned long slot_resets;
  * This routine should be called to retrieve error log through the dedicated
  * RTAS call.
  */
-static void rtas_slot_error_detail(struct pci_dn *pdn, int severity,
+static void eeh_rtas_slot_error_detail(struct pci_dn *pdn, int severity,
char *driver_log, size_t loglen)
 {
int config_addr;
@@ -170,7 +170,7 @@ static void rtas_slot_error_detail(struct pci_dn *pdn, int 
severity,
 }
 
 /**
- * gather_pci_data - Copy assorted PCI config space registers to buff
+ * eeh_gather_pci_data - Copy assorted PCI config space registers to buff
  * @pdn: device to report data for
  * @buf: point to buffer in which to log
  * @len: amount of room in buffer
@@ -178,7 +178,7 @@ static void rtas_slot_error_detail(struct pci_dn *pdn, int 
severity,
  * This routine captures assorted PCI configuration space data,
  * and puts them into a buffer for RTAS error logging.
  */
-static size_t gather_pci_data(struct pci_dn *pdn, char * buf, size_t len)
+static size_t eeh_gather_pci_data(struct pci_dn *pdn, char * buf, size_t len)
 {
struct pci_dev *dev = pdn-pcidev;
u32 cfg;
@@ -258,7 +258,7 @@ static size_t gather_pci_data(struct pci_dn *pdn, char * 
buf, size_t len)
for_each_child_of_node(pdn-node, dn) {
pdn = PCI_DN(dn);
if (pdn)
-   n += gather_pci_data(pdn, buf+n, len-n);
+   n += eeh_gather_pci_data(pdn, buf+n, len-n);
}
}
 
@@ -280,23 +280,23 @@ void eeh_slot_error_detail(struct pci_dn *pdn, int 
severity)
size_t loglen = 0;
pci_regs_buf[0] = 0;
 
-   rtas_pci_enable(pdn, EEH_THAW_MMIO);
-   rtas_configure_bridge(pdn);
+   eeh_pci_enable(pdn, EEH_THAW_MMIO);
+   eeh_configure_bridge(pdn);
eeh_restore_bars(pdn);
-   loglen = gather_pci_data(pdn, pci_regs_buf, EEH_PCI_REGS_LOG_LEN);
+   loglen = eeh_gather_pci_data(pdn, pci_regs_buf, EEH_PCI_REGS_LOG_LEN);
 
-   rtas_slot_error_detail(pdn, severity, pci_regs_buf, loglen);
+   eeh_rtas_slot_error_detail(pdn, severity, pci_regs_buf, loglen);
 }
 
 /**
- * read_slot_reset_state - Read the reset state of a device node's slot
+ * eeh_read_slot_reset_state - Read the reset state of a device node's slot
  * @dn: device node to read
  

[PATCH 03/21] Platform dependent EEH operations

2012-02-24 Thread Gavin Shan
EEH has been implemented on RTAS-compliant pSeries platform.
That's to say, the EEH operations will be implemented through RTAS
calls eventually. The situation limited feasible extension on EEH.
In order to support EEH on multiple platforms like pseries and powernv
simutaneously. We have to split the platform dependent EEH options
up out of current implementation.

The patch addresses supporting EEH on multiple platforms. The pseries
platform dependent EEH operations will be abstracted by struct eeh_ops.
EEH core components will be built based on the registered EEH operations.
With the mechanism, what the individual platform needs to do is implement
platform dependent EEH operations.

For now, the pseries platform is covered under the mechanism. That means
we have to think about other platforms to support EEH, like powernv.
Besides, we only have framework for the mechanism and we have to implement
it for pseries platform later.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |   32 +
 arch/powerpc/platforms/pseries/Makefile  |2 +-
 arch/powerpc/platforms/pseries/eeh.c |   53 
 arch/powerpc/platforms/pseries/eeh_pseries.c |  183 ++
 arch/powerpc/platforms/pseries/setup.c   |1 +
 5 files changed, 270 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/platforms/pseries/eeh_pseries.c

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 2328877..0666c52 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -31,6 +31,26 @@ struct device_node;
 
 #ifdef CONFIG_EEH
 
+/*
+ * The struct is used to trace the registered EEH operation
+ * callback functions. Actually, those operation callback
+ * functions are heavily platform dependent. That means the
+ * platform should register its own EEH operation callback
+ * functions before any EEH further operations.
+ */
+struct eeh_ops {
+   char *name;
+   int (*init)(void);
+   int (*set_option)(struct device_node *dn, int option);
+   int (*get_pe_addr)(struct device_node *dn);
+   int (*get_state)(struct device_node *dn, int *state);
+   int (*reset)(struct device_node *dn, int option);
+   int (*wait_state)(struct device_node *dn, int max_wait);
+   int (*get_log)(struct device_node *dn, int severity, char *drv_log, 
unsigned long len);
+   int (*configure_bridge)(struct device_node *dn);
+};
+
+extern struct eeh_ops *eeh_ops;
 extern int eeh_subsystem_enabled;
 
 /* Values for eeh_mode bits in device_node */
@@ -47,6 +67,11 @@ extern int eeh_subsystem_enabled;
 #define EEH_MAX_ALLOWED_FREEZES 5
 
 void __init eeh_init(void);
+#ifdef CONFIG_PPC_PSERIES
+int __init eeh_pseries_init(void);
+#endif
+int __init eeh_ops_register(struct eeh_ops *ops);
+int __exit eeh_ops_unregister(const char *name);
 unsigned long eeh_check_failure(const volatile void __iomem *token,
unsigned long val);
 int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev);
@@ -73,6 +98,13 @@ void eeh_remove_bus_device(struct pci_dev *);
 #else /* !CONFIG_EEH */
 static inline void eeh_init(void) { }
 
+#ifdef CONFIG_PPC_PSERIES
+static inline int eeh_pseries_init(void)
+{
+   return 0;
+}
+#endif /* CONFIG_PPC_PSERIES */
+
 static inline unsigned long eeh_check_failure(const volatile void __iomem 
*token, unsigned long val)
 {
return val;
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 236db46..9aa5581 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -6,7 +6,7 @@ obj-y   := lpar.o hvCall.o nvram.o reconfig.o \
   firmware.o power.o dlpar.o mobility.o
 obj-$(CONFIG_SMP)  += smp.o
 obj-$(CONFIG_SCANLOG)  += scanlog.o
-obj-$(CONFIG_EEH)  += eeh.o eeh_cache.o eeh_driver.o eeh_event.o 
eeh_sysfs.o
+obj-$(CONFIG_EEH)  += eeh.o eeh_cache.o eeh_driver.o eeh_event.o 
eeh_sysfs.o eeh_pseries.o
 obj-$(CONFIG_KEXEC)+= kexec.o
 obj-$(CONFIG_PCI)  += pci.o pci_dlpar.o
 obj-$(CONFIG_PSERIES_MSI)  += msi.o
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index fa88589..b0e3fb0 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -97,6 +97,9 @@ static int ibm_get_config_addr_info2;
 static int ibm_configure_bridge;
 static int ibm_configure_pe;
 
+/* Platform dependent EEH operations */
+struct eeh_ops *eeh_ops = NULL;
+
 int eeh_subsystem_enabled;
 EXPORT_SYMBOL(eeh_subsystem_enabled);
 
@@ -1208,6 +1211,56 @@ static void *eeh_early_enable(struct device_node *dn, 
void *data)
 }
 
 /**
+ * eeh_ops_register - Register platform dependent EEH operations
+ * @ops: platform dependent EEH operations
+ *
+ * Register the platform dependent EEH operation callback
+ * functions. The platform should 

[PATCH 20/21] Introduce struct eeh_stats for EEH

2012-02-24 Thread Gavin Shan
With the original EEH implementation, the EEH global statistics
are maintained by individual global variables. That makes the
code a little hard to maintain.

The patch introduces extra struct eeh_stats for the EEH global
statistics so that it can be maintained in collective fashion.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |   15 +
 arch/powerpc/platforms/pseries/eeh.c |   37 +++--
 2 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 1310971..226c9a5 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -98,6 +98,21 @@ struct eeh_ops {
int (*configure_bridge)(struct device_node *dn);
 };
 
+/*
+ * The struct is used to maintain the EEH global statistic
+ * information. Besides, the EEH global statistics will be
+ * exported to user space through procfs
+ */
+struct eeh_stats {
+   unsigned long no_device;/* PCI device not found */
+   unsigned long no_dn;/* OF node not found*/
+   unsigned long no_cfg_addr;  /* Config address not found */
+   unsigned long ignored_check;/* EEH check skipped*/
+   unsigned long total_mmio_ffs;   /* Total EEH checks */
+   unsigned long false_positives;  /* Unnecessary EEH checks   */
+   unsigned long slot_resets;  /* PE reset */
+};
+
 extern struct eeh_ops *eeh_ops;
 extern int eeh_subsystem_enabled;
 
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 759d5af..a8a8c27 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -102,14 +102,8 @@ static DEFINE_RAW_SPINLOCK(confirm_error_lock);
 #define EEH_PCI_REGS_LOG_LEN 4096
 static unsigned char pci_regs_buf[EEH_PCI_REGS_LOG_LEN];
 
-/* System monitoring statistics */
-static unsigned long no_device;
-static unsigned long no_dn;
-static unsigned long no_cfg_addr;
-static unsigned long ignored_check;
-static unsigned long total_mmio_ffs;
-static unsigned long false_positives;
-static unsigned long slot_resets;
+/* EEH global statiscs */
+static struct eeh_stats eeh_stats;
 
 #define IS_BRIDGE(class_code) (((class_code)16) == PCI_BASE_CLASS_BRIDGE)
 
@@ -392,13 +386,13 @@ int eeh_dn_check_failure(struct device_node *dn, struct 
pci_dev *dev)
int rc = 0;
const char *location;
 
-   total_mmio_ffs++;
+   eeh_stats.total_mmio_ffs++;
 
if (!eeh_subsystem_enabled)
return 0;
 
if (!dn) {
-   no_dn++;
+   eeh_stats.no_dn++;
return 0;
}
dn = eeh_find_device_pe(dn);
@@ -407,14 +401,14 @@ int eeh_dn_check_failure(struct device_node *dn, struct 
pci_dev *dev)
/* Access to IO BARs might get this far and still not want checking. */
if (!(edev-mode  EEH_MODE_SUPPORTED) ||
edev-mode  EEH_MODE_NOCHECK) {
-   ignored_check++;
+   eeh_stats.ignored_check++;
pr_debug(EEH: Ignored check (%x) for %s %s\n,
edev-mode, eeh_pci_name(dev), dn-full_name);
return 0;
}
 
if (!edev-config_addr  !edev-pe_config_addr) {
-   no_cfg_addr++;
+   eeh_stats.no_cfg_addr++;
return 0;
}
 
@@ -460,13 +454,13 @@ int eeh_dn_check_failure(struct device_node *dn, struct 
pci_dev *dev)
(ret == EEH_STATE_NOT_SUPPORT) ||
(ret  (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) ==
(EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) {
-   false_positives++;
+   eeh_stats.false_positives++;
edev-false_positives ++;
rc = 0;
goto dn_unlock;
}
 
-   slot_resets++;
+   eeh_stats.slot_resets++;
  
/* Avoid repeated reports of this failure, including problems
 * with other functions on this device, and functions under
@@ -513,7 +507,7 @@ unsigned long eeh_check_failure(const volatile void __iomem 
*token, unsigned lon
addr = eeh_token_to_phys((unsigned long __force) token);
dev = pci_addr_cache_get_device(addr);
if (!dev) {
-   no_device++;
+   eeh_stats.no_device++;
return val;
}
 
@@ -1174,7 +1168,7 @@ static int proc_eeh_show(struct seq_file *m, void *v)
 {
if (0 == eeh_subsystem_enabled) {
seq_printf(m, EEH Subsystem is globally disabled\n);
-   seq_printf(m, eeh_total_mmio_ffs=%ld\n, total_mmio_ffs);
+   seq_printf(m, eeh_total_mmio_ffs=%ld\n, 
eeh_stats.total_mmio_ffs);
} else {
seq_printf(m, EEH Subsystem is enabled\n);
seq_printf(m,
@@ -1185,10 +1179,13 @@ static int 

[PATCH 16/21] Replace pci_dn with eeh_dev for EEH address cache

2012-02-24 Thread Gavin Shan
With original EEH implementation, struct pci_dn is used while building
PCI I/O address cache, which helps on searching the corresponding
PCI device according to the given physical I/O address. Besides, pci_dn
is associated with the corresponding PCI device while building its
I/O cache.

The patch replaces struct pci_dn with struct eeh_dev so that EEH address
cache won't depend on struct pci_dn. That will help EEH to become an
independent module in future. Besides, the binding of eeh_dev and PCI
device is done while building PCI device I/O cache.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/eeh_cache.c |   27 ---
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_cache.c 
b/arch/powerpc/platforms/pseries/eeh_cache.c
index 7c36a9c..0e8f6e9 100644
--- a/arch/powerpc/platforms/pseries/eeh_cache.c
+++ b/arch/powerpc/platforms/pseries/eeh_cache.c
@@ -175,7 +175,7 @@ pci_addr_cache_insert(struct pci_dev *dev, unsigned long 
alo,
 static void __pci_addr_cache_insert_device(struct pci_dev *dev)
 {
struct device_node *dn;
-   struct pci_dn *pdn;
+   struct eeh_dev *edev;
int i;
 
dn = pci_device_to_OF_node(dev);
@@ -184,13 +184,19 @@ static void __pci_addr_cache_insert_device(struct pci_dev 
*dev)
return;
}
 
+   edev = OF_NODE_TO_EEH_DEV(dn);
+   if (!edev) {
+   pr_warning(PCI: no EEH dev found for dn=%s\n,
+   dn-full_name);
+   return;
+   }
+
/* Skip any devices for which EEH is not enabled. */
-   pdn = PCI_DN(dn);
-   if (!(pdn-eeh_mode  EEH_MODE_SUPPORTED) ||
-   pdn-eeh_mode  EEH_MODE_NOCHECK) {
+   if (!(edev-mode  EEH_MODE_SUPPORTED) ||
+   edev-mode  EEH_MODE_NOCHECK) {
 #ifdef DEBUG
-   printk(KERN_INFO PCI: skip building address cache for=%s - 
%s\n,
-  pci_name(dev), pdn-node-full_name);
+   pr_info(PCI: skip building address cache for=%s - %s\n,
+   pci_name(dev), dn-full_name);
 #endif
return;
}
@@ -281,6 +287,7 @@ void pci_addr_cache_remove_device(struct pci_dev *dev)
 void __init pci_addr_cache_build(void)
 {
struct device_node *dn;
+   struct eeh_dev *edev;
struct pci_dev *dev = NULL;
 
spin_lock_init(pci_io_addr_cache_root.piar_lock);
@@ -291,8 +298,14 @@ void __init pci_addr_cache_build(void)
dn = pci_device_to_OF_node(dev);
if (!dn)
continue;
+
+   edev = OF_NODE_TO_EEH_DEV(dn);
+   if (!edev)
+   continue;
+
pci_dev_get(dev);  /* matching put is in eeh_remove_device() */
-   PCI_DN(dn)-pcidev = dev;
+   dev-dev.archdata.edev = edev;
+   edev-pdev = dev;
 
eeh_sysfs_add_device(dev);
}
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 07/21] pSeries platform PE state retrieval

2012-02-24 Thread Gavin Shan
On pSeries platform, there're 2 dedicated RTAS calls introduced to
retrieve the corresponding PE's state: ibm,read-slot-reset-state and
ibm,read-slot-reset-state2.

The patch implements the retrieval of PE's state according to the
given PE address. Besides, the implementation has been abstracted by
struct eeh_ops::get_state so that EEH core components could support
multiple platforms in future.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |8 ++
 arch/powerpc/platforms/pseries/eeh.c |   96 -
 arch/powerpc/platforms/pseries/eeh_driver.c  |2 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c |   70 ++-
 4 files changed, 94 insertions(+), 82 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 76f7b3f..1d3c9e5 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -42,6 +42,14 @@ struct device_node;
 #define EEH_OPT_ENABLE 1   /* EEH enable   */
 #define EEH_OPT_THAW_MMIO  2   /* MMIO enable  */
 #define EEH_OPT_THAW_DMA   3   /* DMA enable   */
+#define EEH_STATE_UNAVAILABLE  (1  0)/* State unavailable*/
+#define EEH_STATE_NOT_SUPPORT  (1  1)/* EEH not supported*/
+#define EEH_STATE_RESET_ACTIVE (1  2)/* Active reset */
+#define EEH_STATE_MMIO_ACTIVE  (1  3)/* Active MMIO  */
+#define EEH_STATE_DMA_ACTIVE   (1  4)/* Active DMA   */
+#define EEH_STATE_MMIO_ENABLED (1  5)/* MMIO enabled */
+#define EEH_STATE_DMA_ENABLED  (1  6)/* DMA enabled  */
+
 struct eeh_ops {
char *name;
int (*init)(void);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 00797e0..8d11f1f 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -88,8 +88,6 @@
 
 /* RTAS tokens */
 static int ibm_set_slot_reset;
-static int ibm_read_slot_reset_state;
-static int ibm_read_slot_reset_state2;
 static int ibm_slot_error_detail;
 static int ibm_configure_bridge;
 static int ibm_configure_pe;
@@ -289,37 +287,6 @@ void eeh_slot_error_detail(struct pci_dn *pdn, int 
severity)
 }
 
 /**
- * eeh_read_slot_reset_state - Read the reset state of a device node's slot
- * @dn: device node to read
- * @rets: array to return results in
- *
- * Read the reset state of a device node's slot through platform dependent
- * function call.
- */
-static int eeh_read_slot_reset_state(struct pci_dn *pdn, int rets[])
-{
-   int token, outputs;
-   int config_addr;
-
-   if (ibm_read_slot_reset_state2 != RTAS_UNKNOWN_SERVICE) {
-   token = ibm_read_slot_reset_state2;
-   outputs = 4;
-   } else {
-   token = ibm_read_slot_reset_state;
-   rets[2] = 0; /* fake PE Unavailable info */
-   outputs = 3;
-   }
-
-   /* Use PE configuration address, if present */
-   config_addr = pdn-eeh_config_addr;
-   if (pdn-eeh_pe_config_addr)
-   config_addr = pdn-eeh_pe_config_addr;
-
-   return rtas_call(token, 3, outputs, rets, config_addr,
-BUID_HI(pdn-phb-buid), BUID_LO(pdn-phb-buid));
-}
-
-/**
  * eeh_wait_for_slot_status - Returns error status of slot
  * @pdn: pci device node
  * @max_wait_msecs: maximum number to millisecs to wait
@@ -335,21 +302,15 @@ static int eeh_read_slot_reset_state(struct pci_dn *pdn, 
int rets[])
 int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs)
 {
int rc;
-   int rets[3];
int mwait;
 
while (1) {
-   rc = eeh_read_slot_reset_state(pdn, rets);
-   if (rc) return rc;
-   if (rets[1] == 0) return -1;  /* EEH is not supported */
-
-   if (rets[0] != 5) return rets[0]; /* return actual status */
-
-   if (rets[2] == 0) return -1; /* permanently unavailable */
+   rc = eeh_ops-get_state(pdn-node, mwait);
+   if (rc != EEH_STATE_UNAVAILABLE)
+   return rc;
 
if (max_wait_msecs = 0) break;
 
-   mwait = rets[2];
if (mwait = 0) {
printk(KERN_WARNING EEH: Firmware returned bad wait 
value=%d\n,
mwait);
@@ -522,7 +483,6 @@ void eeh_clear_slot(struct device_node *dn, int mode_flag)
 int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 {
int ret;
-   int rets[3];
unsigned long flags;
struct pci_dn *pdn;
int rc = 0;
@@ -584,40 +544,18 @@ int eeh_dn_check_failure(struct device_node *dn, struct 
pci_dev *dev)
 * function zero of a multi-function device.
 * In any case they must share a common PHB.
 */
-   ret = eeh_read_slot_reset_state(pdn, rets);
-
-   /* If the call to firmware failed, punt 

[PATCH 05/21] pSeries platform EEH operation

2012-02-24 Thread Gavin Shan
There're 4 EEH operations that are covered by the dedicated RTAS
call ibm,set-eeh-option: enable or disable EEH, enable MMIO and
enable DMA. At early stage of system boot, the EEH would be tried
to enable on PCI device related device node. MMIO and DMA for
particular PE should be enabled when doing recovery on EEH errors
so that the PE could function properly again.

The patch implements it and abstract that through struct
eeh_ops::set_eeh. It would be help for EEH to support multiple
platforms in future.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |4 ++
 arch/powerpc/include/asm/ppc-pci.h   |2 -
 arch/powerpc/platforms/pseries/eeh.c |   26 ++--
 arch/powerpc/platforms/pseries/eeh_driver.c  |4 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c |   39 +-
 5 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 0666c52..76f7b3f 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -38,6 +38,10 @@ struct device_node;
  * platform should register its own EEH operation callback
  * functions before any EEH further operations.
  */
+#define EEH_OPT_DISABLE0   /* EEH disable  */
+#define EEH_OPT_ENABLE 1   /* EEH enable   */
+#define EEH_OPT_THAW_MMIO  2   /* MMIO enable  */
+#define EEH_OPT_THAW_DMA   3   /* DMA enable   */
 struct eeh_ops {
char *name;
int (*init)(void);
diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 605a970..6150349 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -56,8 +56,6 @@ struct pci_dev *pci_get_device_by_addr(unsigned long addr);
 #define EEH_LOG_TEMP_FAILURE 1
 #define EEH_LOG_PERM_FAILURE 2
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
-#define EEH_THAW_MMIO 2
-#define EEH_THAW_DMA  3
 int eeh_pci_enable(struct pci_dn *pdn, int function);
 int eeh_reset_pe(struct pci_dn *);
 int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index bb6de6c..70a9617 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -87,7 +87,6 @@
 #define PCI_BUS_RESET_WAIT_MSEC (60*1000)
 
 /* RTAS tokens */
-static int ibm_set_eeh_option;
 static int ibm_set_slot_reset;
 static int ibm_read_slot_reset_state;
 static int ibm_read_slot_reset_state2;
@@ -283,7 +282,7 @@ void eeh_slot_error_detail(struct pci_dn *pdn, int severity)
size_t loglen = 0;
pci_regs_buf[0] = 0;
 
-   eeh_pci_enable(pdn, EEH_THAW_MMIO);
+   eeh_pci_enable(pdn, EEH_OPT_THAW_MMIO);
eeh_configure_bridge(pdn);
eeh_restore_bars(pdn);
loglen = eeh_gather_pci_data(pdn, pci_regs_buf, EEH_PCI_REGS_LOG_LEN);
@@ -698,26 +697,15 @@ EXPORT_SYMBOL(eeh_check_failure);
  */
 int eeh_pci_enable(struct pci_dn *pdn, int function)
 {
-   int config_addr;
int rc;
 
-   /* Use PE configuration address, if present */
-   config_addr = pdn-eeh_config_addr;
-   if (pdn-eeh_pe_config_addr)
-   config_addr = pdn-eeh_pe_config_addr;
-
-   rc = rtas_call(ibm_set_eeh_option, 4, 1, NULL,
-  config_addr,
-  BUID_HI(pdn-phb-buid),
-  BUID_LO(pdn-phb-buid),
-   function);
-
+   rc = eeh_ops-set_option(pdn-node, function);
if (rc)
printk(KERN_WARNING EEH: Unexpected state change %d, err=%d 
dn=%s\n,
function, rc, pdn-node-full_name);
 
rc = eeh_wait_for_slot_status(pdn, PCI_BUS_RESET_WAIT_MSEC);
-   if ((rc == 4)  (function == EEH_THAW_MMIO))
+   if ((rc == 4)  (function == EEH_OPT_THAW_MMIO))
return 0;
 
return rc;
@@ -1158,9 +1146,7 @@ static void *eeh_early_enable(struct device_node *dn, 
void *data)
if (regs) {
/* First register entry is addr (00BBSS00)  */
/* Try to enable eeh */
-   ret = rtas_call(ibm_set_eeh_option, 4, 1, NULL,
-   regs[0], info-buid_hi, info-buid_lo,
-   EEH_ENABLE);
+   ret = eeh_ops-set_option(dn, EEH_OPT_ENABLE);
 
enable = 0;
if (ret == 0) {
@@ -1299,7 +1285,6 @@ void __init eeh_init(void)
if (np == NULL)
return;
 
-   ibm_set_eeh_option = rtas_token(ibm,set-eeh-option);
ibm_set_slot_reset = rtas_token(ibm,set-slot-reset);
ibm_read_slot_reset_state2 = rtas_token(ibm,read-slot-reset-state2);
ibm_read_slot_reset_state = rtas_token(ibm,read-slot-reset-state);
@@ -1309,9 +1294,6 @@ void __init eeh_init(void)
ibm_configure_bridge = 

[PATCH 13/21] Cleanup on function names of EEH aux components

2012-02-24 Thread Gavin Shan
The patch does some cleanup on the function names of EEH
aux components. Currently, only couple of function names from
eeh_cache have been adjusted so that:

* The function name has prefix eeh_addr_cache.
* Move around pci_addr_cache_build() in the header file
  to reflect function call sequence.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/ppc-pci.h |4 ++--
 arch/powerpc/platforms/pseries/eeh.c   |2 +-
 arch/powerpc/platforms/pseries/eeh_cache.c |8 
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index b4b18d8..c02d5a7 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -49,10 +49,10 @@ extern unsigned long pci_probe_only;
 
 #ifdef CONFIG_EEH
 
+void pci_addr_cache_build(void);
 void pci_addr_cache_insert_device(struct pci_dev *dev);
 void pci_addr_cache_remove_device(struct pci_dev *dev);
-void pci_addr_cache_build(void);
-struct pci_dev *pci_get_device_by_addr(unsigned long addr);
+struct pci_dev *pci_addr_cache_get_device(unsigned long addr);
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
 int eeh_pci_enable(struct pci_dn *pdn, int function);
 int eeh_reset_pe(struct pci_dn *);
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index bd4ed83..646b520 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -511,7 +511,7 @@ unsigned long eeh_check_failure(const volatile void __iomem 
*token, unsigned lon
 
/* Finding the phys addr + pci device; this is pretty quick. */
addr = eeh_token_to_phys((unsigned long __force) token);
-   dev = pci_get_device_by_addr(addr);
+   dev = pci_addr_cache_get_device(addr);
if (!dev) {
no_device++;
return val;
diff --git a/arch/powerpc/platforms/pseries/eeh_cache.c 
b/arch/powerpc/platforms/pseries/eeh_cache.c
index 850c00c..7c36a9c 100644
--- a/arch/powerpc/platforms/pseries/eeh_cache.c
+++ b/arch/powerpc/platforms/pseries/eeh_cache.c
@@ -59,7 +59,7 @@ static struct pci_io_addr_cache {
spinlock_t piar_lock;
 } pci_io_addr_cache_root;
 
-static inline struct pci_dev *__pci_get_device_by_addr(unsigned long addr)
+static inline struct pci_dev *__pci_addr_cache_get_device(unsigned long addr)
 {
struct rb_node *n = pci_io_addr_cache_root.rb_root.rb_node;
 
@@ -83,7 +83,7 @@ static inline struct pci_dev 
*__pci_get_device_by_addr(unsigned long addr)
 }
 
 /**
- * pci_get_device_by_addr - Get device, given only address
+ * pci_addr_cache_get_device - Get device, given only address
  * @addr: mmio (PIO) phys address or i/o port number
  *
  * Given an mmio phys address, or a port number, find a pci device
@@ -92,13 +92,13 @@ static inline struct pci_dev 
*__pci_get_device_by_addr(unsigned long addr)
  * from zero (that is, they do *not* have pci_io_addr added in).
  * It is safe to call this function within an interrupt.
  */
-struct pci_dev *pci_get_device_by_addr(unsigned long addr)
+struct pci_dev *pci_addr_cache_get_device(unsigned long addr)
 {
struct pci_dev *dev;
unsigned long flags;
 
spin_lock_irqsave(pci_io_addr_cache_root.piar_lock, flags);
-   dev = __pci_get_device_by_addr(addr);
+   dev = __pci_addr_cache_get_device(addr);
spin_unlock_irqrestore(pci_io_addr_cache_root.piar_lock, flags);
return dev;
 }
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 15/21] Replace pci_dn with eeh_dev for EEH sysfs

2012-02-24 Thread Gavin Shan
With original EEH implementation, all EEH related statistics have
been put into struct pci_dn. We've introduced struct eeh_dev to
replace struct pci_dn in EEH core components, including EEH sysfs
component.

The patch shows EEH statistics from struct eeh_dev instead of struct
pci_dn in EEH sysfs component. Besides, it also fixed the EEH device
retrieval from PCI device, which was introduced by the previous patch
in the series of patch.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/eeh_sysfs.c |   23 ++-
 1 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_sysfs.c 
b/arch/powerpc/platforms/pseries/eeh_sysfs.c
index 5e4eab1..01a62fc 100644
--- a/arch/powerpc/platforms/pseries/eeh_sysfs.c
+++ b/arch/powerpc/platforms/pseries/eeh_sysfs.c
@@ -41,24 +41,21 @@ static ssize_t eeh_show_##_name(struct device *dev,  \
struct device_attribute *attr, char *buf)  \
 {\
struct pci_dev *pdev = to_pci_dev(dev);   \
-   struct device_node *dn = pci_device_to_OF_node(pdev); \
-   struct pci_dn *pdn;   \
+   struct eeh_dev *edev = PCI_DEV_TO_EEH_DEV(pdev);  \
  \
-   if (!dn || PCI_DN(dn) == NULL)\
-   return 0;  \
+   if (!edev)\
+   return 0; \
  \
-   pdn = PCI_DN(dn); \
-   return sprintf(buf, _format \n, pdn-_memb);\
+   return sprintf(buf, _format \n, edev-_memb);   \
 }\
 static DEVICE_ATTR(_name, S_IRUGO, eeh_show_##_name, NULL);
 
-
-EEH_SHOW_ATTR(eeh_mode, eeh_mode, 0x%x);
-EEH_SHOW_ATTR(eeh_config_addr, eeh_config_addr, 0x%x);
-EEH_SHOW_ATTR(eeh_pe_config_addr, eeh_pe_config_addr, 0x%x);
-EEH_SHOW_ATTR(eeh_check_count, eeh_check_count, %d);
-EEH_SHOW_ATTR(eeh_freeze_count, eeh_freeze_count, %d);
-EEH_SHOW_ATTR(eeh_false_positives, eeh_false_positives, %d);
+EEH_SHOW_ATTR(eeh_mode,mode,0x%x);
+EEH_SHOW_ATTR(eeh_config_addr, config_addr, 0x%x);
+EEH_SHOW_ATTR(eeh_pe_config_addr,  pe_config_addr,  0x%x);
+EEH_SHOW_ATTR(eeh_check_count, check_count, %d  );
+EEH_SHOW_ATTR(eeh_freeze_count,freeze_count,%d  );
+EEH_SHOW_ATTR(eeh_false_positives, false_positives, %d  );
 
 void eeh_sysfs_add_device(struct pci_dev *pdev)
 {
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 12/21] Cleanup on comments of EEH aux components

2012-02-24 Thread Gavin Shan
There're several EEH aux components and the patch does some cleanup
for them so that they look more clean.

* Duplicated comments have been removed from the header file.
* Comments have been reorganized so that it looks more clean.
* The leading comments of functions are adjusted for a little
  bit so that the result of make pdfdocs would be more
  unified.
* Function calls xxx () has been replaced by xxx().

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh_event.h|   34 ++-
 arch/powerpc/platforms/pseries/eeh_cache.c  |9 +-
 arch/powerpc/platforms/pseries/eeh_driver.c |  136 ---
 arch/powerpc/platforms/pseries/eeh_event.c  |   23 ++---
 arch/powerpc/platforms/pseries/eeh_sysfs.c  |2 +-
 5 files changed, 107 insertions(+), 97 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh_event.h 
b/arch/powerpc/include/asm/eeh_event.h
index cc3cb04..25ebf6a 100644
--- a/arch/powerpc/include/asm/eeh_event.h
+++ b/arch/powerpc/include/asm/eeh_event.h
@@ -1,6 +1,4 @@
 /*
- * eeh_event.h
- *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -22,32 +20,20 @@
 #define ASM_POWERPC_EEH_EVENT_H
 #ifdef __KERNEL__
 
-/** EEH event -- structure holding pci controller data that describes
- *  a change in the isolation status of a PCI slot.  A pointer
- *  to this struct is passed as the data pointer in a notify callback.
+/*
+ * structure holding pci controller data that describes a
+ * change in the isolation status of a PCI slot.  A pointer
+ * to this struct is passed as the data pointer in a notify
+ * callback.
  */
 struct eeh_event {
-   struct list_head list;
-   struct device_node  *dn;   /* struct device node */
-   struct pci_dev   *dev;  /* affected device */
+   struct list_headlist;   /* to form event queue  */
+   struct device_node  *dn;/* struct device node   */
+   struct pci_dev  *dev;   /* affected device  */
 };
 
-/**
- * eeh_send_failure_event - generate a PCI error event
- * @dev pci device
- *
- * This routine builds a PCI error event which will be delivered
- * to all listeners on the eeh_notifier_chain.
- *
- * This routine can be called within an interrupt context;
- * the actual event will be delivered in a normal context
- * (from a workqueue).
- */
-int eeh_send_failure_event (struct device_node *dn,
-struct pci_dev *dev);
-
-/* Main recovery function */
-struct pci_dn * handle_eeh_events (struct eeh_event *);
+int eeh_send_failure_event(struct device_node *dn, struct pci_dev *dev);
+struct pci_dn *handle_eeh_events(struct eeh_event *);
 
 #endif /* __KERNEL__ */
 #endif /* ASM_POWERPC_EEH_EVENT_H */
diff --git a/arch/powerpc/platforms/pseries/eeh_cache.c 
b/arch/powerpc/platforms/pseries/eeh_cache.c
index fc5ae76..850c00c 100644
--- a/arch/powerpc/platforms/pseries/eeh_cache.c
+++ b/arch/powerpc/platforms/pseries/eeh_cache.c
@@ -1,5 +1,4 @@
 /*
- * eeh_cache.c
  * PCI address cache; allows the lookup of PCI devices based on I/O address
  *
  * Copyright IBM Corporation 2004
@@ -47,8 +46,7 @@
  * than any hash algo I could think of for this problem, even
  * with the penalty of slow pointer chases for d-cache misses).
  */
-struct pci_io_addr_range
-{
+struct pci_io_addr_range {
struct rb_node rb_node;
unsigned long addr_lo;
unsigned long addr_hi;
@@ -56,8 +54,7 @@ struct pci_io_addr_range
unsigned int flags;
 };
 
-static struct pci_io_addr_cache
-{
+static struct pci_io_addr_cache {
struct rb_root rb_root;
spinlock_t piar_lock;
 } pci_io_addr_cache_root;
@@ -166,7 +163,7 @@ pci_addr_cache_insert(struct pci_dev *dev, unsigned long 
alo,
 
 #ifdef DEBUG
printk(KERN_DEBUG PIAR: insert range=[%lx:%lx] dev=%s\n,
- alo, ahi, pci_name (dev));
+ alo, ahi, pci_name(dev));
 #endif
 
rb_link_node(piar-rb_node, parent, p);
diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c 
b/arch/powerpc/platforms/pseries/eeh_driver.c
index 61450e1..3f25fab 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -33,8 +33,14 @@
 #include asm/prom.h
 #include asm/rtas.h
 
-
-static inline const char * pcid_name (struct pci_dev *pdev)
+/**
+ * eeh_pcid_name - Retrieve name of PCI device driver
+ * @pdev: PCI device
+ *
+ * This routine is used to retrieve the name of PCI device driver
+ * if that's valid.
+ */
+static inline const char *pcid_name(struct pci_dev *pdev)
 {
if (pdev  pdev-dev.driver)
return pdev-dev.driver-name;
@@ -64,7 +70,14 @@ static void print_device_node_tree(struct pci_dn *pdn, int 
dent)
 #endif
 
 /**
- * 

[PATCH 18/21] Replace pci_dn with eeh_dev for EEH aux components

2012-02-24 Thread Gavin Shan
The original EEH implementation is heavily depending on struct pci_dn.
We have to put EEH related information to pci_dn. Actually, we could
split struct pci_dn so that the EEH sensitive information to form an
individual struct, then EEH looks more independent.

The patch replaces pci_dn with eeh_dev for EEH aux components like
event and driver. Also, the eeh_event struct has been adjusted for
a little bit since eeh_dev has linked the associated FDT (Flat Device
Tree) node and PCI device. It's not necessary for eeh_event struct to
trace FDT node and PCI device. We can just simply to trace eeh_dev in
eeh_event.

The patch also renames function pcid_name() to eeh_pcid_name(), which
should be missed in the previous patch where the EEH aux components
have been cleaned up.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh_event.h|7 +-
 arch/powerpc/platforms/pseries/eeh.c|2 +-
 arch/powerpc/platforms/pseries/eeh_driver.c |   81 +--
 arch/powerpc/platforms/pseries/eeh_event.c  |   36 ++--
 4 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh_event.h 
b/arch/powerpc/include/asm/eeh_event.h
index 25ebf6a..c68b012 100644
--- a/arch/powerpc/include/asm/eeh_event.h
+++ b/arch/powerpc/include/asm/eeh_event.h
@@ -28,12 +28,11 @@
  */
 struct eeh_event {
struct list_headlist;   /* to form event queue  */
-   struct device_node  *dn;/* struct device node   */
-   struct pci_dev  *dev;   /* affected device  */
+   struct eeh_dev  *edev;  /* EEH device   */
 };
 
-int eeh_send_failure_event(struct device_node *dn, struct pci_dev *dev);
-struct pci_dn *handle_eeh_events(struct eeh_event *);
+int eeh_send_failure_event(struct eeh_dev *edev);
+struct eeh_dev *handle_eeh_events(struct eeh_event *);
 
 #endif /* __KERNEL__ */
 #endif /* ASM_POWERPC_EEH_EVENT_H */
diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 84a8a0c..759d5af 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -475,7 +475,7 @@ int eeh_dn_check_failure(struct device_node *dn, struct 
pci_dev *dev)
eeh_mark_slot(dn, EEH_MODE_ISOLATED);
raw_spin_unlock_irqrestore(confirm_error_lock, flags);
 
-   eeh_send_failure_event(edev-dn, edev-pdev);
+   eeh_send_failure_event(edev);
 
/* Most EEH events are due to device driver bugs.  Having
 * a stack trace will help the device-driver authors figure
diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c 
b/arch/powerpc/platforms/pseries/eeh_driver.c
index 3f25fab..3bf1d10 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -40,7 +40,7 @@
  * This routine is used to retrieve the name of PCI device driver
  * if that's valid.
  */
-static inline const char *pcid_name(struct pci_dev *pdev)
+static inline const char *eeh_pcid_name(struct pci_dev *pdev)
 {
if (pdev  pdev-dev.driver)
return pdev-dev.driver-name;
@@ -81,7 +81,7 @@ static void print_device_node_tree(struct pci_dn *pdn, int 
dent)
  */
 static void eeh_disable_irq(struct pci_dev *dev)
 {
-   struct device_node *dn = pci_device_to_OF_node(dev);
+   struct eeh_dev *edev = PCI_DEV_TO_EEH_DEV(dev);
 
/* Don't disable MSI and MSI-X interrupts. They are
 * effectively disabled by the DMA Stopped state
@@ -93,7 +93,7 @@ static void eeh_disable_irq(struct pci_dev *dev)
if (!irq_has_action(dev-irq))
return;
 
-   PCI_DN(dn)-eeh_mode |= EEH_MODE_IRQ_DISABLED;
+   edev-mode |= EEH_MODE_IRQ_DISABLED;
disable_irq_nosync(dev-irq);
 }
 
@@ -106,10 +106,10 @@ static void eeh_disable_irq(struct pci_dev *dev)
  */
 static void eeh_enable_irq(struct pci_dev *dev)
 {
-   struct device_node *dn = pci_device_to_OF_node(dev);
+   struct eeh_dev *edev = PCI_DEV_TO_EEH_DEV(dev);
 
-   if ((PCI_DN(dn)-eeh_mode)  EEH_MODE_IRQ_DISABLED) {
-   PCI_DN(dn)-eeh_mode = ~EEH_MODE_IRQ_DISABLED;
+   if ((edev-mode)  EEH_MODE_IRQ_DISABLED) {
+   edev-mode = ~EEH_MODE_IRQ_DISABLED;
enable_irq(dev-irq);
}
 }
@@ -270,20 +270,20 @@ static int eeh_report_failure(struct pci_dev *dev, void 
*userdata)
 
 /**
  * eeh_reset_device - Perform actual reset of a pci slot
- * @pe_dn: PE associated device node
+ * @edev: PE associated EEH device
  * @bus: PCI bus corresponding to the isolcated slot
  *
  * This routine must be called to do reset on the indicated PE.
  * During the reset, udev might be invoked because those affected
  * PCI devices will be removed and then added.
  */
-static int eeh_reset_device(struct pci_dn *pe_dn, struct pci_bus *bus)
+static int eeh_reset_device(struct eeh_dev *edev, struct pci_bus *bus)
 {
struct device_node *dn;
int cnt, rc;
 
  

[PATCH 14/21] Introduce EEH device

2012-02-24 Thread Gavin Shan
Original EEH implementation depends on struct pci_dn heavily. However,
EEH shouldn't depend on that actually because EEH needn't share much
information with other PCI components. That's to say, EEH should have
worked independently.

The patch introduces struct eeh_dev so that EEH core components needn't
be working based on struct pci_dn in future. Also, struct pci_dn, struct
eeh_dev instances are created in dynamic fasion and the binding with EEH
device, OF node, PCI device is implemented as well.

The EEH devices are created after PHBs are detected and initialized, but
PCI emunation hasn't started yet. Apart from that, PHB might be created
dynamically through DLPAR component and the EEH devices should be creatd
as well. Another case might be OF node is created dynamically by DR
(Dynamic Reconfiguration), which has been defined by PAPR. For those OF
nodes created by DR, EEH devices should be also created accordingly. The
binding between EEH device and OF node is done while the EEH device is
initially created.

The binding between EEH device and PCI device should be done after PCI
emunation is done. Besides, PCI hotplug also needs the binding so that
the EEH devices could be traced from the newly coming PCI buses or PCI
devices.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/device.h  |3 ++
 arch/powerpc/include/asm/eeh.h |   51 
 arch/powerpc/kernel/of_platform.c  |3 ++
 arch/powerpc/kernel/rtas_pci.c |3 ++
 arch/powerpc/platforms/pseries/Makefile|3 +-
 arch/powerpc/platforms/pseries/pci_dlpar.c |3 ++
 arch/powerpc/platforms/pseries/setup.c |6 +++-
 include/linux/of.h |3 ++
 8 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h 
b/arch/powerpc/include/asm/device.h
index d57c08a..4668344 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -31,6 +31,9 @@ struct dev_archdata {
 #ifdef CONFIG_SWIOTLB
dma_addr_t  max_direct_dma_addr;
 #endif
+#ifdef CONFIG_EEH
+   void*edev;
+#endif
 };
 
 struct pdev_archdata {
diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index ad8f318..1310971 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -32,6 +32,37 @@ struct device_node;
 #ifdef CONFIG_EEH
 
 /*
+ * The struct is used to trace EEH state for the associated
+ * PCI device node or PCI device. In future, it might
+ * represent PE as well so that the EEH device to form
+ * another tree except the currently existing tree of PCI
+ * buses and PCI devices
+ */
+#define EEH_MODE_SUPPORTED (10)  /* EEH supported on the device  */
+#define EEH_MODE_NOCHECK   (11)  /* EEH check should be skipped  */
+#define EEH_MODE_ISOLATED  (12)  /* The device has been isolated */
+#define EEH_MODE_RECOVERING(13)  /* Recovering the device*/
+#define EEH_MODE_IRQ_DISABLED  (14)  /* Interrupt disabled   */
+
+struct eeh_dev {
+   int mode;   /* EEH mode */
+   int class_code; /* Class code of the device */
+   int config_addr;/* Config address   */
+   int pe_config_addr; /* PE config address*/
+   int check_count;/* Times of ignored error   */
+   int freeze_count;   /* Times of froze up*/
+   int false_positives;/* Times of reported #ff's  */
+   u32 config_space[16];   /* Saved PCI config space   */
+   struct pci_controller *phb; /* Associated PHB   */
+   struct device_node *dn; /* Associated device node   */
+   struct pci_dev *pdev;   /* Associated PCI device*/
+};
+#define EEH_DEV_TO_OF_NODE(edev)   (edev-dn)
+#define EEH_DEV_TO_PCI_DEV(edev)   (edev-pdev)
+#define OF_NODE_TO_EEH_DEV(dn) ((struct eeh_dev *)(dn-edev))
+#define PCI_DEV_TO_EEH_DEV(pdev)   ((struct eeh_dev 
*)(pdev-dev.archdata.edev))
+
+/*
  * The struct is used to trace the registered EEH operation
  * callback functions. Actually, those operation callback
  * functions are heavily platform dependent. That means the
@@ -70,19 +101,15 @@ struct eeh_ops {
 extern struct eeh_ops *eeh_ops;
 extern int eeh_subsystem_enabled;
 
-/* Values for eeh_mode bits in device_node */
-#define EEH_MODE_SUPPORTED (10)
-#define EEH_MODE_NOCHECK   (11)
-#define EEH_MODE_ISOLATED  (12)
-#define EEH_MODE_RECOVERING(13)
-#define EEH_MODE_IRQ_DISABLED  (14)
-
 /*
  * Max number of EEH freezes allowed before we consider the device
  * to be permanently disabled.
  */
 #define EEH_MAX_ALLOWED_FREEZES 5
 
+void * __devinit eeh_dev_init(struct device_node *dn, void *data);
+void __devinit 

[PATCH 04/21] pSeries platform EEH initialization

2012-02-24 Thread Gavin Shan
The platform specific EEH operations have been abstracted by
struct eeh_ops. The individual platroms, including pSeries, needs
doing necessary initialization before the platform dependent EEH
operations work properly.

The patch is addressing that and do necessary platform initialization
for pSeries platform. More specificly, it will figure out the tokens
of EEH related RTAS calls.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/eeh.c |   12 ++
 arch/powerpc/platforms/pseries/eeh_pseries.c |   55 ++
 2 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index b0e3fb0..bb6de6c 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -1279,6 +1279,18 @@ void __init eeh_init(void)
 {
struct device_node *phb, *np;
struct eeh_early_enable_info info;
+   int ret;
+
+   /* call platform initialization function */
+   if (!eeh_ops) {
+   pr_warning(%s: Platform EEH operation not found\n,
+   __func__);
+   return;
+   } else if ((ret = eeh_ops-init())) {
+   pr_warning(%s: Failed to call platform init function (%d)\n,
+   __func__, ret);
+   return;
+   }
 
raw_spin_lock_init(confirm_error_lock);
spin_lock_init(slot_errbuf_lock);
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c 
b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 61a9050..1a9410a 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -45,6 +45,17 @@
 #include asm/ppc-pci.h
 #include asm/rtas.h
 
+/* RTAS tokens */
+static int ibm_set_eeh_option;
+static int ibm_set_slot_reset;
+static int ibm_read_slot_reset_state;
+static int ibm_read_slot_reset_state2;
+static int ibm_slot_error_detail;
+static int ibm_get_config_addr_info;
+static int ibm_get_config_addr_info2;
+static int ibm_configure_bridge;
+static int ibm_configure_pe;
+
 /**
  * pseries_eeh_init - EEH platform dependent initialization
  *
@@ -52,6 +63,50 @@
  */
 static int pseries_eeh_init(void)
 {
+   /* figure out EEH RTAS function call tokens */
+   ibm_set_eeh_option  = rtas_token(ibm,set-eeh-option);
+   ibm_set_slot_reset  = rtas_token(ibm,set-slot-reset);
+   ibm_read_slot_reset_state2  = 
rtas_token(ibm,read-slot-reset-state2);
+   ibm_read_slot_reset_state   = 
rtas_token(ibm,read-slot-reset-state);
+   ibm_slot_error_detail   = rtas_token(ibm,slot-error-detail);
+   ibm_get_config_addr_info2   = 
rtas_token(ibm,get-config-addr-info2);
+   ibm_get_config_addr_info= 
rtas_token(ibm,get-config-addr-info);
+   ibm_configure_pe= rtas_token(ibm,configure-pe);
+   ibm_configure_bridge= rtas_token (ibm,configure-bridge);
+
+   /* necessary sanity check */
+   if (ibm_set_eeh_option == RTAS_UNKNOWN_SERVICE) {
+   pr_warning(%s: RTAS service ibm,set-eeh-option invalid\n,
+   __func__);
+   return -EINVAL;
+   } else if (ibm_set_slot_reset == RTAS_UNKNOWN_SERVICE) {
+   pr_warning(%s: RTAS service ibm, set-slot-reset invalid\n,
+   __func__);
+   return -EINVAL;
+   } else if (ibm_read_slot_reset_state2 == RTAS_UNKNOWN_SERVICE 
+  ibm_read_slot_reset_state == RTAS_UNKNOWN_SERVICE) {
+   pr_warning(%s: RTAS service ibm,read-slot-reset-state2 and 
+   ibm,read-slot-reset-state invalid\n,
+   __func__);
+   return -EINVAL;
+   } else if (ibm_slot_error_detail == RTAS_UNKNOWN_SERVICE) {
+   pr_warning(%s: RTAS service ibm,slot-error-detail invalid\n,
+   __func__);
+   return -EINVAL;
+   } else if (ibm_get_config_addr_info2 == RTAS_UNKNOWN_SERVICE 
+  ibm_get_config_addr_info == RTAS_UNKNOWN_SERVICE) {
+   pr_warning(%s: RTAS service ibm,get-config-addr-info2 and 
+   ibm,get-config-addr-info invalid\n,
+   __func__);
+   return -EINVAL;
+   } else if (ibm_configure_pe == RTAS_UNKNOWN_SERVICE 
+  ibm_configure_bridge == RTAS_UNKNOWN_SERVICE) {
+   pr_warning(%s: RTAS service ibm,configure-pe and 
+   ibm,configure-bridge invalid\n,
+   __func__);
+   return -EINVAL;
+   }
+
return 0;
 }
 
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 06/21] pSeries platform EEH PE address retrieval

2012-02-24 Thread Gavin Shan
There're 2 types of addresses used for EEH operations. The first
one would be BDF (Bus/Device/Function) address which is retrieved
from the reg property of the corresponding FDT node. Another one
is PE address that should be enquired from firmware through RTAS
call on pSeries platform. When issuing EEH operation, the PE address
has precedence over BDF address.

The patch implements retrieving PE address according to the given
BDF address on pSeries platform. Also, the struct eeh_early_enable_info
has been removed since the information can be figured out from
dn-pdn-phb-buid directly and that simplifies the code.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/eeh.c |   67 +
 arch/powerpc/platforms/pseries/eeh_pseries.c |   46 +-
 2 files changed, 48 insertions(+), 65 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh.c 
b/arch/powerpc/platforms/pseries/eeh.c
index 70a9617..00797e0 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -91,8 +91,6 @@ static int ibm_set_slot_reset;
 static int ibm_read_slot_reset_state;
 static int ibm_read_slot_reset_state2;
 static int ibm_slot_error_detail;
-static int ibm_get_config_addr_info;
-static int ibm_get_config_addr_info2;
 static int ibm_configure_bridge;
 static int ibm_configure_pe;
 
@@ -1048,56 +1046,6 @@ void eeh_configure_bridge(struct pci_dn *pdn)
}
 }
 
-#define EEH_ENABLE 1
-
-struct eeh_early_enable_info {
-   unsigned int buid_hi;
-   unsigned int buid_lo;
-};
-
-/**
- * eeh_get_pe_addr - Retrieve PE address with given BDF address
- * @config_addr: BDF address
- * @info: BUID of the associated PHB
- *
- * There're 2 kinds of addresses existing in EEH core components:
- * BDF address and PE address. Besides, there has dedicated platform
- * dependent function call to retrieve the PE address according to
- * the given BDF address. Further more, we prefer PE address on BDF
- * address in EEH core components.
- */
-static int eeh_get_pe_addr(int config_addr,
-struct eeh_early_enable_info *info)
-{
-   unsigned int rets[3];
-   int ret;
-
-   /* Use latest config-addr token on power6 */
-   if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
-   /* Make sure we have a PE in hand */
-   ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
-   config_addr, info-buid_hi, info-buid_lo, 1);
-   if (ret || (rets[0]==0))
-   return 0;
-
-   ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
-   config_addr, info-buid_hi, info-buid_lo, 0);
-   if (ret)
-   return 0;
-   return rets[0];
-   }
-
-   /* Use older config-addr token on power5 */
-   if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
-   ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
-   config_addr, info-buid_hi, info-buid_lo, 0);
-   if (ret)
-   return 0;
-   return rets[0];
-   }
-   return 0;
-}
-
 /**
  * eeh_early_enable - Early enable EEH on the indicated device
  * @dn: device node
@@ -1110,7 +1058,6 @@ static int eeh_get_pe_addr(int config_addr,
 static void *eeh_early_enable(struct device_node *dn, void *data)
 {
unsigned int rets[3];
-   struct eeh_early_enable_info *info = data;
int ret;
const u32 *class_code = of_get_property(dn, class-code, NULL);
const u32 *vendor_id = of_get_property(dn, vendor-id, NULL);
@@ -1155,7 +1102,7 @@ static void *eeh_early_enable(struct device_node *dn, 
void *data)
/* If the newer, better, ibm,get-config-addr-info is 
supported, 
 * then use that instead.
 */
-   pdn-eeh_pe_config_addr = 
eeh_get_pe_addr(pdn-eeh_config_addr, info);
+   pdn-eeh_pe_config_addr = eeh_ops-get_pe_addr(dn);
 
/* Some older systems (Power4) allow the
 * ibm,set-eeh-option call to succeed even on nodes
@@ -1264,7 +1211,6 @@ int __exit eeh_ops_unregister(const char *name)
 void __init eeh_init(void)
 {
struct device_node *phb, *np;
-   struct eeh_early_enable_info info;
int ret;
 
/* call platform initialization function */
@@ -1289,8 +1235,6 @@ void __init eeh_init(void)
ibm_read_slot_reset_state2 = rtas_token(ibm,read-slot-reset-state2);
ibm_read_slot_reset_state = rtas_token(ibm,read-slot-reset-state);
ibm_slot_error_detail = rtas_token(ibm,slot-error-detail);
-   ibm_get_config_addr_info = rtas_token(ibm,get-config-addr-info);
-   ibm_get_config_addr_info2 = rtas_token(ibm,get-config-addr-info2);
ibm_configure_bridge = rtas_token(ibm,configure-bridge);

[PATCH 01/21] Cleanup on comments of EEH core

2012-02-24 Thread Gavin Shan
The EEH has been implemented on pSeries platform. The original
code looks a little bit nasty. The patch does cleanup on the
current EEH implementation so that it looks more clean.

* Duplicated comments have been removed from the corresponding
  header files.
* Comments have been reorganized so that it looks more clean.
* The leading comments of functions are adjusted for a little
  bit so that the result of make pdfdocs would be more
  unified.
* Function definitions and calls have unified format as xxx().
  That means the format xxx () has been replaced by xxx().
* There're multiple functions implemented for resetting PE. The
  position of those functions have been move around so that they
  are adjacent to each other to reflect their relationship.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/eeh.h   |   26 +--
 arch/powerpc/include/asm/ppc-pci.h   |   71 +--
 arch/powerpc/platforms/pseries/eeh.c |  400 +++---
 3 files changed, 276 insertions(+), 221 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 66ea9b8..2328877 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -1,6 +1,6 @@
 /*
- * eeh.h
  * Copyright (C) 2001  Dave Engebretsen  Todd Inglett IBM Corporation.
+ * Copyright 2001-2012 IBM Corporation.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -40,8 +40,10 @@ extern int eeh_subsystem_enabled;
 #define EEH_MODE_RECOVERING(13)
 #define EEH_MODE_IRQ_DISABLED  (14)
 
-/* Max number of EEH freezes allowed before we consider the device
- * to be permanently disabled. */
+/*
+ * Max number of EEH freezes allowed before we consider the device
+ * to be permanently disabled.
+ */
 #define EEH_MAX_ALLOWED_FREEZES 5
 
 void __init eeh_init(void);
@@ -49,26 +51,8 @@ unsigned long eeh_check_failure(const volatile void __iomem 
*token,
unsigned long val);
 int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev);
 void __init pci_addr_cache_build(void);
-
-/**
- * eeh_add_device_early
- * eeh_add_device_late
- *
- * Perform eeh initialization for devices added after boot.
- * Call eeh_add_device_early before doing any i/o to the
- * device (including config space i/o).  Call eeh_add_device_late
- * to finish the eeh setup for this device.
- */
 void eeh_add_device_tree_early(struct device_node *);
 void eeh_add_device_tree_late(struct pci_bus *);
-
-/**
- * eeh_remove_device_recursive - undo EEH for device  children.
- * @dev: pci device to be removed
- *
- * As above, this removes the device; it also removes child
- * pci devices as well.
- */
 void eeh_remove_bus_device(struct pci_dev *);
 
 /**
diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 6d42297..221d82f 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -47,92 +47,27 @@ extern int rtas_setup_phb(struct pci_controller *phb);
 
 extern unsigned long pci_probe_only;
 
-/*  EEH internal-use-only related routines  */
 #ifdef CONFIG_EEH
 
 void pci_addr_cache_insert_device(struct pci_dev *dev);
 void pci_addr_cache_remove_device(struct pci_dev *dev);
 void pci_addr_cache_build(void);
 struct pci_dev *pci_get_device_by_addr(unsigned long addr);
-
-/**
- * eeh_slot_error_detail -- record and EEH error condition to the log
- * @pdn:  pci device node
- * @severity: EEH_LOG_TEMP_FAILURE or EEH_LOG_PERM_FAILURE
- *
- * Obtains the EEH error details from the RTAS subsystem,
- * and then logs these details with the RTAS error log system.
- */
 #define EEH_LOG_TEMP_FAILURE 1
 #define EEH_LOG_PERM_FAILURE 2
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
-
-/**
- * rtas_pci_enable - enable IO transfers for this slot
- * @pdn:   pci device node
- * @function:  either EEH_THAW_MMIO or EEH_THAW_DMA 
- *
- * Enable I/O transfers to this slot 
- */
 #define EEH_THAW_MMIO 2
 #define EEH_THAW_DMA  3
 int rtas_pci_enable(struct pci_dn *pdn, int function);
-
-/**
- * rtas_set_slot_reset -- unfreeze a frozen slot
- * @pdn:   pci device node
- *
- * Clear the EEH-frozen condition on a slot.  This routine
- * does this by asserting the PCI #RST line for 1/8th of
- * a second; this routine will sleep while the adapter is
- * being reset.
- *
- * Returns a non-zero value if the reset failed.
- */
 int rtas_set_slot_reset (struct pci_dn *);
 int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs);
-
-/** 
- * eeh_restore_bars - Restore device configuration info.
- * @pdn:   pci device node
- *
- * A reset of a PCI device will clear out its config space.
- * This routines will restore the config space for this
- * device, and is children, to values previously 

[PATCH 19/21] Replace pci_dn with eeh_dev for EEH on pSeries

2012-02-24 Thread Gavin Shan
The pci_dn has been replaced with eeh_dev. In order to comply with
the rule, the EEH platform implementation on pSeries should also
be adjusted for a little bit so that it will depend on eeh_dev instead
of pci_dn.

The patch replaces pci_dn with eeh_dev. The corresponding information
will be retrieved from eeh_dev instead of pci_dn.

Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/eeh_pseries.c |   96 +-
 1 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c 
b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 4ed06b2..a3345c2 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -144,11 +144,11 @@ static int pseries_eeh_init(void)
 static int pseries_eeh_set_option(struct device_node *dn, int option)
 {
int ret = 0;
-   struct pci_dn *pdn;
+   struct eeh_dev *edev;
const u32 *reg;
int config_addr;
 
-   pdn = PCI_DN(dn);
+   edev = OF_NODE_TO_EEH_DEV(dn);
 
/*
 * When we're enabling or disabling EEH functioality on
@@ -165,9 +165,9 @@ static int pseries_eeh_set_option(struct device_node *dn, 
int option)
 
case EEH_OPT_THAW_MMIO:
case EEH_OPT_THAW_DMA:
-   config_addr = pdn-eeh_config_addr;
-   if (pdn-eeh_pe_config_addr)
-   config_addr = pdn-eeh_pe_config_addr;
+   config_addr = edev-config_addr;
+   if (edev-pe_config_addr)
+   config_addr = edev-pe_config_addr;
break;
 
default:
@@ -177,8 +177,8 @@ static int pseries_eeh_set_option(struct device_node *dn, 
int option)
}
 
ret = rtas_call(ibm_set_eeh_option, 4, 1, NULL,
-   config_addr, BUID_HI(pdn-phb-buid),
-   BUID_LO(pdn-phb-buid), option);
+   config_addr, BUID_HI(edev-phb-buid),
+   BUID_LO(edev-phb-buid), option);
 
return ret;
 }
@@ -198,11 +198,11 @@ static int pseries_eeh_set_option(struct device_node *dn, 
int option)
  */
 static int pseries_eeh_get_pe_addr(struct device_node *dn)
 {
-   struct pci_dn *pdn;
+   struct eeh_dev *edev;
int ret = 0;
int rets[3];
 
-   pdn = PCI_DN(dn);
+   edev = OF_NODE_TO_EEH_DEV(dn);
 
if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
/*
@@ -211,15 +211,15 @@ static int pseries_eeh_get_pe_addr(struct device_node *dn)
 * meaningless.
 */
ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
-   pdn-eeh_config_addr, BUID_HI(pdn-phb-buid),
-   BUID_LO(pdn-phb-buid), 1);
+   edev-config_addr, BUID_HI(edev-phb-buid),
+   BUID_LO(edev-phb-buid), 1);
if (ret || (rets[0] == 0))
return 0;
 
/* Retrieve the associated PE config address */
ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
-   pdn-eeh_config_addr, BUID_HI(pdn-phb-buid),
-   BUID_LO(pdn-phb-buid), 0);
+   edev-config_addr, BUID_HI(edev-phb-buid),
+   BUID_LO(edev-phb-buid), 0);
if (ret) {
pr_warning(%s: Failed to get PE address for %s\n,
__func__, dn-full_name);
@@ -231,8 +231,8 @@ static int pseries_eeh_get_pe_addr(struct device_node *dn)
 
if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
-   pdn-eeh_config_addr, BUID_HI(pdn-phb-buid),
-   BUID_LO(pdn-phb-buid), 0);
+   edev-config_addr, BUID_HI(edev-phb-buid),
+   BUID_LO(edev-phb-buid), 0);
if (ret) {
pr_warning(%s: Failed to get PE address for %s\n,
__func__, dn-full_name);
@@ -260,28 +260,28 @@ static int pseries_eeh_get_pe_addr(struct device_node *dn)
  */
 static int pseries_eeh_get_state(struct device_node *dn, int *state)
 {
-   struct pci_dn *pdn;
+   struct eeh_dev *edev;
int config_addr;
int ret;
int rets[4];
int result;
 
/* Figure out PE config address if possible */
-   pdn = PCI_DN(dn);
-   config_addr = pdn-eeh_config_addr;
-   if (pdn-eeh_pe_config_addr)
-   config_addr = pdn-eeh_pe_config_addr;
+   edev = OF_NODE_TO_EEH_DEV(dn);
+   config_addr = edev-config_addr;
+   if (edev-pe_config_addr)
+   config_addr = edev-pe_config_addr;
 
if (ibm_read_slot_reset_state2 != 

[PATCH 08/23] PCI, powerpc: Register busn_res for root buses

2012-02-24 Thread Yinghai Lu
Signed-off-by: Yinghai Lu ying...@kernel.org
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/pci-common.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index cce98d7..501f29b 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1732,6 +1732,8 @@ void __devinit pcibios_scan_phb(struct pci_controller 
*hose)
bus-secondary = hose-first_busno;
hose-bus = bus;
 
+   pci_bus_insert_busn_res(bus, hose-first_busno, hose-last_busno);
+
/* Get probe mode and perform scan */
mode = PCI_PROBE_NORMAL;
if (node  ppc_md.pci_probe_mode)
@@ -1742,8 +1744,11 @@ void __devinit pcibios_scan_phb(struct pci_controller 
*hose)
of_scan_bus(node, bus);
}
 
-   if (mode == PCI_PROBE_NORMAL)
+   if (mode == PCI_PROBE_NORMAL) {
+   pci_bus_update_busn_res_end(bus, 255);
hose-last_busno = bus-subordinate = pci_scan_child_bus(bus);
+   pci_bus_update_busn_res_end(bus, bus-subordinate);
+   }
 
/* Platform gets a chance to do some global fixups before
 * we proceed to resource allocation
-- 
1.7.7

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 14/21] Introduce EEH device

2012-02-24 Thread Stephen Rothwell
Hi Gavin,

On Fri, 24 Feb 2012 17:38:11 +0800 Gavin Shan sha...@linux.vnet.ibm.com wrote:

 +#define EEH_DEV_TO_OF_NODE(edev) (edev-dn)
 +#define EEH_DEV_TO_PCI_DEV(edev) (edev-pdev)
 +#define OF_NODE_TO_EEH_DEV(dn)   ((struct eeh_dev *)(dn-edev))
 +#define PCI_DEV_TO_EEH_DEV(pdev) ((struct eeh_dev 
 *)(pdev-dev.archdata.edev))

You need to put parentheses around the use of macro arguments ... or,
even better, make these into static inline accessor functions (with
lower case names).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgppYc3VZ261t.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 14/21] Introduce EEH device

2012-02-24 Thread Stephen Rothwell
Hi Gavin,

On Fri, 24 Feb 2012 17:38:11 +0800 Gavin Shan sha...@linux.vnet.ibm.com wrote:

 diff --git a/arch/powerpc/include/asm/device.h 
 b/arch/powerpc/include/asm/device.h
 index d57c08a..4668344 100644
 --- a/arch/powerpc/include/asm/device.h
 +++ b/arch/powerpc/include/asm/device.h
 @@ -31,6 +31,9 @@ struct dev_archdata {
  #ifdef CONFIG_SWIOTLB
   dma_addr_t  max_direct_dma_addr;
  #endif
 +#ifdef CONFIG_EEH
 + void*edev;
 +#endif
  };
  
  struct pdev_archdata {
 diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
 index ad8f318..1310971 100644
 --- a/arch/powerpc/include/asm/eeh.h
 +++ b/arch/powerpc/include/asm/eeh.h
 +#define OF_NODE_TO_EEH_DEV(dn)   ((struct eeh_dev *)(dn-edev))
 +#define PCI_DEV_TO_EEH_DEV(pdev) ((struct eeh_dev 
 *)(pdev-dev.archdata.edev))

If the edev fields of dev_archdata and device_node are always going to be
struct eeh_dev *, why not declare then as such and avoid the casting?

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgpeAYIVF0mVU.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 20/21] Introduce struct eeh_stats for EEH

2012-02-24 Thread Stephen Rothwell
Hi Gavin,

On Fri, 24 Feb 2012 17:38:17 +0800 Gavin Shan sha...@linux.vnet.ibm.com wrote:

 diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
 index 1310971..226c9a5 100644
 --- a/arch/powerpc/include/asm/eeh.h
 +++ b/arch/powerpc/include/asm/eeh.h
 @@ -98,6 +98,21 @@ struct eeh_ops {
   int (*configure_bridge)(struct device_node *dn);
  };
  
 +/*
 + * The struct is used to maintain the EEH global statistic
 + * information. Besides, the EEH global statistics will be
 + * exported to user space through procfs
 + */
 +struct eeh_stats {
 + unsigned long no_device;/* PCI device not found */
 + unsigned long no_dn;/* OF node not found*/
 + unsigned long no_cfg_addr;  /* Config address not found */
 + unsigned long ignored_check;/* EEH check skipped*/
 + unsigned long total_mmio_ffs;   /* Total EEH checks */
 + unsigned long false_positives;  /* Unnecessary EEH checks   */
 + unsigned long slot_resets;  /* PE reset */
 +};

If this is used in only one place, there is not much point in putting it
in a header file.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgpLzg8PCDPVU.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 20/21] Introduce struct eeh_stats for EEH

2012-02-24 Thread David Laight
 
 +/*
 + * The struct is used to maintain the EEH global statistic
 + * information. Besides, the EEH global statistics will be
 + * exported to user space through procfs
 + */
 +struct eeh_stats {
 + unsigned long no_device;/* PCI device not found
*/
 + unsigned long no_dn;/* OF node not found
*/
 + unsigned long no_cfg_addr;  /* Config address not  found
*/
 + unsigned long ignored_check;/* EEH check skipped
*/
 + unsigned long total_mmio_ffs;   /* Total EEH checks
*/
 + unsigned long false_positives;  /* Unnecessary EEH checks
*/
 + unsigned long slot_resets;  /* PE reset
*/
 +};

Why 'unsigned long', surely either 'unsigned int'
or a fixed-width type.

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 01/37] powerpc/booke: Set CPU_FTR_DEBUG_LVL_EXC on 32-bit

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Currently 32-bit only cares about this for choice of exception
vector, which is done in core-specific code.  However, KVM will
want to distinguish as well.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/cputable.h |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ad55a1c..6a034a2 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -376,7 +376,8 @@ extern const char *powerpc_base_platform;
 #define CPU_FTRS_47X   (CPU_FTRS_440x6)
 #define CPU_FTRS_E200  (CPU_FTR_USE_TB | CPU_FTR_SPE_COMP | \
CPU_FTR_NODSISRALIGN | CPU_FTR_COHERENT_ICACHE | \
-   CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_NOEXECUTE)
+   CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_NOEXECUTE | \
+   CPU_FTR_DEBUG_LVL_EXC)
 #define CPU_FTRS_E500  (CPU_FTR_MAYBE_CAN_DOZE | CPU_FTR_USE_TB | \
CPU_FTR_SPE_COMP | CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_NODSISRALIGN | \
CPU_FTR_NOEXECUTE)
@@ -385,7 +386,7 @@ extern const char *powerpc_base_platform;
CPU_FTR_NODSISRALIGN | CPU_FTR_NOEXECUTE)
 #define CPU_FTRS_E500MC(CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \
CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \
-   CPU_FTR_DBELL)
+   CPU_FTR_DBELL | CPU_FTR_DEBUG_LVL_EXC)
 #define CPU_FTRS_E5500 (CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \
CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \
CPU_FTR_DBELL | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 05/37] KVM: PPC: booke: Move vm core init/destroy out of booke.c

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

e500mc will want to do lpid allocation/deallocation here.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/44x.c   |9 +
 arch/powerpc/kvm/booke.c |9 -
 arch/powerpc/kvm/e500.c  |9 +
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 879a1a7..50e7dbc 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -163,6 +163,15 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 }
 
+int kvmppc_core_init_vm(struct kvm *kvm)
+{
+   return 0;
+}
+
+void kvmppc_core_destroy_vm(struct kvm *kvm)
+{
+}
+
 static int __init kvmppc_44x_init(void)
 {
int r;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a2456c7..2ee9bae 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -932,15 +932,6 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm,
 {
 }
 
-int kvmppc_core_init_vm(struct kvm *kvm)
-{
-   return 0;
-}
-
-void kvmppc_core_destroy_vm(struct kvm *kvm)
-{
-}
-
 void kvmppc_set_tcr(struct kvm_vcpu *vcpu, u32 new_tcr)
 {
vcpu-arch.tcr = new_tcr;
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 2d5fe04..ac6c9ae 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -226,6 +226,15 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vcpu_e500);
 }
 
+int kvmppc_core_init_vm(struct kvm *kvm)
+{
+   return 0;
+}
+
+void kvmppc_core_destroy_vm(struct kvm *kvm)
+{
+}
+
 static int __init kvmppc_e500_init(void)
 {
int r, i;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 04/37] KVM: PPC: booke: add booke-level vcpu load/put

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

This gives us a place to put load/put actions that correspond to
code that is booke-specific but not specific to a particular core.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/44x.c   |3 +++
 arch/powerpc/kvm/booke.c |8 
 arch/powerpc/kvm/booke.h |3 +++
 arch/powerpc/kvm/e500.c  |3 +++
 4 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 7b612a7..879a1a7 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -29,15 +29,18 @@
 #include asm/kvm_ppc.h
 
 #include 44x_tlb.h
+#include booke.h
 
 void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
+   kvmppc_booke_vcpu_load(vcpu, cpu);
kvmppc_44x_tlb_load(vcpu);
 }
 
 void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 {
kvmppc_44x_tlb_put(vcpu);
+   kvmppc_booke_vcpu_put(vcpu);
 }
 
 int kvmppc_core_check_processor_compat(void)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ee9e1ee..a2456c7 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -968,6 +968,14 @@ void kvmppc_decrementer_func(unsigned long data)
kvmppc_set_tsr_bits(vcpu, TSR_DIS);
 }
 
+void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+{
+}
+
+void kvmppc_booke_vcpu_put(struct kvm_vcpu *vcpu)
+{
+}
+
 int __init kvmppc_booke_init(void)
 {
unsigned long ivor[16];
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index 2fe2027..05d1d99 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -71,4 +71,7 @@ void kvmppc_save_guest_spe(struct kvm_vcpu *vcpu);
 /* high-level function, manages flags, host state */
 void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu);
 
+void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
+void kvmppc_booke_vcpu_put(struct kvm_vcpu *vcpu);
+
 #endif /* __KVM_BOOKE_H__ */
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index ddcd896..2d5fe04 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -36,6 +36,7 @@ void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu)
 
 void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
+   kvmppc_booke_vcpu_load(vcpu, cpu);
kvmppc_e500_tlb_load(vcpu, cpu);
 }
 
@@ -47,6 +48,8 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
if (vcpu-arch.shadow_msr  MSR_SPE)
kvmppc_vcpu_disable_spe(vcpu);
 #endif
+
+   kvmppc_booke_vcpu_put(vcpu);
 }
 
 int kvmppc_core_check_processor_compat(void)
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 00/37] KVM: PPC: e500mc support v2

2012-02-24 Thread Alexander Graf
This is Scott's e500mc RFC patch set rebased, berobbed of its pt_regs
parts and fixed for bisectability. On top of them, I addressed all the
comments that I had on the code and that came up in his code as FIXMEs.

I verified that this patch set works just fine on e500mc and doesn't
break e500v2, so I would say it's good to go as it is, unless someone
has strong objections to how things are done. Everything hereafter
I would prefer to do based on a working upstream version rather than
a downstream fork, as that way exposure is a lot higher.

v1 - v2:

  - ESR - GESR
  - introduce and use constants for doorbell
  - drop e500mc ifdefs for doorbell
  - fix whitespace
  - use explicit preempt counts in inst fixup
  - rework e500v2 kconfig patch
  - add patches 31-37

Alexander Graf (22):
  KVM: PPC: e500mc: Add doorbell emulation support
  KVM: PPC: e500mc: implicitly set MSR_GS
  KVM: PPC: e500mc: Move r1/r2 restoration very early
  KVM: PPC: e500mc: add load inst fixup
  KVM: PPC: rename CONFIG_KVM_E500 - CONFIG_KVM_E500V2
  KVM: PPC: make e500v2 kvm and e500mc cpu mutually exclusive
  KVM: PPC: booke: remove leftover debugging
  KVM: PPC: booke: deliver program int on emulation failure
  KVM: PPC: booke: rework rescheduling checks
  KVM: PPC: booke: BOOKE_IRQPRIO_MAX is n+1
  KVM: PPC: bookehv: fix exit timing
  KVM: PPC: bookehv: remove negation for CONFIG_64BIT
  KVM: PPC: bookehv: remove SET_VCPU
  KVM: PPC: bookehv: disable MAS register updates early
  KVM: PPC: bookehv: add comment about shadow_msr
  KVM: PPC: booke: Readd debug abort code for machine check
  KVM: PPC: booke: add GS documentation for program interrupt
  KVM: PPC: bookehv: remove unused code
  KVM: PPC: e500: fix typo in tlb code
  KVM: PPC: booke: Support perfmon interrupts
  KVM: PPC: booke: expose guest registers on irq reinject
  KVM: PPC: booke: Reinject performance monitor interrupts

Scott Wood (15):
  powerpc/booke: Set CPU_FTR_DEBUG_LVL_EXC on 32-bit
  powerpc/e500: split CPU_FTRS_ALWAYS/CPU_FTRS_POSSIBLE
  KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv
  KVM: PPC: booke: add booke-level vcpu load/put
  KVM: PPC: booke: Move vm core init/destroy out of booke.c
  KVM: PPC: e500: rename e500_tlb.h to e500.h
  KVM: PPC: e500: merge asm/kvm_e500.h into arch/powerpc/kvm/e500.h
  KVM: PPC: e500: clean up arch/powerpc/kvm/e500.h
  KVM: PPC: e500: refactor core-specific TLB code
  KVM: PPC: e500: Track TLB1 entries with a bitmap
  KVM: PPC: e500: emulate tlbilx
  powerpc/booke: Provide exception macros with interrupt name
  KVM: PPC: booke: category E.HV (GS-mode) support
  KVM: PPC: booke: standard PPC floating point support
  KVM: PPC: e500mc support

 arch/powerpc/include/asm/cputable.h |   21 +-
 arch/powerpc/include/asm/dbell.h|3 +
 arch/powerpc/include/asm/hw_irq.h   |1 +
 arch/powerpc/include/asm/kvm.h  |1 +
 arch/powerpc/include/asm/kvm_asm.h  |8 +
 arch/powerpc/include/asm/kvm_book3s.h   |3 +
 arch/powerpc/include/asm/kvm_booke.h|3 +
 arch/powerpc/include/asm/kvm_booke_hv_asm.h |   49 +++
 arch/powerpc/include/asm/kvm_e500.h |   96 -
 arch/powerpc/include/asm/kvm_host.h |   22 +-
 arch/powerpc/include/asm/kvm_ppc.h  |   10 +-
 arch/powerpc/include/asm/mmu-book3e.h   |6 +
 arch/powerpc/include/asm/processor.h|3 +
 arch/powerpc/include/asm/reg.h  |2 +
 arch/powerpc/include/asm/reg_booke.h|   34 ++
 arch/powerpc/include/asm/system.h   |1 +
 arch/powerpc/kernel/asm-offsets.c   |   15 +-
 arch/powerpc/kernel/cpu_setup_fsl_booke.S   |1 +
 arch/powerpc/kernel/head_44x.S  |   23 +-
 arch/powerpc/kernel/head_booke.h|   69 ++-
 arch/powerpc/kernel/head_fsl_booke.S|   98 -
 arch/powerpc/kvm/44x.c  |   12 +
 arch/powerpc/kvm/Kconfig|   28 +-
 arch/powerpc/kvm/Makefile   |   15 +-
 arch/powerpc/kvm/book3s.c   |4 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c |   26 +-
 arch/powerpc/kvm/booke.c|  470 +
 arch/powerpc/kvm/booke.h|   57 +++-
 arch/powerpc/kvm/booke_emulate.c|   23 +-
 arch/powerpc/kvm/bookehv_interrupts.S   |  606 +++
 arch/powerpc/kvm/e500.c |  372 ++---
 arch/powerpc/kvm/e500.h |  302 +
 arch/powerpc/kvm/e500_emulate.c |  110 +-
 arch/powerpc/kvm/e500_tlb.c |  588 +++---
 arch/powerpc/kvm/e500_tlb.h |  174 
 arch/powerpc/kvm/e500mc.c   |  342 +++
 arch/powerpc/kvm/powerpc.c  |   47 ++-
 arch/powerpc/kvm/timing.h   |6 +
 38 files changed, 2797 insertions(+), 854 deletions(-)
 create mode 100644 

[PATCH 03/37] KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

We'll use it on e500mc as well.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |3 ++
 arch/powerpc/include/asm/kvm_booke.h  |3 ++
 arch/powerpc/include/asm/kvm_ppc.h|5 
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   26 +---
 arch/powerpc/kvm/powerpc.c|   34 +
 5 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index aa795cc..046041f 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -452,4 +452,7 @@ static inline bool kvmppc_critical_section(struct kvm_vcpu 
*vcpu)
 
 #define INS_DCBZ   0x7c0007ec
 
+/* LPIDs we support with this build -- runtime limit may be lower */
+#define KVMPPC_NR_LPIDS(LPID_RSVD + 1)
+
 #endif /* __ASM_KVM_BOOK3S_H__ */
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index a90e091..b7cd335 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -23,6 +23,9 @@
 #include linux/types.h
 #include linux/kvm_host.h
 
+/* LPIDs we support with this build -- runtime limit may be lower */
+#define KVMPPC_NR_LPIDS64
+
 static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val)
 {
vcpu-arch.gpr[num] = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 9d6dee0..731e920 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -204,4 +204,9 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
 struct kvm_dirty_tlb *cfg);
 
+long kvmppc_alloc_lpid(void);
+void kvmppc_claim_lpid(long lpid);
+void kvmppc_free_lpid(long lpid);
+void kvmppc_init_lpid(unsigned long nr_lpids);
+
 #endif /* __POWERPC_KVM_PPC_H__ */
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index ddc485a..d031ce1 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -36,13 +36,11 @@
 
 /* POWER7 has 10-bit LPIDs, PPC970 has 6-bit LPIDs */
 #define MAX_LPID_970   63
-#define NR_LPIDS   (LPID_RSVD + 1)
-unsigned long lpid_inuse[BITS_TO_LONGS(NR_LPIDS)];
 
 long kvmppc_alloc_hpt(struct kvm *kvm)
 {
unsigned long hpt;
-   unsigned long lpid;
+   long lpid;
struct revmap_entry *rev;
struct kvmppc_linear_info *li;
 
@@ -72,14 +70,9 @@ long kvmppc_alloc_hpt(struct kvm *kvm)
}
kvm-arch.revmap = rev;
 
-   /* Allocate the guest's logical partition ID */
-   do {
-   lpid = find_first_zero_bit(lpid_inuse, NR_LPIDS);
-   if (lpid = NR_LPIDS) {
-   pr_err(kvm_alloc_hpt: No LPIDs free\n);
-   goto out_freeboth;
-   }
-   } while (test_and_set_bit(lpid, lpid_inuse));
+   lpid = kvmppc_alloc_lpid();
+   if (lpid  0)
+   goto out_freeboth;
 
kvm-arch.sdr1 = __pa(hpt) | (HPT_ORDER - 18);
kvm-arch.lpid = lpid;
@@ -96,7 +89,7 @@ long kvmppc_alloc_hpt(struct kvm *kvm)
 
 void kvmppc_free_hpt(struct kvm *kvm)
 {
-   clear_bit(kvm-arch.lpid, lpid_inuse);
+   kvmppc_free_lpid(kvm-arch.lpid);
vfree(kvm-arch.revmap);
if (kvm-arch.hpt_li)
kvm_release_hpt(kvm-arch.hpt_li);
@@ -171,8 +164,7 @@ int kvmppc_mmu_hv_init(void)
if (!cpu_has_feature(CPU_FTR_HVMODE))
return -EINVAL;
 
-   memset(lpid_inuse, 0, sizeof(lpid_inuse));
-
+   /* POWER7 has 10-bit LPIDs, PPC970 and e500mc have 6-bit LPIDs */
if (cpu_has_feature(CPU_FTR_ARCH_206)) {
host_lpid = mfspr(SPRN_LPID);   /* POWER7 */
rsvd_lpid = LPID_RSVD;
@@ -181,9 +173,11 @@ int kvmppc_mmu_hv_init(void)
rsvd_lpid = MAX_LPID_970;
}
 
-   set_bit(host_lpid, lpid_inuse);
+   kvmppc_init_lpid(rsvd_lpid + 1);
+
+   kvmppc_claim_lpid(host_lpid);
/* rsvd_lpid is reserved for use in partition switching */
-   set_bit(rsvd_lpid, lpid_inuse);
+   kvmppc_claim_lpid(rsvd_lpid);
 
return 0;
 }
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 00d7e34..9806ea5 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -808,6 +808,40 @@ out:
return r;
 }
 
+static unsigned long lpid_inuse[BITS_TO_LONGS(KVMPPC_NR_LPIDS)];
+static unsigned long nr_lpids;
+
+long kvmppc_alloc_lpid(void)
+{
+   long lpid;
+
+   do {
+   lpid = find_first_zero_bit(lpid_inuse, KVMPPC_NR_LPIDS);
+   if (lpid = nr_lpids) {
+   pr_err(%s: No LPIDs 

[PATCH 06/37] KVM: PPC: e500: rename e500_tlb.h to e500.h

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

This is in preparation for merging in the contents of
arch/powerpc/include/asm/kvm_e500.h.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500.c |2 +-
 arch/powerpc/kvm/{e500_tlb.h = e500.h} |6 +++---
 arch/powerpc/kvm/e500_emulate.c |2 +-
 arch/powerpc/kvm/e500_tlb.c |2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)
 rename arch/powerpc/kvm/{e500_tlb.h = e500.h} (98%)

diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index ac6c9ae..5c450ba 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -24,7 +24,7 @@
 #include asm/kvm_ppc.h
 
 #include booke.h
-#include e500_tlb.h
+#include e500.h
 
 void kvmppc_core_load_host_debugstate(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/powerpc/kvm/e500_tlb.h b/arch/powerpc/kvm/e500.h
similarity index 98%
rename from arch/powerpc/kvm/e500_tlb.h
rename to arch/powerpc/kvm/e500.h
index 5c6d2d7..02ecde2 100644
--- a/arch/powerpc/kvm/e500_tlb.h
+++ b/arch/powerpc/kvm/e500.h
@@ -12,8 +12,8 @@
  * published by the Free Software Foundation.
  */
 
-#ifndef __KVM_E500_TLB_H__
-#define __KVM_E500_TLB_H__
+#ifndef KVM_E500_H
+#define KVM_E500_H
 
 #include linux/kvm_host.h
 #include asm/mmu-book3e.h
@@ -171,4 +171,4 @@ static inline int tlbe_is_host_safe(const struct kvm_vcpu 
*vcpu,
return 1;
 }
 
-#endif /* __KVM_E500_TLB_H__ */
+#endif /* KVM_E500_H */
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index 6d0b2bd..2a1a228 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -17,7 +17,7 @@
 #include asm/kvm_e500.h
 
 #include booke.h
-#include e500_tlb.h
+#include e500.h
 
 #define XOP_TLBIVAX 786
 #define XOP_TLBSX   914
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 6e53e41..1d623a0 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -29,7 +29,7 @@
 #include asm/kvm_e500.h
 
 #include ../mm/mmu_decl.h
-#include e500_tlb.h
+#include e500.h
 #include trace.h
 #include timing.h
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 02/37] powerpc/e500: split CPU_FTRS_ALWAYS/CPU_FTRS_POSSIBLE

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Split e500 (v1/v2) and e500mc/e5500 to allow optimization of feature
checks that differ between the two.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/cputable.h |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 6a034a2..2022f2d 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -483,8 +483,10 @@ enum {
CPU_FTRS_E200 |
 #endif
 #ifdef CONFIG_E500
-   CPU_FTRS_E500 | CPU_FTRS_E500_2 | CPU_FTRS_E500MC |
-   CPU_FTRS_E5500 |
+   CPU_FTRS_E500 | CPU_FTRS_E500_2 |
+#endif
+#ifdef CONFIG_PPC_E500MC
+   CPU_FTRS_E500MC | CPU_FTRS_E5500 |
 #endif
0,
 };
@@ -528,8 +530,10 @@ enum {
CPU_FTRS_E200 
 #endif
 #ifdef CONFIG_E500
-   CPU_FTRS_E500  CPU_FTRS_E500_2  CPU_FTRS_E500MC 
-   CPU_FTRS_E5500 
+   CPU_FTRS_E500  CPU_FTRS_E500_2 
+#endif
+#ifdef CONFIG_PPC_E500MC
+   CPU_FTRS_E500MC  CPU_FTRS_E5500 
 #endif
CPU_FTRS_POSSIBLE,
 };
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 11/37] KVM: PPC: e500: emulate tlbilx

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

tlbilx is the new, preferred invalidation instruction.  It is not
found on e500 prior to e500mc, but there should be no harm in
supporting it on all e500.

Based on code from Ashish Kalra ashish.ka...@freescale.com.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500.h |1 +
 arch/powerpc/kvm/e500_emulate.c |9 ++
 arch/powerpc/kvm/e500_tlb.c |   52 +++
 3 files changed, 62 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index f4dee55..ce3f163 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -124,6 +124,7 @@ int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 
*vcpu_e500,
 int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu);
 int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu);
 int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb);
+int kvmppc_e500_emul_tlbilx(struct kvm_vcpu *vcpu, int rt, int ra, int rb);
 int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb);
 int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500);
 void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500);
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index c80794d..af02c18 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -22,6 +22,7 @@
 #define XOP_TLBSX   914
 #define XOP_TLBRE   946
 #define XOP_TLBWE   978
+#define XOP_TLBILX  18
 
 int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int inst, int *advance)
@@ -29,6 +30,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
int emulated = EMULATE_DONE;
int ra;
int rb;
+   int rt;
 
switch (get_op(inst)) {
case 31:
@@ -47,6 +49,13 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
emulated = kvmppc_e500_emul_tlbsx(vcpu,rb);
break;
 
+   case XOP_TLBILX:
+   ra = get_ra(inst);
+   rb = get_rb(inst);
+   rt = get_rt(inst);
+   emulated = kvmppc_e500_emul_tlbilx(vcpu, rt, ra, rb);
+   break;
+
case XOP_TLBIVAX:
ra = get_ra(inst);
rb = get_rb(inst);
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index c8ce51d..6eb5d65 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -631,6 +631,58 @@ int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int 
ra, int rb)
return EMULATE_DONE;
 }
 
+static void tlbilx_all(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
+  int pid, int rt)
+{
+   struct kvm_book3e_206_tlb_entry *tlbe;
+   int tid, esel;
+
+   /* invalidate all entries */
+   for (esel = 0; esel  vcpu_e500-gtlb_params[tlbsel].entries; esel++) {
+   tlbe = get_entry(vcpu_e500, tlbsel, esel);
+   tid = get_tlb_tid(tlbe);
+   if (rt == 0 || tid == pid) {
+   inval_gtlbe_on_host(vcpu_e500, tlbsel, esel);
+   kvmppc_e500_gtlbe_invalidate(vcpu_e500, tlbsel, esel);
+   }
+   }
+}
+
+static void tlbilx_one(struct kvmppc_vcpu_e500 *vcpu_e500, int pid,
+  int ra, int rb)
+{
+   int tlbsel, esel;
+   gva_t ea;
+
+   ea = kvmppc_get_gpr(vcpu_e500-vcpu, rb);
+   if (ra)
+   ea += kvmppc_get_gpr(vcpu_e500-vcpu, ra);
+
+   for (tlbsel = 0; tlbsel  2; tlbsel++) {
+   esel = kvmppc_e500_tlb_index(vcpu_e500, ea, tlbsel, pid, -1);
+   if (esel = 0) {
+   inval_gtlbe_on_host(vcpu_e500, tlbsel, esel);
+   kvmppc_e500_gtlbe_invalidate(vcpu_e500, tlbsel, esel);
+   break;
+   }
+   }
+}
+
+int kvmppc_e500_emul_tlbilx(struct kvm_vcpu *vcpu, int rt, int ra, int rb)
+{
+   struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+   int pid = get_cur_spid(vcpu);
+
+   if (rt == 0 || rt == 1) {
+   tlbilx_all(vcpu_e500, 0, pid, rt);
+   tlbilx_all(vcpu_e500, 1, pid, rt);
+   } else if (rt == 3) {
+   tlbilx_one(vcpu_e500, pid, ra, rb);
+   }
+
+   return EMULATE_DONE;
+}
+
 int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu)
 {
struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 10/37] KVM: PPC: e500: Track TLB1 entries with a bitmap

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Rather than invalidate everything when a TLB1 entry needs to be
taken down, keep track of which host TLB1 entries are used for
a given guest TLB1 entry, and invalidate just those entries.

Based on code from Ashish Kalra ashish.ka...@freescale.com
and Liu Yu yu@freescale.com.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500.h |5 +++
 arch/powerpc/kvm/e500_tlb.c |   72 ---
 2 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 34cef08..f4dee55 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -2,6 +2,7 @@
  * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved.
  *
  * Author: Yu Liu yu@freescale.com
+ * Ashish Kalra ashish.ka...@freescale.com
  *
  * Description:
  * This file is based on arch/powerpc/kvm/44x_tlb.h and
@@ -25,6 +26,7 @@
 
 #define E500_TLB_VALID 1
 #define E500_TLB_DIRTY 2
+#define E500_TLB_BITMAP 4
 
 struct tlbe_ref {
pfn_t pfn;
@@ -82,6 +84,9 @@ struct kvmppc_vcpu_e500 {
struct page **shared_tlb_pages;
int num_shared_tlb_pages;
 
+   u64 *g2h_tlb1_map;
+   unsigned int *h2g_tlb1_rmap;
+
 #ifdef CONFIG_KVM_E500
u32 pid[E500_PID_NUM];
 
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 9925fc6..c8ce51d 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -2,6 +2,7 @@
  * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved.
  *
  * Author: Yu Liu, yu@freescale.com
+ * Ashish Kalra, ashish.ka...@freescale.com
  *
  * Description:
  * This file is based on arch/powerpc/kvm/44x_tlb.c,
@@ -175,8 +176,28 @@ static void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 
*vcpu_e500,
struct kvm_book3e_206_tlb_entry *gtlbe =
get_entry(vcpu_e500, tlbsel, esel);
 
-   if (tlbsel == 1) {
-   kvmppc_e500_tlbil_all(vcpu_e500);
+   if (tlbsel == 1 
+   vcpu_e500-gtlb_priv[1][esel].ref.flags  E500_TLB_BITMAP) {
+   u64 tmp = vcpu_e500-g2h_tlb1_map[esel];
+   int hw_tlb_indx;
+   unsigned long flags;
+
+   local_irq_save(flags);
+   while (tmp) {
+   hw_tlb_indx = __ilog2_u64(tmp  -tmp);
+   mtspr(SPRN_MAS0,
+ MAS0_TLBSEL(1) |
+ MAS0_ESEL(to_htlb1_esel(hw_tlb_indx)));
+   mtspr(SPRN_MAS1, 0);
+   asm volatile(tlbwe);
+   vcpu_e500-h2g_tlb1_rmap[hw_tlb_indx] = 0;
+   tmp = tmp - 1;
+   }
+   mb();
+   vcpu_e500-g2h_tlb1_map[esel] = 0;
+   vcpu_e500-gtlb_priv[1][esel].ref.flags = ~E500_TLB_BITMAP;
+   local_irq_restore(flags);
+
return;
}
 
@@ -282,6 +303,16 @@ static inline void kvmppc_e500_ref_release(struct tlbe_ref 
*ref)
}
 }
 
+static void clear_tlb1_bitmap(struct kvmppc_vcpu_e500 *vcpu_e500)
+{
+   if (vcpu_e500-g2h_tlb1_map)
+   memset(vcpu_e500-g2h_tlb1_map,
+  sizeof(u64) * vcpu_e500-gtlb_params[1].entries, 0);
+   if (vcpu_e500-h2g_tlb1_rmap)
+   memset(vcpu_e500-h2g_tlb1_rmap,
+  sizeof(unsigned int) * host_tlb_params[1].entries, 0);
+}
+
 static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
int tlbsel = 0;
@@ -511,7 +542,7 @@ static void kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 
*vcpu_e500,
 /* XXX for both one-one and one-to-many , for now use TLB1 */
 static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500,
u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe,
-   struct kvm_book3e_206_tlb_entry *stlbe)
+   struct kvm_book3e_206_tlb_entry *stlbe, int esel)
 {
struct tlbe_ref *ref;
unsigned int victim;
@@ -524,6 +555,14 @@ static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 
*vcpu_e500,
ref = vcpu_e500-tlb_refs[1][victim];
kvmppc_e500_shadow_map(vcpu_e500, gvaddr, gfn, gtlbe, 1, stlbe, ref);
 
+   vcpu_e500-g2h_tlb1_map[esel] |= (u64)1  victim;
+   vcpu_e500-gtlb_priv[1][esel].ref.flags |= E500_TLB_BITMAP;
+   if (vcpu_e500-h2g_tlb1_rmap[victim]) {
+   unsigned int idx = vcpu_e500-h2g_tlb1_rmap[victim];
+   vcpu_e500-g2h_tlb1_map[idx] = ~(1ULL  victim);
+   }
+   vcpu_e500-h2g_tlb1_rmap[victim] = esel;
+
return victim;
 }
 
@@ -728,7 +767,7 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 * are mapped on the fly. */
stlbsel = 1;
sesel = kvmppc_e500_tlb1_map(vcpu_e500, eaddr,
-  

[PATCH 08/37] KVM: PPC: e500: clean up arch/powerpc/kvm/e500.h

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Move vcpu to the beginning of vcpu_e500 to give it appropriate
prominence, especially if more fields end up getting added to the
end of vcpu_e500 (and vcpu ends up in the middle).

Remove gratuitous extern and add parameter names to prototypes.

Signed-off-by: Scott Wood scottw...@freescale.com
[agraf: fix bisectability]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500.h |   25 ++---
 1 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 51d13bd..a48af00 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -42,6 +42,8 @@ struct kvmppc_e500_tlb_params {
 };
 
 struct kvmppc_vcpu_e500 {
+   struct kvm_vcpu vcpu;
+
/* Unmodified copy of the guest's TLB -- shared with host userspace. */
struct kvm_book3e_206_tlb_entry *gtlb_arch;
 
@@ -85,8 +87,6 @@ struct kvmppc_vcpu_e500 {
 
struct page **shared_tlb_pages;
int num_shared_tlb_pages;
-
-   struct kvm_vcpu vcpu;
 };
 
 static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu)
@@ -113,19 +113,22 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
 
-extern void kvmppc_dump_tlbs(struct kvm_vcpu *);
-extern int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *, ulong);
-extern int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *);
-extern int kvmppc_e500_emul_tlbre(struct kvm_vcpu *);
-extern int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *, int, int);
-extern int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *, int);
-extern int kvmppc_e500_tlb_search(struct kvm_vcpu *, gva_t, unsigned int, int);
 extern void kvmppc_e500_tlb_put(struct kvm_vcpu *);
 extern void kvmppc_e500_tlb_load(struct kvm_vcpu *, int);
-extern int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *);
-extern void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *);
 extern void kvmppc_e500_tlb_setup(struct kvmppc_vcpu_e500 *);
 extern void kvmppc_e500_recalc_shadow_pid(struct kvmppc_vcpu_e500 *);
+int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *vcpu_e500,
+   ulong value);
+int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu);
+int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu);
+int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb);
+int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb);
+int kvmppc_e500_tlb_search(struct kvm_vcpu *, gva_t, unsigned int, int);
+int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500);
+void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500);
+
+void kvmppc_get_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
+int kvmppc_set_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
 
 /* TLB helper functions */
 static inline unsigned int
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 12/37] powerpc/booke: Provide exception macros with interrupt name

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

DO_KVM will need to identify the particular exception type.

There is an existing set of arbitrary numbers that Linux passes,
but it's an undocumented mess that sort of corresponds to server/classic
exception vectors but not really.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/head_44x.S   |   23 +--
 arch/powerpc/kernel/head_booke.h |   41 ++
 arch/powerpc/kernel/head_fsl_booke.S |   52 +-
 3 files changed, 68 insertions(+), 48 deletions(-)

diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index 7dd2981..d1192c5 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -248,10 +248,11 @@ _ENTRY(_start);
 
 interrupt_base:
/* Critical Input Interrupt */
-   CRITICAL_EXCEPTION(0x0100, CriticalInput, unknown_exception)
+   CRITICAL_EXCEPTION(0x0100, CRITICAL, CriticalInput, unknown_exception)
 
/* Machine Check Interrupt */
-   CRITICAL_EXCEPTION(0x0200, MachineCheck, machine_check_exception)
+   CRITICAL_EXCEPTION(0x0200, MACHINE_CHECK, MachineCheck, \
+  machine_check_exception)
MCHECK_EXCEPTION(0x0210, MachineCheckA, machine_check_exception)
 
/* Data Storage Interrupt */
@@ -261,7 +262,8 @@ interrupt_base:
INSTRUCTION_STORAGE_EXCEPTION
 
/* External Input Interrupt */
-   EXCEPTION(0x0500, ExternalInput, do_IRQ, EXC_XFER_LITE)
+   EXCEPTION(0x0500, BOOKE_INTERRUPT_EXTERNAL, ExternalInput, \
+ do_IRQ, EXC_XFER_LITE)
 
/* Alignment Interrupt */
ALIGNMENT_EXCEPTION
@@ -273,29 +275,32 @@ interrupt_base:
 #ifdef CONFIG_PPC_FPU
FP_UNAVAILABLE_EXCEPTION
 #else
-   EXCEPTION(0x2010, FloatingPointUnavailable, unknown_exception, 
EXC_XFER_EE)
+   EXCEPTION(0x2010, BOOKE_INTERRUPT_FP_UNAVAIL, \
+ FloatingPointUnavailable, unknown_exception, EXC_XFER_EE)
 #endif
/* System Call Interrupt */
START_EXCEPTION(SystemCall)
-   NORMAL_EXCEPTION_PROLOG
+   NORMAL_EXCEPTION_PROLOG(BOOKE_INTERRUPT_SYSCALL)
EXC_XFER_EE_LITE(0x0c00, DoSyscall)
 
/* Auxiliary Processor Unavailable Interrupt */
-   EXCEPTION(0x2020, AuxillaryProcessorUnavailable, unknown_exception, 
EXC_XFER_EE)
+   EXCEPTION(0x2020, BOOKE_INTERRUPT_AP_UNAVAIL, \
+ AuxillaryProcessorUnavailable, unknown_exception, EXC_XFER_EE)
 
/* Decrementer Interrupt */
DECREMENTER_EXCEPTION
 
/* Fixed Internal Timer Interrupt */
/* TODO: Add FIT support */
-   EXCEPTION(0x1010, FixedIntervalTimer, unknown_exception, EXC_XFER_EE)
+   EXCEPTION(0x1010, BOOKE_INTERRUPT_FIT, FixedIntervalTimer, \
+ unknown_exception, EXC_XFER_EE)
 
/* Watchdog Timer Interrupt */
/* TODO: Add watchdog support */
 #ifdef CONFIG_BOOKE_WDT
-   CRITICAL_EXCEPTION(0x1020, WatchdogTimer, WatchdogException)
+   CRITICAL_EXCEPTION(0x1020, WATCHDOG, WatchdogTimer, WatchdogException)
 #else
-   CRITICAL_EXCEPTION(0x1020, WatchdogTimer, unknown_exception)
+   CRITICAL_EXCEPTION(0x1020, WATCHDOG, WatchdogTimer, unknown_exception)
 #endif
 
/* Data TLB Error Interrupt */
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index fc921bf..06ab353 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -2,6 +2,8 @@
 #define __HEAD_BOOKE_H__
 
 #include asm/ptrace.h/* for STACK_FRAME_REGS_MARKER */
+#include asm/kvm_asm.h
+
 /*
  * Macros used for common Book-e exception handling
  */
@@ -28,7 +30,7 @@
  */
 #define THREAD_NORMSAVE(offset)(THREAD_NORMSAVES + (offset * 4))
 
-#define NORMAL_EXCEPTION_PROLOG
 \
+#define NORMAL_EXCEPTION_PROLOG(intno) 
 \
mtspr   SPRN_SPRG_WSCRATCH0, r10;   /* save one register */  \
mfspr   r10, SPRN_SPRG_THREAD;   \
stw r11, THREAD_NORMSAVE(0)(r10);\
@@ -113,7 +115,7 @@
  * registers as the normal prolog above. Instead we use a portion of the
  * critical/machine check exception stack at low physical addresses.
  */
-#define EXC_LEVEL_EXCEPTION_PROLOG(exc_level, exc_level_srr0, exc_level_srr1) \
+#define EXC_LEVEL_EXCEPTION_PROLOG(exc_level, intno, exc_level_srr0, 
exc_level_srr1) \
mtspr   SPRN_SPRG_WSCRATCH_##exc_level,r8;   \
BOOKE_LOAD_EXC_LEVEL_STACK(exc_level);/* r8 points to the exc_level 
stack*/ \
stw r9,GPR9(r8);/* save various registers  */\
@@ -162,12 +164,13 @@
SAVE_4GPRS(3, r11);  \
SAVE_2GPRS(7, r11)
 

[PATCH 09/37] KVM: PPC: e500: refactor core-specific TLB code

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

The PID handling is e500v1/v2-specific, and is moved to e500.c.

The MMU sregs code and kvmppc_core_vcpu_translate will be shared with
e500mc, and is moved from e500.c to e500_tlb.c.

Partially based on patches from Liu Yu yu@freescale.com.

Signed-off-by: Scott Wood scottw...@freescale.com
[agraf: fix bisectability]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |2 +
 arch/powerpc/kvm/e500.c |  357 +++
 arch/powerpc/kvm/e500.h |   62 -
 arch/powerpc/kvm/e500_emulate.c |6 +-
 arch/powerpc/kvm/e500_tlb.c |  460 +--
 5 files changed, 473 insertions(+), 414 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 52eb9c1..47612cc 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -426,6 +426,8 @@ struct kvm_vcpu_arch {
ulong fault_esr;
ulong queued_dear;
ulong queued_esr;
+   u32 tlbcfg[4];
+   u32 mmucfg;
 #endif
gpa_t paddr_accessed;
 
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 76b35d8..b479ed7 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -22,9 +22,281 @@
 #include asm/tlbflush.h
 #include asm/kvm_ppc.h
 
+#include ../mm/mmu_decl.h
 #include booke.h
 #include e500.h
 
+struct id {
+   unsigned long val;
+   struct id **pentry;
+};
+
+#define NUM_TIDS 256
+
+/*
+ * This table provide mappings from:
+ * (guestAS,guestTID,guestPR) -- ID of physical cpu
+ * guestAS [0..1]
+ * guestTID[0..255]
+ * guestPR [0..1]
+ * ID  [1..255]
+ * Each vcpu keeps one vcpu_id_table.
+ */
+struct vcpu_id_table {
+   struct id id[2][NUM_TIDS][2];
+};
+
+/*
+ * This table provide reversed mappings of vcpu_id_table:
+ * ID -- address of vcpu_id_table item.
+ * Each physical core has one pcpu_id_table.
+ */
+struct pcpu_id_table {
+   struct id *entry[NUM_TIDS];
+};
+
+static DEFINE_PER_CPU(struct pcpu_id_table, pcpu_sids);
+
+/* This variable keeps last used shadow ID on local core.
+ * The valid range of shadow ID is [1..255] */
+static DEFINE_PER_CPU(unsigned long, pcpu_last_used_sid);
+
+/*
+ * Allocate a free shadow id and setup a valid sid mapping in given entry.
+ * A mapping is only valid when vcpu_id_table and pcpu_id_table are match.
+ *
+ * The caller must have preemption disabled, and keep it that way until
+ * it has finished with the returned shadow id (either written into the
+ * TLB or arch.shadow_pid, or discarded).
+ */
+static inline int local_sid_setup_one(struct id *entry)
+{
+   unsigned long sid;
+   int ret = -1;
+
+   sid = ++(__get_cpu_var(pcpu_last_used_sid));
+   if (sid  NUM_TIDS) {
+   __get_cpu_var(pcpu_sids).entry[sid] = entry;
+   entry-val = sid;
+   entry-pentry = __get_cpu_var(pcpu_sids).entry[sid];
+   ret = sid;
+   }
+
+   /*
+* If sid == NUM_TIDS, we've run out of sids.  We return -1, and
+* the caller will invalidate everything and start over.
+*
+* sid  NUM_TIDS indicates a race, which we disable preemption to
+* avoid.
+*/
+   WARN_ON(sid  NUM_TIDS);
+
+   return ret;
+}
+
+/*
+ * Check if given entry contain a valid shadow id mapping.
+ * An ID mapping is considered valid only if
+ * both vcpu and pcpu know this mapping.
+ *
+ * The caller must have preemption disabled, and keep it that way until
+ * it has finished with the returned shadow id (either written into the
+ * TLB or arch.shadow_pid, or discarded).
+ */
+static inline int local_sid_lookup(struct id *entry)
+{
+   if (entry  entry-val != 0 
+   __get_cpu_var(pcpu_sids).entry[entry-val] == entry 
+   entry-pentry == __get_cpu_var(pcpu_sids).entry[entry-val])
+   return entry-val;
+   return -1;
+}
+
+/* Invalidate all id mappings on local core -- call with preempt disabled */
+static inline void local_sid_destroy_all(void)
+{
+   __get_cpu_var(pcpu_last_used_sid) = 0;
+   memset(__get_cpu_var(pcpu_sids), 0, sizeof(__get_cpu_var(pcpu_sids)));
+}
+
+static void *kvmppc_e500_id_table_alloc(struct kvmppc_vcpu_e500 *vcpu_e500)
+{
+   vcpu_e500-idt = kzalloc(sizeof(struct vcpu_id_table), GFP_KERNEL);
+   return vcpu_e500-idt;
+}
+
+static void kvmppc_e500_id_table_free(struct kvmppc_vcpu_e500 *vcpu_e500)
+{
+   kfree(vcpu_e500-idt);
+   vcpu_e500-idt = NULL;
+}
+
+/* Map guest pid to shadow.
+ * We use PID to keep shadow of current guest non-zero PID,
+ * and use PID1 to keep shadow of guest zero PID.
+ * So that guest tlbe with TID=0 can be accessed at any time */
+static void kvmppc_e500_recalc_shadow_pid(struct kvmppc_vcpu_e500 *vcpu_e500)
+{
+   preempt_disable();
+   vcpu_e500-vcpu.arch.shadow_pid = kvmppc_e500_get_sid(vcpu_e500,
+  

[PATCH 07/37] KVM: PPC: e500: merge asm/kvm_e500.h into arch/powerpc/kvm/e500.h

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Keeping two separate headers for e500-specific things was a
pain, and wasn't even organized along any logical boundary.

There was TLB stuff in asm/kvm_e500.h despite the existence of
arch/powerpc/kvm/e500_tlb.h, and nothing in asm/kvm_e500.h needed
to be referenced from outside arch/powerpc/kvm.

Signed-off-by: Scott Wood scottw...@freescale.com
[agraf: fix bisectability]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_e500.h |   96 ---
 arch/powerpc/kvm/e500.c |1 -
 arch/powerpc/kvm/e500.h |   82 --
 arch/powerpc/kvm/e500_emulate.c |1 -
 arch/powerpc/kvm/e500_tlb.c |1 -
 5 files changed, 78 insertions(+), 103 deletions(-)
 delete mode 100644 arch/powerpc/include/asm/kvm_e500.h

diff --git a/arch/powerpc/include/asm/kvm_e500.h 
b/arch/powerpc/include/asm/kvm_e500.h
deleted file mode 100644
index 8cd50a5..000
--- a/arch/powerpc/include/asm/kvm_e500.h
+++ /dev/null
@@ -1,96 +0,0 @@
-/*
- * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved.
- *
- * Author: Yu Liu, yu@freescale.com
- *
- * Description:
- * This file is derived from arch/powerpc/include/asm/kvm_44x.h,
- * by Hollis Blanchard holl...@us.ibm.com.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- */
-
-#ifndef __ASM_KVM_E500_H__
-#define __ASM_KVM_E500_H__
-
-#include linux/kvm_host.h
-
-#define BOOKE_INTERRUPT_SIZE 36
-
-#define E500_PID_NUM   3
-#define E500_TLB_NUM   2
-
-#define E500_TLB_VALID 1
-#define E500_TLB_DIRTY 2
-
-struct tlbe_ref {
-   pfn_t pfn;
-   unsigned int flags; /* E500_TLB_* */
-};
-
-struct tlbe_priv {
-   struct tlbe_ref ref; /* TLB0 only -- TLB1 uses tlb_refs */
-};
-
-struct vcpu_id_table;
-
-struct kvmppc_e500_tlb_params {
-   int entries, ways, sets;
-};
-
-struct kvmppc_vcpu_e500 {
-   /* Unmodified copy of the guest's TLB -- shared with host userspace. */
-   struct kvm_book3e_206_tlb_entry *gtlb_arch;
-
-   /* Starting entry number in gtlb_arch[] */
-   int gtlb_offset[E500_TLB_NUM];
-
-   /* KVM internal information associated with each guest TLB entry */
-   struct tlbe_priv *gtlb_priv[E500_TLB_NUM];
-
-   struct kvmppc_e500_tlb_params gtlb_params[E500_TLB_NUM];
-
-   unsigned int gtlb_nv[E500_TLB_NUM];
-
-   /*
-* information associated with each host TLB entry --
-* TLB1 only for now.  If/when guest TLB1 entries can be
-* mapped with host TLB0, this will be used for that too.
-*
-* We don't want to use this for guest TLB0 because then we'd
-* have the overhead of doing the translation again even if
-* the entry is still in the guest TLB (e.g. we swapped out
-* and back, and our host TLB entries got evicted).
-*/
-   struct tlbe_ref *tlb_refs[E500_TLB_NUM];
-   unsigned int host_tlb1_nv;
-
-   u32 host_pid[E500_PID_NUM];
-   u32 pid[E500_PID_NUM];
-   u32 svr;
-
-   /* vcpu id table */
-   struct vcpu_id_table *idt;
-
-   u32 l1csr0;
-   u32 l1csr1;
-   u32 hid0;
-   u32 hid1;
-   u32 tlb0cfg;
-   u32 tlb1cfg;
-   u64 mcar;
-
-   struct page **shared_tlb_pages;
-   int num_shared_tlb_pages;
-
-   struct kvm_vcpu vcpu;
-};
-
-static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu)
-{
-   return container_of(vcpu, struct kvmppc_vcpu_e500, vcpu);
-}
-
-#endif /* __ASM_KVM_E500_H__ */
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 5c450ba..76b35d8 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -20,7 +20,6 @@
 #include asm/reg.h
 #include asm/cputable.h
 #include asm/tlbflush.h
-#include asm/kvm_e500.h
 #include asm/kvm_ppc.h
 
 #include booke.h
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 02ecde2..51d13bd 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -1,11 +1,12 @@
 /*
  * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved.
  *
- * Author: Yu Liu, yu@freescale.com
+ * Author: Yu Liu yu@freescale.com
  *
  * Description:
- * This file is based on arch/powerpc/kvm/44x_tlb.h,
- * by Hollis Blanchard holl...@us.ibm.com.
+ * This file is based on arch/powerpc/kvm/44x_tlb.h and
+ * arch/powerpc/include/asm/kvm_44x.h by Hollis Blanchard holl...@us.ibm.com,
+ * Copyright IBM Corp. 2007-2008
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License, version 2, as
@@ -18,7 +19,80 @@
 #include linux/kvm_host.h
 #include asm/mmu-book3e.h
 #include asm/tlb.h
-#include asm/kvm_e500.h
+
+#define E500_PID_NUM   3
+#define E500_TLB_NUM   2
+
+#define E500_TLB_VALID 1
+#define 

[PATCH 15/37] KVM: PPC: e500mc support

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Add processor support for e500mc, using hardware virtualization support
(GS-mode).

Current issues include:
 - No support for external proxy (coreint) interrupt mode in the guest.

Includes work by Ashish Kalra ashish.ka...@freescale.com,
Varun Sethi varun.se...@freescale.com, and
Liu Yu yu@freescale.com.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/cputable.h   |6 +-
 arch/powerpc/include/asm/kvm.h|1 +
 arch/powerpc/kernel/cpu_setup_fsl_booke.S |1 +
 arch/powerpc/kernel/head_fsl_booke.S  |   46 
 arch/powerpc/kvm/Kconfig  |   17 ++-
 arch/powerpc/kvm/Makefile |   11 +
 arch/powerpc/kvm/e500.h   |   13 +-
 arch/powerpc/kvm/e500_emulate.c   |   24 ++-
 arch/powerpc/kvm/e500_tlb.c   |   21 ++-
 arch/powerpc/kvm/e500mc.c |  342 +
 arch/powerpc/kvm/powerpc.c|6 +-
 11 files changed, 476 insertions(+), 12 deletions(-)
 create mode 100644 arch/powerpc/kvm/e500mc.c

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 2022f2d..598cd24 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -168,6 +168,7 @@ extern const char *powerpc_base_platform;
 #define CPU_FTR_LWSYNC ASM_CONST(0x0800)
 #define CPU_FTR_NOEXECUTE  ASM_CONST(0x1000)
 #define CPU_FTR_INDEXED_DCRASM_CONST(0x2000)
+#define CPU_FTR_EMB_HV ASM_CONST(0x4000)
 
 /*
  * Add the 64-bit processor unique features in the top half of the word;
@@ -386,11 +387,11 @@ extern const char *powerpc_base_platform;
CPU_FTR_NODSISRALIGN | CPU_FTR_NOEXECUTE)
 #define CPU_FTRS_E500MC(CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \
CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \
-   CPU_FTR_DBELL | CPU_FTR_DEBUG_LVL_EXC)
+   CPU_FTR_DBELL | CPU_FTR_DEBUG_LVL_EXC | CPU_FTR_EMB_HV)
 #define CPU_FTRS_E5500 (CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \
CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \
CPU_FTR_DBELL | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
-   CPU_FTR_DEBUG_LVL_EXC)
+   CPU_FTR_DEBUG_LVL_EXC | CPU_FTR_EMB_HV)
 #define CPU_FTRS_GENERIC_32(CPU_FTR_COMMON | CPU_FTR_NODSISRALIGN)
 
 /* 64-bit CPUs */
@@ -535,6 +536,7 @@ enum {
 #ifdef CONFIG_PPC_E500MC
CPU_FTRS_E500MC  CPU_FTRS_E5500 
 #endif
+   ~CPU_FTR_EMB_HV/* can be removed at runtime */
CPU_FTRS_POSSIBLE,
 };
 #endif /* __powerpc64__ */
diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index b921c3f..1bea4d8 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -277,6 +277,7 @@ struct kvm_sync_regs {
 #define KVM_CPU_E500V2 2
 #define KVM_CPU_3S_32  3
 #define KVM_CPU_3S_64  4
+#define KVM_CPU_E500MC 5
 
 /* for KVM_CAP_SPAPR_TCE */
 struct kvm_create_spapr_tce {
diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S 
b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
index 8053db0..69fdd23 100644
--- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S
+++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
@@ -73,6 +73,7 @@ _GLOBAL(__setup_cpu_e500v2)
mtlrr4
blr
 _GLOBAL(__setup_cpu_e500mc)
+   mr  r5, r4
mflrr4
bl  __e500_icache_setup
bl  __e500_dcache_setup
diff --git a/arch/powerpc/kernel/head_fsl_booke.S 
b/arch/powerpc/kernel/head_fsl_booke.S
index 418931f..88c0a35 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -380,10 +380,16 @@ interrupt_base:
mtspr   SPRN_SPRG_WSCRATCH0, r10 /* Save some working registers */
mfspr   r10, SPRN_SPRG_THREAD
stw r11, THREAD_NORMSAVE(0)(r10)
+#ifdef CONFIG_KVM_BOOKE_HV
+BEGIN_FTR_SECTION
+   mfspr   r11, SPRN_SRR1
+END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
+#endif
stw r12, THREAD_NORMSAVE(1)(r10)
stw r13, THREAD_NORMSAVE(2)(r10)
mfcrr13
stw r13, THREAD_NORMSAVE(3)(r10)
+   DO_KVM  BOOKE_INTERRUPT_DTLB_MISS SPRN_SRR1
mfspr   r10, SPRN_DEAR  /* Get faulting address */
 
/* If we are faulting a kernel address, we have to use the
@@ -468,10 +474,16 @@ interrupt_base:
mtspr   SPRN_SPRG_WSCRATCH0, r10 /* Save some working registers */
mfspr   r10, SPRN_SPRG_THREAD
stw r11, THREAD_NORMSAVE(0)(r10)
+#ifdef CONFIG_KVM_BOOKE_HV
+BEGIN_FTR_SECTION
+   mfspr   r11, SPRN_SRR1
+END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
+#endif
stw r12, THREAD_NORMSAVE(1)(r10)
stw r13, THREAD_NORMSAVE(2)(r10)
mfcrr13
stw r13, THREAD_NORMSAVE(3)(r10)
+

[PATCH 13/37] KVM: PPC: booke: category E.HV (GS-mode) support

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

Chips such as e500mc that implement category E.HV in Power ISA 2.06
provide hardware virtualization features, including a new MSR mode for
guest state.  The guest OS can perform many operations without trapping
into the hypervisor, including transitions to and from guest userspace.

Since we can use SRR1[GS] to reliably tell whether an exception came from
guest state, instead of messing around with IVPR, we use DO_KVM similarly
to book3s.

Current issues include:
 - Machine checks from guest state are not routed to the host handler.
 - The guest can cause a host oops by executing an emulated instruction
   in a page that lacks read permission.  Existing e500/4xx support has
   the same problem.

Includes work by Ashish Kalra ashish.ka...@freescale.com,
Varun Sethi varun.se...@freescale.com, and
Liu Yu yu@freescale.com.

Signed-off-by: Scott Wood scottw...@freescale.com
[agraf: remove pt_regs usage]
Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - ESR - GESR
---
 arch/powerpc/include/asm/dbell.h|1 +
 arch/powerpc/include/asm/kvm_asm.h  |8 +
 arch/powerpc/include/asm/kvm_booke_hv_asm.h |   49 +++
 arch/powerpc/include/asm/kvm_host.h |   19 +-
 arch/powerpc/include/asm/kvm_ppc.h  |3 +
 arch/powerpc/include/asm/mmu-book3e.h   |6 +
 arch/powerpc/include/asm/processor.h|3 +
 arch/powerpc/include/asm/reg.h  |2 +
 arch/powerpc/include/asm/reg_booke.h|   34 ++
 arch/powerpc/kernel/asm-offsets.c   |   15 +-
 arch/powerpc/kernel/head_booke.h|   28 ++-
 arch/powerpc/kvm/Kconfig|3 +
 arch/powerpc/kvm/booke.c|  309 ---
 arch/powerpc/kvm/booke.h|   24 +-
 arch/powerpc/kvm/booke_emulate.c|   23 +-
 arch/powerpc/kvm/bookehv_interrupts.S   |  587 +++
 arch/powerpc/kvm/powerpc.c  |5 +
 arch/powerpc/kvm/timing.h   |6 +
 18 files changed, 1058 insertions(+), 67 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_booke_hv_asm.h
 create mode 100644 arch/powerpc/kvm/bookehv_interrupts.S

diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index efa74ac..d7365b0 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -19,6 +19,7 @@
 
 #define PPC_DBELL_MSG_BRDCAST  (0x0400)
 #define PPC_DBELL_TYPE(x)  (((x)  0xf)  (63-36))
+#define PPC_DBELL_LPID(x)  ((x)  (63 - 49))
 enum ppc_dbell {
PPC_DBELL = 0,  /* doorbell */
PPC_DBELL_CRIT = 1, /* critical doorbell */
diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index 7b1f0e0..0978152 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -48,6 +48,14 @@
 #define BOOKE_INTERRUPT_SPE_FP_DATA 33
 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34
 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
+#define BOOKE_INTERRUPT_DOORBELL 36
+#define BOOKE_INTERRUPT_DOORBELL_CRITICAL 37
+
+/* booke_hv */
+#define BOOKE_INTERRUPT_GUEST_DBELL 38
+#define BOOKE_INTERRUPT_GUEST_DBELL_CRIT 39
+#define BOOKE_INTERRUPT_HV_SYSCALL 40
+#define BOOKE_INTERRUPT_HV_PRIV 41
 
 /* book3s */
 
diff --git a/arch/powerpc/include/asm/kvm_booke_hv_asm.h 
b/arch/powerpc/include/asm/kvm_booke_hv_asm.h
new file mode 100644
index 000..30a600f
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_booke_hv_asm.h
@@ -0,0 +1,49 @@
+/*
+ * Copyright 2010-2011 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef ASM_KVM_BOOKE_HV_ASM_H
+#define ASM_KVM_BOOKE_HV_ASM_H
+
+#ifdef __ASSEMBLY__
+
+/*
+ * All exceptions from guest state must go through KVM
+ * (except for those which are delivered directly to the guest) --
+ * there are no exceptions for which we fall through directly to
+ * the normal host handler.
+ *
+ * Expected inputs (normal exceptions):
+ *   SCRATCH0 = saved r10
+ *   r10 = thread struct
+ *   r11 = appropriate SRR1 variant (currently used as scratch)
+ *   r13 = saved CR
+ *   *(r10 + THREAD_NORMSAVE(0)) = saved r11
+ *   *(r10 + THREAD_NORMSAVE(2)) = saved r13
+ *
+ * Expected inputs (crit/mcheck/debug exceptions):
+ *   appropriate SCRATCH = saved r8
+ *   r8 = exception level stack frame
+ *   r9 = *(r8 + _CCR) = saved CR
+ *   r11 = appropriate SRR1 variant (currently used as scratch)
+ *   *(r8 + GPR9) = saved r9
+ *   *(r8 + GPR10) = saved r10 (r10 not yet clobbered)
+ *   *(r8 + GPR11) = saved r11
+ */
+.macro DO_KVM intno srr1
+#ifdef CONFIG_KVM_BOOKE_HV
+BEGIN_FTR_SECTION
+   mtocrf  0x80, r11   /* check MSR[GS] without clobbering reg */
+   bf  3, kvmppc_resume_\intno\()_\srr1
+   b   

[PATCH 18/37] KVM: PPC: e500mc: Move r1/r2 restoration very early

2012-02-24 Thread Alexander Graf
If we hit any exception whatsoever in the restore path and r1/r2 aren't the
host registers, we don't get a working oops. So it's always a good idea to
restore them as early as possible.

This time, it actually has practical reasons to do so too, since we need to
have the host page fault handler fix up our guest instruction read code. And
for that to work we need r1/r2 restored.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 9eaeebd..63023ae 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -67,6 +67,12 @@
  * saved in vcpu: cr, ctr, r3-r13
  */
 .macro kvm_handler_common intno, srr0, flags
+   /* Restore host stack pointer */
+   PPC_STL r1, VCPU_GPR(r1)(r4)
+   PPC_STL r2, VCPU_GPR(r2)(r4)
+   PPC_LL  r1, VCPU_HOST_STACK(r4)
+   PPC_LL  r2, HOST_R2(r1)
+
mfspr   r10, SPRN_PID
lwz r8, VCPU_HOST_PID(r4)
PPC_LL  r11, VCPU_SHARED(r4)
@@ -290,10 +296,8 @@ _GLOBAL(kvmppc_resume_host)
/* Save remaining volatile guest register state to vcpu. */
mfspr   r3, SPRN_VRSAVE
PPC_STL r0, VCPU_GPR(r0)(r4)
-   PPC_STL r1, VCPU_GPR(r1)(r4)
mflrr5
mfspr   r6, SPRN_SPRG4
-   PPC_STL r2, VCPU_GPR(r2)(r4)
PPC_STL r5, VCPU_LR(r4)
mfspr   r7, SPRN_SPRG5
PPC_STL r3, VCPU_VRSAVE(r4)
@@ -334,10 +338,6 @@ _GLOBAL(kvmppc_resume_host)
mtspr   SPRN_EPCR, r3
isync
 
-   /* Restore host stack pointer */
-   PPC_LL  r1, VCPU_HOST_STACK(r4)
-   PPC_LL  r2, HOST_R2(r1)
-
/* Switch to kernel stack and jump to handler. */
PPC_LL  r3, HOST_RUN(r1)
mr  r5, r14 /* intno */
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 17/37] KVM: PPC: e500mc: implicitly set MSR_GS

2012-02-24 Thread Alexander Graf
When setting MSR for an e500mc guest, we implicitly always set MSR_GS
to make sure the guest is in guest state. Since we have this implicit
rule there, we don't need to explicitly pass MSR_GS to set_msr().

Remove all explicit setters of MSR_GS.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |   11 +--
 1 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 85bd5b8..fcbe928 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -280,7 +280,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
 unsigned int priority)
 {
int allowed = 0;
-   ulong uninitialized_var(msr_mask);
+   ulong msr_mask = 0;
bool update_esr = false, update_dear = false;
ulong crit_raw = vcpu-arch.shared-critical;
ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
@@ -322,20 +322,19 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
case BOOKE_IRQPRIO_AP_UNAVAIL:
case BOOKE_IRQPRIO_ALIGNMENT:
allowed = 1;
-   msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE;
+   msr_mask = MSR_CE | MSR_ME | MSR_DE;
int_class = INT_CLASS_NONCRIT;
break;
case BOOKE_IRQPRIO_CRITICAL:
case BOOKE_IRQPRIO_DBELL_CRIT:
allowed = vcpu-arch.shared-msr  MSR_CE;
allowed = allowed  !crit;
-   msr_mask = MSR_GS | MSR_ME;
+   msr_mask = MSR_ME;
int_class = INT_CLASS_CRIT;
break;
case BOOKE_IRQPRIO_MACHINE_CHECK:
allowed = vcpu-arch.shared-msr  MSR_ME;
allowed = allowed  !crit;
-   msr_mask = MSR_GS;
int_class = INT_CLASS_MC;
break;
case BOOKE_IRQPRIO_DECREMENTER:
@@ -346,13 +345,13 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
case BOOKE_IRQPRIO_DBELL:
allowed = vcpu-arch.shared-msr  MSR_EE;
allowed = allowed  !crit;
-   msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE;
+   msr_mask = MSR_CE | MSR_ME | MSR_DE;
int_class = INT_CLASS_NONCRIT;
break;
case BOOKE_IRQPRIO_DEBUG:
allowed = vcpu-arch.shared-msr  MSR_DE;
allowed = allowed  !crit;
-   msr_mask = MSR_GS | MSR_ME;
+   msr_mask = MSR_ME;
int_class = INT_CLASS_CRIT;
break;
}
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 16/37] KVM: PPC: e500mc: Add doorbell emulation support

2012-02-24 Thread Alexander Graf
When one vcpu wants to kick another, it can issue a special IPI instruction
called msgsnd. This patch emulates this instruction, its clearing counterpart
and the infrastructure required to actually trigger that interrupt inside
a guest vcpu.

With this patch, SMP guests on e500mc work.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - introduce and use constants
  - drop e500mc ifdefs
---
 arch/powerpc/include/asm/dbell.h |2 +
 arch/powerpc/kvm/booke.c |2 +
 arch/powerpc/kvm/e500_emulate.c  |   68 ++
 3 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index d7365b0..154c067 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -19,7 +19,9 @@
 
 #define PPC_DBELL_MSG_BRDCAST  (0x0400)
 #define PPC_DBELL_TYPE(x)  (((x)  0xf)  (63-36))
+#define PPC_DBELL_TYPE_MASKPPC_DBELL_TYPE(0xf)
 #define PPC_DBELL_LPID(x)  ((x)  (63 - 49))
+#define PPC_DBELL_PIR_MASK 0x3fff
 enum ppc_dbell {
PPC_DBELL = 0,  /* doorbell */
PPC_DBELL_CRIT = 1, /* critical doorbell */
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 0b77be1..85bd5b8 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -326,6 +326,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
int_class = INT_CLASS_NONCRIT;
break;
case BOOKE_IRQPRIO_CRITICAL:
+   case BOOKE_IRQPRIO_DBELL_CRIT:
allowed = vcpu-arch.shared-msr  MSR_CE;
allowed = allowed  !crit;
msr_mask = MSR_GS | MSR_ME;
@@ -342,6 +343,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
keep_irq = true;
/* fall through */
case BOOKE_IRQPRIO_EXTERNAL:
+   case BOOKE_IRQPRIO_DBELL:
allowed = vcpu-arch.shared-msr  MSR_EE;
allowed = allowed  !crit;
msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE;
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index 98b6c1c..99155f8 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -14,16 +14,74 @@
 
 #include asm/kvm_ppc.h
 #include asm/disassemble.h
+#include asm/dbell.h
 
 #include booke.h
 #include e500.h
 
+#define XOP_MSGSND  206
+#define XOP_MSGCLR  238
 #define XOP_TLBIVAX 786
 #define XOP_TLBSX   914
 #define XOP_TLBRE   946
 #define XOP_TLBWE   978
 #define XOP_TLBILX  18
 
+#ifdef CONFIG_KVM_E500MC
+static int dbell2prio(ulong param)
+{
+   int msg = param  PPC_DBELL_TYPE_MASK;
+   int prio = -1;
+
+   switch (msg) {
+   case PPC_DBELL_TYPE(PPC_DBELL):
+   prio = BOOKE_IRQPRIO_DBELL;
+   break;
+   case PPC_DBELL_TYPE(PPC_DBELL_CRIT):
+   prio = BOOKE_IRQPRIO_DBELL_CRIT;
+   break;
+   default:
+   break;
+   }
+
+   return prio;
+}
+
+static int kvmppc_e500_emul_msgclr(struct kvm_vcpu *vcpu, int rb)
+{
+   ulong param = vcpu-arch.gpr[rb];
+   int prio = dbell2prio(param);
+
+   if (prio  0)
+   return EMULATE_FAIL;
+
+   clear_bit(prio, vcpu-arch.pending_exceptions);
+   return EMULATE_DONE;
+}
+
+static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, int rb)
+{
+   ulong param = vcpu-arch.gpr[rb];
+   int prio = dbell2prio(rb);
+   int pir = param  PPC_DBELL_PIR_MASK;
+   int i;
+   struct kvm_vcpu *cvcpu;
+
+   if (prio  0)
+   return EMULATE_FAIL;
+
+   kvm_for_each_vcpu(i, cvcpu, vcpu-kvm) {
+   int cpir = cvcpu-arch.shared-pir;
+   if ((param  PPC_DBELL_MSG_BRDCAST) || (cpir == pir)) {
+   set_bit(prio, cvcpu-arch.pending_exceptions);
+   kvm_vcpu_kick(cvcpu);
+   }
+   }
+
+   return EMULATE_DONE;
+}
+#endif
+
 int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int inst, int *advance)
 {
@@ -36,6 +94,16 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
case 31:
switch (get_xop(inst)) {
 
+#ifdef CONFIG_KVM_E500MC
+   case XOP_MSGSND:
+   emulated = kvmppc_e500_emul_msgsnd(vcpu, get_rb(inst));
+   break;
+
+   case XOP_MSGCLR:
+   emulated = kvmppc_e500_emul_msgclr(vcpu, get_rb(inst));
+   break;
+#endif
+
case XOP_TLBRE:
emulated = kvmppc_e500_emul_tlbre(vcpu);
break;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 14/37] KVM: PPC: booke: standard PPC floating point support

2012-02-24 Thread Alexander Graf
From: Scott Wood scottw...@freescale.com

e500mc has a normal PPC FPU, rather than SPE which is found
on e500v1/v2.

Based on code from Liu Yu yu@freescale.com.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/system.h |1 +
 arch/powerpc/kvm/booke.c  |   44 +
 arch/powerpc/kvm/booke.h  |   30 +
 3 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/system.h 
b/arch/powerpc/include/asm/system.h
index c377457..73eee86 100644
--- a/arch/powerpc/include/asm/system.h
+++ b/arch/powerpc/include/asm/system.h
@@ -140,6 +140,7 @@ extern void via_cuda_init(void);
 extern void read_rtc_time(void);
 extern void pmac_find_display(void);
 extern void giveup_fpu(struct task_struct *);
+extern void load_up_fpu(void);
 extern void disable_kernel_fp(void);
 extern void enable_kernel_fp(void);
 extern void flush_fp_to_thread(struct task_struct *);
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 75dbaeb..0b77be1 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -457,6 +457,11 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
int ret;
+#ifdef CONFIG_PPC_FPU
+   unsigned int fpscr;
+   int fpexc_mode;
+   u64 fpr[32];
+#endif
 
if (!vcpu-arch.sane) {
kvm_run-exit_reason = KVM_EXIT_INTERNAL_ERROR;
@@ -479,7 +484,46 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
}
 
kvm_guest_enter();
+
+#ifdef CONFIG_PPC_FPU
+   /* Save userspace FPU state in stack */
+   enable_kernel_fp();
+   memcpy(fpr, current-thread.fpr, sizeof(current-thread.fpr));
+   fpscr = current-thread.fpscr.val;
+   fpexc_mode = current-thread.fpexc_mode;
+
+   /* Restore guest FPU state to thread */
+   memcpy(current-thread.fpr, vcpu-arch.fpr, sizeof(vcpu-arch.fpr));
+   current-thread.fpscr.val = vcpu-arch.fpscr;
+
+   /*
+* Since we can't trap on MSR_FP in GS-mode, we consider the guest
+* as always using the FPU.  Kernel usage of FP (via
+* enable_kernel_fp()) in this thread must not occur while
+* vcpu-fpu_active is set.
+*/
+   vcpu-fpu_active = 1;
+
+   kvmppc_load_guest_fp(vcpu);
+#endif
+
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
+
+#ifdef CONFIG_PPC_FPU
+   kvmppc_save_guest_fp(vcpu);
+
+   vcpu-fpu_active = 0;
+
+   /* Save guest FPU state from thread */
+   memcpy(vcpu-arch.fpr, current-thread.fpr, sizeof(vcpu-arch.fpr));
+   vcpu-arch.fpscr = current-thread.fpscr.val;
+
+   /* Restore userspace FPU state from stack */
+   memcpy(current-thread.fpr, fpr, sizeof(current-thread.fpr));
+   current-thread.fpscr.val = fpscr;
+   current-thread.fpexc_mode = fpexc_mode;
+#endif
+
kvm_guest_exit();
 
 out:
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index d53bcf2..3bf5eda 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -96,4 +96,34 @@ enum int_class {
 
 void kvmppc_set_pending_interrupt(struct kvm_vcpu *vcpu, enum int_class type);
 
+/*
+ * Load up guest vcpu FP state if it's needed.
+ * It also set the MSR_FP in thread so that host know
+ * we're holding FPU, and then host can help to save
+ * guest vcpu FP state if other threads require to use FPU.
+ * This simulates an FP unavailable fault.
+ *
+ * It requires to be called with preemption disabled.
+ */
+static inline void kvmppc_load_guest_fp(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_PPC_FPU
+   if (vcpu-fpu_active  !(current-thread.regs-msr  MSR_FP)) {
+   load_up_fpu();
+   current-thread.regs-msr |= MSR_FP;
+   }
+#endif
+}
+
+/*
+ * Save guest vcpu FP state into thread.
+ * It requires to be called with preemption disabled.
+ */
+static inline void kvmppc_save_guest_fp(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_PPC_FPU
+   if (vcpu-fpu_active  (current-thread.regs-msr  MSR_FP))
+   giveup_fpu(current);
+#endif
+}
 #endif /* __KVM_BOOKE_H__ */
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 22/37] KVM: PPC: booke: remove leftover debugging

2012-02-24 Thread Alexander Graf
The e500mc patches left some debug code in that we don't need. Remove it.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |5 -
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 9fcc760..17d5318 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -469,11 +469,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
return -EINVAL;
}
 
-   if (!current-thread.kvm_vcpu) {
-   WARN(1, no vcpu\n);
-   return -EPERM;
-   }
-
local_irq_disable();
 
kvmppc_core_prepare_to_enter(vcpu);
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 21/37] KVM: PPC: make e500v2 kvm and e500mc cpu mutually exclusive

2012-02-24 Thread Alexander Graf
We can't run e500v2 kvm on e500mc kernels, so indicate that by
making the 2 options mutually exclusive in kconfig.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 44a998d..f4dacb9 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -120,7 +120,7 @@ config KVM_EXIT_TIMING
 
 config KVM_E500V2
bool KVM support for PowerPC E500v2 processors
-   depends on EXPERIMENTAL  E500
+   depends on EXPERIMENTAL  E500  !PPC_E500MC
select KVM
select KVM_MMIO
---help---
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 19/37] KVM: PPC: e500mc: add load inst fixup

2012-02-24 Thread Alexander Graf
There's always a chance we're unable to read a guest instruction. The guest
could have its TLB mapped execute-, but not readable, something odd happens
and our TLB gets flushed. So it's a good idea to be prepared for that case
and have a fallback that allows us to fix things up in that case.

Add fixup code that keeps guest code from potentially crashing our host kernel.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - fix whitespace
  - use explicit preempt counts
---
 arch/powerpc/kvm/bookehv_interrupts.S |   30 +-
 1 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 63023ae..f7dc3f6 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -28,6 +28,7 @@
 #include asm/asm-compat.h
 #include asm/asm-offsets.h
 #include asm/bitsperlong.h
+#include asm/thread_info.h
 
 #include ../kernel/head_booke.h /* for THREAD_NORMSAVE() */
 
@@ -171,9 +172,36 @@
PPC_STL r30, VCPU_GPR(r30)(r4)
PPC_STL r31, VCPU_GPR(r31)(r4)
mtspr   SPRN_EPLC, r8
+
+   /* disable preemption, so we are sure we hit the fixup handler */
+#ifdef CONFIG_PPC64
+   clrrdi  r8,r1,THREAD_SHIFT
+#else
+   rlwinm  r8,r1,0,0,31-THREAD_SHIFT   /* current thread_info */
+#endif
+   li  r7, 1
+stwr7, TI_PREEMPT(r8)
+
isync
-   lwepx   r9, 0, r5
+
+   /*
+* In case the read goes wrong, we catch it and write an invalid value
+* in LAST_INST instead.
+*/
+1: lwepx   r9, 0, r5
+2:
+.section .fixup, ax
+3: li  r9, KVM_INST_FETCH_FAILED
+   b   2b
+.previous
+.section __ex_table,a
+   PPC_LONG_ALIGN
+   PPC_LONG 1b,3b
+.previous
+
mtspr   SPRN_EPLC, r3
+   li  r7, 0
+stwr7, TI_PREEMPT(r8)
stw r9, VCPU_LAST_INST(r4)
.endif
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 20/37] KVM: PPC: rename CONFIG_KVM_E500 - CONFIG_KVM_E500V2

2012-02-24 Thread Alexander Graf
The CONFIG_KVM_E500 option really indicates that we're running on a V2 machine,
not on a machine of the generic E500 class. So indicate that properly and
change the config name accordingly.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig|8 
 arch/powerpc/kvm/Makefile   |4 ++--
 arch/powerpc/kvm/booke.c|2 +-
 arch/powerpc/kvm/e500.h |6 +++---
 arch/powerpc/kvm/e500_tlb.c |2 +-
 arch/powerpc/kvm/powerpc.c  |8 
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 58f6e68..44a998d 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -109,7 +109,7 @@ config KVM_440
 
 config KVM_EXIT_TIMING
bool Detailed exit timing
-   depends on KVM_440 || KVM_E500 || KVM_E500MC
+   depends on KVM_440 || KVM_E500V2 || KVM_E500MC
---help---
  Calculate elapsed time for every exit/enter cycle. A per-vcpu
  report is available in debugfs kvm/vm#_vcpu#_timing.
@@ -118,14 +118,14 @@ config KVM_EXIT_TIMING
 
  If unsure, say N.
 
-config KVM_E500
-   bool KVM support for PowerPC E500 processors
+config KVM_E500V2
+   bool KVM support for PowerPC E500v2 processors
depends on EXPERIMENTAL  E500
select KVM
select KVM_MMIO
---help---
  Support running unmodified E500 guest kernels in virtual machines on
- E500 host processors.
+ E500v2 host processors.
 
  This module provides access to the hardware capabilities through
  a character device node named /dev/kvm.
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 62febd7..25225ae 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -36,7 +36,7 @@ kvm-e500-objs := \
e500.o \
e500_tlb.o \
e500_emulate.o
-kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
+kvm-objs-$(CONFIG_KVM_E500V2) := $(kvm-e500-objs)
 
 kvm-e500mc-objs := \
$(common-objs-y) \
@@ -98,7 +98,7 @@ kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
 obj-$(CONFIG_KVM_440) += kvm.o
-obj-$(CONFIG_KVM_E500) += kvm.o
+obj-$(CONFIG_KVM_E500V2) += kvm.o
 obj-$(CONFIG_KVM_E500MC) += kvm.o
 obj-$(CONFIG_KVM_BOOK3S_64) += kvm.o
 obj-$(CONFIG_KVM_BOOK3S_32) += kvm.o
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index fcbe928..9fcc760 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -762,7 +762,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
gpa_t gpaddr;
gfn_t gfn;
 
-#ifdef CONFIG_KVM_E500
+#ifdef CONFIG_KVM_E500V2
if (!(vcpu-arch.shared-msr  MSR_PR) 
(eaddr  PAGE_MASK) == vcpu-arch.magic_page_ea) {
kvmppc_map_magic(vcpu);
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 3143085..7967f3f 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -39,7 +39,7 @@ struct tlbe_priv {
struct tlbe_ref ref; /* TLB0 only -- TLB1 uses tlb_refs */
 };
 
-#ifdef CONFIG_KVM_E500
+#ifdef CONFIG_KVM_E500V2
 struct vcpu_id_table;
 #endif
 
@@ -89,7 +89,7 @@ struct kvmppc_vcpu_e500 {
u64 *g2h_tlb1_map;
unsigned int *h2g_tlb1_rmap;
 
-#ifdef CONFIG_KVM_E500
+#ifdef CONFIG_KVM_E500V2
u32 pid[E500_PID_NUM];
 
/* vcpu id table */
@@ -136,7 +136,7 @@ void kvmppc_get_sregs_e500_tlb(struct kvm_vcpu *vcpu, 
struct kvm_sregs *sregs);
 int kvmppc_set_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
 
 
-#ifdef CONFIG_KVM_E500
+#ifdef CONFIG_KVM_E500V2
 unsigned int kvmppc_e500_get_sid(struct kvmppc_vcpu_e500 *vcpu_e500,
 unsigned int as, unsigned int gid,
 unsigned int pr, int avoid_recursion);
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index e232bb4..279e10a 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -156,7 +156,7 @@ static inline void write_host_tlbe(struct kvmppc_vcpu_e500 
*vcpu_e500,
}
 }
 
-#ifdef CONFIG_KVM_E500
+#ifdef CONFIG_KVM_E500V2
 void kvmppc_map_magic(struct kvm_vcpu *vcpu)
 {
struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 58a084f..26c6a8d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -74,7 +74,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
}
case HC_VENDOR_KVM | KVM_HC_FEATURES:
r = HC_EV_SUCCESS;
-#if defined(CONFIG_PPC_BOOK3S) || defined(CONFIG_KVM_E500)
+#if defined(CONFIG_PPC_BOOK3S) || defined(CONFIG_KVM_E500V2)
/* XXX Missing magic page on 44x */
r2 |= (1  KVM_FEATURE_MAGIC_PAGE);
 #endif
@@ -230,7 +230,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case 

[PATCH 26/37] KVM: PPC: bookehv: fix exit timing

2012-02-24 Thread Alexander Graf
When using exit timing stats, we clobber r9 in the NEED_EMU case,
so better move that part down a few lines and fix it that way.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index f7dc3f6..215381e 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -83,10 +83,6 @@
stw r10, VCPU_GUEST_PID(r4)
mtspr   SPRN_PID, r8
 
-   .if \flags  NEED_EMU
-   lwz r9, VCPU_KVM(r4)
-   .endif
-
 #ifdef CONFIG_KVM_EXIT_TIMING
/* save exit time */
 1: mfspr   r7, SPRN_TBRU
@@ -98,6 +94,10 @@
PPC_STL r9, VCPU_TIMING_EXIT_TBU(r4)
 #endif
 
+   .if \flags  NEED_EMU
+   lwz r9, VCPU_KVM(r4)
+   .endif
+
orisr8, r6, MSR_CE@h
 #ifndef CONFIG_64BIT
stw r6, (VCPU_SHARED_MSR + 4)(r11)
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 24/37] KVM: PPC: booke: rework rescheduling checks

2012-02-24 Thread Alexander Graf
Instead of checking whether we should reschedule only when we exited
due to an interrupt, let's always check before entering the guest back
again. This gets the target more in line with the other archs.

Also while at it, generalize the whole thing so that eventually we could
have a single kvmppc_prepare_to_enter function for all ppc targets that
does signal and reschedule checking for us.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h |2 +-
 arch/powerpc/kvm/book3s.c  |4 ++-
 arch/powerpc/kvm/booke.c   |   70 ---
 3 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e709975..7f0a3da 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -95,7 +95,7 @@ extern int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu,
 extern void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
 extern void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu);
 
-extern void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu);
+extern int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu);
 extern int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu);
 extern void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags);
 extern void kvmppc_core_queue_dec(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 7d54f4e..c8ead7b 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -258,7 +258,7 @@ static bool clear_irqprio(struct kvm_vcpu *vcpu, unsigned 
int priority)
return true;
 }
 
-void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
+int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
unsigned long *pending = vcpu-arch.pending_exceptions;
unsigned long old_pending = vcpu-arch.pending_exceptions;
@@ -283,6 +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 
/* Tell the guest about our interrupt status */
kvmppc_update_int_pending(vcpu, *pending, old_pending);
+
+   return 0;
 }
 
 pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 9979be1..3fcec2c 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu 
*vcpu)
 }
 
 /* Check pending exceptions and deliver one, if possible. */
-void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
+int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
+   int r = 0;
WARN_ON_ONCE(!irqs_disabled());
 
kvmppc_core_check_exceptions(vcpu);
@@ -451,8 +452,44 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
local_irq_disable();
 
kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS);
-   kvmppc_core_check_exceptions(vcpu);
+   r = 1;
};
+
+   return r;
+}
+
+/*
+ * Common checks before entering the guest world.  Call with interrupts
+ * disabled.
+ *
+ * returns !0 if a signal is pending and check_signal is true
+ */
+static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal)
+{
+   int r = 0;
+
+   WARN_ON_ONCE(!irqs_disabled());
+   while (true) {
+   if (need_resched()) {
+   local_irq_enable();
+   cond_resched();
+   local_irq_disable();
+   continue;
+   }
+
+   if (kvmppc_core_prepare_to_enter(vcpu)) {
+   /* interrupts got enabled in between, so we
+  are back at square 1 */
+   continue;
+   }
+
+   if (check_signal  signal_pending(current))
+   r = 1;
+
+   break;
+   }
+
+   return r;
 }
 
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
@@ -470,10 +507,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
}
 
local_irq_disable();
-
-   kvmppc_core_prepare_to_enter(vcpu);
-
-   if (signal_pending(current)) {
+   if (kvmppc_prepare_to_enter(vcpu, true)) {
kvm_run-exit_reason = KVM_EXIT_INTR;
ret = -EINTR;
goto out;
@@ -598,25 +632,21 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
switch (exit_nr) {
case BOOKE_INTERRUPT_MACHINE_CHECK:
-   kvm_resched(vcpu);
r = RESUME_GUEST;
break;
 
case BOOKE_INTERRUPT_EXTERNAL:
kvmppc_account_exit(vcpu, EXT_INTR_EXITS);
-   kvm_resched(vcpu);
r = RESUME_GUEST;
break;
 
case BOOKE_INTERRUPT_DECREMENTER:
kvmppc_account_exit(vcpu, DEC_EXITS);
-   kvm_resched(vcpu);
  

[PATCH 28/37] KVM: PPC: bookehv: remove SET_VCPU

2012-02-24 Thread Alexander Graf
The SET_VCPU macro is a leftover from times when the vcpu struct wasn't
stored in the thread on vcpu_load/put. It's not needed anymore. Remove it.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S |8 
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index c5a0796..469bd3f 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -35,9 +35,6 @@
 #define GET_VCPU(vcpu, thread) \
PPC_LL  vcpu, THREAD_KVM_VCPU(thread)
 
-#define SET_VCPU(vcpu) \
-PPC_STLvcpu, (THREAD + THREAD_KVM_VCPU)(r2)
-
 #define LONGBYTES  (BITS_PER_LONG / 8)
 
 #define VCPU_GPR(n)(VCPU_GPRS + (n * LONGBYTES))
@@ -517,11 +514,6 @@ lightweight_exit:
lwz r3, VCPU_GUEST_PID(r4)
mtspr   SPRN_PID, r3
 
-   /* Save vcpu pointer for the exception handlers
-* must be done before loading guest r2.
-*/
-// SET_VCPU(r4)
-
PPC_LL  r11, VCPU_SHARED(r4)
/* Save host mas4 and mas6 and load guest MAS registers */
mfspr   r3, SPRN_MAS4
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 25/37] KVM: PPC: booke: BOOKE_IRQPRIO_MAX is n+1

2012-02-24 Thread Alexander Graf
The semantics of BOOKE_IRQPRIO_MAX changed to denote the highest available
irqprio + 1, so let's reflect that in the code too.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 3fcec2c..288bc05 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -425,7 +425,7 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu 
*vcpu)
}
 
priority = __ffs(*pending);
-   while (priority = BOOKE_IRQPRIO_MAX) {
+   while (priority  BOOKE_IRQPRIO_MAX) {
if (kvmppc_booke_irqprio_deliver(vcpu, priority))
break;
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 30/37] KVM: PPC: bookehv: add comment about shadow_msr

2012-02-24 Thread Alexander Graf
For BookE HV the guest visible MSR is shared-msr and is identical to
the MSR that is in use while the guest is running, because we can't trap
reads from/to MSR.

So shadow_msr is unused there. Indicate that with a comment.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index ed95f53..633d68f 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -386,6 +386,7 @@ struct kvm_vcpu_arch {
 #endif
u32 vrsave; /* also USPRG0 */
u32 mmucr;
+   /* shadow_msr is unused for BookE HV */
ulong shadow_msr;
ulong csrr0;
ulong csrr1;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 31/37] KVM: PPC: booke: Readd debug abort code for machine check

2012-02-24 Thread Alexander Graf
When during guest execution we get a machine check interrupt, we don't
know how to handle it yet. So let's add the error printing code back
again that we dropped accidently earlier and tell user space that something
went really wrong.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 288bc05..451ba16 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -632,7 +632,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
switch (exit_nr) {
case BOOKE_INTERRUPT_MACHINE_CHECK:
-   r = RESUME_GUEST;
+   printk(MACHINE CHECK: %lx\n, mfspr(SPRN_MCSR));
+   kvmppc_dump_vcpu(vcpu);
+   /* For debugging, send invalid exit reason to user space */
+   run-hw.hardware_exit_reason = ~1ULL  32;
+   run-hw.hardware_exit_reason |= mfspr(SPRN_MCSR);
+   r = RESUME_HOST;
break;
 
case BOOKE_INTERRUPT_EXTERNAL:
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 32/37] KVM: PPC: booke: add GS documentation for program interrupt

2012-02-24 Thread Alexander Graf
The comment for program interrupts triggered when using bookehv was
misleading. Update it to mention why MSR_GS indicates that we have
to inject an interrupt into the guest again, not emulate it.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 451ba16..7adef28 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -683,8 +683,14 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
case BOOKE_INTERRUPT_PROGRAM:
if (vcpu-arch.shared-msr  (MSR_PR | MSR_GS)) {
-   /* Program traps generated by user-level software must 
be handled
-* by the guest kernel. */
+   /*
+* Program traps generated by user-level software must
+* be handled by the guest kernel.
+*
+* In GS mode, hypervisor privileged instructions trap
+* on BOOKE_INTERRUPT_HV_PRIV, not here, so these are
+* actual program interrupts, handled by the guest.
+*/
kvmppc_core_queue_program(vcpu, vcpu-arch.fault_esr);
r = RESUME_GUEST;
kvmppc_account_exit(vcpu, USR_PR_INST);
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 29/37] KVM: PPC: bookehv: disable MAS register updates early

2012-02-24 Thread Alexander Graf
We need to make sure that no MAS updates happen automatically while we
have the guest MAS registers loaded. So move the disabling code a bit
higher up so that it covers the full time we have guest values in MAS
registers.

The race this patch fixes should never occur, but it makes the code a
bit more logical to do it this way around.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 469bd3f..021d087 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -358,6 +358,7 @@ _GLOBAL(kvmppc_resume_host)
mtspr   SPRN_MAS4, r6
stw r5, VCPU_SHARED_MAS7_3+0(r11)
mtspr   SPRN_MAS6, r8
+   /* Enable MAS register updates via exception */
mfspr   r3, SPRN_EPCR
rlwinm  r3, r3, 0, ~SPRN_EPCR_DMIUH
mtspr   SPRN_EPCR, r3
@@ -515,6 +516,11 @@ lightweight_exit:
mtspr   SPRN_PID, r3
 
PPC_LL  r11, VCPU_SHARED(r4)
+   /* Disable MAS register updates via exception */
+   mfspr   r3, SPRN_EPCR
+   orisr3, r3, SPRN_EPCR_DMIUH@h
+   mtspr   SPRN_EPCR, r3
+   isync
/* Save host mas4 and mas6 and load guest MAS registers */
mfspr   r3, SPRN_MAS4
stw r3, VCPU_HOST_MAS4(r4)
@@ -538,10 +544,6 @@ lightweight_exit:
lwz r5, VCPU_SHARED_MAS7_3+0(r11)
mtspr   SPRN_MAS6, r3
mtspr   SPRN_MAS7, r5
-   /* Disable MAS register updates via exception */
-   mfspr   r3, SPRN_EPCR
-   orisr3, r3, SPRN_EPCR_DMIUH@h
-   mtspr   SPRN_EPCR, r3
 
/*
 * Host interrupt handlers may have clobbered these guest-readable
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 34/37] KVM: PPC: e500: fix typo in tlb code

2012-02-24 Thread Alexander Graf
The tlbncfg registers should be populated with their respective TLB's
values. Fix the obvious typo.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500_tlb.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 279e10a..e05232b 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -1268,8 +1268,8 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 
*vcpu_e500)
 
vcpu-arch.tlbcfg[1] = mfspr(SPRN_TLB1CFG) 
 ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
-   vcpu-arch.tlbcfg[0] |= vcpu_e500-gtlb_params[1].entries;
-   vcpu-arch.tlbcfg[0] |=
+   vcpu-arch.tlbcfg[1] |= vcpu_e500-gtlb_params[1].entries;
+   vcpu-arch.tlbcfg[1] |=
vcpu_e500-gtlb_params[1].ways  TLBnCFG_ASSOC_SHIFT;
 
return 0;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 27/37] KVM: PPC: bookehv: remove negation for CONFIG_64BIT

2012-02-24 Thread Alexander Graf
Instead if doing

  #ifndef CONFIG_64BIT
  ...
  #else
  ...
  #endif

we should rather do

  #ifdef CONFIG_64BIT
  ...
  #else
  ...
  #endif

which is a lot easier to read. Change the bookehv implementation to
stick with this rule.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 215381e..c5a0796 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -99,10 +99,10 @@
.endif
 
orisr8, r6, MSR_CE@h
-#ifndef CONFIG_64BIT
-   stw r6, (VCPU_SHARED_MSR + 4)(r11)
-#else
+#ifdef CONFIG_64BIT
std r6, (VCPU_SHARED_MSR)(r11)
+#else
+   stw r6, (VCPU_SHARED_MSR + 4)(r11)
 #endif
ori r8, r8, MSR_ME | MSR_RI
PPC_STL r5, VCPU_PC(r4)
@@ -344,10 +344,10 @@ _GLOBAL(kvmppc_resume_host)
stw r5, VCPU_SHARED_MAS0(r11)
mfspr   r7, SPRN_MAS2
stw r6, VCPU_SHARED_MAS1(r11)
-#ifndef CONFIG_64BIT
-   stw r7, (VCPU_SHARED_MAS2 + 4)(r11)
-#else
+#ifdef CONFIG_64BIT
std r7, (VCPU_SHARED_MAS2)(r11)
+#else
+   stw r7, (VCPU_SHARED_MAS2 + 4)(r11)
 #endif
mfspr   r5, SPRN_MAS3
mfspr   r6, SPRN_MAS4
@@ -530,10 +530,10 @@ lightweight_exit:
stw r3, VCPU_HOST_MAS6(r4)
lwz r3, VCPU_SHARED_MAS0(r11)
lwz r5, VCPU_SHARED_MAS1(r11)
-#ifndef CONFIG_64BIT
-   lwz r6, (VCPU_SHARED_MAS2 + 4)(r11)
-#else
+#ifdef CONFIG_64BIT
ld  r6, (VCPU_SHARED_MAS2)(r11)
+#else
+   lwz r6, (VCPU_SHARED_MAS2 + 4)(r11)
 #endif
lwz r7, VCPU_SHARED_MAS7_3+4(r11)
lwz r8, VCPU_SHARED_MAS4(r11)
@@ -572,10 +572,10 @@ lightweight_exit:
PPC_LL  r6, VCPU_CTR(r4)
PPC_LL  r7, VCPU_CR(r4)
PPC_LL  r8, VCPU_PC(r4)
-#ifndef CONFIG_64BIT
-   lwz r9, (VCPU_SHARED_MSR + 4)(r11)
-#else
+#ifdef CONFIG_64BIT
ld  r9, (VCPU_SHARED_MSR)(r11)
+#else
+   lwz r9, (VCPU_SHARED_MSR + 4)(r11)
 #endif
PPC_LL  r0, VCPU_GPR(r0)(r4)
PPC_LL  r1, VCPU_GPR(r1)(r4)
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 33/37] KVM: PPC: bookehv: remove unused code

2012-02-24 Thread Alexander Graf
There was some unused code in the exit code path that must have been
a leftover from earlier iterations. While it did no hard, it's superfluous
and thus should be removed.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 021d087..b1c099b 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -112,9 +112,6 @@
 * appropriate for the exception type).
 */
cmpwr6, r8
-   .if \flags  NEED_EMU
-   lwz r9, KVM_LPID(r9)
-   .endif
beq 1f
mfmsr   r7
.if \srr0 != SPRN_MCSRR0  \srr0 != SPRN_CSRR0
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 35/37] KVM: PPC: booke: Support perfmon interrupts

2012-02-24 Thread Alexander Graf
When during guest context we get a performance monitor interrupt, we
currently bail out and oops. Let's route it to its correct handler
instead.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 7adef28..423701b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -677,6 +677,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
r = RESUME_GUEST;
break;
 
+   case BOOKE_INTERRUPT_PERFORMANCE_MONITOR:
+   r = RESUME_GUEST;
+   break;
+
case BOOKE_INTERRUPT_HV_PRIV:
r = emulation_exit(run, vcpu);
break;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 37/37] KVM: PPC: booke: Reinject performance monitor interrupts

2012-02-24 Thread Alexander Graf
When we get a performance monitor interrupt, we need to make sure that
the host receives it. So reinject it like we reinject the other host
destined interrupts.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/hw_irq.h |1 +
 arch/powerpc/kvm/booke.c  |3 +++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index bb712c9..904e66c 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -12,6 +12,7 @@
 #include asm/processor.h
 
 extern void timer_interrupt(struct pt_regs *);
+extern void performance_monitor_exception(struct pt_regs *regs);
 
 #ifdef CONFIG_PPC64
 #include asm/paca.h
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 709fd45..c391357 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -630,6 +630,9 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
case BOOKE_INTERRUPT_MACHINE_CHECK:
/* FIXME */
break;
+   case BOOKE_INTERRUPT_PERFORMANCE_MONITOR:
+   performance_monitor_exception(regs);
+   break;
}
 }
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 36/37] KVM: PPC: booke: expose guest registers on irq reinject

2012-02-24 Thread Alexander Graf
When reinjecting an interrupt into the host interrupt handler after we're
back in host kernel land, let's tell the kernel about all the guest state
that the interrupt happened at.

This helps getting reasonable numbers out of perf.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |   54 +
 1 files changed, 39 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 423701b..709fd45 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -593,37 +593,61 @@ static int emulation_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
}
 }
 
-/**
- * kvmppc_handle_exit
- *
- * Return value is in the form (errcode2 | RESUME_FLAG_HOST | RESUME_FLAG_NV)
- */
-int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
-   unsigned int exit_nr)
+static void kvmppc_fill_pt_regs(struct kvm_vcpu *vcpu, struct pt_regs *regs)
 {
-   int r = RESUME_HOST;
+   int i;
 
-   /* update before a new last_exit_type is rewritten */
-   kvmppc_update_timing_stats(vcpu);
+   for (i = 0; i  32; i++)
+   regs-gpr[i] = kvmppc_get_gpr(vcpu, i);
+   regs-nip = vcpu-arch.pc;
+   regs-msr = vcpu-arch.shared-msr;
+   regs-ctr = vcpu-arch.ctr;
+   regs-link = vcpu-arch.lr;
+   regs-xer = kvmppc_get_xer(vcpu);
+   regs-ccr = kvmppc_get_cr(vcpu);
+   regs-dar = get_guest_dear(vcpu);
+   regs-dsisr = get_guest_esr(vcpu);
+}
+
+static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
+unsigned int exit_nr)
+{
+   struct pt_regs regs = *current-thread.regs;
 
+   kvmppc_fill_pt_regs(vcpu, regs);
switch (exit_nr) {
case BOOKE_INTERRUPT_EXTERNAL:
-   do_IRQ(current-thread.regs);
+   do_IRQ(regs);
break;
-
case BOOKE_INTERRUPT_DECREMENTER:
-   timer_interrupt(current-thread.regs);
+   timer_interrupt(regs);
break;
-
 #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
case BOOKE_INTERRUPT_DOORBELL:
-   doorbell_exception(current-thread.regs);
+   doorbell_exception(regs);
break;
 #endif
case BOOKE_INTERRUPT_MACHINE_CHECK:
/* FIXME */
break;
}
+}
+
+/**
+ * kvmppc_handle_exit
+ *
+ * Return value is in the form (errcode2 | RESUME_FLAG_HOST | RESUME_FLAG_NV)
+ */
+int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
+   unsigned int exit_nr)
+{
+   int r = RESUME_HOST;
+
+   /* update before a new last_exit_type is rewritten */
+   kvmppc_update_timing_stats(vcpu);
+
+   /* restart interrupts if they were meant for the host */
+   kvmppc_restart_interrupt(vcpu, exit_nr);
 
local_irq_enable();
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 00/14] DMA-mapping framework redesign preparation

2012-02-24 Thread Arnd Bergmann
On Friday 23 December 2011, Marek Szyprowski wrote:
 The solution we found is to introduce a new public dma mapping functions
 with additional attributes argument: dma_alloc_attrs and
 dma_free_attrs(). This way all different kinds of architecture specific
 buffer mappings can be hidden behind the attributes without the need of
 creating several versions of dma_alloc_ function.

Since the patches are now in linux-next, we should make sure that they
can actually get merged into 3.4.

I've looked at all the patches again and found them to be straightforward
and helpful, I hope we can get them merged next time. Please add my

Reviewed-by: Arnd Bergmann a...@arndb.de
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] sparsemem/bootmem: catch greater than section size allocations

2012-02-24 Thread Nishanth Aravamudan
While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
Overcommit) on powerpc, we tripped the following:

kernel BUG at mm/bootmem.c:483!
cpu 0x0: Vector: 700 (Program Check) at [c0c03940]
pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c
lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
sp: c0c03bc0
   msr: 80021032
  current = 0xc0b0cce0
  paca= 0xc1d8
pid   = 0, comm = swapper
kernel BUG at mm/bootmem.c:483!
enter ? for help
[c0c03c80] c0a64bcc
.sparse_early_usemaps_alloc_node+0x84/0x29c
[c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c
[c0c03e20] c0a474f4 .setup_arch+0x20c/0x294
[c0c03ee0] c0a4079c .start_kernel+0xb4/0x460
[c0c03f90] c0009670 .start_here_common+0x1c/0x2c

This is

BUG_ON(limit  goal + size  limit);

and after some debugging, it seems that

goal = 0x700
limit = 0x800

and sparse_early_usemaps_alloc_node -
sparse_early_usemaps_alloc_pgdat_section - alloc_bootmem_section calls

return alloc_bootmem_section(usemap_size() * count, section_nr);

This is on a system with 8TB available via the AMS pool, and as a quirk
of AMS in firmware, all of that memory shows up in node 0. So, we end up
with an allocation that will fail the goal/limit constraints. In theory,
we could fall-back to alloc_bootmem_node() in
sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
defined, we'll BUG_ON() instead. A simple solution appears to be to
disable the limit check if the size of the allocation in
alloc_bootmem_secition exceeds the section size.

Signed-off-by: Nishanth Aravamudan n...@us.ibm.com
Cc: Dave Hansen haveb...@us.ibm.com
Cc: Anton Blanchard an...@au1.ibm.com
Cc: Paul Mackerras pau...@samba.org
Cc: Ben Herrenschmidt b...@kernel.crashing.org
Cc: Robert Jennings r...@linux.vnet.ibm.com
Cc: linux...@kvack.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 include/linux/mmzone.h |2 ++
 mm/bootmem.c   |5 -
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 650ba2f..4176834 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -967,6 +967,8 @@ static inline unsigned long early_pfn_to_nid(unsigned long 
pfn)
  * PA_SECTION_SHIFTphysical address to/from section number
  * PFN_SECTION_SHIFT   pfn to/from section number
  */
+#define BYTES_PER_SECTION  (1UL  SECTION_SIZE_BITS)
+
 #define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
 
 #define PA_SECTION_SHIFT   (SECTION_SIZE_BITS)
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 668e94d..5cbbc76 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -770,7 +770,10 @@ void * __init alloc_bootmem_section(unsigned long size,
 
pfn = section_nr_to_pfn(section_nr);
goal = pfn  PAGE_SHIFT;
-   limit = section_nr_to_pfn(section_nr + 1)  PAGE_SHIFT;
+   if (size  BYTES_PER_SECTION)
+   limit = 0;
+   else
+   limit = section_nr_to_pfn(section_nr + 1)  PAGE_SHIFT;
bdata = bootmem_node_data[early_pfn_to_nid(pfn)];
 
return alloc_bootmem_core(bdata, size, SMP_CACHE_BYTES, goal, limit);
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: warnings from drivers/tty/ehv_bytechan.c

2012-02-24 Thread gre...@linuxfoundation.org
On Mon, Feb 20, 2012 at 01:24:22PM +, Tabi Timur-B04825 wrote:
 Stephen Rothwell wrote:
  console_initcall() is not defined for modules.
 
 Hmmm... the patch you posted is a good short-term fix, but I wonder if 
 makes sense for the driver to support modules at all.  I have this in the 
 driver:
 
 #include linux/module.h
 ...
 module_init(ehv_bc_init);
 module_exit(ehv_bc_exit);
 
 although to be honest, I can't remember the last time I tried to compile 
 it as a module.
 
 The problem stems from the fact that it's a console driver *and* a tty 
 driver.  It makes sense that a tty driver can be compiled as a module, but 
 not a console driver.
 
 So Greg, can I do something like this:
 
 #ifdef MODULE
 module_initcall(ehv_bc_console_init)
 #else
 console_initcall(ehv_bc_console_init);
 #endif

Sure, something like that is fine, but if the code really can't be a
module, why not just fix the Kconfig file to enforce this properly
instead?

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: warnings from drivers/tty/ehv_bytechan.c

2012-02-24 Thread Timur Tabi
gre...@linuxfoundation.org wrote:
 Sure, something like that is fine, but if the code really can't be a
 module, why not just fix the Kconfig file to enforce this properly
 instead?

That's the simplest approach, for use.  The TTY portion of the driver can
be used as a module.  Is there any real value in loading a TTY driver as a
module?  In this case, the console support for byte channels would not be
available.

-- 
Timur Tabi
Linux kernel developer at Freescale

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: warnings from drivers/tty/ehv_bytechan.c

2012-02-24 Thread gre...@linuxfoundation.org
On Fri, Feb 24, 2012 at 04:00:12PM -0600, Timur Tabi wrote:
 gre...@linuxfoundation.org wrote:
  Sure, something like that is fine, but if the code really can't be a
  module, why not just fix the Kconfig file to enforce this properly
  instead?
 
 That's the simplest approach, for use.  The TTY portion of the driver can
 be used as a module.  Is there any real value in loading a TTY driver as a
 module?

Depends on the hardware it supports :)

 In this case, the console support for byte channels would not be
 available.

Then it doesn't make sense, right?

thanks,

greg k-h

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: warnings from drivers/tty/ehv_bytechan.c

2012-02-24 Thread Timur Tabi
gre...@linuxfoundation.org wrote:
  That's the simplest approach, for use.  The TTY portion of the driver can
  be used as a module.  Is there any real value in loading a TTY driver as a
  module?

 Depends on the hardware it supports :)
 
  In this case, the console support for byte channels would not be
  available.

 Then it doesn't make sense, right?

I guess that's my question.  Is there a real use case for having console
output go to the serial port, and TTY go to a byte channel?  Even if you
wanted to do that, I supposed you don't need to load the byte channel
driver as a module to get that behavior.

Anyway, that's all academic.  A more important question is: now that the
driver can't be compiled as a module, should I change module_init() to
something else (like device_initcall)?

Should I remove this line?

#include linux/module.h

-- 
Timur Tabi
Linux kernel developer at Freescale

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: warnings from drivers/tty/ehv_bytechan.c

2012-02-24 Thread gre...@linuxfoundation.org
On Fri, Feb 24, 2012 at 04:15:04PM -0600, Timur Tabi wrote:
 gre...@linuxfoundation.org wrote:
   That's the simplest approach, for use.  The TTY portion of the driver can
   be used as a module.  Is there any real value in loading a TTY driver as 
   a
   module?
 
  Depends on the hardware it supports :)
  
   In this case, the console support for byte channels would not be
   available.
 
  Then it doesn't make sense, right?
 
 I guess that's my question.  Is there a real use case for having console
 output go to the serial port, and TTY go to a byte channel?  Even if you
 wanted to do that, I supposed you don't need to load the byte channel
 driver as a module to get that behavior.
 
 Anyway, that's all academic.  A more important question is: now that the
 driver can't be compiled as a module, should I change module_init() to
 something else (like device_initcall)?
 
 Should I remove this line?
 
   #include linux/module.h

No, no need to, leave it as-is if it builds properly.

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses

2012-02-24 Thread Jesse Barnes
On Thu, 23 Feb 2012 12:51:30 -0800
Bjorn Helgaas bhelg...@google.com wrote:

 On Thu, Feb 23, 2012 at 12:25 PM, Jesse Barnes jbar...@virtuousgeek.org 
 wrote:
  On Fri, 10 Feb 2012 08:35:58 +1100
  Benjamin Herrenschmidt b...@kernel.crashing.org wrote:
 
  On Thu, 2012-02-09 at 11:24 -0800, Bjorn Helgaas wrote:
   My point is that the interface between the arch and the PCI core
   should be simply the arch telling the core this is the range of bus
   numbers you can use.  If the firmware doesn't give you the HW limits,
   that's the arch's problem.  If you want to assume 0..255 are
   available, again, that's the arch's decision.
  
   But the answer to the question what bus numbers are available to me
   depends only on the host bridge HW configuration.  It does not depend
   on what pci_scan_child_bus() found.  Therefore, I think we can come up
   with a design where pci_bus_update_busn_res_end() is unnecessary.
 
  In an ideal world yes. In a world where there are reverse engineered
  platforms on which we aren't 100% sure how thing actually work under the
  hood and have the code just adapt on what's there (and try to fix it
  up -sometimes-), thinks can get a bit murky :-)
 
  But yes, I see your point. As for what is the correct setting that
  needs to be done so that the patch doesn't end up a regression for us,
  I'll have to dig into some ancient HW to dbl check a few things. I hope
  0...255 will just work but I can't guarantee it.
 
  What I'll probably do is constraint the core to the values in
  hose-min/max, and update selected platforms to put 0..255 in there when
  I know for sure they can cope.
 
  But I think the point is, can't we intiialize the busn resource after
  the first  last bus numbers have been determined?  E.g. rather than
  Yinghai's current:
  +       pci_bus_insert_busn_res(bus, hose-first_busno, hose-last_busno);
  +
         /* Get probe mode and perform scan */
         mode = PCI_PROBE_NORMAL;
         if (node  ppc_md.pci_probe_mode)
  @@ -1742,8 +1744,11 @@ void __devinit pcibios_scan_phb(struct 
  pci_controller *hose)
                 of_scan_bus(node, bus);
         }
 
  -       if (mode == PCI_PROBE_NORMAL)
  +       if (mode == PCI_PROBE_NORMAL) {
  +               pci_bus_update_busn_res_end(bus, 255);
                 hose-last_busno = bus-subordinate = 
  pci_scan_child_bus(bus);
  +               pci_bus_update_busn_res_end(bus, bus-subordinate);
  +       }
 
  we'd have something more like:
 
         /* Get probe mode and perform scan */
         mode = PCI_PROBE_NORMAL;
         if (node  ppc_md.pci_probe_mode)
  @@ -1742,8 +1744,11 @@ void __devinit pcibios_scan_phb(struct 
  pci_controller *hose)
                 of_scan_bus(node, bus);
         }
 
         if (mode == PCI_PROBE_NORMAL)
                 hose-last_busno = bus-subordinate = 
  pci_scan_child_bus(bus);
 
  +       pci_bus_insert_busn_res(bus, hose-first_busno, hose-last_busno);
 
  since we should have the final bus range by then?  Setting the end to
  255 and then changing it again doesn't make sense; and definitely makes
  the code hard to follow.
 
 I have two issues here:
 
 1) hose-last_busno is currently the highest bus number found by
 pci_scan_child_bus().  If I understand correctly,
 pci_bus_insert_busn_res() is supposed to update the core's idea of the
 host bridge's bus number aperture.  (Actually, I guess it just updates
 the *end* of the aperture, since we supply the start directly to
 pci_scan_root_bus()).  The aperture and the highest bus number we
 found are not related, except that we should have:
 
 hose-first_busno = bus-subordinate = hose-last_busno
 
 If we set the aperture to [first_busno - last_busno], we artificially
 prevent some hotplug.

Oh true, we'll need to allocate any extra bus number space somehow so
that hot plug of bridges is possible in the future w/o renumbering
(until our glorious future when we can move resources on the fly by
stopping drivers).

 
 2) We already have a way to add resources to a root bus: the
 pci_add_resource() used to add I/O port and MMIO apertures.  I think
 it'd be a lot simpler to just use that same interface for the bus
 number aperture, e.g.,
 
 pci_add_resource(resources, hose-io_space);
 pci_add_resource(resources, hose-mem_space);
 pci_add_resource(resources, hose-busnr_space);
 bus = pci_scan_root_bus(dev, next_busno, pci_ops, sysdata, resources);
 
 This is actually a bit redundant, since next_busno should be the
 same as hose-busnr_space-start.  So if we adopted this approach, we
 might want to eventually drop the next_busno argument.

Yeah that would be nice, the call would certainly make more sense that
way.

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: warnings from drivers/tty/ehv_bytechan.c

2012-02-24 Thread Scott Wood
On 02/24/2012 04:15 PM, Timur Tabi wrote:
 gre...@linuxfoundation.org wrote:
 That's the simplest approach, for use.  The TTY portion of the driver can
 be used as a module.  Is there any real value in loading a TTY driver as a
 module?
 
 Depends on the hardware it supports :)

 In this case, the console support for byte channels would not be
 available.
 
 Then it doesn't make sense, right?
 
 I guess that's my question.  Is there a real use case for having console
 output go to the serial port, and TTY go to a byte channel?

Sure -- you could be using the byte channel for inter-partition
communication, or just not have enough serial ports for all of this
partition's needs.

It looks like the usual pattern is to have a separate kconfig for the
console part, and have that be a bool that depends on the tristate tty
driver being y.

 Even if you
 wanted to do that, I supposed you don't need to load the byte channel
 driver as a module to get that behavior.

Right, though that could be said about all (most?) modules.

Probably not that important in this particular case, though.  I can see
people wanting to use byte channel but not caring about console, and I
can see people wanting to build a generic kernel that supports byte
channels, but I don't think there's much overlap between the two.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 33/37] KVM: PPC: bookehv: remove unused code

2012-02-24 Thread Scott Wood
On 02/24/2012 08:26 AM, Alexander Graf wrote:
 There was some unused code in the exit code path that must have been
 a leftover from earlier iterations. While it did no hard, it's superfluous
 and thus should be removed.

s/hard/harm/ -- at that assumes exit timing wasn't enabled. :-)

 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  arch/powerpc/kvm/bookehv_interrupts.S |3 ---
  1 files changed, 0 insertions(+), 3 deletions(-)
 
 diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
 b/arch/powerpc/kvm/bookehv_interrupts.S
 index 021d087..b1c099b 100644
 --- a/arch/powerpc/kvm/bookehv_interrupts.S
 +++ b/arch/powerpc/kvm/bookehv_interrupts.S
 @@ -112,9 +112,6 @@
* appropriate for the exception type).
*/
   cmpwr6, r8
 - .if \flags  NEED_EMU
 - lwz r9, KVM_LPID(r9)
 - .endif
   beq 1f
   mfmsr   r7
   .if \srr0 != SPRN_MCSRR0  \srr0 != SPRN_CSRR0

Can also remove lwzr9, VCPU_KVM(r4).

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 35/37] KVM: PPC: booke: Support perfmon interrupts

2012-02-24 Thread Scott Wood
On 02/24/2012 08:26 AM, Alexander Graf wrote:
 When during guest context we get a performance monitor interrupt, we
 currently bail out and oops. Let's route it to its correct handler
 instead.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  arch/powerpc/kvm/booke.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index 7adef28..423701b 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -677,6 +677,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
 kvm_vcpu *vcpu,
   r = RESUME_GUEST;
   break;
  
 + case BOOKE_INTERRUPT_PERFORMANCE_MONITOR:
 + r = RESUME_GUEST;
 + break;
 +
   case BOOKE_INTERRUPT_HV_PRIV:
   r = emulation_exit(run, vcpu);
   break;

Why do we need to call timer_interrupt() explicitly, but can rely on
automatic retriggering for perfmon?

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 36/37] KVM: PPC: booke: expose guest registers on irq reinject

2012-02-24 Thread Scott Wood
On 02/24/2012 08:26 AM, Alexander Graf wrote:
 +static void kvmppc_fill_pt_regs(struct kvm_vcpu *vcpu, struct pt_regs *regs)
  {
 - int r = RESUME_HOST;
 + int i;
  
 - /* update before a new last_exit_type is rewritten */
 - kvmppc_update_timing_stats(vcpu);
 + for (i = 0; i  32; i++)
 + regs-gpr[i] = kvmppc_get_gpr(vcpu, i);
 + regs-nip = vcpu-arch.pc;
 + regs-msr = vcpu-arch.shared-msr;
 + regs-ctr = vcpu-arch.ctr;
 + regs-link = vcpu-arch.lr;
 + regs-xer = kvmppc_get_xer(vcpu);
 + regs-ccr = kvmppc_get_cr(vcpu);
 + regs-dar = get_guest_dear(vcpu);
 + regs-dsisr = get_guest_esr(vcpu);
 +}

How much overhead does this add to every interrupt?  Can't we keep this
to the minimum that perf cares about?

 +
 +static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
 +  unsigned int exit_nr)
 +{
 + struct pt_regs regs = *current-thread.regs;
  
 + kvmppc_fill_pt_regs(vcpu, regs);

Why are you copying out of current-thread.regs?  That's old junk data,
set by some previous exception and possibly overwritten since.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc/prom: bump up maximum size of properties

2012-02-24 Thread Nishanth Aravamudan
On a 16TB system (using AMS/CMO), I get:

WARNING: ignoring large property [/ibm,dynamic-reconfiguration-memory] 
ibm,dynamic-memory length 0x0017ffec

and significantly less memory is thus shown to the partition. As far as
I can tell, the constant used is arbitrary, but bump it up to 2MB, which
covers the above property (approximately 1.5MB).

With this patch, the kernel does see all of the system memory on the
16TB system.

Signed-off-by: Nishanth Aravamudan n...@us.ibm.com
Cc: Anton Blanchard an...@au1.ibm.com
Cc: Paul Mackerras pau...@samba.org
Cc: Robert Jennings r...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/prom_init.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index eca626e..0bf0ccc 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -53,7 +53,7 @@
  * ensure that we don't lose things like the interrupt-map property
  * on a PCI-PCI bridge.
  */
-#define MAX_PROPERTY_LENGTH(1UL * 1024 * 1024)
+#define MAX_PROPERTY_LENGTH(2UL * 1024 * 1024)
 
 /*
  * Eventually bump that one up
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses

2012-02-24 Thread Yinghai Lu
On Fri, Feb 24, 2012 at 2:24 PM, Jesse Barnes jbar...@virtuousgeek.org wrote:
 On Thu, 23 Feb 2012 12:51:30 -0800
 Bjorn Helgaas bhelg...@google.com wrote:
 2) We already have a way to add resources to a root bus: the
 pci_add_resource() used to add I/O port and MMIO apertures.  I think
 it'd be a lot simpler to just use that same interface for the bus
 number aperture, e.g.,

     pci_add_resource(resources, hose-io_space);
     pci_add_resource(resources, hose-mem_space);
     pci_add_resource(resources, hose-busnr_space);
     bus = pci_scan_root_bus(dev, next_busno, pci_ops, sysdata, resources);

 This is actually a bit redundant, since next_busno should be the
 same as hose-busnr_space-start.  So if we adopted this approach, we
 might want to eventually drop the next_busno argument.

 Yeah that would be nice, the call would certainly make more sense that
 way.

no, I don't think so.

using pci_add_resource will need to create dummy resource abut bus range.

there is lots of pci_scan_root_bus(),  and those user does not bus end
yet before scan.
so could just hide pci_insert_busn_res in pci_scan_root_bus, and
update busn_res end there.

other arch like x86, ia64, powerpc, sparc, will insert exact bus range
between pci_create_root_bus and
pci_scan_child_bus, will not need to update busn_res end.

please check v7 of this patchset.

git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
for-pci-busn-alloc

It should be clean and have minimum lines of change.

Thanks

  Yinghai
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev