[PATCH powerpc/next RESEND] powerpc: spinlock: Fix spin_unlock_wait()
There is an ordering issue with spin_unlock_wait() on powerpc, because the spin_lock primitive is an ACQUIRE and an ACQUIRE is only ordering the load part of the operation with memory operations following it. Therefore the following event sequence can happen: CPU 1 CPU 2 CPU 3 == == spin_unlock(); spin_lock(): r1 = *lock; // r1 == 0; o = object; o = READ_ONCE(object); // reordered here object = NULL; smp_mb(); spin_unlock_wait(); *lock = 1; smp_mb(); o->dead = true; < o = READ_ONCE(object); > // reordered upwards if (o) // true BUG_ON(o->dead); // true!! To fix this, we add a "nop" ll/sc loop in arch_spin_unlock_wait() on ppc (arch_spin_is_locked_sync()), the "nop" ll/sc loop reads the lock value and writes it back atomically, in this way it will synchronize the view of the lock on CPU1 with that on CPU2. Therefore in the scenario above, either CPU2 will fail to get the lock at first or CPU1 will see the lock acquired by CPU2, both cases will eliminate this bug. This is a similar idea as what Will Deacon did for ARM64 in: "arm64: spinlock: serialise spin_unlock_wait against concurrent lockers" Further more, if arch_spin_is_locked_sync() figures out the lock is locked, we actually don't need to do the "nop" ll/sc trick again, we can just do a normal load+check loop for the lock to be released, because in that case, spin_unlock_wait() is called when someone is holding the lock, and the store part of arch_spin_is_locked_sync() happens before the unlocking of the current lock holder, which means arch_spin_is_locked_sync() happens before the next lock acquisition. With the smp_mb() perceding spin_unlock_wait(), the store of object is guaranteed to be observed by the next lock holder. Please note spin_unlock_wait() on powerpc is still not an ACQUIRE after this fix, the callers should add necessary barriers if they want to promote it as all the current callers do. This patch therefore fixes the issue and also cleans the arch_spin_unlock_wait() a little bit by removing superfluous memory barriers in loops and consolidating the implementations for PPC32 and PPC64 into one. Suggested-by: "Paul E. McKenney"Signed-off-by: Boqun Feng Reviewed-by: "Paul E. McKenney" --- arch/powerpc/include/asm/spinlock.h | 48 - arch/powerpc/lib/locks.c| 16 - 2 files changed, 42 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 523673d7583c..0a517c1a751e 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -64,6 +64,25 @@ static inline int arch_spin_is_locked(arch_spinlock_t *lock) } /* + * Use a ll/sc loop to read the lock value, the STORE part of this operation is + * used for making later lock operation observe it. + */ +static inline bool arch_spin_is_locked_sync(arch_spinlock_t *lock) +{ + arch_spinlock_t tmp; + + __asm__ __volatile__( +"1:" PPC_LWARX(%0, 0, %2, 1) "\n" +" stwcx. %0, 0, %2\n" +" bne- 1b\n" + : "=" (tmp), "+m" (*lock) + : "r" (lock) + : "cr0", "xer"); + + return !arch_spin_value_unlocked(tmp); +} + +/* * This returns the old value in the lock, so we succeeded * in getting the lock if the return value is 0. */ @@ -162,12 +181,29 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock) lock->slock = 0; } -#ifdef CONFIG_PPC64 -extern void arch_spin_unlock_wait(arch_spinlock_t *lock); -#else -#define arch_spin_unlock_wait(lock) \ - do { while (arch_spin_is_locked(lock)) cpu_relax(); } while (0) -#endif +static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) +{ + /* +* Make sure previous loads and stores are observed by other cpu, this +* pairs with the ACQUIRE barrier in lock. +*/ + smp_mb(); + + if (!arch_spin_is_locked_sync(lock)) + return; + + while (!arch_spin_value_unlocked(*lock)) { + HMT_low(); + if (SHARED_PROCESSOR) + __spin_yield(lock); + } + HMT_medium(); + + /* +* No barrier here, caller either relys on the control dependency or +* should add a necessary barrier afterwards. +*/ +} /* * Read-write spinlocks, allowing multiple readers diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c index f7deebdf3365..b7b1237d4aa6 100644 --- a/arch/powerpc/lib/locks.c +++ b/arch/powerpc/lib/locks.c @@ -68,19 +68,3 @@ void __rw_yield(arch_rwlock_t *rw) get_hard_smp_processor_id(holder_cpu),
Re: [PATCH v8 17/45] powerpc/powernv/ioda1: Improve DMA32 segment track
On 04/20/2016 10:49 AM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 11:50:10AM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: In current implementation, the DMA32 segments required by one specific PE isn't calculated with the information hold in the PE independently. It conflicts with the PCI hotplug design: PE centralized, meaning the PE's DMA32 segments should be calculated from the information hold in the PE independently. This introduces an array (@dma32_segmap) for every PHB to track the DMA32 segmeng usage. Besides, this moves the logic calculating PE's consumed DMA32 segments to pnv_pci_ioda1_setup_dma_pe() so that PE's DMA32 segments are calculated/allocated from the information hold in the PE (DMA32 weight). Also the logic is improved: we try to allocate as much DMA32 segments as we can. It's acceptable that number of DMA32 segments less than the expected number are allocated. Signed-off-by: Gavin ShanThis DMA segments business was the reason why I have not even tried implementing DDW for POWER7 - it is way too different from POWER8 and there is no chance that anyone outside Ozlabs will ever try using this in practice; the same applies to PCI hotplug on POWER7. I am suggesting to ditch all IODA1 changes from this patchset as this code will hang around (unused) for may be a year or so and then will be gone as p5ioc2. As I knew, some P7 boxes out of Ozlabs have the software stack. At least, I was heavily relying on P7 box + PowerNV based linux heavily until last September of last year. And yet you have not replaced a single physical device on any of our power7 boxes ;) My original thoughts are as below. If they're convincing, I can drop some of IODA1 changes, but not all of them obviously: - In case customer want to use this combo (P7 box + PowerNV) for any reason. I have serious doubts we have any customer like this. Or a developer who would want this. And OPAL on P7 does not support this either. - In case developers want to use this combo (P7 box + PowerNV) for any reason. For example, no P8 boxes can be found for one particular project, but available P7 box is still ok for that. Testing POWER8 PCI hotplug on POWER7 machine is kind of pointless anyway. - EEH supported on P7/P8 needs hotplug some cases: when hitting excessive failures, PCI devices and their platform resources (PE, DMA, M32/M64 mapping etc) should be purged. EEH recovery should not require resource reallocation, no? - Current implementation has P7/P8 mixed up to some extent which isn't so good as Ben pointed long time ago. It's impossible not to affect P7IOC piece if P8 piece is changed in order to support hotplug. This is understandable. I'll leave it to Ben. --- arch/powerpc/platforms/powernv/pci-ioda.c | 111 +- arch/powerpc/platforms/powernv/pci.h | 7 +- 2 files changed, 66 insertions(+), 52 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 0fc2309..59782fba 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2007,20 +2007,54 @@ static unsigned int pnv_pci_ioda_total_dma_weight(struct pnv_phb *phb) } static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb, - struct pnv_ioda_pe *pe, - unsigned int base, - unsigned int segs) + struct pnv_ioda_pe *pe) { struct page *tce_mem = NULL; struct iommu_table *tbl; - unsigned int tce32_segsz, i; + unsigned int weight, total_weight; + unsigned int tce32_segsz, base, segs, i; int64_t rc; void *addr; /* XXX FIXME: Handle 64-bit only DMA devices */ /* XXX FIXME: Provide 64-bit DMA facilities & non-4K TCE tables etc.. */ /* XXX FIXME: Allocate multi-level tables on PHB3 */ + total_weight = pnv_pci_ioda_total_dma_weight(phb); + weight = pnv_pci_ioda_pe_dma_weight(pe); + + segs = (weight * phb->ioda.dma32_count) / total_weight; + if (!segs) + segs = 1; + + /* +* Allocate contiguous DMA32 segments. We begin with the expected +* number of segments. With one more attempt, the number of DMA32 +* segments to be allocated is decreased by one until one segment +* is allocated successfully. +*/ + while (segs) { + for (base = 0; base <= phb->ioda.dma32_count - segs; base++) { + for (i = base; i < base + segs; i++) { + if (phb->ioda.dma32_segmap[i] != + IODA_INVALID_PE) + break; + } + + if (i >= base + segs) +
Re: [PATCH V11 0/4]perf/powerpc: Add ability to sample intr machine state in powerpc
On Wed, 2016-04-20 at 00:57 -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Apr 18, 2016 at 03:17:11PM +0530, Anju T escreveu: > > On Saturday 20 February 2016 10:32 AM, Anju T wrote: > > > > > > arch/powerpc/Kconfig| 1 + > > > arch/powerpc/include/uapi/asm/perf_regs.h | 50 > > > arch/powerpc/perf/Makefile | 1 + > > > arch/powerpc/perf/perf_regs.c | 91 > > > + > > > tools/perf/arch/powerpc/include/perf_regs.h | 69 ++ > > > tools/perf/arch/powerpc/util/Build | 1 + > > > tools/perf/arch/powerpc/util/perf_regs.c| 49 > > > tools/perf/config/Makefile | 5 ++ > > > 8 files changed, 267 insertions(+) > > > create mode 100644 arch/powerpc/include/uapi/asm/perf_regs.h > > > create mode 100644 arch/powerpc/perf/perf_regs.c > > > create mode 100644 tools/perf/arch/powerpc/include/perf_regs.h > > > create mode 100644 tools/perf/arch/powerpc/util/perf_regs.c > > > > > > > Hi, > > > > Can this be taken into the next tree? > > Even the bits in tools/perf/ are arch specific, so I guess this goes via > the powerpc tree? Michael? Yeah if that's OK with you. It doesn't look like it will generate much in the way of merge conflicts. Do you want to send an ack? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 38/45] powerpc/powernv: Functions to get/set PCI slot status
On 04/20/2016 12:36 PM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 07:39:34PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: This exports 4 functins, which base on the corresponding OPAL s/functins/functions/ Thanks. APIs to get/set PCI slot status. Those functions are going to be used by PowerNV PCI hotplug driver: pnv_pci_get_device_tree()opal_get_device_tree() pnv_pci_get_presence_state() opal_pci_get_presence_state() pnv_pci_get_power_state()opal_pci_get_power_state() pnv_pci_set_power_state()opal_pci_set_power_state() Besides, the patch also exports pnv_pci_hotplug_notifier_{register, unregister}() to allow registration and unregistration of PCI hotplug notifier, which will be used to receive PCI hotplug message from skiboot firmware in PowerNV PCI hotplug driver. Signed-off-by: Gavin Shan--- arch/powerpc/include/asm/opal-api.h| 17 ++- arch/powerpc/include/asm/opal.h| 4 ++ arch/powerpc/include/asm/pnv-pci.h | 7 +++ arch/powerpc/platforms/powernv/opal-wrappers.S | 4 ++ arch/powerpc/platforms/powernv/pci.c | 66 ++ 5 files changed, 97 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index f8faaae..a6af338 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -158,7 +158,11 @@ #define OPAL_LEDS_SET_INDICATOR 115 #define OPAL_CEC_REBOOT2 116 #define OPAL_CONSOLE_FLUSH117 -#define OPAL_LAST 117 +#define OPAL_GET_DEVICE_TREE 118 +#define OPAL_PCI_GET_PRESENCE_STATE119 +#define OPAL_PCI_GET_POWER_STATE 120 +#define OPAL_PCI_SET_POWER_STATE 121 +#define OPAL_LAST 121 /* Device tree flags */ @@ -344,6 +348,16 @@ enum OpalPciResetState { OPAL_ASSERT_RESET = 1 }; +enum OpalPciSlotPresentenceState { + OPAL_PCI_SLOT_EMPTY = 0, + OPAL_PCI_SLOT_PRESENT = 1 +}; + +enum OpalPciSlotPowerState { + OPAL_PCI_SLOT_POWER_OFF = 0, + OPAL_PCI_SLOT_POWER_ON = 1 +}; + enum OpalSlotLedType { OPAL_SLOT_LED_TYPE_ID = 0, /* IDENTIFY LED */ OPAL_SLOT_LED_TYPE_FAULT = 1, /* FAULT LED */ @@ -378,6 +392,7 @@ enum opal_msg_type { OPAL_MSG_DPO, OPAL_MSG_PRD, OPAL_MSG_OCC, + OPAL_MSG_PCI_HOTPLUG, OPAL_MSG_TYPE_MAX, }; diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 9e0039f..899bcb941 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -209,6 +209,10 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, uint64_t buf, uint64_t size, uint64_t token); int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size, uint64_t token); +int64_t opal_get_device_tree(uint32_t phandle, uint64_t buf, uint64_t len); +int64_t opal_pci_get_presence_state(uint64_t id, uint8_t *state); +int64_t opal_pci_get_power_state(uint64_t id, uint8_t *state); +int64_t opal_pci_set_power_state(uint64_t id, uint8_t state); /* Internal functions */ extern int early_init_dt_scan_opal(unsigned long node, const char *uname, diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index 6f77f71..d9d095b 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -13,6 +13,13 @@ #include #include +extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t len); +extern int pnv_pci_get_presence_state(uint64_t id, uint8_t *state); +extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state); +extern int pnv_pci_set_power_state(uint64_t id, uint8_t state); +extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb); +extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb); + int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode); int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, unsigned int virq); diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S index e45b88a..3ea1a855 100644 --- a/arch/powerpc/platforms/powernv/opal-wrappers.S +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S @@ -302,3 +302,7 @@ OPAL_CALL(opal_prd_msg, OPAL_PRD_MSG); OPAL_CALL(opal_leds_get_ind, OPAL_LEDS_GET_INDICATOR); OPAL_CALL(opal_leds_set_ind, OPAL_LEDS_SET_INDICATOR); OPAL_CALL(opal_console_flush, OPAL_CONSOLE_FLUSH); +OPAL_CALL(opal_get_device_tree,OPAL_GET_DEVICE_TREE); +OPAL_CALL(opal_pci_get_presence_state, OPAL_PCI_GET_PRESENCE_STATE);
Re: [PATCH v8 37/45] powerpc/powernv: Use firmware PCI slot reset infrastructure
On 04/20/2016 12:33 PM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 07:34:55PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: The skiboot firmware might provide the PCI slot reset capability which is identified by property "ibm,reset-by-firmware" on the PCI slot associated device node. This checks the property. If it exists, the reset request is routed to firmware. Otherwise, the reset is done by kernel as before. Signed-off-by: Gavin Shan--- arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index e23b063..c8a5217 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -789,7 +789,7 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option) return ret; } -static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option) +static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option) { struct pci_dn *pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn); struct eeh_dev *edev = pdn_to_eeh_dev(pdn); @@ -840,6 +840,45 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option) return 0; } +static int pnv_eeh_bridge_reset(struct pci_dev *pdev, int option) +{ + struct pci_controller *hose; + struct pnv_phb *phb; + struct device_node *dn = pdev ? pci_device_to_OF_node(pdev) : NULL; + uint64_t id = (0x1ul << 60); What is this 1<<60 for? As you replied in other threads, it's worthy to have some macros for this piece of business. This bit indicates the ID of the slot behind a switch port. If this bit is cleared, the ID represents a PHB slot. + uint8_t scope; + int64_t rc; + + /* +* If the firmware can't handle it, we will issue hot reset +* on the secondary bus despite the requested reset type. +*/ + if (!dn || !of_get_property(dn, "ibm,reset-by-firmware", NULL)) + return __pnv_eeh_bridge_reset(pdev, option); + + /* The firmware can handle the request */ + switch (option) { + case EEH_RESET_HOT: + scope = OPAL_RESET_PCI_HOT; + break; + case EEH_RESET_FUNDAMENTAL: + scope = OPAL_RESET_PCI_FUNDAMENTAL; + break; + case EEH_RESET_DEACTIVATE: + return 0; + default: + dev_warn(>dev, "%s: Unsupported reset %d\n", +__func__, option); Can the userspace trigger this case (via VFIO-EEH) and flood dmesg? It depends on how you defined message flooding actually. It's abnormal path caused by program internal error, not external users. Can QEMU be changed to do something special (cause reset with a wrong option) via VFIO/EEH interface in a loop to make this message appear? Or the call with a wrong option will never reach this point? + return -EINVAL; + } + + hose = pci_bus_to_host(pdev->bus); + phb = hose->private_data; + id |= (pdev->bus->number << 24) | (pdev->devfn << 16) | phb->opal_id; + rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET); + return pnv_pci_poll(id, rc, NULL); +} + static int pnv_pci_dev_reset_type(struct pci_dev *pdev, void *data) { int *freset = data; -- Alexey -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 36/45] powerpc/powernv: Support PCI slot ID
On 04/20/2016 12:28 PM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 07:28:20PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: PowerNV platforms runs on top of skiboot firmware that includes changes to support PCI slots. PCI slots are identified by PHB's ID or the combo of that and PCI slot ID. This changes the EEH PowerNV backend to support PCI slots: * Rename arguments of opal_pci_reset() and opal_pci_poll(). * One more argument (PCI slot's state) added to opal_pci_poll(). * Drop pnv_eeh_phb_poll() and introduce a enhanced similar function pnv_pci_poll() that will be used by PowerNV hotplug backends. Signed-off-by: Gavin Shan--- arch/powerpc/include/asm/opal.h | 4 +-- arch/powerpc/platforms/powernv/eeh-powernv.c | 42 ++-- arch/powerpc/platforms/powernv/pci.c | 21 ++ arch/powerpc/platforms/powernv/pci.h | 1 + 4 files changed, 32 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 07a99e6..9e0039f 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -131,7 +131,7 @@ int64_t opal_pci_map_pe_dma_window(uint64_t phb_id, uint16_t pe_number, uint16_t int64_t opal_pci_map_pe_dma_window_real(uint64_t phb_id, uint16_t pe_number, uint16_t dma_window_number, uint64_t pci_start_addr, uint64_t pci_mem_size); -int64_t opal_pci_reset(uint64_t phb_id, uint8_t reset_scope, uint8_t assert_state); +int64_t opal_pci_reset(uint64_t id, uint8_t reset_scope, uint8_t assert_state); int64_t opal_pci_get_hub_diag_data(uint64_t hub_id, void *diag_buffer, uint64_t diag_buffer_len); @@ -148,7 +148,7 @@ int64_t opal_get_dpo_status(__be64 *dpo_timeout); int64_t opal_set_system_attention_led(uint8_t led_action); int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe, __be16 *pci_error_type, __be16 *severity); -int64_t opal_pci_poll(uint64_t phb_id); +int64_t opal_pci_poll(uint64_t id, uint8_t *state); int64_t opal_return_cpu(void); int64_t opal_check_token(uint64_t token); int64_t opal_reinit_cpus(uint64_t flags); diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index c7454ba..e23b063 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -717,28 +717,11 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int *delay) return ret; } -static s64 pnv_eeh_phb_poll(struct pnv_phb *phb) -{ - s64 rc = OPAL_HARDWARE; - - while (1) { - rc = opal_pci_poll(phb->opal_id); - if (rc <= 0) - break; - - if (system_state < SYSTEM_RUNNING) - udelay(1000 * rc); - else - msleep(rc); - } - - return rc; -} - int pnv_eeh_phb_reset(struct pci_controller *hose, int option) { struct pnv_phb *phb = hose->private_data; s64 rc = OPAL_HARDWARE; + int ret; pr_debug("%s: Reset PHB#%x, option=%d\n", __func__, hose->global_number, option); @@ -753,8 +736,6 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option) rc = opal_pci_reset(phb->opal_id, OPAL_RESET_PHB_COMPLETE, OPAL_DEASSERT_RESET); - if (rc < 0) - goto out; /* * Poll state of the PHB until the request is done @@ -762,24 +743,22 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option) * reset followed by hot reset on root bus. So we also * need the PCI bus settlement delay. */ - rc = pnv_eeh_phb_poll(phb); - if (option == EEH_RESET_DEACTIVATE) { + ret = pnv_pci_poll(phb->opal_id, rc, NULL); + if (option == EEH_RESET_DEACTIVATE && !ret) { if (system_state < SYSTEM_RUNNING) udelay(1000 * EEH_PE_RST_SETTLE_TIME); else msleep(EEH_PE_RST_SETTLE_TIME); } -out: - if (rc != OPAL_SUCCESS) - return -EIO; - return 0; + return ret; } static int pnv_eeh_root_reset(struct pci_controller *hose, int option) { struct pnv_phb *phb = hose->private_data; s64 rc = OPAL_HARDWARE; + int ret; pr_debug("%s: Reset PHB#%x, option=%d\n", __func__, hose->global_number, option); @@ -801,18 +780,13 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option) rc = opal_pci_reset(phb->opal_id, OPAL_RESET_PCI_HOT,
Re: [V2, 02/68] powerpc/mm/nohash: Return correctly from flush_tlb_page
On Sat, 2016-09-04 at 06:12:58 UTC, "Aneesh Kumar K.V" wrote: > if it is a hugetlb address return without calling __flush_tlb_page. Why? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V11 0/4]perf/powerpc: Add ability to sample intr machine state in powerpc
Em Mon, Apr 18, 2016 at 03:17:11PM +0530, Anju T escreveu: > On Saturday 20 February 2016 10:32 AM, Anju T wrote: > >This short patch series adds the ability to sample the interrupted > >machine state for each hardware sample. > > > >To test this patchset, > >Eg: > > > >$ perf record -I? # list supported registers > > > >output: > >available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 > >r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26 r27 r28 r29 r30 r31 nip msr > >orig_r3 ctr link xer ccr softe trap dar dsisr > > > > usage: perf record [] [] > > or: perf record [] -- [] > > > > -I, --intr-regs[=] > > sample selected machine registers on interrupt, > > use -I ? to list register names > > > > > >$ perf record -I ls # record machine state at interrupt > >$ perf script -D # read the perf.data file > > > >Sample output obtained for this patchset/ output looks like as follows: > > > >496768515470 0x1988 [0x188]: PERF_RECORD_SAMPLE(IP, 0x1): 4522/4522: > >0xc01e538c period: 1 addr: 0 > >... intr regs: mask 0x7ff ABI 64-bit > > r00xc01e5e34 > > r10xc00fe733f9a0 > > r20xc1523100 > > r30xc00ffaadeb60 > > r40xc3456800 > > r50x73a9b5e000 > > r60x1e00 > > r70x0 > > r80x0 > > r90x0 > > r10 0x1 > > r11 0x0 > > r12 0x24022822 > > r13 0xcfeec180 > > r14 0x0 > > r15 0xc01e4be18800 > > r16 0x0 > > r17 0xc00ffaac5000 > > r18 0xc00fe733f8a0 > > r19 0xc1523100 > > r20 0xc009fd1c > > r21 0xc00fcaa69000 > > r22 0xc01e4968 > > r23 0xc1523100 > > r24 0xc00fe733f850 > > r25 0xc00fcaa69000 > > r26 0xc3b8fcf0 > > r27 0xfead > > r28 0x0 > > r29 0xc00fcaa69000 > > r30 0x1 > > r31 0x0 > > nip 0xc01dd320 > > msr 0x90009032 > > orig_r3 0xc01e538c > > ctr 0xc009d550 > > link 0xc01e5e34 > > xer 0x0 > > ccr 0x84022882 > > softe 0x0 > > trap 0xf01 > > dar 0x0 > > dsisr 0xf0004006004 > > ... thread: :4522:4522 > > .. dso: > > /root/.debug/.build-id/b0/ef11b1a1629e62ac9de75199117ee5ef9469e9 > >:4522 4522 496.768515: 1 cycles: c01e538c > > .perf_event_context_sched_in (/boot/vmlinux) > > > > > > > >Changes from v10: > > > >- Included SOFTE as suggested by mpe > >- The name of registers displayed is changed from > > gpr* to r* also the macro names changed from > > PERF_REG_POWERPC_GPR* to PERF_REG_POWERPC_R*. > >- The conflict in returning the ABI is resolved. > >- #define PERF_REG_SP is again changed to PERF_REG_POWERPC_R1 > >- Comment in tools/perf/config/Makefile is updated. > >- removed the "Reviewed-By" tag as the patch has logic changes. > > > > > >Changes from V9: > > > >- Changed the name displayed for link register from "lnk" to "link" in > > tools/perf/arch/powerpc/include/perf_regs.h > > > >changes from V8: > > > >- Corrected the indentation issue in the Makefile mentioned in 3rd patch > > > >Changes from V7: > > > >- Addressed the new line issue in 3rd patch. > > > >Changes from V6: > > > >- Corrected the typo in patch tools/perf: Map the ID values with register > >names. > > ie #define PERF_REG_SP PERF_REG_POWERPC_R1 should be #define PERF_REG_SP > > PERF_REG_POWERPC_GPR1 > > > > > >Changes from V5: > > > >- Enabled perf_sample_regs_user also in this patch set.Functions added in > >arch/powerpc/perf/perf_regs.c > >- Added Maddy's patch to this patchset for enabling -I? option which will > > list the supported register names. > > > > > >Changes from V4: > > > >- Removed the softe and MQ from all patches > >- Switch case is replaced with an array in the 3rd patch > > > >Changes from V3: > > > >- Addressed the comments by Sukadev regarding the nits in the descriptions. > >- Modified the subject of first patch. > >- Included the sample output in the 3rd patch also. > > > >Changes from V2: > > > >- tools/perf/config/Makefile is moved to the patch tools/perf. > >- The patchset is reordered. > >- perf_regs_load() function is used for the dwarf unwind test.Since it is > >not required here, > > it is removed from tools/perf/arch/powerpc/include/perf_regs.h > >- PERF_REGS_POWERPC_RESULT is removed. > > > >Changes from V1: > > > >- Solved the name missmatch issue in the from and signed-off field of the > >patch series. > >- Added necessary comments in the 3rd patch ie perf/powerpc ,as suggested by > >Maddy. > > > > > > > >Anju T (3): > > perf/powerpc: assign an id to each powerpc register > > perf/powerpc: add support for sampling intr machine state > > tools/perf: Map the ID values with register names > > > >Madhavan Srinivasan (1): > > tool/perf: Add
Re: [PATCH v8 30/45] powerpc/pci: Delay populating pdn
On 04/20/2016 12:13 PM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 06:19:20PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: The pdn (struct pci_dn) instances are allocated from memblock or bootmem when creating PCI controller (hoses) in setup_arch(). PCI hotplug, which will be supported by proceeding patches, releases PCI device nodes and their corresponding pdn on unplugging event. The memory chunks for pdn instances allocated from memblock or bootmem are hard to reused after being released. This delays creating pdn by pci_devs_phb_init() from setup_arch() to core_initcall() so that they are allocated from slab. The memory consumed by pdn can be released to system without problem during PCI unplugging time. It indicates that pci_dn is unavailable in setup_arch() and the the fixup on pdn (like AGP's) can't be carried out that time. We have to do that in ppc_md.pcibios_root_bridge_prepare() on maple/pasemi/powermac platforms where/when the pdn is available. At the mean while, the EEH device is created when pdn is populated, meaning pdn and EEH device have same life cycle. In turn, we needn't call eeh_dev_init() to create EEH device explicitly. Signed-off-by: Gavin ShanUff. It would not hurt to mention that pcibios_root_bridge_prepare is called from subsys_initcall() which is executed after core_initcall() so the code flow does not change. Yes, will do in next revision. Have you checked if there is anything in between core_initcall(pci_devs_phb_init) and subsys_initcall(pcibios_init) which might need device tree nodes? For example, subsys_initcall(pcibios_init) calls (eventually) pnv_pci_ioda_fixup(), if we are unlucky and pcibios_init() (and therefore pnv_pci_ioda_fixup() or what pseries/others do) is called before pcibios_init() - won't we crash or something? I don't catch what you were asking. device-tree nodes (struct device_node) are always there. This patch doesn't affect them. Perhaps you were talking about pdn (PCI_DN). If it's the case, this patch delays creating pdn from setup_arch() to core_initcall(pci_devs_phb_init). While thinking of explaining what I wanted to ask, I found my answer :) pcibios_init() calls ppc_md.pcibios_root_bridge_prepare() first, then ppc_md.pcibios_fixup() so we are fine here with ordering. I don't see anything need pdn between setup_arch() and core_initcall(). The changes introduced to powermac/pasemi platforms are: move fixing the child pdns of the specifiec PHB's pdn from setup_arch() to subsys_initcall(pcibios_init). I don't see anything between them needs the fixed pdns. I don't understand how pcibios_init() is called before pcibios_init() in your pcibios_init() is used twice in the sentence above :) Anyway, Reviewed-by: Alexey Kardashevskiy context. Sorry for my bad English. Perhaps you're asking the the called sequence on core_initcall() and subsys_init()? If so, they're defined like below: #define core_initcall(fn) __define_initcall(fn, 1) #define subsys_initcall(fn) __define_initcall(fn, 4) > -- Alexey -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2] powerpc: Implement {cmp}xchg for u8 and u16
Hello, boqun On 2016年04月19日 17:18, Boqun Feng wrote: > Hi Xinhui, > > On Tue, Apr 19, 2016 at 02:29:34PM +0800, Pan Xinhui wrote: >> From: Pan Xinhui>> >> Implement xchg{u8,u16}{local,relaxed}, and >> cmpxchg{u8,u16}{,local,acquire,relaxed}. >> >> It works on all ppc. >> > > Nice work! > thank you. > AFAICT, your work doesn't depend on anything that ppc-specific, right? > So maybe we can use it as a general approach for a fallback > implementation on the archs without u8/u16 atomics. ;-) > >> Suggested-by: Peter Zijlstra (Intel) >> Signed-off-by: Pan Xinhui >> --- >> change from V1: >> rework totally. >> --- >> arch/powerpc/include/asm/cmpxchg.h | 83 >> ++ >> 1 file changed, 83 insertions(+) >> >> diff --git a/arch/powerpc/include/asm/cmpxchg.h >> b/arch/powerpc/include/asm/cmpxchg.h >> index 44efe73..79a1f45 100644 >> --- a/arch/powerpc/include/asm/cmpxchg.h >> +++ b/arch/powerpc/include/asm/cmpxchg.h >> @@ -7,6 +7,37 @@ >> #include >> #include >> >> +#ifdef __BIG_ENDIAN >> +#define BITOFF_CAL(size, off) ((sizeof(u32) - size - off) * >> BITS_PER_BYTE) >> +#else >> +#define BITOFF_CAL(size, off) (off * BITS_PER_BYTE) >> +#endif >> + >> +static __always_inline unsigned long >> +__cmpxchg_u32_local(volatile unsigned int *p, unsigned long old, >> +unsigned long new); >> + >> +#define __XCHG_GEN(cmp, type, sfx, u32sfx, skip, v) \ >> +static __always_inline u32 \ >> +__##cmp##xchg_##type##sfx(v void *ptr, u32 old, u32 new)\ >> +{ \ >> +int size = sizeof (type); \ >> +int off = (unsigned long)ptr % sizeof(u32); \ >> +volatile u32 *p = ptr - off;\ >> +int bitoff = BITOFF_CAL(size, off); \ >> +u32 bitmask = ((0x1 << size * BITS_PER_BYTE) - 1) << bitoff;\ >> +u32 oldv, newv; \ >> +u32 ret;\ >> +do {\ >> +oldv = READ_ONCE(*p); \ >> +ret = (oldv & bitmask) >> bitoff; \ >> +if (skip && ret != old) \ >> +break; \ >> +newv = (oldv & ~bitmask) | (new << bitoff); \ >> +} while (__cmpxchg_u32##u32sfx((v void*)p, oldv, newv) != oldv);\ > > Forgive me if this is too paranoid, but I think we can save the > READ_ONCE() in the loop if we change the code into the following, > because cmpxchg will return the "new" value, if the cmp part fails. > > newv = READ_ONCE(*p); > > do { > oldv = newv; > ret = (oldv & bitmask) >> bitoff; > if (skip && ret != old) > break; > newv = (oldv & ~bitmask) | (new << bitoff); > newv = __cmpxchg_u32##u32sfx((void *)p, oldv, newv); > } while(newv != oldv); > >> +return ret; \ >> +} a little optimization. Patch V3 will include your code, thanks. >> + >> /* >> * Atomic exchange >> * >> @@ -14,6 +45,19 @@ >> * the previous value stored there. >> */ >> >> +#define XCHG_GEN(type, sfx, v) >> \ >> +__XCHG_GEN(_, type, sfx, _local, 0, v) \ > ^^^ > > This should be sfx, right? Otherwise, all the newly added xchg will > call __cmpxchg_u32_local, this will result in wrong ordering guarantees. > I mean that. But I will think of the ordering issue for a while. :) >> +static __always_inline u32 __xchg_##type##sfx(v void *p, u32 n) \ >> +{ \ >> +return ___xchg_##type##sfx(p, 0, n);\ >> +} >> + >> +XCHG_GEN(u8, _local, volatile); > > I don't think we need the "volatile" modifier here, because READ_ONCE() > and __cmpxchg_u32_* all have "volatile" semantics IIUC, so maybe we can > save a paramter for the __XCHG_GEN macro. > such cleanup work can be done in separated patch. Here I just make the compiler happy. thanks xinhui > Regards, > Boqun > >> +XCHG_GEN(u8, _relaxed, ); >> +XCHG_GEN(u16, _local, volatile); >> +XCHG_GEN(u16, _relaxed, ); >> +#undef XCHG_GEN >> + >> static __always_inline unsigned long >> __xchg_u32_local(volatile void *p, unsigned long val) >> { >> @@ -88,6 +132,10 @@ static __always_inline unsigned long >> __xchg_local(volatile void *ptr, unsigned long x,
Re: [PATCH v8 29/45] powerpc/pci: Export pci_traverse_device_nodes()
On 04/20/2016 11:27 AM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 03:51:03PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: This renames traverse_pci_devices() to pci_traverse_device_nodes(). The function traverses all subordinate device nodes of the specified one. Also, below cleanup applied to the function. No logical changes introduced. * Rename "pre" to "fn". * Avoid assignment in if condition reported from checkpatch.pl. Signed-off-by: Gavin Shan--- arch/powerpc/include/asm/ppc-pci.h | 6 +++--- arch/powerpc/kernel/pci_dn.c | 15 ++- arch/powerpc/platforms/pseries/msi.c | 4 ++-- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h index ca0c5bf..8753e4e 100644 --- a/arch/powerpc/include/asm/ppc-pci.h +++ b/arch/powerpc/include/asm/ppc-pci.h @@ -33,9 +33,9 @@ extern struct pci_dev *isa_bridge_pcidev; /* may be NULL if no ISA bus */ struct device_node; struct pci_dn; -typedef void *(*traverse_func)(struct device_node *me, void *data); Why removing this typedef? Typedef's are good. Anyway, Could you please provide more details why it's good? I removed it because it was used for only once. I have some thoughts but never mind, nobody seems to care about this and typedefs are considered bad by the CodingStyle. Reviewed-by: Alexey Kardashevskiy -void *traverse_pci_devices(struct device_node *start, traverse_func pre, - void *data); +void *pci_traverse_device_nodes(struct device_node *start, + void *(*fn)(struct device_node *, void *), + void *data); void *traverse_pci_dn(struct pci_dn *root, void *(*fn)(struct pci_dn *, void *), void *data); diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c index ce10281..ecdccce 100644 --- a/arch/powerpc/kernel/pci_dn.c +++ b/arch/powerpc/kernel/pci_dn.c @@ -372,8 +372,9 @@ EXPORT_SYMBOL_GPL(pci_remove_device_node_info); * one of these nodes we also assume its siblings are non-pci for * performance. */ -void *traverse_pci_devices(struct device_node *start, traverse_func pre, - void *data) +void *pci_traverse_device_nodes(struct device_node *start, + void *(*fn)(struct device_node *, void *), + void *data) { struct device_node *dn, *nextdn; void *ret; @@ -388,8 +389,11 @@ void *traverse_pci_devices(struct device_node *start, traverse_func pre, if (classp) class = of_read_number(classp, 1); - if (pre && ((ret = pre(dn, data)) != NULL)) - return ret; + if (fn) { + ret = fn(dn, data); + if (ret) + return ret; + } /* If we are a PCI bridge, go down */ if (dn->child && ((class >> 8) == PCI_CLASS_BRIDGE_PCI || @@ -411,6 +415,7 @@ void *traverse_pci_devices(struct device_node *start, traverse_func pre, } return NULL; } +EXPORT_SYMBOL_GPL(pci_traverse_device_nodes); static struct pci_dn *pci_dn_next_one(struct pci_dn *root, struct pci_dn *pdn) @@ -487,7 +492,7 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb) } /* Update dn->phb ptrs for new phb and children devices */ - traverse_pci_devices(dn, add_pdn, phb); + pci_traverse_device_nodes(dn, add_pdn, phb); } /** diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c index 272e9ec..543a638 100644 --- a/arch/powerpc/platforms/pseries/msi.c +++ b/arch/powerpc/platforms/pseries/msi.c @@ -305,7 +305,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int request) memset(, 0, sizeof(struct msi_counts)); /* Work out how many devices we have below this PE */ - traverse_pci_devices(pe_dn, count_non_bridge_devices, ); + pci_traverse_device_nodes(pe_dn, count_non_bridge_devices, ); if (counts.num_devices == 0) { pr_err("rtas_msi: found 0 devices under PE for %s\n", @@ -320,7 +320,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int request) /* else, we have some more calculating to do */ counts.requestor = pci_device_to_OF_node(dev); counts.request = request; - traverse_pci_devices(pe_dn, count_spare_msis, ); + pci_traverse_device_nodes(pe_dn, count_spare_msis, ); /* If the quota isn't an integer multiple of the total, we can * use the remainder as spare MSIs for anyone that wants them. */ -- Alexey -- Alexey ___ Linuxppc-dev mailing list
Re: [PATCH v8 21/45] powerpc/powernv: Create PEs at PCI hot plugging time
On Wed, Apr 20, 2016 at 01:00:38PM +1000, Alexey Kardashevskiy wrote: >On 04/20/2016 11:12 AM, Gavin Shan wrote: >>On Tue, Apr 19, 2016 at 02:16:42PM +1000, Alexey Kardashevskiy wrote: >>>On 02/17/2016 02:44 PM, Gavin Shan wrote: Currently, the PEs and their associated resources are assigned in ppc_md.pcibios_fixup() except those used by SRIOV VFs. >>> >>>But this new code does not affect IOV and VF's PEs will still be created >>>somewhere else rather than pnv_pci_setup_bridge()? >>> >> >>Correct. VF PEs cannot be created in pnv_pci_setup_bridge() as the PF's >>IOV capability isn't enabled at that point. >> >>> The function is called for once after PCI probing and resources assignment is completed. So it isn't hotplug friendly. This creates PEs dynamically by ppc_md.pcibios_setup_bridge(), which is called on the event during system bootup and PCI hotplug: updating PCI bridge's windows after resource assignment/reassignment are done. For partial hotplug case, where not all PCI devices belonging to the PE are unplugged and plugged again, we just need unbinding/binding the affected PCI devices with the corresponding PE without creating new one. As there is no upstream bridge for root bus that needs to be covered by PE, we have to create PE for root bus in ppc_md.pcibios_setup_bridge() before any other PEs can be created, as PE for root bus is the ancestor to anyone else. >>> >>>We did not need a root bus PE before? What is the other PE reserved for? >>>Comments only say "reserved"... >>> >> >>No, A PE for root bus is needed before. > >Ok. We needed a PE for the root bus and we need it now. What changed? Why do >you reserve another PE? > Originally, all PEs (include the one for root bus) were created at PHB fixup time in pnv_pci_ioda_fixup(). With this patch, all PEs are created in pnv_pci_setup_bridge(). pnv_pci_setup_bridge() is called for every PCI buses other than root bus. It means pnv_pci_setup_bridge() isn't called for root bus. So we have to create PE for root bus before the left PEs are created there. The PE# for root bus is reserved in advance and used in pnv_pci_setup_bridge() at that point. > >> >other PEs can be for the PCI bus >>originated from root port and the subordinate domains. >> Also, the windows of root port or the upstream port of PCIe switch behind root port are extended to be PHB's apertures to accommodate the additional resources needed by newly plugged devices based on the fact: hotpluggable slot is behind root port or downstream port of the PCIe switch behind root port. The extension for those PCI brdiges' windows is done in ppc_md.pcibios_setup_bridge() as well. >>> >>> >>>This patch seems to be doing way too many things, hard to follow. >>> >>>Could you please split the patch into smaller chunks? For example (you can do >>>it totally different): >>>- move pnv_pci_ioda_setup_opal_tce_kill() >>>- move PE creation from pnv_pci_ioda_fixup() to pnv_pci_setup_bridge(); >>>- add pnv_pci_fixup_bridge_resources() >>>- add an extra reserved PE for the root bus (and all this magic with >>>root_pe_idx/root_pe_populated) >>>- ... >>> >> >>I'll evaluate it later. It's always nice to have small patches. Thanks >>for the comments. >> >>> >>> >>> >>>-- >>>Alexey >>> >> >>-- >>To unsubscribe from this list: send the line "unsubscribe linux-pci" in >>the body of a message to majord...@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 24/45] powerpc/pci: Rename pcibios_{add,remove}_pci_devices()
On 04/20/2016 11:23 AM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 03:28:36PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: This renames pcibios_{add,remove}_pci_devices() to avoid conflicts with names of the weak functions in PCI subsystem, which have the prefix "pcibios". No logical changes introduced. Signed-off-by: Gavin Shan--- arch/powerpc/include/asm/pci-bridge.h | 4 ++-- arch/powerpc/kernel/eeh_driver.c | 12 ++-- arch/powerpc/kernel/pci-hotplug.c | 15 +++ drivers/pci/hotplug/rpadlpar_core.c | 2 +- drivers/pci/hotplug/rpaphp_core.c | 4 ++-- drivers/pci/hotplug/rpaphp_pci.c | 2 +- 6 files changed, 19 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 4dd6ef4..c817f38 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -263,10 +263,10 @@ static inline struct eeh_dev *pdn_to_eeh_dev(struct pci_dn *pdn) extern struct pci_bus *pcibios_find_pci_bus(struct device_node *dn); /** Remove all of the PCI devices under this bus */ -extern void pcibios_remove_pci_devices(struct pci_bus *bus); +extern void pci_remove_pci_devices(struct pci_bus *bus); pci_lala_pci_lala() ("pci" is used twice) looks weird, if the prefix is "pci", what other device types can they handle?... May be pcihp_add_devices(), pcihp_remove_devices() as these as defined in pci-hotplug.c? I assume you're talking about drivers/pci/hotplug/pci_hotplug_core.c. No, the helpers you are renaming are in pci-hotplug.c which uses "pci_" as a prefix even though the file is supposed to be about hotplug. pci_hotplug_core.c uses pci_hp_ prefix rather than pcihp_. I will rename them to pci_hp_*() in next revision. Anyway, this will work too. gwshan@gwshan:~/sandbox/linux$ find . -name pci-hotplug.c ./arch/powerpc/kernel/pci-hotplug.c gwshan@gwshan:~/sandbox/linux$ grep pci*hp arch/powerpc/kernel/pci-hotplug.c /** Discover new pci devices under this bus, and add them */ -extern void pcibios_add_pci_devices(struct pci_bus *bus); +extern void pci_add_pci_devices(struct pci_bus *bus); extern void isa_bridge_find_early(struct pci_controller *hose); diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index fb6207d..59e53fe 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -621,7 +621,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, * We don't remove the corresponding PE instances because * we need the information afterwords. The attached EEH * devices are expected to be attached soon when calling -* into pcibios_add_pci_devices(). +* into pci_add_pci_devices(). */ eeh_pe_state_mark(pe, EEH_PE_KEEP); if (bus) { @@ -630,7 +630,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, } else { eeh_pe_state_clear(pe, EEH_PE_PRI_BUS); pci_lock_rescan_remove(); - pcibios_remove_pci_devices(bus); + pci_remove_pci_devices(bus); pci_unlock_rescan_remove(); } } else if (frozen_bus) { @@ -681,7 +681,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, if (pe->type & EEH_PE_VF) eeh_add_virt_device(edev, NULL); else - pcibios_add_pci_devices(bus); + pci_add_pci_devices(bus); } else if (frozen_bus && rmv_data->removed) { pr_info("EEH: Sleep 5s ahead of partial hotplug\n"); ssleep(5); @@ -691,7 +691,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, if (pe->type & EEH_PE_VF) eeh_add_virt_device(edev, NULL); else - pcibios_add_pci_devices(frozen_bus); + pci_add_pci_devices(frozen_bus); } eeh_pe_state_clear(pe, EEH_PE_KEEP); @@ -896,7 +896,7 @@ perm_error: eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED); pci_lock_rescan_remove(); - pcibios_remove_pci_devices(frozen_bus); + pci_remove_pci_devices(frozen_bus); pci_unlock_rescan_remove(); } } @@ -981,7 +981,7 @@ static void eeh_handle_special_event(void) bus = eeh_pe_bus_get(phb_pe); eeh_pe_dev_traverse(pe, eeh_report_failure, NULL); - pcibios_remove_pci_devices(bus); + pci_remove_pci_devices(bus); }
Re: [PATCH v8 22/45] powerpc/powernv/ioda1: Support releasing IODA1 TCE table
On 04/20/2016 11:15 AM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 02:28:51PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: pnv_pci_ioda_table_free_pages() can be reused to release the IODA1 TCE table when releasing IODA1 PE in subsequent patches. This renames the following functions to support releasing IODA1 TCE table: pnv_pci_ioda2_table_free_pages() to pnv_pci_ioda_table_free_pages(), pnv_pci_ioda2_table_do_free_pages() to pnv_pci_ioda_table_do_free_pages(). No logical changes introduced. I can only see renaming here but it seems (from IODA_architecture_04-14-2008.pdf) that IODA1 does not support multi-level TCE tables in the way IODA2 does. Note that the change was proposed by you in last round. Hm. I do not recall proposing exactly that :-/ Yes, TVE on P7IOC doesn't support multiple levels of TCE tables. I thought it supports 2 levels. In this case, we will always have "tbl->it_indirect_levels" to 1, right? Nope, it will be 0. But it is still ugly to use release function but not to use its allocating counterpart which is pnv_pci_ioda2_table_alloc_pages(). I suggest having pnv_pci_ioda1_table_free_pages() which will be just a single free_pages() call. If you need some ioda*-common code to free a table, then define pnv_ioda1_iommu_ops::free(). Signed-off-by: Gavin Shan--- arch/powerpc/platforms/powernv/pci-ioda.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index d360607..077f9db 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -51,7 +51,7 @@ #define POWERNV_IOMMU_DEFAULT_LEVELS 1 #define POWERNV_IOMMU_MAX_LEVELS 5 -static void pnv_pci_ioda2_table_free_pages(struct iommu_table *tbl); +static void pnv_pci_ioda_table_free_pages(struct iommu_table *tbl); static void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level, const char *fmt, ...) @@ -1352,7 +1352,7 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe iommu_group_put(pe->table_group.group); BUG_ON(pe->table_group.group); } - pnv_pci_ioda2_table_free_pages(tbl); + pnv_pci_ioda_table_free_pages(tbl); iommu_free_table(tbl, of_node_full_name(dev->dev.of_node)); } @@ -1946,7 +1946,7 @@ static void pnv_ioda2_tce_free(struct iommu_table *tbl, long index, static void pnv_ioda2_table_free(struct iommu_table *tbl) { - pnv_pci_ioda2_table_free_pages(tbl); + pnv_pci_ioda_table_free_pages(tbl); iommu_free_table(tbl, "pnv"); } @@ -2448,7 +2448,7 @@ static __be64 *pnv_pci_ioda2_table_do_alloc_pages(int nid, unsigned shift, return addr; } -static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, +static void pnv_pci_ioda_table_do_free_pages(__be64 *addr, unsigned long size, unsigned level); static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, @@ -2487,7 +2487,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, * release partially allocated table. */ if (offset < tce_table_size) { - pnv_pci_ioda2_table_do_free_pages(addr, + pnv_pci_ioda_table_do_free_pages(addr, 1ULL << (level_shift - 3), levels - 1); return -ENOMEM; } @@ -2505,7 +2505,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, return 0; } -static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, +static void pnv_pci_ioda_table_do_free_pages(__be64 *addr, unsigned long size, unsigned level) { const unsigned long addr_ul = (unsigned long) addr & @@ -2521,7 +2521,7 @@ static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, if (!(hpa & (TCE_PCI_READ | TCE_PCI_WRITE))) continue; - pnv_pci_ioda2_table_do_free_pages(__va(hpa), size, + pnv_pci_ioda_table_do_free_pages(__va(hpa), size, level - 1); } } @@ -2529,7 +2529,7 @@ static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, free_pages(addr_ul, get_order(size << 3)); } -static void pnv_pci_ioda2_table_free_pages(struct iommu_table *tbl) +static void pnv_pci_ioda_table_free_pages(struct iommu_table *tbl) { const unsigned long size = tbl->it_indirect_levels ? tbl->it_level_size : tbl->it_size; @@ -2537,7 +2537,7 @@ static void pnv_pci_ioda2_table_free_pages(struct iommu_table *tbl) if (!tbl->it_size) return; - pnv_pci_ioda2_table_do_free_pages((__be64 *)tbl->it_base, size, +
Re: [PATCH v8 39/45] powerpc/powernv: Select OF_DYNAMIC
On Tue, Apr 19, 2016 at 07:42:01PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>The device tree will change dynamically in PowerNV PCI hotplug >>driver. This enables CONFIG_OF_DYNAMIC to support that. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/platforms/powernv/Kconfig | 1 + >> 1 file changed, 1 insertion(+) >> >>diff --git a/arch/powerpc/platforms/powernv/Kconfig >>b/arch/powerpc/platforms/powernv/Kconfig >>index 604190c..e7b1ad7 100644 >>--- a/arch/powerpc/platforms/powernv/Kconfig >>+++ b/arch/powerpc/platforms/powernv/Kconfig >>@@ -18,6 +18,7 @@ config PPC_POWERNV >> select CPU_FREQ_GOV_ONDEMAND >> select CPU_FREQ_GOV_CONSERVATIVE >> select PPC_DOORBELL >>+ select OF_DYNAMIC > > >Why not to enable it in 45/45 under config HOTPLUG_PCI_POWERNV? Is there any >benefit of having it always on if HOTPLUG_PCI_POWERNV is not enabled? > Agree, I will move accordingly in next revision. Note that we have to move it back here once something else depends on OF_DYNAMIC in future. >> default y >> >> config OPAL_PRD >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 01/68] powerpc/cxl: Use REGION_ID instead of opencoding
On Wed, 2016-04-13 at 08:12 +0530, Aneesh Kumar K.V wrote: > "Aneesh Kumar K.V"writes: > > Also note that the `~` operation is wrong. > > > > Cc: Frederic Barrat > > Cc: Andrew Donnellan > > Acked-by: Ian Munsie > > Signed-off-by: Aneesh Kumar K.V > > --- > > drivers/misc/cxl/fault.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c > > index 9a8650bcb042..9a236543da23 100644 > > --- a/drivers/misc/cxl/fault.c > > +++ b/drivers/misc/cxl/fault.c > > @@ -152,7 +152,7 @@ static void cxl_handle_page_fault(struct cxl_context > > *ctx, > > access = _PAGE_PRESENT; > > if (dsisr & CXL_PSL_DSISR_An_S) > > access |= _PAGE_RW; > > - if ((!ctx->kernel) || ~(dar & (1ULL << 63))) > > + if ((!ctx->kernel) || (REGION_ID(dar) == USER_REGION_ID)) > > access |= _PAGE_USER; > > > > if (dsisr & DSISR_NOHPTE) > > Posted an updated version of this patch alone with improved commit > message here > > http://mid.gmane.org/1460482475-20782-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com I never saw it. And that link is empty? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 21/45] powerpc/powernv: Create PEs at PCI hot plugging time
On 04/20/2016 11:12 AM, Gavin Shan wrote: On Tue, Apr 19, 2016 at 02:16:42PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:44 PM, Gavin Shan wrote: Currently, the PEs and their associated resources are assigned in ppc_md.pcibios_fixup() except those used by SRIOV VFs. But this new code does not affect IOV and VF's PEs will still be created somewhere else rather than pnv_pci_setup_bridge()? Correct. VF PEs cannot be created in pnv_pci_setup_bridge() as the PF's IOV capability isn't enabled at that point. The function is called for once after PCI probing and resources assignment is completed. So it isn't hotplug friendly. This creates PEs dynamically by ppc_md.pcibios_setup_bridge(), which is called on the event during system bootup and PCI hotplug: updating PCI bridge's windows after resource assignment/reassignment are done. For partial hotplug case, where not all PCI devices belonging to the PE are unplugged and plugged again, we just need unbinding/binding the affected PCI devices with the corresponding PE without creating new one. As there is no upstream bridge for root bus that needs to be covered by PE, we have to create PE for root bus in ppc_md.pcibios_setup_bridge() before any other PEs can be created, as PE for root bus is the ancestor to anyone else. We did not need a root bus PE before? What is the other PE reserved for? Comments only say "reserved"... No, A PE for root bus is needed before. Ok. We needed a PE for the root bus and we need it now. What changed? Why do you reserve another PE? other PEs can be for the PCI bus originated from root port and the subordinate domains. Also, the windows of root port or the upstream port of PCIe switch behind root port are extended to be PHB's apertures to accommodate the additional resources needed by newly plugged devices based on the fact: hotpluggable slot is behind root port or downstream port of the PCIe switch behind root port. The extension for those PCI brdiges' windows is done in ppc_md.pcibios_setup_bridge() as well. This patch seems to be doing way too many things, hard to follow. Could you please split the patch into smaller chunks? For example (you can do it totally different): - move pnv_pci_ioda_setup_opal_tce_kill() - move PE creation from pnv_pci_ioda_fixup() to pnv_pci_setup_bridge(); - add pnv_pci_fixup_bridge_resources() - add an extra reserved PE for the root bus (and all this magic with root_pe_idx/root_pe_populated) - ... I'll evaluate it later. It's always nice to have small patches. Thanks for the comments. -- Alexey -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [V2, 68/68] powerpc/mm/radix: Use firmware feature to disable radix
On Sat, 2016-09-04 at 06:14:04 UTC, "Aneesh Kumar K.V" wrote: > We can depend on ibm,pa-features to enable/disable radix. This gives us > a nice way to test p9 hash config, by changing device tree property. I think we might want to be more careful here. You set MMU_FTR_RADIX in the cputable entry. So it's on by default on P9 cpus. Then if there is an ibm,pa-features property *and* it is >= 41 bytes long, the below feature entry will hit. In that case the firmware controls whether it's on or off. I think it would be clearer if we removed RADIX from the cputable, and the below became the only way to turn it on. Would that break anything? > diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c > index 7030b035905d..a4d1f44364b8 100644 > --- a/arch/powerpc/kernel/prom.c > +++ b/arch/powerpc/kernel/prom.c > @@ -165,6 +165,7 @@ static struct ibm_pa_feature { >* which is 0 if the kernel doesn't support TM. >*/ > {CPU_FTR_TM_COMP, 0, 0, 22, 0, 0}, > + {0, MMU_FTR_RADIX, 0, 40, 0, 0}, So that says bit 0 of byte 40 enables MMU_FTR_RADIX. Where is that documented? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 40/45] drivers/of: Split unflatten_dt_node()
On Wed, Feb 17, 2016 at 08:30:42AM -0600, Rob Herring wrote: >On Tue, Feb 16, 2016 at 9:44 PM, Gavin Shanwrote: >> The function unflatten_dt_node() is called recursively to unflatten >> device nodes and properties in the FDT blob. It looks complicated >> and hard to be understood. >> >> This splits the function into 3 functions: populate_properties(), >> populate_node() and unflatten_dt_node(). populate_properties(), >> which is called by populate_node(), creates properties for the >> indicated device node. The later one creates the device nodes >> from FDT blob. populate_node() gets the offset in FDT blob for >> next device nodes and then calls populate_node(). No logical >> changes introduced. >> >> Signed-off-by: Gavin Shan >> --- >> drivers/of/fdt.c | 249 >> --- >> 1 file changed, 147 insertions(+), 102 deletions(-) > >One nit, otherwise: > >Acked-by: Rob Herring > >[...] > >> + /* And we process the "ibm,phandle" property >> +* used in pSeries dynamic device tree >> +* stuff >> +*/ >> + if (!strcmp(pname, "ibm,phandle")) >> + np->phandle = be32_to_cpup(val); >> + >> + pp->name = (char *)pname; >> + pp->length = sz; >> + pp->value = (__be32 *)val; > >This cast should not be needed. > Rob, very sorry to response so lately. I will fix it up in next revision. >> + *pprev = pp; >> + pprev = >next; >> + } > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 13/45] powerpc/powernv/ioda1: M64 support on P7IOC
On 04/20/2016 10:22 AM, Gavin Shan wrote: On Wed, Apr 13, 2016 at 05:47:59PM +1000, Alexey Kardashevskiy wrote: On 02/17/2016 02:43 PM, Gavin Shan wrote: This enables M64 window on P7IOC, which has been enabled on PHB3. Different from PHB3 where 16 M64 BARs are supported and each of them can be owned by one particular PE# exclusively or divided evenly to 256 segments, every P7IOC PHB has 16 M64 BARs and each of them are divided to 8 segments. So every P7IOC PHB supports 128 M64 segments in total. P7IOC has M64DT, which helps mapping one particular M64 segment# to arbitrary PE#. PHB3 doesn't have M64DT, indicating that one M64 segment can only be pinned to the fixed PE#. In order to have same code to support M64 on P7IOC and PHB3, we just provide 128 M64 segments on every P7IOC PHB and each of them is pinned to the fixed PE# by bypassing the function of M64DT. In turn, we just need different phb->init_m64() for P7IOC and PHB3 to support M64. The comment is not quite correct - in addition to pnv_ioda1_init_m64(), you also need to hack pnv_ioda_pick_m64_pe(). Right, will talk about the changes to pnv_ioda_pick_m64_pe() in the commit log of next revision. Signed-off-by: Gavin Shan--- arch/powerpc/platforms/powernv/pci-ioda.c | 86 +-- arch/powerpc/platforms/powernv/pci.h | 3 ++ 2 files changed, 86 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 1dc663a..8488238 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -246,6 +246,64 @@ static void pnv_ioda_reserve_dev_m64_pe(struct pci_dev *pdev, } } +static int pnv_ioda1_init_m64(struct pnv_phb *phb) +{ + struct resource *r; + int index; + + /* +* There are 16 M64 BARs, each of which has 8 segments. So +* there are as many M64 segments as the maximum number of +* PEs, which is 128. +*/ + for (index = 0; index < PNV_IODA1_M64_NUM; index++) { + unsigned long base, segsz = phb->ioda.m64_segsize; + int64_t rc; + + base = phb->ioda.m64_base + + index * PNV_IODA1_M64_SEGS * segsz; + rc = opal_pci_set_phb_mem_window(phb->opal_id, + OPAL_M64_WINDOW_TYPE, index, base, 0, + PNV_IODA1_M64_SEGS * segsz); + if (rc != OPAL_SUCCESS) { + pr_warn(" Error %lld setting M64 PHB#%d-BAR#%d\n", + rc, phb->hose->global_number, index); + goto fail; + } + + rc = opal_pci_phb_mmio_enable(phb->opal_id, + OPAL_M64_WINDOW_TYPE, index, + OPAL_ENABLE_M64_SPLIT); + if (rc != OPAL_SUCCESS) { + pr_warn(" Error %lld enabling M64 PHB#%d-BAR#%d\n", + rc, phb->hose->global_number, index); + goto fail; + } + } + + /* +* Exclude the segment used by the reserved PE, which +* is expected to be 0 or last supported PE#. +*/ + r = >hose->mem_resources[1]; + if (phb->ioda.reserved_pe_idx == 0) + r->start += phb->ioda.m64_segsize; + else if (phb->ioda.reserved_pe_idx == (phb->ioda.total_pe_num - 1)) + r->end -= phb->ioda.m64_segsize; + else + pr_warn(" Cannot cut M64 segment for reserved PE#%d\n", + phb->ioda.reserved_pe_idx); + + return 0; + +fail: + for ( ; index >= 0; index--) + opal_pci_phb_mmio_enable(phb->opal_id, + OPAL_M64_WINDOW_TYPE, index, OPAL_DISABLE_M64); + + return -EIO; +} + static void pnv_ioda_reserve_m64_pe(struct pci_bus *bus, unsigned long *pe_bitmap, bool all) @@ -315,6 +373,26 @@ static int pnv_ioda_pick_m64_pe(struct pci_bus *bus, bool all) pe->master = master_pe; list_add_tail(>list, _pe->slaves); } + + /* +* P7IOC supports M64DT, which helps mapping M64 segment +* to one particular PE#. However, PHB3 has fixed mapping +* between M64 segment and PE#. In order to have same logic +* for P7IOC and PHB3, we enforce fixed mapping between M64 +* segment and PE# on P7IOC. +*/ + if (phb->type == PNV_PHB_IODA1) { + int64_t rc; + + rc = opal_pci_map_pe_mmio_window(phb->opal_id, + pe->pe_number, OPAL_M64_WINDOW_TYPE, + pe->pe_number /
RE: [PATCH 5/5] drivers/net: support hdlc function for QE-UCC
On 20/04/2016 12:22AM, Christophe Leroywrote > -Original Message- > From: Christophe Leroy [mailto:christophe.le...@c-s.fr] > Sent: Wednesday, April 20, 2016 12:22 AM > To: Qiang Zhao ; da...@davemloft.net > Cc: gre...@linuxfoundation.org; Xiaobo Xie ; linux- > ker...@vger.kernel.org; o...@buserror.net; net...@vger.kernel.org; > a...@linux-foundation.org; linuxppc-dev@lists.ozlabs.org > Subject: Re: [PATCH 5/5] drivers/net: support hdlc function for QE-UCC > > Le 30/03/2016 10:50, Zhao Qiang a écrit : > > The driver add hdlc support for Freescale QUICC Engine. > > It support NMSI and TSA mode. > When using TSA, how does the TSA gets configured ? Especially how do you > describe which Timeslot is switched to HDLC channels ? the TSA is configured statically according to device tree node. For " which Timeslot is switched to HDLC channels ", there is a property "fsl,tx-timeslot-mask" in device tree to describe it. > Is it possible to route some Timeslots to one UCC for HDLC, and route some > others to another UCC for an ALSA sound driver ? The feature you describe is not supported at present. > The QE also have a QMC which allows to split all timeslots to a given UCC into > independant channels that can either be used with HDLC or transparents (for > audio for instance). Do you intent to also support QMC ? new QE use UMCC instead of QMC in old QE, we have started to develop UMCC. > According to the compatible property, it looks like your driver is for > freescale > T1040. The MPC83xx also has a Quick Engine, would it work on it too ? The driver is common, but tested on t1040, it is needed to add node to MPC83xx If you want to test on mpc83xx. -Zhao Qiang ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 00/45] powerpc/powernv: PCI hotplug support
On Fri, Apr 15, 2016 at 11:10:21AM -0500, Rob Herring wrote: >On Wed, Apr 13, 2016 at 8:30 PM, Gavin Shanwrote: >> On Thu, Apr 14, 2016 at 09:57:32AM +1000, Alistair Popple wrote: >>>Hi Gavin, >>> >>> >>> >Why exactly cannot EEH reset changes go to a smaller separate patchset >(before hotplug)? > As I explained before, the patchset's order is: PCI generic part, PowerNV PCI related, EEH related, device-tree part and hotplug driver. The EEH reset change is included in PATCH[37/45]. There is no point to reorder the patches. >>> >>>I don't understand all of the dependencies but if possible splitting the >>>series up into a set of smaller self-contained patch series makes things >>>easier to review and may make it easier for you to get this functionality >>>reviewed and accepted into upstream. >>> >> >> Thanks, Alistair. I will move those cleanup/refactor related patches >> to form a separate series which is expected to be merged first. That >> will helps the reviewers to focus on the patches with complicated >> changes as you suggested. Alexey, please let me know if that way is >> you like to see or not. > >As I said last cycle, I'll happily take the DT refactoring patches >separately, but you have to tell me if you want me to apply them and >it has to be well before the merge window. > Thanks, Rob. I hope to post next revision (v9) soon and the device-tree related cleanup patches should be ready for next merge window in it. >Rob > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 38/45] powerpc/powernv: Functions to get/set PCI slot status
On Tue, Apr 19, 2016 at 07:39:34PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>This exports 4 functins, which base on the corresponding OPAL > > >s/functins/functions/ > Thanks. >>APIs to get/set PCI slot status. Those functions are going to >>be used by PowerNV PCI hotplug driver: >> >>pnv_pci_get_device_tree()opal_get_device_tree() >>pnv_pci_get_presence_state() opal_pci_get_presence_state() >>pnv_pci_get_power_state()opal_pci_get_power_state() >>pnv_pci_set_power_state()opal_pci_set_power_state() >> >>Besides, the patch also exports pnv_pci_hotplug_notifier_{register, >>unregister}() to allow registration and unregistration of PCI hotplug >>notifier, which will be used to receive PCI hotplug message from >>skiboot firmware in PowerNV PCI hotplug driver. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/include/asm/opal-api.h| 17 ++- >> arch/powerpc/include/asm/opal.h| 4 ++ >> arch/powerpc/include/asm/pnv-pci.h | 7 +++ >> arch/powerpc/platforms/powernv/opal-wrappers.S | 4 ++ >> arch/powerpc/platforms/powernv/pci.c | 66 >> ++ >> 5 files changed, 97 insertions(+), 1 deletion(-) >> >>diff --git a/arch/powerpc/include/asm/opal-api.h >>b/arch/powerpc/include/asm/opal-api.h >>index f8faaae..a6af338 100644 >>--- a/arch/powerpc/include/asm/opal-api.h >>+++ b/arch/powerpc/include/asm/opal-api.h >>@@ -158,7 +158,11 @@ >> #define OPAL_LEDS_SET_INDICATOR 115 >> #define OPAL_CEC_REBOOT2116 >> #define OPAL_CONSOLE_FLUSH 117 >>-#define OPAL_LAST117 >>+#define OPAL_GET_DEVICE_TREE 118 >>+#define OPAL_PCI_GET_PRESENCE_STATE 119 >>+#define OPAL_PCI_GET_POWER_STATE 120 >>+#define OPAL_PCI_SET_POWER_STATE 121 >>+#define OPAL_LAST121 >> >> /* Device tree flags */ >> >>@@ -344,6 +348,16 @@ enum OpalPciResetState { >> OPAL_ASSERT_RESET = 1 >> }; >> >>+enum OpalPciSlotPresentenceState { >>+ OPAL_PCI_SLOT_EMPTY = 0, >>+ OPAL_PCI_SLOT_PRESENT = 1 >>+}; >>+ >>+enum OpalPciSlotPowerState { >>+ OPAL_PCI_SLOT_POWER_OFF = 0, >>+ OPAL_PCI_SLOT_POWER_ON = 1 >>+}; >>+ >> enum OpalSlotLedType { >> OPAL_SLOT_LED_TYPE_ID = 0, /* IDENTIFY LED */ >> OPAL_SLOT_LED_TYPE_FAULT = 1, /* FAULT LED */ >>@@ -378,6 +392,7 @@ enum opal_msg_type { >> OPAL_MSG_DPO, >> OPAL_MSG_PRD, >> OPAL_MSG_OCC, >>+ OPAL_MSG_PCI_HOTPLUG, >> OPAL_MSG_TYPE_MAX, >> }; >> >>diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h >>index 9e0039f..899bcb941 100644 >>--- a/arch/powerpc/include/asm/opal.h >>+++ b/arch/powerpc/include/asm/opal.h >>@@ -209,6 +209,10 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, >>uint64_t buf, >> uint64_t size, uint64_t token); >> int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size, >> uint64_t token); >>+int64_t opal_get_device_tree(uint32_t phandle, uint64_t buf, uint64_t len); >>+int64_t opal_pci_get_presence_state(uint64_t id, uint8_t *state); >>+int64_t opal_pci_get_power_state(uint64_t id, uint8_t *state); >>+int64_t opal_pci_set_power_state(uint64_t id, uint8_t state); >> >> /* Internal functions */ >> extern int early_init_dt_scan_opal(unsigned long node, const char *uname, >>diff --git a/arch/powerpc/include/asm/pnv-pci.h >>b/arch/powerpc/include/asm/pnv-pci.h >>index 6f77f71..d9d095b 100644 >>--- a/arch/powerpc/include/asm/pnv-pci.h >>+++ b/arch/powerpc/include/asm/pnv-pci.h >>@@ -13,6 +13,13 @@ >> #include >> #include >> >>+extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t >>len); >>+extern int pnv_pci_get_presence_state(uint64_t id, uint8_t *state); >>+extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state); >>+extern int pnv_pci_set_power_state(uint64_t id, uint8_t state); >>+extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb); >>+extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb); >>+ >> int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode); >> int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, >> unsigned int virq); >>diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S >>b/arch/powerpc/platforms/powernv/opal-wrappers.S >>index e45b88a..3ea1a855 100644 >>--- a/arch/powerpc/platforms/powernv/opal-wrappers.S >>+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S >>@@ -302,3 +302,7 @@ OPAL_CALL(opal_prd_msg, >>OPAL_PRD_MSG); >> OPAL_CALL(opal_leds_get_ind, >> OPAL_LEDS_GET_INDICATOR); >> OPAL_CALL(opal_leds_set_ind, >> OPAL_LEDS_SET_INDICATOR); >> OPAL_CALL(opal_console_flush, OPAL_CONSOLE_FLUSH);
Re: [PATCH v8 37/45] powerpc/powernv: Use firmware PCI slot reset infrastructure
On Tue, Apr 19, 2016 at 07:34:55PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>The skiboot firmware might provide the PCI slot reset capability >>which is identified by property "ibm,reset-by-firmware" on the >>PCI slot associated device node. >> >>This checks the property. If it exists, the reset request is routed >>to firmware. Otherwise, the reset is done by kernel as before. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/platforms/powernv/eeh-powernv.c | 41 >> +++- >> 1 file changed, 40 insertions(+), 1 deletion(-) >> >>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c >>b/arch/powerpc/platforms/powernv/eeh-powernv.c >>index e23b063..c8a5217 100644 >>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c >>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c >>@@ -789,7 +789,7 @@ static int pnv_eeh_root_reset(struct pci_controller >>*hose, int option) >> return ret; >> } >> >>-static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option) >>+static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option) >> { >> struct pci_dn *pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn); >> struct eeh_dev *edev = pdn_to_eeh_dev(pdn); >>@@ -840,6 +840,45 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int >>option) >> return 0; >> } >> >>+static int pnv_eeh_bridge_reset(struct pci_dev *pdev, int option) >>+{ >>+ struct pci_controller *hose; >>+ struct pnv_phb *phb; >>+ struct device_node *dn = pdev ? pci_device_to_OF_node(pdev) : NULL; >>+ uint64_t id = (0x1ul << 60); > > >What is this 1<<60 for? > > As you replied in other threads, it's worthy to have some macros for this piece of business. This bit indicates the ID of the slot behind a switch port. If this bit is cleared, the ID represents a PHB slot. >>+ uint8_t scope; >>+ int64_t rc; >>+ >>+ /* >>+ * If the firmware can't handle it, we will issue hot reset >>+ * on the secondary bus despite the requested reset type. >>+ */ >>+ if (!dn || !of_get_property(dn, "ibm,reset-by-firmware", NULL)) >>+ return __pnv_eeh_bridge_reset(pdev, option); >>+ >>+ /* The firmware can handle the request */ >>+ switch (option) { >>+ case EEH_RESET_HOT: >>+ scope = OPAL_RESET_PCI_HOT; >>+ break; >>+ case EEH_RESET_FUNDAMENTAL: >>+ scope = OPAL_RESET_PCI_FUNDAMENTAL; >>+ break; >>+ case EEH_RESET_DEACTIVATE: >>+ return 0; >>+ default: >>+ dev_warn(>dev, "%s: Unsupported reset %d\n", >>+ __func__, option); > > >Can the userspace trigger this case (via VFIO-EEH) and flood dmesg? > It depends on how you defined message flooding actually. It's abnormal path caused by program internal error, not external users. > > >>+ return -EINVAL; >>+ } >>+ >>+ hose = pci_bus_to_host(pdev->bus); >>+ phb = hose->private_data; >>+ id |= (pdev->bus->number << 24) | (pdev->devfn << 16) | phb->opal_id; >>+ rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET); >>+ return pnv_pci_poll(id, rc, NULL); >>+} >>+ >> static int pnv_pci_dev_reset_type(struct pci_dev *pdev, void *data) >> { >> int *freset = data; >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 36/45] powerpc/powernv: Support PCI slot ID
On Tue, Apr 19, 2016 at 07:28:20PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>PowerNV platforms runs on top of skiboot firmware that includes >>changes to support PCI slots. PCI slots are identified by PHB's >>ID or the combo of that and PCI slot ID. >> >>This changes the EEH PowerNV backend to support PCI slots: >> >>* Rename arguments of opal_pci_reset() and opal_pci_poll(). >>* One more argument (PCI slot's state) added to opal_pci_poll(). >>* Drop pnv_eeh_phb_poll() and introduce a enhanced similar >> function pnv_pci_poll() that will be used by PowerNV hotplug >> backends. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/include/asm/opal.h | 4 +-- >> arch/powerpc/platforms/powernv/eeh-powernv.c | 42 >> ++-- >> arch/powerpc/platforms/powernv/pci.c | 21 ++ >> arch/powerpc/platforms/powernv/pci.h | 1 + >> 4 files changed, 32 insertions(+), 36 deletions(-) >> >>diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h >>index 07a99e6..9e0039f 100644 >>--- a/arch/powerpc/include/asm/opal.h >>+++ b/arch/powerpc/include/asm/opal.h >>@@ -131,7 +131,7 @@ int64_t opal_pci_map_pe_dma_window(uint64_t phb_id, >>uint16_t pe_number, uint16_t >> int64_t opal_pci_map_pe_dma_window_real(uint64_t phb_id, uint16_t pe_number, >> uint16_t dma_window_number, uint64_t >> pci_start_addr, >> uint64_t pci_mem_size); >>-int64_t opal_pci_reset(uint64_t phb_id, uint8_t reset_scope, uint8_t >>assert_state); >>+int64_t opal_pci_reset(uint64_t id, uint8_t reset_scope, uint8_t >>assert_state); >> >> int64_t opal_pci_get_hub_diag_data(uint64_t hub_id, void *diag_buffer, >> uint64_t diag_buffer_len); >>@@ -148,7 +148,7 @@ int64_t opal_get_dpo_status(__be64 *dpo_timeout); >> int64_t opal_set_system_attention_led(uint8_t led_action); >> int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe, >> __be16 *pci_error_type, __be16 *severity); >>-int64_t opal_pci_poll(uint64_t phb_id); >>+int64_t opal_pci_poll(uint64_t id, uint8_t *state); >> int64_t opal_return_cpu(void); >> int64_t opal_check_token(uint64_t token); >> int64_t opal_reinit_cpus(uint64_t flags); >>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c >>b/arch/powerpc/platforms/powernv/eeh-powernv.c >>index c7454ba..e23b063 100644 >>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c >>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c >>@@ -717,28 +717,11 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int >>*delay) >> return ret; >> } >> >>-static s64 pnv_eeh_phb_poll(struct pnv_phb *phb) >>-{ >>- s64 rc = OPAL_HARDWARE; >>- >>- while (1) { >>- rc = opal_pci_poll(phb->opal_id); >>- if (rc <= 0) >>- break; >>- >>- if (system_state < SYSTEM_RUNNING) >>- udelay(1000 * rc); >>- else >>- msleep(rc); >>- } >>- >>- return rc; >>-} >>- >> int pnv_eeh_phb_reset(struct pci_controller *hose, int option) >> { >> struct pnv_phb *phb = hose->private_data; >> s64 rc = OPAL_HARDWARE; >>+ int ret; >> >> pr_debug("%s: Reset PHB#%x, option=%d\n", >> __func__, hose->global_number, option); >>@@ -753,8 +736,6 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int >>option) >> rc = opal_pci_reset(phb->opal_id, >> OPAL_RESET_PHB_COMPLETE, >> OPAL_DEASSERT_RESET); >>- if (rc < 0) >>- goto out; >> >> /* >> * Poll state of the PHB until the request is done >>@@ -762,24 +743,22 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int >>option) >> * reset followed by hot reset on root bus. So we also >> * need the PCI bus settlement delay. >> */ >>- rc = pnv_eeh_phb_poll(phb); >>- if (option == EEH_RESET_DEACTIVATE) { >>+ ret = pnv_pci_poll(phb->opal_id, rc, NULL); >>+ if (option == EEH_RESET_DEACTIVATE && !ret) { >> if (system_state < SYSTEM_RUNNING) >> udelay(1000 * EEH_PE_RST_SETTLE_TIME); >> else >> msleep(EEH_PE_RST_SETTLE_TIME); >> } >>-out: >>- if (rc != OPAL_SUCCESS) >>- return -EIO; >> >>- return 0; >>+ return ret; >> } >> >> static int pnv_eeh_root_reset(struct pci_controller *hose, int option) >> { >> struct pnv_phb *phb = hose->private_data; >> s64 rc = OPAL_HARDWARE; >>+ int ret; >> >> pr_debug("%s: Reset PHB#%x, option=%d\n", >> __func__, hose->global_number, option); >>@@ -801,18 +780,13 @@ static int pnv_eeh_root_reset(struct pci_controller >>*hose, int option) >> rc = opal_pci_reset(phb->opal_id,
Re: [PATCH v8 30/45] powerpc/pci: Delay populating pdn
On Tue, Apr 19, 2016 at 06:19:20PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>The pdn (struct pci_dn) instances are allocated from memblock or >>bootmem when creating PCI controller (hoses) in setup_arch(). PCI >>hotplug, which will be supported by proceeding patches, releases >>PCI device nodes and their corresponding pdn on unplugging event. >>The memory chunks for pdn instances allocated from memblock or >>bootmem are hard to reused after being released. >> >>This delays creating pdn by pci_devs_phb_init() from setup_arch() >>to core_initcall() so that they are allocated from slab. The memory >>consumed by pdn can be released to system without problem during >>PCI unplugging time. It indicates that pci_dn is unavailable in >>setup_arch() and the the fixup on pdn (like AGP's) can't be carried >>out that time. We have to do that in ppc_md.pcibios_root_bridge_prepare() >>on maple/pasemi/powermac platforms where/when the pdn is available. >> >>At the mean while, the EEH device is created when pdn is populated, >>meaning pdn and EEH device have same life cycle. In turn, we needn't >>call eeh_dev_init() to create EEH device explicitly. >> >>Signed-off-by: Gavin Shan> > >Uff. It would not hurt to mention that pcibios_root_bridge_prepare is called >from subsys_initcall() which is executed after core_initcall() so the code >flow does not change. > Yes, will do in next revision. >Have you checked if there is anything in between >core_initcall(pci_devs_phb_init) and subsys_initcall(pcibios_init) which >might need device tree nodes? For example, subsys_initcall(pcibios_init) >calls (eventually) pnv_pci_ioda_fixup(), if we are unlucky and pcibios_init() >(and therefore pnv_pci_ioda_fixup() or what pseries/others do) is called >before pcibios_init() - won't we crash or something? > I don't catch what you were asking. device-tree nodes (struct device_node) are always there. This patch doesn't affect them. Perhaps you were talking about pdn (PCI_DN). If it's the case, this patch delays creating pdn from setup_arch() to core_initcall(pci_devs_phb_init). I don't see anything need pdn between setup_arch() and core_initcall(). The changes introduced to powermac/pasemi platforms are: move fixing the child pdns of the specifiec PHB's pdn from setup_arch() to subsys_initcall(pcibios_init). I don't see anything between them needs the fixed pdns. I don't understand how pcibios_init() is called before pcibios_init() in your context. Sorry for my bad English. Perhaps you're asking the the called sequence on core_initcall() and subsys_init()? If so, they're defined like below: #define core_initcall(fn) __define_initcall(fn, 1) #define subsys_initcall(fn) __define_initcall(fn, 4) > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 45/45] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
On Tue, 19 Apr 2016 20:36:48 Alexey Kardashevskiy wrote: > On 02/17/2016 02:44 PM, Gavin Shan wrote: > > This adds standalone driver to support PCI hotplug for PowerPC PowerNV > > platform that runs on top of skiboot firmware. The firmware identifies > > hotpluggable slots and marked their device tree node with proper > > "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans > > device tree nodes to create/register PCI hotplug slot accordingly. > > > > The PCI slots are organized in fashion of tree, which means one > > PCI slot might have parent PCI slot and parent PCI slot possibly > > contains multiple child PCI slots. At the plugging time, the parent > > PCI slot is populated before its children. The child PCI slots are > > removed before their parent PCI slot can be removed from the system. > > > > If the skiboot firmware doesn't support slot status retrieval, the PCI > > slot device node shouldn't have property "ibm,reset-by-firmware". In > > that case, none of valid PCI slots will be detected from device tree. > > The skiboot firmware doesn't export the capability to access attention > > LEDs yet and it's something for TBD. > > > > Signed-off-by: Gavin Shan> > Acked-by: Bjorn Helgaas > > --- > > drivers/pci/hotplug/Kconfig | 12 + > > drivers/pci/hotplug/Makefile | 3 + > > drivers/pci/hotplug/pnv_php.c | 870 > > ++ > > 3 files changed, 885 insertions(+) > > create mode 100644 drivers/pci/hotplug/pnv_php.c > > > > diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig > > index df8caec..167c8ce 100644 > > --- a/drivers/pci/hotplug/Kconfig > > +++ b/drivers/pci/hotplug/Kconfig > > @@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC > > > > When in doubt, say N. > > > > +config HOTPLUG_PCI_POWERNV > > + tristate "PowerPC PowerNV PCI Hotplug driver" > > + depends on PPC_POWERNV && EEH > > + help > > + Say Y here if you run PowerPC PowerNV platform that supports > > + PCI Hotplug > > + > > + To compile this driver as a module, choose M here: the > > + module will be called pnv-php. > > + > > + When in doubt, say N. > > + > > config HOTPLUG_PCI_RPA > > tristate "RPA PCI Hotplug driver" > > depends on PPC_PSERIES && EEH > > diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile > > index b616e75..e33cdda 100644 > > --- a/drivers/pci/hotplug/Makefile > > +++ b/drivers/pci/hotplug/Makefile > > @@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)+= pciehp.o > > obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550) += cpcihp_zt5550.o > > obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)+= cpcihp_generic.o > > obj-$(CONFIG_HOTPLUG_PCI_SHPC)+= shpchp.o > > +obj-$(CONFIG_HOTPLUG_PCI_POWERNV) += pnv-php.o > > obj-$(CONFIG_HOTPLUG_PCI_RPA) += rpaphp.o > > obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR) += rpadlpar_io.o > > obj-$(CONFIG_HOTPLUG_PCI_SGI) += sgi_hotplug.o > > @@ -50,6 +51,8 @@ ibmphp-objs := ibmphp_core.o \ > > acpiphp-objs := acpiphp_core.o \ > > acpiphp_glue.o > > > > +pnv-php-objs := pnv_php.o > > + > > rpaphp-objs := rpaphp_core.o \ > > rpaphp_pci.o\ > > rpaphp_slot.o > > diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c > > new file mode 100644 > > index 000..364ec36 > > --- /dev/null > > +++ b/drivers/pci/hotplug/pnv_php.c > > @@ -0,0 +1,870 @@ > > +/* > > + * PCI Hotplug Driver for PowerPC PowerNV platform. > > + * > > + * Copyright Gavin Shan, IBM Corporation 2015. > > + * > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License as published by > > + * the Free Software Foundation; either version 2 of the License, or > > + * (at your option) any later version. > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > + > > +#include > > +#include > > +#include > > + > > +#define DRIVER_VERSION "0.1" > > +#define DRIVER_AUTHOR "Gavin Shan, IBM Corporation" > > +#define DRIVER_DESC"PowerPC PowerNV PCI Hotplug Driver" > > + > > +struct pnv_php_slot { > > + struct hotplug_slot slot; > > + struct hotplug_slot_infoslot_info; > > + uint64_tid; > > + char*name; > > + int slot_no; > > + struct kref kref; > > +#define PNV_PHP_STATE_INITIALIZED 0 > > +#define PNV_PHP_STATE_REGISTERED 1 > > +#define PNV_PHP_STATE_POPULATED2 > > + int state; > > + struct device_node *dn; > > + struct pci_dev *pdev; > > + struct pci_bus *bus; > > + bool
Re: [PATCH v8 35/45] powerpc/powernv: Fundamental reset in pnv_pci_reset_secondary_bus()
On Tue, Apr 19, 2016 at 07:04:19PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>In pnv_pci_reset_secondary_bus(), we should issue fundamental reset >>if any one subordinate device of the specified bus is requesting that. >>Otherwise, the device might not come up after the reset. >> >>Signed-off-by: Gavin Shan> > >Reviewed-by: Alexey Kardashevskiy > > >Out of curiosity - what does "fundamental" reset actually do? > Please refer to the skiboot patches - power off/on the target slot. > >>--- >> arch/powerpc/platforms/powernv/eeh-powernv.c | 21 - >> 1 file changed, 20 insertions(+), 1 deletion(-) >> >>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c >>b/arch/powerpc/platforms/powernv/eeh-powernv.c >>index 593b8dc..c7454ba 100644 >>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c >>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c >>@@ -866,9 +866,28 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int >>option) >> return 0; >> } >> >>+static int pnv_pci_dev_reset_type(struct pci_dev *pdev, void *data) >>+{ >>+ int *freset = data; >>+ >>+ /* >>+ * Stop the iteration immediately if there has any one >>+ * PCI device requesting fundamental reset. >>+ */ >>+ *freset |= pdev->needs_freset; >>+ return *freset; >>+} >>+ >> void pnv_pci_reset_secondary_bus(struct pci_dev *dev) >> { >>- pnv_eeh_bridge_reset(dev, EEH_RESET_HOT); >>+ int option, freset = 0; >>+ >>+ if (dev->subordinate) >>+ pci_walk_bus(dev->subordinate, >>+ pnv_pci_dev_reset_type, ); >>+ >>+ option = freset ? EEH_RESET_FUNDAMENTAL : EEH_RESET_HOT; >>+ pnv_eeh_bridge_reset(dev, option); >> pnv_eeh_bridge_reset(dev, EEH_RESET_DEACTIVATE); >> } >> >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Increase timeout for detection of AFU mmio hang
Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/512x: clk: Remove CLK_IS_ROOT
This flag is a no-op now (see commit 47b0eeb3dc8a "clk: Deprecate CLK_IS_ROOT", 2016-02-02) so remove it. Cc: Gerhard SittigSigned-off-by: Stephen Boyd --- arch/powerpc/platforms/512x/clock-commonclk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/512x/clock-commonclk.c b/arch/powerpc/platforms/512x/clock-commonclk.c index c50ea76ba66c..6081fbd75330 100644 --- a/arch/powerpc/platforms/512x/clock-commonclk.c +++ b/arch/powerpc/platforms/512x/clock-commonclk.c @@ -221,7 +221,7 @@ static bool soc_has_mclk_mux0_canin(void) /* convenience wrappers around the common clk API */ static inline struct clk *mpc512x_clk_fixed(const char *name, int rate) { - return clk_register_fixed_rate(NULL, name, NULL, CLK_IS_ROOT, rate); + return clk_register_fixed_rate(NULL, name, NULL, 0, rate); } static inline struct clk *mpc512x_clk_factor( -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 29/45] powerpc/pci: Export pci_traverse_device_nodes()
On Tue, Apr 19, 2016 at 03:51:03PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>This renames traverse_pci_devices() to pci_traverse_device_nodes(). >>The function traverses all subordinate device nodes of the specified >>one. Also, below cleanup applied to the function. No logical changes >>introduced. >> >>* Rename "pre" to "fn". >>* Avoid assignment in if condition reported from checkpatch.pl. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/include/asm/ppc-pci.h | 6 +++--- >> arch/powerpc/kernel/pci_dn.c | 15 ++- >> arch/powerpc/platforms/pseries/msi.c | 4 ++-- >> 3 files changed, 15 insertions(+), 10 deletions(-) >> >>diff --git a/arch/powerpc/include/asm/ppc-pci.h >>b/arch/powerpc/include/asm/ppc-pci.h >>index ca0c5bf..8753e4e 100644 >>--- a/arch/powerpc/include/asm/ppc-pci.h >>+++ b/arch/powerpc/include/asm/ppc-pci.h >>@@ -33,9 +33,9 @@ extern struct pci_dev *isa_bridge_pcidev; /* may be NULL >>if no ISA bus */ >> struct device_node; >> struct pci_dn; >> >>-typedef void *(*traverse_func)(struct device_node *me, void *data); > > > >Why removing this typedef? Typedef's are good. > >Anyway, > Could you please provide more details why it's good? I removed it because it was used for only once. > >Reviewed-by: Alexey Kardashevskiy > > > > >>-void *traverse_pci_devices(struct device_node *start, traverse_func pre, >>- void *data); >>+void *pci_traverse_device_nodes(struct device_node *start, >>+ void *(*fn)(struct device_node *, void *), >>+ void *data); >> void *traverse_pci_dn(struct pci_dn *root, >>void *(*fn)(struct pci_dn *, void *), >>void *data); >>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c >>index ce10281..ecdccce 100644 >>--- a/arch/powerpc/kernel/pci_dn.c >>+++ b/arch/powerpc/kernel/pci_dn.c >>@@ -372,8 +372,9 @@ EXPORT_SYMBOL_GPL(pci_remove_device_node_info); >> * one of these nodes we also assume its siblings are non-pci for >> * performance. >> */ >>-void *traverse_pci_devices(struct device_node *start, traverse_func pre, >>- void *data) >>+void *pci_traverse_device_nodes(struct device_node *start, >>+ void *(*fn)(struct device_node *, void *), >>+ void *data) >> { >> struct device_node *dn, *nextdn; >> void *ret; >>@@ -388,8 +389,11 @@ void *traverse_pci_devices(struct device_node *start, >>traverse_func pre, >> if (classp) >> class = of_read_number(classp, 1); >> >>- if (pre && ((ret = pre(dn, data)) != NULL)) >>- return ret; >>+ if (fn) { >>+ ret = fn(dn, data); >>+ if (ret) >>+ return ret; >>+ } >> >> /* If we are a PCI bridge, go down */ >> if (dn->child && ((class >> 8) == PCI_CLASS_BRIDGE_PCI || >>@@ -411,6 +415,7 @@ void *traverse_pci_devices(struct device_node *start, >>traverse_func pre, >> } >> return NULL; >> } >>+EXPORT_SYMBOL_GPL(pci_traverse_device_nodes); >> >> static struct pci_dn *pci_dn_next_one(struct pci_dn *root, >>struct pci_dn *pdn) >>@@ -487,7 +492,7 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb) >> } >> >> /* Update dn->phb ptrs for new phb and children devices */ >>- traverse_pci_devices(dn, add_pdn, phb); >>+ pci_traverse_device_nodes(dn, add_pdn, phb); >> } >> >> /** >>diff --git a/arch/powerpc/platforms/pseries/msi.c >>b/arch/powerpc/platforms/pseries/msi.c >>index 272e9ec..543a638 100644 >>--- a/arch/powerpc/platforms/pseries/msi.c >>+++ b/arch/powerpc/platforms/pseries/msi.c >>@@ -305,7 +305,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int >>request) >> memset(, 0, sizeof(struct msi_counts)); >> >> /* Work out how many devices we have below this PE */ >>- traverse_pci_devices(pe_dn, count_non_bridge_devices, ); >>+ pci_traverse_device_nodes(pe_dn, count_non_bridge_devices, ); >> >> if (counts.num_devices == 0) { >> pr_err("rtas_msi: found 0 devices under PE for %s\n", >>@@ -320,7 +320,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int >>request) >> /* else, we have some more calculating to do */ >> counts.requestor = pci_device_to_OF_node(dev); >> counts.request = request; >>- traverse_pci_devices(pe_dn, count_spare_msis, ); >>+ pci_traverse_device_nodes(pe_dn, count_spare_msis, ); >> >> /* If the quota isn't an integer multiple of the total, we can >> * use the remainder as spare MSIs for anyone that wants them. */ >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org
Re: [PATCH v8 28/45] powerpc/pci: Introduce pci_remove_device_node_info()
On Tue, Apr 19, 2016 at 03:48:26PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>This implements and exports pci_remove_device_node_info(). It's >>used to remove the pdn (struct pci_dn) for the indicated device >>node. The function is going to be used by PowerNV PCI hotplug >>driver. >> >>Signed-off-by: Gavin Shan> >Kind of strange that there is no such helper for pseries, is there? > I don't find one actually. If you find one, pls let me know, thanks! > >Reviewed-by: Alexey Kardashevskiy > > >>--- >> arch/powerpc/include/asm/pci-bridge.h | 1 + >> arch/powerpc/kernel/pci_dn.c | 23 +++ >> 2 files changed, 24 insertions(+) >> >>diff --git a/arch/powerpc/include/asm/pci-bridge.h >>b/arch/powerpc/include/asm/pci-bridge.h >>index 72a9d4e..c6310e2 100644 >>--- a/arch/powerpc/include/asm/pci-bridge.h >>+++ b/arch/powerpc/include/asm/pci-bridge.h >>@@ -240,6 +240,7 @@ extern struct pci_dn *add_dev_pci_data(struct pci_dev >>*pdev); >> extern void remove_dev_pci_data(struct pci_dev *pdev); >> extern struct pci_dn *pci_add_device_node_info(struct pci_controller *hose, >> struct device_node *dn); >>+extern void pci_remove_device_node_info(struct device_node *dn); >> >> static inline int pci_device_from_OF_node(struct device_node *np, >>u8 *bus, u8 *devfn) >>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c >>index 0a249ff..ce10281 100644 >>--- a/arch/powerpc/kernel/pci_dn.c >>+++ b/arch/powerpc/kernel/pci_dn.c >>@@ -331,6 +331,29 @@ struct pci_dn *pci_add_device_node_info(struct >>pci_controller *hose, >> } >> EXPORT_SYMBOL_GPL(pci_add_device_node_info); >> >>+void pci_remove_device_node_info(struct device_node *dn) >>+{ >>+ struct pci_dn *pdn = dn ? PCI_DN(dn) : NULL; >>+#ifdef CONFIG_EEH >>+ struct eeh_dev *edev = pdn_to_eeh_dev(pdn); >>+ >>+ if (edev) >>+ edev->pdn = NULL; >>+#endif >>+ >>+ if (!pdn) >>+ return; >>+ >>+ WARN_ON(!list_empty(>child_list)); >>+ list_del(>list); >>+ if (pdn->parent) >>+ of_node_put(pdn->parent->node); >>+ >>+ dn->data = NULL; >>+ kfree(pdn); >>+} >>+EXPORT_SYMBOL_GPL(pci_remove_device_node_info); >>+ >> /* >> * Traverse a device tree stopping each PCI device in the tree. >> * This is done depth first. As each node is processed, a "pre" >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 24/45] powerpc/pci: Rename pcibios_{add,remove}_pci_devices()
On Tue, Apr 19, 2016 at 03:28:36PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>This renames pcibios_{add,remove}_pci_devices() to avoid conflicts >>with names of the weak functions in PCI subsystem, which have the >>prefix "pcibios". No logical changes introduced. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/include/asm/pci-bridge.h | 4 ++-- >> arch/powerpc/kernel/eeh_driver.c | 12 ++-- >> arch/powerpc/kernel/pci-hotplug.c | 15 +++ >> drivers/pci/hotplug/rpadlpar_core.c | 2 +- >> drivers/pci/hotplug/rpaphp_core.c | 4 ++-- >> drivers/pci/hotplug/rpaphp_pci.c | 2 +- >> 6 files changed, 19 insertions(+), 20 deletions(-) >> >>diff --git a/arch/powerpc/include/asm/pci-bridge.h >>b/arch/powerpc/include/asm/pci-bridge.h >>index 4dd6ef4..c817f38 100644 >>--- a/arch/powerpc/include/asm/pci-bridge.h >>+++ b/arch/powerpc/include/asm/pci-bridge.h >>@@ -263,10 +263,10 @@ static inline struct eeh_dev *pdn_to_eeh_dev(struct >>pci_dn *pdn) >> extern struct pci_bus *pcibios_find_pci_bus(struct device_node *dn); >> >> /** Remove all of the PCI devices under this bus */ >>-extern void pcibios_remove_pci_devices(struct pci_bus *bus); >>+extern void pci_remove_pci_devices(struct pci_bus *bus); > > >pci_lala_pci_lala() ("pci" is used twice) looks weird, if the prefix is >"pci", what other device types can they handle?... > >May be pcihp_add_devices(), pcihp_remove_devices() as these as defined in >pci-hotplug.c? > I assume you're talking about drivers/pci/hotplug/pci_hotplug_core.c. pci_hotplug_core.c uses pci_hp_ prefix rather than pcihp_. I will rename them to pci_hp_*() in next revision. gwshan@gwshan:~/sandbox/linux$ find . -name pci-hotplug.c ./arch/powerpc/kernel/pci-hotplug.c gwshan@gwshan:~/sandbox/linux$ grep pci*hp arch/powerpc/kernel/pci-hotplug.c > >> >> /** Discover new pci devices under this bus, and add them */ >>-extern void pcibios_add_pci_devices(struct pci_bus *bus); >>+extern void pci_add_pci_devices(struct pci_bus *bus); >> >> >> extern void isa_bridge_find_early(struct pci_controller *hose); >>diff --git a/arch/powerpc/kernel/eeh_driver.c >>b/arch/powerpc/kernel/eeh_driver.c >>index fb6207d..59e53fe 100644 >>--- a/arch/powerpc/kernel/eeh_driver.c >>+++ b/arch/powerpc/kernel/eeh_driver.c >>@@ -621,7 +621,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct >>pci_bus *bus, >> * We don't remove the corresponding PE instances because >> * we need the information afterwords. The attached EEH >> * devices are expected to be attached soon when calling >>- * into pcibios_add_pci_devices(). >>+ * into pci_add_pci_devices(). >> */ >> eeh_pe_state_mark(pe, EEH_PE_KEEP); >> if (bus) { >>@@ -630,7 +630,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct >>pci_bus *bus, >> } else { >> eeh_pe_state_clear(pe, EEH_PE_PRI_BUS); >> pci_lock_rescan_remove(); >>- pcibios_remove_pci_devices(bus); >>+ pci_remove_pci_devices(bus); >> pci_unlock_rescan_remove(); >> } >> } else if (frozen_bus) { >>@@ -681,7 +681,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct >>pci_bus *bus, >> if (pe->type & EEH_PE_VF) >> eeh_add_virt_device(edev, NULL); >> else >>- pcibios_add_pci_devices(bus); >>+ pci_add_pci_devices(bus); >> } else if (frozen_bus && rmv_data->removed) { >> pr_info("EEH: Sleep 5s ahead of partial hotplug\n"); >> ssleep(5); >>@@ -691,7 +691,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct >>pci_bus *bus, >> if (pe->type & EEH_PE_VF) >> eeh_add_virt_device(edev, NULL); >> else >>- pcibios_add_pci_devices(frozen_bus); >>+ pci_add_pci_devices(frozen_bus); >> } >> eeh_pe_state_clear(pe, EEH_PE_KEEP); >> >>@@ -896,7 +896,7 @@ perm_error: >> eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED); >> >> pci_lock_rescan_remove(); >>- pcibios_remove_pci_devices(frozen_bus); >>+ pci_remove_pci_devices(frozen_bus); >> pci_unlock_rescan_remove(); >> } >> } >>@@ -981,7 +981,7 @@ static void eeh_handle_special_event(void) >> bus = eeh_pe_bus_get(phb_pe); >> eeh_pe_dev_traverse(pe, >> eeh_report_failure, NULL); >>- pcibios_remove_pci_devices(bus); >>+ pci_remove_pci_devices(bus); >> } >> pci_unlock_rescan_remove(); >> } >>diff --git a/arch/powerpc/kernel/pci-hotplug.c
Re: [PATCH v8 22/45] powerpc/powernv/ioda1: Support releasing IODA1 TCE table
On Tue, Apr 19, 2016 at 02:28:51PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>pnv_pci_ioda_table_free_pages() can be reused to release the IODA1 >>TCE table when releasing IODA1 PE in subsequent patches. >> >>This renames the following functions to support releasing IODA1 TCE >>table: pnv_pci_ioda2_table_free_pages() to pnv_pci_ioda_table_free_pages(), >>pnv_pci_ioda2_table_do_free_pages() to pnv_pci_ioda_table_do_free_pages(). >>No logical changes introduced. > >I can only see renaming here but it seems (from >IODA_architecture_04-14-2008.pdf) that IODA1 does not support multi-level TCE >tables in the way IODA2 does. > Note that the change was proposed by you in last round. Yes, TVE on P7IOC doesn't support multiple levels of TCE tables. In this case, we will always have "tbl->it_indirect_levels" to 1, right? >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 18 +- >> 1 file changed, 9 insertions(+), 9 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index d360607..077f9db 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -51,7 +51,7 @@ >> #define POWERNV_IOMMU_DEFAULT_LEVELS1 >> #define POWERNV_IOMMU_MAX_LEVELS5 >> >>-static void pnv_pci_ioda2_table_free_pages(struct iommu_table *tbl); >>+static void pnv_pci_ioda_table_free_pages(struct iommu_table *tbl); >> >> static void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level, >> const char *fmt, ...) >>@@ -1352,7 +1352,7 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev >>*dev, struct pnv_ioda_pe >> iommu_group_put(pe->table_group.group); >> BUG_ON(pe->table_group.group); >> } >>- pnv_pci_ioda2_table_free_pages(tbl); >>+ pnv_pci_ioda_table_free_pages(tbl); >> iommu_free_table(tbl, of_node_full_name(dev->dev.of_node)); >> } >> >>@@ -1946,7 +1946,7 @@ static void pnv_ioda2_tce_free(struct iommu_table *tbl, >>long index, >> >> static void pnv_ioda2_table_free(struct iommu_table *tbl) >> { >>- pnv_pci_ioda2_table_free_pages(tbl); >>+ pnv_pci_ioda_table_free_pages(tbl); >> iommu_free_table(tbl, "pnv"); >> } >> >>@@ -2448,7 +2448,7 @@ static __be64 *pnv_pci_ioda2_table_do_alloc_pages(int >>nid, unsigned shift, >> return addr; >> } >> >>-static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, >>+static void pnv_pci_ioda_table_do_free_pages(__be64 *addr, >> unsigned long size, unsigned level); >> >> static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, >>@@ -2487,7 +2487,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid, >>__u64 bus_offset, >> * release partially allocated table. >> */ >> if (offset < tce_table_size) { >>- pnv_pci_ioda2_table_do_free_pages(addr, >>+ pnv_pci_ioda_table_do_free_pages(addr, >> 1ULL << (level_shift - 3), levels - 1); >> return -ENOMEM; >> } >>@@ -2505,7 +2505,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid, >>__u64 bus_offset, >> return 0; >> } >> >>-static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, >>+static void pnv_pci_ioda_table_do_free_pages(__be64 *addr, >> unsigned long size, unsigned level) >> { >> const unsigned long addr_ul = (unsigned long) addr & >>@@ -2521,7 +2521,7 @@ static void pnv_pci_ioda2_table_do_free_pages(__be64 >>*addr, >> if (!(hpa & (TCE_PCI_READ | TCE_PCI_WRITE))) >> continue; >> >>- pnv_pci_ioda2_table_do_free_pages(__va(hpa), size, >>+ pnv_pci_ioda_table_do_free_pages(__va(hpa), size, >> level - 1); >> } >> } >>@@ -2529,7 +2529,7 @@ static void pnv_pci_ioda2_table_do_free_pages(__be64 >>*addr, >> free_pages(addr_ul, get_order(size << 3)); >> } >> >>-static void pnv_pci_ioda2_table_free_pages(struct iommu_table *tbl) >>+static void pnv_pci_ioda_table_free_pages(struct iommu_table *tbl) >> { >> const unsigned long size = tbl->it_indirect_levels ? >> tbl->it_level_size : tbl->it_size; >>@@ -2537,7 +2537,7 @@ static void pnv_pci_ioda2_table_free_pages(struct >>iommu_table *tbl) >> if (!tbl->it_size) >> return; >> >>- pnv_pci_ioda2_table_do_free_pages((__be64 *)tbl->it_base, size, >>+ pnv_pci_ioda_table_do_free_pages((__be64 *)tbl->it_base, size, >> tbl->it_indirect_levels); >> } >> >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 21/45] powerpc/powernv: Create PEs at PCI hot plugging time
On Tue, Apr 19, 2016 at 02:16:42PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>Currently, the PEs and their associated resources are assigned >>in ppc_md.pcibios_fixup() except those used by SRIOV VFs. > >But this new code does not affect IOV and VF's PEs will still be created >somewhere else rather than pnv_pci_setup_bridge()? > Correct. VF PEs cannot be created in pnv_pci_setup_bridge() as the PF's IOV capability isn't enabled at that point. > >>The >>function is called for once after PCI probing and resources >>assignment is completed. So it isn't hotplug friendly. >> >>This creates PEs dynamically by ppc_md.pcibios_setup_bridge(), which >>is called on the event during system bootup and PCI hotplug: updating >>PCI bridge's windows after resource assignment/reassignment are done. >>For partial hotplug case, where not all PCI devices belonging to the >>PE are unplugged and plugged again, we just need unbinding/binding >>the affected PCI devices with the corresponding PE without creating >>new one. >> >>As there is no upstream bridge for root bus that needs to be covered >>by PE, we have to create PE for root bus in ppc_md.pcibios_setup_bridge() >>before any other PEs can be created, as PE for root bus is the ancestor >>to anyone else. > >We did not need a root bus PE before? What is the other PE reserved for? >Comments only say "reserved"... > No, A PE for root bus is needed before. other PEs can be for the PCI bus originated from root port and the subordinate domains. >> >>Also, the windows of root port or the upstream port of PCIe switch behind >>root port are extended to be PHB's apertures to accommodate the additional >>resources needed by newly plugged devices based on the fact: hotpluggable >>slot is behind root port or downstream port of the PCIe switch behind >>root port. The extension for those PCI brdiges' windows is done in >>ppc_md.pcibios_setup_bridge() as well. > > >This patch seems to be doing way too many things, hard to follow. > >Could you please split the patch into smaller chunks? For example (you can do >it totally different): >- move pnv_pci_ioda_setup_opal_tce_kill() >- move PE creation from pnv_pci_ioda_fixup() to pnv_pci_setup_bridge(); >- add pnv_pci_fixup_bridge_resources() >- add an extra reserved PE for the root bus (and all this magic with >root_pe_idx/root_pe_populated) >- ... > I'll evaluate it later. It's always nice to have small patches. Thanks for the comments. > > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 20/45] powerpc/powernv: Allocate PE# in reverse order
On Tue, Apr 19, 2016 at 01:07:59PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>PE number for one particular PE can be allocated dynamically or >>reserved according to the consumed M64 (64-bits prefetchable) >>segments of the PE. The M64 resources, and hence their segments >>and PE number are assigned/reserved in ascending order. The PE >>numbers are allocated dynamically in ascending order as well. >>It's not a problem as the PE numbers are reserved and then >>allocated all at once in fine order. However, it will introduce >>conflicts when PCI hotplug is supported: the PE number to be >>reserved for newly added PE might have been assigned. >> >>To resolve above conflicts, this forces the PE number to be >>allocated dynamically in reverse order. With this patch applied, >>the PE numbers are reserved in ascending order, but allocated >>dynamically in reverse order. > > >The patch is probably is ok, the commit log is not - I do not follow it. Some >PEs are reserved (for what? why does the absolute PE number matter? put it in >the commit log), that means that the corresponding bits in pe_alloc[] should >be set so when you will be allocating PEs for a just plugged device, you >won't pick them and you will pick free ones, and the order should not matter. >I would think that "reservation" happens once at the boot time so you set >"used" bits for the reserved PEs then and after that the dynamic allocator >will skip them. > I will enhance the commit log in next revision, perhaps just pick part of below words: On PHB3, there are 16 M64 BARs in hardware. The last one is split ovenly into 256 segments. Each segment can be associated/assigned to fixed PE# (segment#x <-> PE#x) which is how the hardware was designed. If one plugged PE has M64 (64-bits prefetchable memory) resources, its PE# is equal to the segment#. Otherwise, the PE# is allocated dynamically if the PE doesn't contain M64 resource. The M64 resources are assigned from low to high end, meaning the reserved PE# (according to the M64 segments) are grown from low to high end. It's most likely to get a dynamically allocated PE# which should be reserved because of M64 segment. It's the conflicts the patch tries to resolve. The PE# reservation doesn't happen once at boot time because it's unknow how many PEs and how much M64 resources will be hot added. > >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++ >> 1 file changed, 6 insertions(+), 8 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index f182ca7..565725b 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -144,16 +144,14 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, >>int pe_no) >> >> static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb) >> { >>- unsigned long pe; >>+ unsigned long pe = phb->ioda.total_pe_num - 1; >> >>- do { >>- pe = find_next_zero_bit(phb->ioda.pe_alloc, >>- phb->ioda.total_pe_num, 0); >>- if (pe >= phb->ioda.total_pe_num) >>- return NULL; >>- } while(test_and_set_bit(pe, phb->ioda.pe_alloc)); >>+ for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) { >>+ if (!test_and_set_bit(pe, phb->ioda.pe_alloc)) >>+ return pnv_ioda_init_pe(phb, pe); >>+ } >> >>- return pnv_ioda_init_pe(phb, pe); >>+ return NULL; >> } >> >> static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe) >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 18/45] powerpc/powernv: Increase PE# capacity
On Tue, Apr 19, 2016 at 12:02:23PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>Each PHB maintains an array helping to translate 2-bytes Request >>ID (RID) to PE# with the assumption that PE# takes one byte, meaning >>that we can't have more than 256 PEs. However, pci_dn->pe_number >>already had 4-bytes for the PE#. >> >>This extends the PE# capacity for every PHB. After that, the PE number >>is represented by 4-bytes value. Then we can reuse IODA_INVALID_PE to >>check the PE# in phb->pe_rmap[] is valid or not. > > >This should be merged into "[PATCH v8 21/45] powerpc/powernv: Create PEs at >PCI hot plugging time" as it does not make sense alone (this patch does the >initialization but only 3 patches apart this default value is analyzed -> >hard to review). > Indeed, will move accordingly in next revision. >>Signed-off-by: Gavin Shan>>Reviewed-by: Daniel Axtens >>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 6 +- >> arch/powerpc/platforms/powernv/pci.h | 7 ++- >> 2 files changed, 7 insertions(+), 6 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index 59782fba..7800897 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -757,7 +757,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, >>struct pnv_ioda_pe *pe) >> >> /* Clear the reverse map */ >> for (rid = pe->rid; rid < rid_end; rid++) >>- phb->ioda.pe_rmap[rid] = 0; >>+ phb->ioda.pe_rmap[rid] = IODA_INVALID_PE; >> >> /* Release from all parents PELT-V */ >> while (parent) { >>@@ -3387,6 +3387,10 @@ static void __init pnv_pci_init_ioda_phb(struct >>device_node *np, >> if (prop32) >> phb->ioda.reserved_pe_idx = be32_to_cpup(prop32); >> >>+ /* Invalidate RID to PE# mapping */ >>+ for (i = 0; i < ARRAY_SIZE(phb->ioda.pe_rmap); ++i) >>+ phb->ioda.pe_rmap[i] = IODA_INVALID_PE; >>+ >> /* Parse 64-bit MMIO range */ >> pnv_ioda_parse_m64_window(phb); >> >>diff --git a/arch/powerpc/platforms/powernv/pci.h >>b/arch/powerpc/platforms/powernv/pci.h >>index 350e630..928cf81 100644 >>--- a/arch/powerpc/platforms/powernv/pci.h >>+++ b/arch/powerpc/platforms/powernv/pci.h >>@@ -160,11 +160,8 @@ struct pnv_phb { >> struct list_headpe_list; >> struct mutexpe_list_mutex; >> >>- /* Reverse map of PEs, will have to extend if >>- * we are to support more than 256 PEs, indexed >>- * bus { bus, devfn } >>- */ >>- unsigned char pe_rmap[0x1]; >>+ /* Reverse map of PEs, indexed by {bus, devfn} */ >>+ int pe_rmap[0x1]; >> >> /* TCE cache invalidate registers (physical and >> * remapped) >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 17/45] powerpc/powernv/ioda1: Improve DMA32 segment track
On Tue, Apr 19, 2016 at 11:50:10AM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:44 PM, Gavin Shan wrote: >>In current implementation, the DMA32 segments required by one specific >>PE isn't calculated with the information hold in the PE independently. >>It conflicts with the PCI hotplug design: PE centralized, meaning the >>PE's DMA32 segments should be calculated from the information hold in >>the PE independently. >> >>This introduces an array (@dma32_segmap) for every PHB to track the >>DMA32 segmeng usage. Besides, this moves the logic calculating PE's >>consumed DMA32 segments to pnv_pci_ioda1_setup_dma_pe() so that PE's >>DMA32 segments are calculated/allocated from the information hold in >>the PE (DMA32 weight). Also the logic is improved: we try to allocate >>as much DMA32 segments as we can. It's acceptable that number of DMA32 >>segments less than the expected number are allocated. >> >>Signed-off-by: Gavin Shan> > >This DMA segments business was the reason why I have not even tried >implementing DDW for POWER7 - it is way too different from POWER8 and there >is no chance that anyone outside Ozlabs will ever try using this in practice; >the same applies to PCI hotplug on POWER7. > >I am suggesting to ditch all IODA1 changes from this patchset as this code >will hang around (unused) for may be a year or so and then will be gone as >p5ioc2. > As I knew, some P7 boxes out of Ozlabs have the software stack. At least, I was heavily relying on P7 box + PowerNV based linux heavily until last September of last year. My original thoughts are as below. If they're convincing, I can drop some of IODA1 changes, but not all of them obviously: - In case customer want to use this combo (P7 box + PowerNV) for any reason. - In case developers want to use this combo (P7 box + PowerNV) for any reason. For example, no P8 boxes can be found for one particular project, but available P7 box is still ok for that. - EEH supported on P7/P8 needs hotplug some cases: when hitting excessive failures, PCI devices and their platform resources (PE, DMA, M32/M64 mapping etc) should be purged. - Current implementation has P7/P8 mixed up to some extent which isn't so good as Ben pointed long time ago. It's impossible not to affect P7IOC piece if P8 piece is changed in order to support hotplug. >>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 111 >> +- >> arch/powerpc/platforms/powernv/pci.h | 7 +- >> 2 files changed, 66 insertions(+), 52 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index 0fc2309..59782fba 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -2007,20 +2007,54 @@ static unsigned int >>pnv_pci_ioda_total_dma_weight(struct pnv_phb *phb) >> } >> >> static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb, >>-struct pnv_ioda_pe *pe, >>-unsigned int base, >>-unsigned int segs) >>+struct pnv_ioda_pe *pe) >> { >> >> struct page *tce_mem = NULL; >> struct iommu_table *tbl; >>- unsigned int tce32_segsz, i; >>+ unsigned int weight, total_weight; >>+ unsigned int tce32_segsz, base, segs, i; >> int64_t rc; >> void *addr; >> >> /* XXX FIXME: Handle 64-bit only DMA devices */ >> /* XXX FIXME: Provide 64-bit DMA facilities & non-4K TCE tables etc.. */ >> /* XXX FIXME: Allocate multi-level tables on PHB3 */ >>+ total_weight = pnv_pci_ioda_total_dma_weight(phb); >>+ weight = pnv_pci_ioda_pe_dma_weight(pe); >>+ >>+ segs = (weight * phb->ioda.dma32_count) / total_weight; >>+ if (!segs) >>+ segs = 1; >>+ >>+ /* >>+ * Allocate contiguous DMA32 segments. We begin with the expected >>+ * number of segments. With one more attempt, the number of DMA32 >>+ * segments to be allocated is decreased by one until one segment >>+ * is allocated successfully. >>+ */ >>+ while (segs) { >>+ for (base = 0; base <= phb->ioda.dma32_count - segs; base++) { >>+ for (i = base; i < base + segs; i++) { >>+ if (phb->ioda.dma32_segmap[i] != >>+ IODA_INVALID_PE) >>+ break; >>+ } >>+ >>+ if (i >= base + segs) >>+ break; >>+ } >>+ >>+ if (i >= base + segs) >>+ break; >>+ >>+ segs--; >>+ } >>+ >>+ if (!segs) { >>+ pe_warn(pe, "No available DMA32 segments\n"); >>+ return; >>+ } >> >> tbl = pnv_pci_table_alloc(phb->hose->node); >> iommu_register_group(>table_group, phb->hose->global_number,
Re: [PATCH v8 16/45] powerpc/powernv: Remove DMA32 PE list
On Wed, Apr 13, 2016 at 06:59:40PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:43 PM, Gavin Shan wrote: >>PEs are put into PHB DMA32 list (phb->ioda.pe_dma_list) according >>to their DMA32 weight. The PEs on the list are iterated to setup >>their TCE32 tables at system booting time. The list is used for >>once and there is for keep having it. > >"there is no need to keep it" may be? > Sorry, I should have fixed it in early revision. Will fix it up in next revision. >> >>This moves the logic calculating DMA32 weight of PHB and PE to >>pnv_ioda_setup_dma() to drop PHB's DMA32 list. Also, every PE >>traces the consumed DMA32 segment by @tce32_seg and @tce32_segcount >>are useless and they're removed. >> >>Signed-off-by: Gavin Shan> > >Reviewed-by: Alexey Kardashevskiy > >with few comments below... > >>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 168 >> +- >> arch/powerpc/platforms/powernv/pci.h | 19 >> 2 files changed, 75 insertions(+), 112 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index e60cff6..0fc2309 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -886,44 +886,6 @@ out: >> return 0; >> } >> >>-static void pnv_ioda_link_pe_by_weight(struct pnv_phb *phb, >>-struct pnv_ioda_pe *pe) >>-{ >>- struct pnv_ioda_pe *lpe; >>- >>- list_for_each_entry(lpe, >ioda.pe_dma_list, dma_link) { >>- if (lpe->dma_weight < pe->dma_weight) { >>- list_add_tail(>dma_link, >dma_link); >>- return; >>- } >>- } >>- list_add_tail(>dma_link, >ioda.pe_dma_list); >>-} >>- >>-static unsigned int pnv_ioda_dma_weight(struct pci_dev *dev) >>-{ >>- /* This is quite simplistic. The "base" weight of a device >>- * is 10. 0 means no DMA is to be accounted for it. >>- */ >>- >>- /* If it's a bridge, no DMA */ >>- if (dev->hdr_type != PCI_HEADER_TYPE_NORMAL) >>- return 0; >>- >>- /* Reduce the weight of slow USB controllers */ >>- if (dev->class == PCI_CLASS_SERIAL_USB_UHCI || >>- dev->class == PCI_CLASS_SERIAL_USB_OHCI || >>- dev->class == PCI_CLASS_SERIAL_USB_EHCI) >>- return 3; >>- >>- /* Increase the weight of RAID (includes Obsidian) */ >>- if ((dev->class >> 8) == PCI_CLASS_STORAGE_RAID) >>- return 15; >>- >>- /* Default */ >>- return 10; >>-} >>- >> #ifdef CONFIG_PCI_IOV >> static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) >> { >>@@ -1028,7 +990,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct >>pci_dev *dev) >> pe->flags = PNV_IODA_PE_DEV; >> pe->pdev = dev; >> pe->pbus = NULL; >>- pe->tce32_seg = -1; >> pe->mve_number = -1; >> pe->rid = dev->bus->number << 8 | pdn->devfn; >> >>@@ -1044,16 +1005,6 @@ static struct pnv_ioda_pe >>*pnv_ioda_setup_dev_PE(struct pci_dev *dev) >> return NULL; >> } >> >>- /* Assign a DMA weight to the device */ >>- pe->dma_weight = pnv_ioda_dma_weight(dev); >>- if (pe->dma_weight != 0) { >>- phb->ioda.dma_weight += pe->dma_weight; >>- phb->ioda.dma_pe_count++; >>- } >>- >>- /* Link the PE */ >>- pnv_ioda_link_pe_by_weight(phb, pe); >>- >> return pe; >> } >> >>@@ -1071,7 +1022,6 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, >>struct pnv_ioda_pe *pe) >> } >> pdn->pcidev = dev; >> pdn->pe_number = pe->pe_number; >>- pe->dma_weight += pnv_ioda_dma_weight(dev); >> if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate) >> pnv_ioda_setup_same_PE(dev->subordinate, pe); >> } >>@@ -1108,10 +1058,8 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, >>bool all) >> pe->flags |= (all ? PNV_IODA_PE_BUS_ALL : PNV_IODA_PE_BUS); >> pe->pbus = bus; >> pe->pdev = NULL; >>- pe->tce32_seg = -1; >> pe->mve_number = -1; >> pe->rid = bus->busn_res.start << 8; >>- pe->dma_weight = 0; >> >> if (all) >> pe_info(pe, "Secondary bus %d..%d associated with PE#%d\n", >>@@ -1133,17 +1081,6 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, >>bool all) >> >> /* Put PE to the list */ >> list_add_tail(>list, >ioda.pe_list); >>- >>- /* Account for one DMA PE if at least one DMA capable device exist >>- * below the bridge >>- */ >>- if (pe->dma_weight != 0) { >>- phb->ioda.dma_weight += pe->dma_weight; >>- phb->ioda.dma_pe_count++; >>- } >>- >>- /* Link the PE */ >>- pnv_ioda_link_pe_by_weight(phb, pe); >> } >> >> static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev) >>@@ -1184,7 +1121,6 @@ static
Re: [PATCH v8 13/45] powerpc/powernv/ioda1: M64 support on P7IOC
On Wed, Apr 13, 2016 at 05:47:59PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:43 PM, Gavin Shan wrote: >>This enables M64 window on P7IOC, which has been enabled on PHB3. >>Different from PHB3 where 16 M64 BARs are supported and each of >>them can be owned by one particular PE# exclusively or divided >>evenly to 256 segments, every P7IOC PHB has 16 M64 BARs and each >>of them are divided to 8 segments. So every P7IOC PHB supports >>128 M64 segments in total. P7IOC has M64DT, which helps mapping >>one particular M64 segment# to arbitrary PE#. PHB3 doesn't have >>M64DT, indicating that one M64 segment can only be pinned to the >>fixed PE#. In order to have same code to support M64 on P7IOC and >>PHB3, we just provide 128 M64 segments on every P7IOC PHB and each >>of them is pinned to the fixed PE# by bypassing the function of >>M64DT. In turn, we just need different phb->init_m64() for P7IOC >>and PHB3 to support M64. > >The comment is not quite correct - in addition to pnv_ioda1_init_m64(), you >also need to hack pnv_ioda_pick_m64_pe(). > Right, will talk about the changes to pnv_ioda_pick_m64_pe() in the commit log of next revision. > >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 86 >> +-- >> arch/powerpc/platforms/powernv/pci.h | 3 ++ >> 2 files changed, 86 insertions(+), 3 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index 1dc663a..8488238 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -246,6 +246,64 @@ static void pnv_ioda_reserve_dev_m64_pe(struct pci_dev >>*pdev, >> } >> } >> >>+static int pnv_ioda1_init_m64(struct pnv_phb *phb) >>+{ >>+ struct resource *r; >>+ int index; >>+ >>+ /* >>+ * There are 16 M64 BARs, each of which has 8 segments. So >>+ * there are as many M64 segments as the maximum number of >>+ * PEs, which is 128. >>+ */ >>+ for (index = 0; index < PNV_IODA1_M64_NUM; index++) { >>+ unsigned long base, segsz = phb->ioda.m64_segsize; >>+ int64_t rc; >>+ >>+ base = phb->ioda.m64_base + >>+index * PNV_IODA1_M64_SEGS * segsz; >>+ rc = opal_pci_set_phb_mem_window(phb->opal_id, >>+ OPAL_M64_WINDOW_TYPE, index, base, 0, >>+ PNV_IODA1_M64_SEGS * segsz); >>+ if (rc != OPAL_SUCCESS) { >>+ pr_warn(" Error %lld setting M64 PHB#%d-BAR#%d\n", >>+ rc, phb->hose->global_number, index); >>+ goto fail; >>+ } >>+ >>+ rc = opal_pci_phb_mmio_enable(phb->opal_id, >>+ OPAL_M64_WINDOW_TYPE, index, >>+ OPAL_ENABLE_M64_SPLIT); >>+ if (rc != OPAL_SUCCESS) { >>+ pr_warn(" Error %lld enabling M64 PHB#%d-BAR#%d\n", >>+ rc, phb->hose->global_number, index); >>+ goto fail; >>+ } >>+ } >>+ >>+ /* >>+ * Exclude the segment used by the reserved PE, which >>+ * is expected to be 0 or last supported PE#. >>+ */ >>+ r = >hose->mem_resources[1]; >>+ if (phb->ioda.reserved_pe_idx == 0) >>+ r->start += phb->ioda.m64_segsize; >>+ else if (phb->ioda.reserved_pe_idx == (phb->ioda.total_pe_num - 1)) >>+ r->end -= phb->ioda.m64_segsize; >>+ else >>+ pr_warn(" Cannot cut M64 segment for reserved PE#%d\n", >>+ phb->ioda.reserved_pe_idx); >>+ >>+ return 0; >>+ >>+fail: >>+ for ( ; index >= 0; index--) >>+ opal_pci_phb_mmio_enable(phb->opal_id, >>+ OPAL_M64_WINDOW_TYPE, index, OPAL_DISABLE_M64); >>+ >>+ return -EIO; >>+} >>+ >> static void pnv_ioda_reserve_m64_pe(struct pci_bus *bus, >> unsigned long *pe_bitmap, >> bool all) >>@@ -315,6 +373,26 @@ static int pnv_ioda_pick_m64_pe(struct pci_bus *bus, >>bool all) >> pe->master = master_pe; >> list_add_tail(>list, _pe->slaves); >> } >>+ >>+ /* >>+ * P7IOC supports M64DT, which helps mapping M64 segment >>+ * to one particular PE#. However, PHB3 has fixed mapping >>+ * between M64 segment and PE#. In order to have same logic >>+ * for P7IOC and PHB3, we enforce fixed mapping between M64 >>+ * segment and PE# on P7IOC. >>+ */ >>+ if (phb->type == PNV_PHB_IODA1) { >>+ int64_t rc; >>+ >>+ rc = opal_pci_map_pe_mmio_window(phb->opal_id, >>+ pe->pe_number, OPAL_M64_WINDOW_TYPE, >>+
Re: [PATCH v8 11/45] powerpc/powernv: Track M64 segment consumption
On Wed, Apr 13, 2016 at 05:09:45PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:43 PM, Gavin Shan wrote: >>When unplugging PCI devices, their parent PEs might be offline. >>The consumed M64 resource by the PEs should be released at that >>time. As we track M32 segment consumption, this introduces an >>array to the PHB to track the mapping between M64 segment and >>PE number. >> >>Signed-off-by: Gavin Shan> > >Reviewed-by: Alexey Kardashevskiy > >but it would not hurt to mention in the commit log why M64 segment is not >tracked/setup by the existing (at this point, at least) >pnv_ioda_setup_one_res(). > Right, I'll add something for it to the commit log in next revision, thanks! > >>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 10 -- >> arch/powerpc/platforms/powernv/pci.h | 1 + >> 2 files changed, 9 insertions(+), 2 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index 7330a73..fc0374a 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -305,6 +305,7 @@ static int pnv_ioda2_pick_m64_pe(struct pci_bus *bus, >>bool all) >> phb->ioda.total_pe_num) { >> pe = >ioda.pe_array[i]; >> >>+ phb->ioda.m64_segmap[pe->pe_number] = pe->pe_number; >> if (!master_pe) { >> pe->flags |= PNV_IODA_PE_MASTER; >> INIT_LIST_HEAD(>slaves); >>@@ -3245,7 +3246,7 @@ static void __init pnv_pci_init_ioda_phb(struct >>device_node *np, >> { >> struct pci_controller *hose; >> struct pnv_phb *phb; >>- unsigned long size, m32map_off, pemap_off, iomap_off = 0; >>+ unsigned long size, m64map_off, m32map_off, pemap_off, iomap_off = 0; >> const __be64 *prop64; >> const __be32 *prop32; >> int i, len; >>@@ -3332,6 +,8 @@ static void __init pnv_pci_init_ioda_phb(struct >>device_node *np, >> >> /* Allocate aux data & arrays. We don't have IO ports on PHB3 */ >> size = _ALIGN_UP(phb->ioda.total_pe_num / 8, sizeof(unsigned long)); >>+ m64map_off = size; >>+ size += phb->ioda.total_pe_num * sizeof(phb->ioda.m64_segmap[0]); >> m32map_off = size; >> size += phb->ioda.total_pe_num * sizeof(phb->ioda.m32_segmap[0]); >> if (phb->type == PNV_PHB_IODA1) { >>@@ -3342,9 +3345,12 @@ static void __init pnv_pci_init_ioda_phb(struct >>device_node *np, >> size += phb->ioda.total_pe_num * sizeof(struct pnv_ioda_pe); >> aux = memblock_virt_alloc(size, 0); >> phb->ioda.pe_alloc = aux; >>+ phb->ioda.m64_segmap = aux + m64map_off; >> phb->ioda.m32_segmap = aux + m32map_off; >>- for (i = 0; i < phb->ioda.total_pe_num; i++) >>+ for (i = 0; i < phb->ioda.total_pe_num; i++) { >>+ phb->ioda.m64_segmap[i] = IODA_INVALID_PE; >> phb->ioda.m32_segmap[i] = IODA_INVALID_PE; >>+ } >> if (phb->type == PNV_PHB_IODA1) { >> phb->ioda.io_segmap = aux + iomap_off; >> for (i = 0; i < phb->ioda.total_pe_num; i++) >>diff --git a/arch/powerpc/platforms/powernv/pci.h >>b/arch/powerpc/platforms/powernv/pci.h >>index 36c4965..866a5ea 100644 >>--- a/arch/powerpc/platforms/powernv/pci.h >>+++ b/arch/powerpc/platforms/powernv/pci.h >>@@ -146,6 +146,7 @@ struct pnv_phb { >> struct pnv_ioda_pe *pe_array; >> >> /* M32 & IO segment maps */ >>+ int *m64_segmap; >> int *m32_segmap; >> int *io_segmap; >> >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 09/45] powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()
On Wed, Apr 13, 2016 at 04:45:39PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:43 PM, Gavin Shan wrote: >>The original implementation of pnv_ioda_setup_pe_seg() configures >>IO and M32 segments by separate logics, which can be merged by >>by caching @segmap, @seg_size, @win in advance. This shouldn't >>cause any behavioural changes. >> >>Signed-off-by: Gavin Shan>>--- >> arch/powerpc/platforms/powernv/pci-ioda.c | 62 >> ++- >> 1 file changed, 28 insertions(+), 34 deletions(-) >> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>b/arch/powerpc/platforms/powernv/pci-ioda.c >>index 44cc5f3..fd7d382 100644 >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>@@ -2940,8 +2940,10 @@ static void pnv_ioda_setup_pe_seg(struct >>pci_controller *hose, >> struct pnv_phb *phb = hose->private_data; >> struct pci_bus_region region; >> struct resource *res; >>- int i, index; >>- int rc; >>+ unsigned int segsize; >>+ int *segmap, index, i; >>+ uint16_t win; >>+ int64_t rc; >> >> /* >> * NOTE: We only care PCI bus based PE for now. For PCI >>@@ -2958,23 +2960,9 @@ static void pnv_ioda_setup_pe_seg(struct >>pci_controller *hose, >> if (res->flags & IORESOURCE_IO) { >> region.start = res->start - phb->ioda.io_pci_base; >> region.end = res->end - phb->ioda.io_pci_base; >>- index = region.start / phb->ioda.io_segsize; >>- >>- while (index < phb->ioda.total_pe_num && >>-region.start <= region.end) { >>- phb->ioda.io_segmap[index] = pe->pe_number; >>- rc = opal_pci_map_pe_mmio_window(phb->opal_id, >>- pe->pe_number, OPAL_IO_WINDOW_TYPE, 0, >>index); >>- if (rc != OPAL_SUCCESS) { >>- pr_err("%s: OPAL error %d when mapping >>IO " >>-"segment #%d to PE#%d\n", >>-__func__, rc, index, >>pe->pe_number); >>- break; >>- } >>- >>- region.start += phb->ioda.io_segsize; >>- index++; >>- } >>+ segsize = phb->ioda.io_segsize; >>+ segmap = phb->ioda.io_segmap; >>+ win = OPAL_IO_WINDOW_TYPE; >> } else if ((res->flags & IORESOURCE_MEM) && >> !pnv_pci_is_mem_pref_64(res->flags)) { >> region.start = res->start - >>@@ -2983,23 +2971,29 @@ static void pnv_ioda_setup_pe_seg(struct >>pci_controller *hose, >> region.end = res->end - >> hose->mem_offset[0] - >> phb->ioda.m32_pci_base; >>- index = region.start / phb->ioda.m32_segsize; >>- >>- while (index < phb->ioda.total_pe_num && >>-region.start <= region.end) { >>- phb->ioda.m32_segmap[index] = pe->pe_number; >>- rc = opal_pci_map_pe_mmio_window(phb->opal_id, >>- pe->pe_number, OPAL_M32_WINDOW_TYPE, 0, >>index); >>- if (rc != OPAL_SUCCESS) { >>- pr_err("%s: OPAL error %d when mapping >>M32 " >>-"segment#%d to PE#%d", >>-__func__, rc, index, >>pe->pe_number); >>- break; >>- } >>+ segsize = phb->ioda.m32_segsize; >>+ segmap = phb->ioda.m32_segmap; >>+ win = OPAL_M32_WINDOW_TYPE; >>+ } else { >>+ continue; >>+ } >> >>- region.start += phb->ioda.m32_segsize; >>- index++; >>+ index = region.start / segsize; >>+ while (index < phb->ioda.total_pe_num && >>+region.start <= region.end) { >>+ segmap[index] = pe->pe_number; >>+ rc = opal_pci_map_pe_mmio_window(phb->opal_id, >>+ pe->pe_number, win, 0, index); >>+ if (rc != OPAL_SUCCESS) { >>+ pr_warn("%s: Error %lld mapping (%d) seg#%d to >>PHB#%d-PE#%d\n", >>+ __func__, rc, win, index, >>+ pe->phb->hose->global_number, >>+
Re: [PATCH v8 03/45] powerpc/pci: Cleanup on struct pci_controller_ops
On Wed, Apr 13, 2016 at 03:52:25PM +1000, Alexey Kardashevskiy wrote: >On 02/17/2016 02:43 PM, Gavin Shan wrote: >>Each PHB has one instance of "struct pci_controller_ops", which >>includes various callbacks called by PCI subsystem. In the definition >>of this struct, some callbacks have explicit names for its arguments, >>but the left don't have. >> >>This adds all explicit names of the arguments to the callbacks in >>"struct pci_controller_ops" so that the code looks consistent. >> >>Signed-off-by: Gavin Shan>>Reviewed-by: Daniel Axtens > >With tiny nit below, > >Reviewed-by: Alexey Kardashevskiy > > > >>--- >> arch/powerpc/include/asm/pci-bridge.h | 13 +++-- >> 1 file changed, 7 insertions(+), 6 deletions(-) >> >>diff --git a/arch/powerpc/include/asm/pci-bridge.h >>b/arch/powerpc/include/asm/pci-bridge.h >>index b688d04..4dd6ef4 100644 >>--- a/arch/powerpc/include/asm/pci-bridge.h >>+++ b/arch/powerpc/include/asm/pci-bridge.h >>@@ -21,18 +21,19 @@ struct pci_controller_ops { >> void(*dma_dev_setup)(struct pci_dev *dev); >> void(*dma_bus_setup)(struct pci_bus *bus); >> >>- int (*probe_mode)(struct pci_bus *); >>+ int (*probe_mode)(struct pci_bus *bus); >> >> /* Called when pci_enable_device() is called. Returns true to >> * allow assignment/enabling of the device. */ >>- bool(*enable_device_hook)(struct pci_dev *); >>+ bool(*enable_device_hook)(struct pci_dev *dev); > > >"pdev" is slightly better as it is of the "pci_dev" type (4130 occurrences of >"pci_dev *pdev" and just 2833 of "pci_dev *dev" in the current kernel), "dev" >is for "struct device". > Thanks for your review. I don't know if "dev" is for "struct device" only. Usually, "dev" and "pdev" are interchangeably used for "struct pci_dev". Especially the code written in old days uses "dev" for "struct pci_dev" heavily. Yes, I agree "pdev" is better than "dev" in this case and I'm going to fix this up in next revision. >> >>- void(*disable_device)(struct pci_dev *); >>+ void(*disable_device)(struct pci_dev *dev); >> >>- void(*release_device)(struct pci_dev *); >>+ void(*release_device)(struct pci_dev *dev); >> >> /* Called during PCI resource reassignment */ >>- resource_size_t (*window_alignment)(struct pci_bus *, unsigned long >>type); >>+ resource_size_t (*window_alignment)(struct pci_bus *bus, >>+ unsigned long type); >> void(*setup_bridge)(struct pci_bus *bus, >> unsigned long type); >> void(*reset_secondary_bus)(struct pci_dev *dev); >>@@ -46,7 +47,7 @@ struct pci_controller_ops { >> int (*dma_set_mask)(struct pci_dev *dev, u64 dma_mask); >> u64 (*dma_get_required_mask)(struct pci_dev *dev); >> >>- void(*shutdown)(struct pci_controller *); >>+ void(*shutdown)(struct pci_controller *hose); >> }; >> >> /* >> > > >-- >Alexey > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [V2] powerpc/Kconfig: Update config option based on page size.
On 20/04/16 00:59, Aneesh Kumar K.V wrote: > Michael Ellermanwrites: > >> On Fri, 2016-19-02 at 05:38:47 UTC, Rashmica Gupta wrote: >>> Currently on PPC64 changing kernel pagesize from 4K to 64K leaves >>> FORCE_MAX_ZONEORDER set to 13 - which produces a compile error. >>> >> ... >>> So, update the range of FORCE_MAX_ZONEORDER from 9-64 to 8-9 for 64K pages >>> and from 13-64 to 9-13 for 4K pages. >>> >>> Signed-off-by: Rashmica Gupta >>> Reviewed-by: Balbir Singh >> >> Applied to powerpc next, thanks. >> >> https://git.kernel.org/powerpc/c/a7ee539584acf4a565b7439cea >> > > HPAGE_PMD_ORDER is not something we should check w.r.t 4k linux page > size. We do have the below constraint w.r.t hugetlb pages > > static inline bool hstate_is_gigantic(struct hstate *h) > { > return huge_page_order(h) >= MAX_ORDER; > } > > That require MAX_ORDER to be greater than 12. > The build will fail for MAX_ZONEORDER beyond the specified limits. MAX_ORDER > 12 for what page size? My understanding is this 1. gigantic refers to the fact the regular allocators cannot allocate this page 2. Use alloc_contig_range() with CONFIG_CMA for gigantic pages I could be wrong > Did we test hugetlbfs 4k config with this patch ? Will it work if we > start marking hugepage as gigantic page ? Nope.. I did not Thanks for the review! Balbir Singh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/5] Live patching for powerpc
On Wed, 20 Apr 2016, Balbir Singh wrote: > Thanks, do we have a summary of what the relocation changes look like? This work is queued in livepatching.git#for-4.7/arch-independent-klp-relocations -- Jiri Kosina SUSE Labs ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/5] Live patching for powerpc
On 16/04/16 01:07, Jiri Kosina wrote: > On Thu, 14 Apr 2016, Michael Ellerman wrote: > >> Topic branch here: >> >> >> https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/log/?h=topic/livepatch >> >> I will merge that before Monday (my time) if I don't hear any objections. > > I've now pulled this into livepatching.git#for-4.7/livepatching-ppc64 and > merged that branch into for-next as well. > > That branch already contains all the relocation changes queued for 4.7, so > as much testing of the merged result as possible on ppc64 would be > appreciated. Thanks, do we have a summary of what the relocation changes look like? Balbir Singh. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 1/1] powerpc/86xx: Add support for Emerson/Artesyn MVME7100
On Tue, 2016-04-19 at 10:33 +0200, Alessio Igor Bogani wrote: > Hi Scott, > > Thanks for reviewing it! > > On 19 April 2016 at 06:26, Scott Woodwrote: > > On Mon, 2016-04-18 at 09:57 +0200, Alessio Igor Bogani wrote: > > > + pci0: pcie@f1008000 { > > > + reg = <0xf1008000 0x1000>; > > > + ranges = <0x0200 0x0 0x8000 0x8000 0x0 > > > 0x5000 > > > + 0x0100 0x0 0x 0xf000 0x0 > > > 0x0080>; > [...] > > > + > > > + pci1: pcie@f1009000 { > > > + compatible = "fsl,mpc8641-pcie"; > > > + device_type = "pci"; > > > + #size-cells = <2>; > > > + #address-cells = <3>; > > > + reg = <0xf1009000 0x1000>; > > > + bus-range = <0 0xff>; > > > > Why are pci0 and pci1 so different? Why does mpc8641si-post.dtsi not have > > pci1? > > You are right. The MPC8641 processor offers two pci so > mpc8641si-post.dtsi should be the right place where to define both. > What about the boards which don't use the pci1? Will 'status = > "disabled"' be enough? Yes. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Trouble with DMA on PPC linux question
Ben, Benjamin Herrenschmidtwrote on 04/19/2016 01:45:40 AM: > From: Benjamin Herrenschmidt > To: bruce_leon...@selinc.com, linuxppc-dev@lists.ozlabs.org > Date: 04/19/2016 01:46 AM > Subject: Re: Trouble with DMA on PPC linux question > > On Mon, 2016-04-18 at 14:54 -0700, bruce_leon...@selinc.com wrote: > > > > On the DMA transactions that work, the virtual address I hand to > > dma_map_single() is something like 0xe084 and the dma_addr_t result is > > 0x1084 which is less than my 512Mb limit. On the transactions that > > don't work, the virtual address is 0xd539 with the mapped result being > > 0x2539, which is past my upper bound on my RAM. In fact it's not even > > in my memory map, there's a hole there. > > Where does this virtual address come from ? > > The kernel has two types of virtual addresses. Those coming from the > linear mapping (the stuff you get from kmalloc() for example, or > get_pages()) which can be translated using that simple substraction. > > The other is the vmalloc space, and that is a non-linear mapping of > random pages. > > If your vaddr comes from the latter it can't be passed to > dma_map_single as-is, you need to get to the underlying pages first. > > Ben. > That's a good question. I'm not sure where the addresses come from right now (they're handed to me from the MTD layer), but I'll certainly dig into that and see. Thanks for the help! I appreciate the pointer. Bruce ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] cxl: Increase timeout for detection of AFU mmio hang
PSL designers recommend a larger value for the mmio hang pulse, 256 us instead of 1 us. The CAIA architecture states that it needs to be smaller than 1/2 of the RTOS timeout set in the PHB for outbound non-posted transactions, which is still (easily) the case here. Signed-off-by: Frederic Barrat--- Needs to be applied on top of http://patchwork.ozlabs.org/patch/604029/ drivers/misc/cxl/pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 94fd3f7..0a9c15b 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -375,8 +375,10 @@ static int init_implementation_adapter_regs(struct cxl *adapter, struct pci_dev return -ENODEV; } + psl_dsnctl = 0x9000ULL; /* pteupd ttype, scdone */ + psl_dsnctl |= (0x2ULL << (63-38)); /* MMIO hang pulse: 256 us */ /* Tell PSL where to route data to */ - psl_dsnctl = 0x9200ULL | (chipid << (63-5)); + psl_dsnctl |= (chipid << (63-5)); psl_dsnctl |= (capp_unit_id << (63-13)); cxl_p1_write(adapter, CXL_PSL_DSNDCTL, psl_dsnctl); -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/5] drivers/net: support hdlc function for QE-UCC
Le 30/03/2016 10:50, Zhao Qiang a écrit : The driver add hdlc support for Freescale QUICC Engine. It support NMSI and TSA mode. When using TSA, how does the TSA gets configured ? Especially how do you describe which Timeslot is switched to HDLC channels ? Is it possible to route some Timeslots to one UCC for HDLC, and route some others to another UCC for an ALSA sound driver ? The QE also have a QMC which allows to split all timeslots to a given UCC into independant channels that can either be used with HDLC or transparents (for audio for instance). Do you intent to also support QMC ? According to the compatible property, it looks like your driver is for freescale T1040. The MPC83xx also has a Quick Engine, would it work on it too ? Christophe Signed-off-by: Zhao Qiang--- MAINTAINERS|6 + drivers/net/wan/Kconfig| 12 + drivers/net/wan/Makefile |1 + drivers/net/wan/fsl_ucc_hdlc.c | 1339 drivers/net/wan/fsl_ucc_hdlc.h | 140 + include/soc/fsl/qe/ucc_fast.h |4 + 6 files changed, 1502 insertions(+) create mode 100644 drivers/net/wan/fsl_ucc_hdlc.c create mode 100644 drivers/net/wan/fsl_ucc_hdlc.h diff --git a/MAINTAINERS b/MAINTAINERS index 74bbff3..428d6ed 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4572,6 +4572,12 @@ F: drivers/net/ethernet/freescale/gianfar* X:drivers/net/ethernet/freescale/gianfar_ptp.c F:Documentation/devicetree/bindings/net/fsl-tsec-phy.txt +FREESCALE QUICC ENGINE UCC HDLC DRIVER +M: Zhao Qiang +L: linuxppc-dev@lists.ozlabs.org +S: Maintained +F: drivers/net/wan/fsl_ucc_hdlc* + FREESCALE QUICC ENGINE UCC UART DRIVER M:Timur Tabi L:linuxppc-dev@lists.ozlabs.org diff --git a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig index a2fdd15..cc424b2 100644 --- a/drivers/net/wan/Kconfig +++ b/drivers/net/wan/Kconfig @@ -280,6 +280,18 @@ config DSCC4 To compile this driver as a module, choose M here: the module will be called dscc4. +config FSL_UCC_HDLC + tristate "Freescale QUICC Engine HDLC support" + depends on HDLC + select QE_TDM + select QUICC_ENGINE + help + Driver for Freescale QUICC Engine HDLC controller. The driver + support HDLC run on NMSI and TDM mode. + + To compile this driver as a module, choose M here: the + module will be called fsl_ucc_hdlc. + config DSCC4_PCISYNC bool "Etinc PCISYNC features" depends on DSCC4 diff --git a/drivers/net/wan/Makefile b/drivers/net/wan/Makefile index c135ef4..25fec40 100644 --- a/drivers/net/wan/Makefile +++ b/drivers/net/wan/Makefile @@ -32,6 +32,7 @@ obj-$(CONFIG_WANXL) += wanxl.o obj-$(CONFIG_PCI200SYN) += pci200syn.o obj-$(CONFIG_PC300TOO)+= pc300too.o obj-$(CONFIG_IXP4XX_HSS) += ixp4xx_hss.o +obj-$(CONFIG_FSL_UCC_HDLC) += fsl_ucc_hdlc.o clean-files := wanxlfw.inc $(obj)/wanxl.o: $(obj)/wanxlfw.inc diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c new file mode 100644 index 000..9958ec1 --- /dev/null +++ b/drivers/net/wan/fsl_ucc_hdlc.c @@ -0,0 +1,1339 @@ +/* Freescale QUICC Engine HDLC Device Driver + * + * Copyright 2014 Freescale Semiconductor Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "fsl_ucc_hdlc.h" + +#define DRV_DESC "Freescale QE UCC HDLC Driver" +#define DRV_NAME "ucc_hdlc" + +#define TDM_PPPOHT_SLIC_MAXIN +/* #define DEBUG */ +/* #define QE_HDLC_TEST */ +#define BROKEN_FRAME_INFO + +static struct ucc_tdm_info utdm_primary_info = { + .uf_info = { + .tsa = 0, + .cdp = 0, + .cds = 1, + .ctsp = 1, + .ctss = 1, + .revd = 0, + .urfs = 256, + .utfs = 256, + .urfet = 128, + .urfset = 192, + .utfet = 128, + .utftt = 0x40, + .ufpt = 256, + .mode = UCC_FAST_PROTOCOL_MODE_HDLC, + .ttx_trx = UCC_FAST_GUMR_TRANSPARENT_TTX_TRX_NORMAL, + .tenc = UCC_FAST_TX_ENCODING_NRZ, + .renc = UCC_FAST_RX_ENCODING_NRZ, + .tcrc = UCC_FAST_16_BIT_CRC, + .synl = UCC_FAST_SYNC_LEN_NOT_USED, + }, + + .si_info = { +#ifdef CONFIG_FSL_PQ_MDS_T1 + .simr_rfsd = 1,
Re: [V2] powerpc/Kconfig: Update config option based on page size.
Michael Ellermanwrites: > On Fri, 2016-19-02 at 05:38:47 UTC, Rashmica Gupta wrote: >> Currently on PPC64 changing kernel pagesize from 4K to 64K leaves >> FORCE_MAX_ZONEORDER set to 13 - which produces a compile error. >> > ... >> So, update the range of FORCE_MAX_ZONEORDER from 9-64 to 8-9 for 64K pages >> and from 13-64 to 9-13 for 4K pages. >> >> Signed-off-by: Rashmica Gupta >> Reviewed-by: Balbir Singh > > Applied to powerpc next, thanks. > > https://git.kernel.org/powerpc/c/a7ee539584acf4a565b7439cea > HPAGE_PMD_ORDER is not something we should check w.r.t 4k linux page size. We do have the below constraint w.r.t hugetlb pages static inline bool hstate_is_gigantic(struct hstate *h) { return huge_page_order(h) >= MAX_ORDER; } That require MAX_ORDER to be greater than 12. Did we test hugetlbfs 4k config with this patch ? Will it work if we start marking hugepage as gigantic page ? -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/devtree: Parse new DRC mem/cpu/dev device, tree elements
On 03/15/2016 12:15 AM, linuxppc-dev-requ...@lists.ozlabs.org wrote: > Documentation/devicetree/bindings ? or link to PAPR where it's specified? > > -- > Stewart Smith < OPAL Architect, IBM. Here's the link to the Notes PAPR database's issue: notes://D01DBR12/86256680004635D2/565907e362ce41e28625636a000fba97/b2fa2e426b3222fd85257e810050443c In case you don't have access to the database, here are the document headings attached to the issue: 10/21/15 : Updated Section: C.6.6.2 ibm,dynamic-reconfiguration-memory to remove lmb-size from ibm,dynamic-memory-v2; corrected typo in C.6.6.2. (See attached file: C PAPR Binding.doc) 10/21/15 : changed ibm,drc-index to drc-index in Section 7.3.28 and R1-7.3.28-14; added ibm,drc-info in the root and vdevice nodes in the Partition Migration/Hibernation section of Table 129. (See attached file: Chapter 7 .doc) 10/21/15 : Updated reference " ... the drc-xxx or ibm,drc-info property ..." in Section 13.5.3.2. Added "See 13.5.2.8 for additional information." to the first paragraph of sections 13.5.2.2, 13.5.2.4, 13.5.2.5, 13.5.2.6 : In the first paragraph of each section, add (See attached file: Chapter 13.doc) (See attached file: Chapter 17 and 18.doc) (See attached file: Chapter 5 .doc) -- Michael W. Bringmann Linux Technology Center IBM Corporation Tie-Line 363-5196 External: (512) 286-5196 Cell: (512) 466-0650 m...@linux.vnet.ibm.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5] powerpc/pci: Assign fixed PHB number based on device-tree properties
On 04/19/2016 04:27 AM, Ian Munsie wrote: Thanks for addressing my feedback :) Reviewed-by: Ian MunsieThanks very much for reviewing Ian =) Cheers, Guilherme ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC v6 06/10] PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag
On 2016/4/18 19:30, David Laight wrote: From: Yongji Xie Sent: 18 April 2016 11:59 We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP which indicates all devices on the bus are protected by the hardware which supports IRQ remapping(intel naming). This flag will be used to know whether it's safe to expose MSI-X tables of PCI BARs to userspace. Because the capability of IRQ remapping can guarantee the PCI device cannot trigger MSIs that correspond to interrupt IDs of other devices. I'm worried that this entire series is going to break drivers for existing hardware. I understand some of the reasoning for 'vm pass through' configurations, but there will be PCIe devices out there that have the MSI-X tables in the same BAR as other device registers. If you are lucky nothing else is in the same 4k area, but I wouldn't assume it. Thanks for your comments. But I didn't get your point here. Why will exposing MSI-X table to userspace break the driver for hardware which have the MSI-X tables in the same BAR as other device registers? Could you give me more details? The reason why we want to mmap MSI-X table is that there may be some other critical device registers in the same page as the MSI-X table. We prefer to handle the mmio access to these registers in guest rather than in QEMU. So we would like to see there is something else in the same 4k/64k area. In any case, if the hardware can't police the card's master transfers there is nothing to stop a different bus master block on the card from raising MSI-X interrupts - they are just a PCIe write. So all you are doing is raising the bar slightly and giving a very false sense of security. Do you mean we can request a DMA to the target address area that raises MSI-X interrupts? But for PPC64 with IODA bridge, this invalid PCIe write will be prevented on PHB before raising MSI-X interrupt. And I think the capability of interrupt remapping or ITS can also do the same thing. If hardware didn't support this, we would not expose MSI-X table in my patch. Thanks, Yongji ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 45/45] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
On 02/17/2016 02:44 PM, Gavin Shan wrote: This adds standalone driver to support PCI hotplug for PowerPC PowerNV platform that runs on top of skiboot firmware. The firmware identifies hotpluggable slots and marked their device tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans device tree nodes to create/register PCI hotplug slot accordingly. The PCI slots are organized in fashion of tree, which means one PCI slot might have parent PCI slot and parent PCI slot possibly contains multiple child PCI slots. At the plugging time, the parent PCI slot is populated before its children. The child PCI slots are removed before their parent PCI slot can be removed from the system. If the skiboot firmware doesn't support slot status retrieval, the PCI slot device node shouldn't have property "ibm,reset-by-firmware". In that case, none of valid PCI slots will be detected from device tree. The skiboot firmware doesn't export the capability to access attention LEDs yet and it's something for TBD. Signed-off-by: Gavin ShanAcked-by: Bjorn Helgaas --- drivers/pci/hotplug/Kconfig | 12 + drivers/pci/hotplug/Makefile | 3 + drivers/pci/hotplug/pnv_php.c | 870 ++ 3 files changed, 885 insertions(+) create mode 100644 drivers/pci/hotplug/pnv_php.c diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig index df8caec..167c8ce 100644 --- a/drivers/pci/hotplug/Kconfig +++ b/drivers/pci/hotplug/Kconfig @@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC When in doubt, say N. +config HOTPLUG_PCI_POWERNV + tristate "PowerPC PowerNV PCI Hotplug driver" + depends on PPC_POWERNV && EEH + help + Say Y here if you run PowerPC PowerNV platform that supports + PCI Hotplug + + To compile this driver as a module, choose M here: the + module will be called pnv-php. + + When in doubt, say N. + config HOTPLUG_PCI_RPA tristate "RPA PCI Hotplug driver" depends on PPC_PSERIES && EEH diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile index b616e75..e33cdda 100644 --- a/drivers/pci/hotplug/Makefile +++ b/drivers/pci/hotplug/Makefile @@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)+= pciehp.o obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550) += cpcihp_zt5550.o obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)+= cpcihp_generic.o obj-$(CONFIG_HOTPLUG_PCI_SHPC)+= shpchp.o +obj-$(CONFIG_HOTPLUG_PCI_POWERNV) += pnv-php.o obj-$(CONFIG_HOTPLUG_PCI_RPA) += rpaphp.o obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR) += rpadlpar_io.o obj-$(CONFIG_HOTPLUG_PCI_SGI) += sgi_hotplug.o @@ -50,6 +51,8 @@ ibmphp-objs := ibmphp_core.o \ acpiphp-objs := acpiphp_core.o \ acpiphp_glue.o +pnv-php-objs := pnv_php.o + rpaphp-objs := rpaphp_core.o \ rpaphp_pci.o\ rpaphp_slot.o diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c new file mode 100644 index 000..364ec36 --- /dev/null +++ b/drivers/pci/hotplug/pnv_php.c @@ -0,0 +1,870 @@ +/* + * PCI Hotplug Driver for PowerPC PowerNV platform. + * + * Copyright Gavin Shan, IBM Corporation 2015. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include + +#include +#include +#include + +#define DRIVER_VERSION "0.1" +#define DRIVER_AUTHOR "Gavin Shan, IBM Corporation" +#define DRIVER_DESC"PowerPC PowerNV PCI Hotplug Driver" + +struct pnv_php_slot { + struct hotplug_slot slot; + struct hotplug_slot_infoslot_info; + uint64_tid; + char*name; + int slot_no; + struct kref kref; +#define PNV_PHP_STATE_INITIALIZED 0 +#define PNV_PHP_STATE_REGISTERED 1 +#define PNV_PHP_STATE_POPULATED2 + int state; + struct device_node *dn; + struct pci_dev *pdev; + struct pci_bus *bus; + boolpower_state_check; + int power_state_confirmed; +#define PNV_PHP_POWER_CONFIRMED_INVALID0 +#define PNV_PHP_POWER_CONFIRMED_SUCCESS1 +#define PNV_PHP_POWER_CONFIRMED_FAIL 2 + struct opal_msg *msg; + void*fdt; + void*dt; + struct of_changeset ocs; +
Re: [5/5] powerpc/livepatch: Add live patching support on ppc64le
On Wed, 2016-13-04 at 12:53:23 UTC, Michael Ellerman wrote: > Add the kconfig logic & assembly support for handling live patched > functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn > depends on the new -mprofile-kernel ftrace ABI, which is only supported > currently on ppc64le. ... > > Signed-off-by: Michael Ellerman> Reviewed-by: Torsten Duwe > Reviewed-by: Balbir Singh Applied to powerpc next. https://git.kernel.org/powerpc/c/85baa095497f3e590df9f6c893 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [4/5] powerpc/livepatch: Add livepatch stack to struct thread_info
On Wed, 2016-13-04 at 12:53:22 UTC, Michael Ellerman wrote: > In order to support live patching we need to maintain an alternate > stack of TOC & LR values. We use the base of the stack for this, and > store the "live patch stack pointer" in struct thread_info. > > Unlike the other fields of thread_info, we can not statically initialise > that value, so it must be done at run time. > > This patch just adds the code to support that, it is not enabled until > the next patch which actually adds live patch support. > > Signed-off-by: Michael Ellerman> Acked-by: Balbir Singh Applied to powerpc next. https://git.kernel.org/powerpc/c/5d31a96e6c0187f2c5d7004e00 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [3/5] powerpc/livepatch: Add livepatch header
On Wed, 2016-13-04 at 12:53:21 UTC, Michael Ellerman wrote: > Add the powerpc specific livepatch definitions. In particular we provide > a non-default implementation of klp_get_ftrace_location(). > > This is required because the location of the mcount call is not constant > when using -mprofile-kernel (which we always do for live patching). > > Signed-off-by: Torsten Duwe> Signed-off-by: Balbir Singh > Signed-off-by: Michael Ellerman Applied to powerpc next. https://git.kernel.org/powerpc/c/f63e6d89876034c21ecd18bb1c cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [2/5] livepatch: Allow architectures to specify an alternate ftrace location
On Wed, 2016-13-04 at 12:53:20 UTC, Michael Ellerman wrote: > When livepatch tries to patch a function it takes the function address > and asks ftrace to install the livepatch handler at that location. > ftrace will look for an mcount call site at that exact address. > > On powerpc the mcount location is not the first instruction of the > function, and in fact it's not at a constant offset from the start of > the function. To accommodate this add a hook which arch code can > override to customise the behaviour. > > Signed-off-by: Torsten Duwe> Signed-off-by: Balbir Singh > Signed-off-by: Petr Mladek > Signed-off-by: Michael Ellerman Applied to powerpc next. https://git.kernel.org/powerpc/c/28e7cbd3e0f5fefec892842d13 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [1/5] ftrace: Make ftrace_location_range() global
On Wed, 2016-13-04 at 12:53:19 UTC, Michael Ellerman wrote: > In order to support live patching on powerpc we would like to call > ftrace_location_range(), so make it global. > > Signed-off-by: Torsten Duwe> Signed-off-by: Balbir Singh > Signed-off-by: Michael Ellerman Applied to powerpc next. https://git.kernel.org/powerpc/c/04cf31a759ef575f750a63777c cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [3/3] powerpc: Update TM user feature bits in scan_features()
On Fri, 2016-15-04 at 02:08:19 UTC, Unknown sender due to SPF wrote: > We need to update the user TM feature bits (PPC_FEATURE2_HTM and > PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature > bit. > > At the moment, if firmware reports TM is not available we turn off > the kernel TM feature bit but leave the userspace ones on. Userspace > thinks it can execute TM instructions and it dies trying. > > This (together with a QEMU patch) fixes PR KVM, which doesn't currently > support TM. > > Signed-off-by: Anton Blanchard> Cc: sta...@vger.kernel.org Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/4705e02498d6d5a7ab98dfee95 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [2/3] powerpc: Update cpu_user_features2 in scan_features()
On Fri, 2016-15-04 at 02:07:24 UTC, Unknown sender due to SPF wrote: > scan_features() updates cpu_user_features but not cpu_user_features2. > > Amongst other things, cpu_user_features2 contains the user TM feature > bits which we must keep in sync with the kernel TM feature bit. > > Signed-off-by: Anton Blanchard> Cc: sta...@vger.kernel.org Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/beff82374b259d726e2625ec6c cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v2, 1/3] powerpc: scan_features() updates incorrect bits for REAL_LE
On Mon, 2016-18-04 at 10:36:07 UTC, Michael Ellerman wrote: > From: Anton Blanchard> > The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU > feature value, meaning all the remaining elements initialise the wrong > values. ... > > Fix the code by adding the missing initialisation of the MMU feature. > > Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It > would be unsafe to start using it as old kernels incorrectly set it. > > Fixes: 44ae3ab3358e ("powerpc: Free up some CPU feature bits by moving out > MMU-related features") > Signed-off-by: Anton Blanchard > Cc: sta...@vger.kernel.org > [mpe: Flesh out changelog, add comment reserving 0x4] > Signed-off-by: Michael Ellerman Applied to powerpc fixes. https://git.kernel.org/powerpc/c/6997e57d693b07289694239e52 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3 2/2] cpufreq: powernv: Ramp-down global pstate slower than local-pstate
The frequency transition latency from pmin to pmax is observed to be in few millisecond granurality. And it usually happens to take a performance penalty during sudden frequency rampup requests. This patch set solves this problem by using an entity called "global pstates". The global pstate is a Chip-level entity, so the global entitiy (Voltage) is managed across the cores. The local pstate is a Core-level entity, so the local entity (frequency) is managed across threads. This patch brings down global pstate at a slower rate than the local pstate. Hence by holding global pstates higher than local pstate makes the subsequent rampups faster. A per policy structure is maintained to keep track of the global and local pstate changes. The global pstate is brought down using a parabolic equation. The ramp down time to pmin is set to ~5 seconds. To make sure that the global pstates are dropped at regular interval , a timer is queued for every 2 seconds during ramp-down phase, which eventually brings the pstate down to local pstate. Iozone results show fairly consistent performance boost. YCSB on redis shows improved Max latencies in most cases. Iozone write/rewite test were made with filesizes 200704Kb and 401408Kb with different record sizes . The following table shows IOoperations/sec with and without patch. Iozone Results ( in op/sec) ( mean over 3 iterations ) - file size- withwithout % recordsize-IOtype patch patch change -- 200704-1-SeqWrite 1616532 1615425 0.06 200704-1-Rewrite2423195 2303130 5.21 200704-2-SeqWrite 1628577 1602620 1.61 200704-2-Rewrite2428264 2312154 5.02 200704-4-SeqWrite 1617605 1617182 0.02 200704-4-Rewrite2430524 2351238 3.37 200704-8-SeqWrite 1629478 1600436 1.81 200704-8-Rewrite2415308 2298136 5.09 200704-16-SeqWrite 1619632 1618250 0.08 200704-16-Rewrite 2396650 2352591 1.87 200704-32-SeqWrite 1632544 1598083 2.15 200704-32-Rewrite 2425119 2329743 4.09 200704-64-SeqWrite 1617812 1617235 0.03 200704-64-Rewrite 2402021 2321080 3.48 200704-128-SeqWrite 1631998 1600256 1.98 200704-128-Rewrite 2422389 2304954 5.09 200704-256 SeqWrite 1617065 1616962 0.00 200704-256-Rewrite 2432539 2301980 5.67 200704-512-SeqWrite 1632599 1598656 2.12 200704-512-Rewrite 2429270 2323676 4.54 200704-1024-SeqWrite1618758 1616156 0.16 200704-1024-Rewrite 2431631 2315889 4.99 401408-1-SeqWrite 1631479 1608132 1.45 401408-1-Rewrite2501550 2459409 1.71 401408-2-SeqWrite 1617095 1626069 -0.55 401408-2-Rewrite2507557 2443621 2.61 401408-4-SeqWrite 1629601 1611869 1.10 401408-4-Rewrite2505909 2462098 1.77 401408-8-SeqWrite 1617110 1626968 -0.60 401408-8-Rewrite2512244 2456827 2.25 401408-16-SeqWrite 1632609 1609603 1.42 401408-16-Rewrite 2500792 2451405 2.01 401408-32-SeqWrite 1619294 1628167 -0.54 401408-32-Rewrite 2510115 2451292 2.39 401408-64-SeqWrite 1632709 1603746 1.80 401408-64-Rewrite 2506692 2433186 3.02 401408-128-SeqWrite 1619284 1627461 -0.50 401408-128-Rewrite 2518698 2453361 2.66 401408-256-SeqWrite 1634022 1610681 1.44 401408-256-Rewrite 2509987 2446328 2.60 401408-512-SeqWrite 1617524 1628016 -0.64 401408-512-Rewrite 2504409 2442899 2.51 401408-1024-SeqWrite1629812 1611566 1.13 401408-1024-Rewrite 2507620 24429682.64 Tested with YCSB workload (50% update + 50% read) over redis for 1 million records and 1 million operation. Each test was carried out with target operations per second and persistence disabled. Max-latency (in us)( mean over 5 iterations )
[PATCH v3 0/2] cpufreq: powernv: Ramp-down global pstate slower than local-pstate
The frequency transition latency from pmin to pmax is observed to be in few millisecond granurality. And it usually happens to take a performance penalty during sudden frequency rampup requests. This patch set solves this problem by using a chip-level entity called "global pstates". Global pstate manages elements across other dependent core chiplets. Typically, the element that needs to be managed is the voltage setting. So by holding global pstates higher than local pstate for some amount of time ( ~5 seconds) the subsequent rampups could be made faster. (1/2) patch removes the flag from cpufreq_policy->driver_data, so that it can be used for tracking global pstates. (2/2) patch adds code for global pstate management. - The iozone results with this patchset, shows improvements in almost all cases. - YCSB workload on redis with various target operations per second shows better MaxLatency with this patch. Changes from v1: - Fixed coding style - Added a routine to reset global_pstate_info instead of hacky memset - Handled case where cpufreq_table_validate_and_show() fails - changed int queue_gpstate_timer() to void queue_gpstate_timer() Changes from v2: - dropped the unreated change. Akshay Adiga (1): cpufreq: powernv: Ramp-down global pstate slower than local-pstate Shilpasri G Bhat (1): cpufreq: powernv: Remove flag use-case of policy->driver_data drivers/cpufreq/powernv-cpufreq.c | 269 -- 1 file changed, 256 insertions(+), 13 deletions(-) -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3 1/2] cpufreq: powernv: Remove flag use-case of policy->driver_data
From: Shilpasri G Bhatcommit 1b0289848d5d ("cpufreq: powernv: Add sysfs attributes to show throttle stats") used policy->driver_data as a flag for one-time creation of throttle sysfs files. Instead of this use 'kernfs_find_and_get()' to check if the attribute already exists. This is required as policy->driver_data is used for other purposes in the later patch. Signed-off-by: Shilpasri G Bhat Signed-off-by: Akshay Adiga Acked-by: Viresh Kumar --- drivers/cpufreq/powernv-cpufreq.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index 39ac78c..e2e2219 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -455,13 +455,15 @@ static int powernv_cpufreq_target_index(struct cpufreq_policy *policy, static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy) { int base, i; + struct kernfs_node *kn; base = cpu_first_thread_sibling(policy->cpu); for (i = 0; i < threads_per_core; i++) cpumask_set_cpu(base + i, policy->cpus); - if (!policy->driver_data) { + kn = kernfs_find_and_get(policy->kobj.sd, throttle_attr_grp.name); + if (!kn) { int ret; ret = sysfs_create_group(>kobj, _attr_grp); @@ -470,11 +472,8 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy) policy->cpu); return ret; } - /* -* policy->driver_data is used as a flag for one-time -* creation of throttle sysfs files. -*/ - policy->driver_data = policy; + } else { + kernfs_put(kn); } return cpufreq_table_validate_and_show(policy, powernv_freqs); } -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 2/2] cpufreq: powernv: Ramp-down global pstate slower than local-pstate
Hi Viresh, On 04/18/2016 03:48 PM, Viresh Kumar wrote: On 15-04-16, 11:58, Akshay Adiga wrote: static int powernv_cpufreq_reboot_notifier(struct notifier_block *nb, - unsigned long action, void *unused) + unsigned long action, void *unused) Unrelated change.. better don't add such changes.. Posting out v3 with out this unrelated change. { int cpu; struct cpufreq_policy cpu_policy; @@ -603,15 +843,18 @@ static struct notifier_block powernv_cpufreq_opal_nb = { static void powernv_cpufreq_stop_cpu(struct cpufreq_policy *policy) { struct powernv_smp_call_data freq_data; - + struct global_pstate_info *gpstates = policy->driver_data; You removed a blank line here and I feel the code looks better with that. freq_data.pstate_id = powernv_pstate_info.min; + freq_data.gpstate_id = powernv_pstate_info.min; smp_call_function_single(policy->cpu, set_pstate, _data, 1); + del_timer_sync(>timer); } static struct cpufreq_driver powernv_cpufreq_driver = { .name = "powernv-cpufreq", .flags = CPUFREQ_CONST_LOOPS, .init = powernv_cpufreq_cpu_init, + .exit = powernv_cpufreq_cpu_exit, .verify = cpufreq_generic_frequency_table_verify, .target_index = powernv_cpufreq_target_index, .get= powernv_cpufreq_get, None of the above comments are mandatory for you to fix.. Acked-by: Viresh KumarThanks for Ack :) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Add a kernel thread to check the coherent platform function's state
On Mon, 2016-04-18 at 15:05 +0200, Christophe Lombard wrote: > In the POWERVM environement, the PHYP CoherentAccel component manages PowerVM is correct I think. > the state of the Coherant Accelerator Processor Interface adapter and ^ (CAPI) > virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and > interrupts - and provides a new set of HCALLs for the OS APIs to utilize ^ hcall (as below?) > AFUs. AFUs ? (you define it below) > During the course of operation, a coherent platform function can > encounter errors. Some possible reason for errors are: > • Hardware recoverable and unrecoverable errors > • Transient and over-threshold correctable errors > > PHYP implements its own state model for the coherent platform function. > The current state of this Acclerator Fonction Unit (AFU) is available > through a hcall. > > In case of low-level troubles (or error injection), The PHYP component > may reset the card and change the AFU state. The PHYP interface doesn't > provide any way to be notified when that happens. Ugh. > The current implementation of the cxl driver, for the POWERVM > environment, follows the general error recovery procedures required to What are "the general error recovery procedures" ? > reset operation of the coherent platform function. The platform firmware > resets and reconfigures hardware when an external action is required - > attach/detach a process, link ok, Platform firmware does that at our request or by itself? > The purpose of this patch is to interact with the external driver What's an external driver? > (where the AFU is shown) even if no action is required. A kernel thread But no action is required, so why do we need to do anything? > is needed to check every x seconds the current state of the AFU to see > if we need to enter an error recovery path. I don't really understand what this is doing and why we want it. It sounds like we're waking the cpu up every 3 seconds and having it poll the hypervisor, for each AFU? As far as the implementation, I can't see any reason why you need your own kthreads, can't you just use queue_work() ? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V3 1/2] cpufreq: qoriq: Remove __exit macro from .exit callback
.exit callback (qoriq_cpufreq_cpu_exit()) is also used during suspend. So __exit macro should be removed or the function will be discarded. Signed-off-by: Jia Hongtao--- drivers/cpufreq/qoriq-cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c index b23e525..3a3fe39 100644 --- a/drivers/cpufreq/qoriq-cpufreq.c +++ b/drivers/cpufreq/qoriq-cpufreq.c @@ -301,7 +301,7 @@ err_np: return -ENODEV; } -static int __exit qoriq_cpufreq_cpu_exit(struct cpufreq_policy *policy) +static int qoriq_cpufreq_cpu_exit(struct cpufreq_policy *policy) { struct cpu_data *data = policy->driver_data; @@ -348,7 +348,7 @@ static struct cpufreq_driver qoriq_cpufreq_driver = { .name = "qoriq_cpufreq", .flags = CPUFREQ_CONST_LOOPS, .init = qoriq_cpufreq_cpu_init, - .exit = __exit_p(qoriq_cpufreq_cpu_exit), + .exit = qoriq_cpufreq_cpu_exit, .verify = cpufreq_generic_frequency_table_verify, .target_index = qoriq_cpufreq_target, .get= cpufreq_generic_get, -- 2.1.0.27.g96db324 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 39/45] powerpc/powernv: Select OF_DYNAMIC
On 02/17/2016 02:44 PM, Gavin Shan wrote: The device tree will change dynamically in PowerNV PCI hotplug driver. This enables CONFIG_OF_DYNAMIC to support that. Signed-off-by: Gavin Shan--- arch/powerpc/platforms/powernv/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig index 604190c..e7b1ad7 100644 --- a/arch/powerpc/platforms/powernv/Kconfig +++ b/arch/powerpc/platforms/powernv/Kconfig @@ -18,6 +18,7 @@ config PPC_POWERNV select CPU_FREQ_GOV_ONDEMAND select CPU_FREQ_GOV_CONSERVATIVE select PPC_DOORBELL + select OF_DYNAMIC Why not to enable it in 45/45 under config HOTPLUG_PCI_POWERNV? Is there any benefit of having it always on if HOTPLUG_PCI_POWERNV is not enabled? default y config OPAL_PRD -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 38/45] powerpc/powernv: Functions to get/set PCI slot status
On 02/17/2016 02:44 PM, Gavin Shan wrote: This exports 4 functins, which base on the corresponding OPAL s/functins/functions/ APIs to get/set PCI slot status. Those functions are going to be used by PowerNV PCI hotplug driver: pnv_pci_get_device_tree()opal_get_device_tree() pnv_pci_get_presence_state() opal_pci_get_presence_state() pnv_pci_get_power_state()opal_pci_get_power_state() pnv_pci_set_power_state()opal_pci_set_power_state() Besides, the patch also exports pnv_pci_hotplug_notifier_{register, unregister}() to allow registration and unregistration of PCI hotplug notifier, which will be used to receive PCI hotplug message from skiboot firmware in PowerNV PCI hotplug driver. Signed-off-by: Gavin Shan--- arch/powerpc/include/asm/opal-api.h| 17 ++- arch/powerpc/include/asm/opal.h| 4 ++ arch/powerpc/include/asm/pnv-pci.h | 7 +++ arch/powerpc/platforms/powernv/opal-wrappers.S | 4 ++ arch/powerpc/platforms/powernv/pci.c | 66 ++ 5 files changed, 97 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index f8faaae..a6af338 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -158,7 +158,11 @@ #define OPAL_LEDS_SET_INDICATOR 115 #define OPAL_CEC_REBOOT2 116 #define OPAL_CONSOLE_FLUSH117 -#define OPAL_LAST 117 +#define OPAL_GET_DEVICE_TREE 118 +#define OPAL_PCI_GET_PRESENCE_STATE119 +#define OPAL_PCI_GET_POWER_STATE 120 +#define OPAL_PCI_SET_POWER_STATE 121 +#define OPAL_LAST 121 /* Device tree flags */ @@ -344,6 +348,16 @@ enum OpalPciResetState { OPAL_ASSERT_RESET = 1 }; +enum OpalPciSlotPresentenceState { + OPAL_PCI_SLOT_EMPTY = 0, + OPAL_PCI_SLOT_PRESENT = 1 +}; + +enum OpalPciSlotPowerState { + OPAL_PCI_SLOT_POWER_OFF = 0, + OPAL_PCI_SLOT_POWER_ON = 1 +}; + enum OpalSlotLedType { OPAL_SLOT_LED_TYPE_ID = 0, /* IDENTIFY LED */ OPAL_SLOT_LED_TYPE_FAULT = 1, /* FAULT LED */ @@ -378,6 +392,7 @@ enum opal_msg_type { OPAL_MSG_DPO, OPAL_MSG_PRD, OPAL_MSG_OCC, + OPAL_MSG_PCI_HOTPLUG, OPAL_MSG_TYPE_MAX, }; diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 9e0039f..899bcb941 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -209,6 +209,10 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, uint64_t buf, uint64_t size, uint64_t token); int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size, uint64_t token); +int64_t opal_get_device_tree(uint32_t phandle, uint64_t buf, uint64_t len); +int64_t opal_pci_get_presence_state(uint64_t id, uint8_t *state); +int64_t opal_pci_get_power_state(uint64_t id, uint8_t *state); +int64_t opal_pci_set_power_state(uint64_t id, uint8_t state); /* Internal functions */ extern int early_init_dt_scan_opal(unsigned long node, const char *uname, diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index 6f77f71..d9d095b 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -13,6 +13,13 @@ #include #include +extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t len); +extern int pnv_pci_get_presence_state(uint64_t id, uint8_t *state); +extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state); +extern int pnv_pci_set_power_state(uint64_t id, uint8_t state); +extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb); +extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb); + int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode); int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, unsigned int virq); diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S index e45b88a..3ea1a855 100644 --- a/arch/powerpc/platforms/powernv/opal-wrappers.S +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S @@ -302,3 +302,7 @@ OPAL_CALL(opal_prd_msg, OPAL_PRD_MSG); OPAL_CALL(opal_leds_get_ind, OPAL_LEDS_GET_INDICATOR); OPAL_CALL(opal_leds_set_ind, OPAL_LEDS_SET_INDICATOR); OPAL_CALL(opal_console_flush, OPAL_CONSOLE_FLUSH); +OPAL_CALL(opal_get_device_tree,OPAL_GET_DEVICE_TREE); +OPAL_CALL(opal_pci_get_presence_state, OPAL_PCI_GET_PRESENCE_STATE); +OPAL_CALL(opal_pci_get_power_state,OPAL_PCI_GET_POWER_STATE); +OPAL_CALL(opal_pci_set_power_state,
[PATCH v2] powerpc: define the fman node for the kmcoge4 DTS
Now that the FMAN mac driver has been merged the fman node is relevant. The kmcoge4 board implements 3 ethernet interfaces, 1 with a RGMII phy and 2 with fixed 1 Giga SGMII links. Signed-off-by: Valentin Longchamp--- arch/powerpc/boot/dts/fsl/kmcoge4.dts | 37 +++ 1 file changed, 37 insertions(+) diff --git a/arch/powerpc/boot/dts/fsl/kmcoge4.dts b/arch/powerpc/boot/dts/fsl/kmcoge4.dts index 6858ec9..67bfcec 100644 --- a/arch/powerpc/boot/dts/fsl/kmcoge4.dts +++ b/arch/powerpc/boot/dts/fsl/kmcoge4.dts @@ -106,6 +106,43 @@ sata@221000 { status = "disabled"; }; + + fman0: fman@40 { + enet0: ethernet@e { + phy-connection-type = "sgmii"; + fixed-link { + speed = <1000>; + full-duplex; + }; + }; + mdio0: mdio@e1120 { + front_phy: ethernet-phy@11 { + reg = <0x11>; + }; + }; + + enet1: ethernet@e2000 { + phy-connection-type = "sgmii"; + fixed-link { + speed = <1000>; + full-duplex; + }; + }; + enet2: ethernet@e4000 { + status = "disabled"; + }; + + enet3: ethernet@e6000 { + status = "disabled"; + }; + enet4: ethernet@e8000 { + phy-handle = <_phy>; + phy-connection-type = "rgmii"; + }; + enet5: ethernet@f { + status = "disabled"; + }; + }; }; rio: rapidio@ffe0c { -- 1.8.3.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Add a kernel thread to check the coherent platform function's state
On 19/04/2016 04:40, Andrew Donnellan wrote: On 18/04/16 23:05, Christophe Lombard wrote: In the POWERVM environement, the PHYP CoherentAccel component manages environment the state of the Coherant Accelerator Processor Interface adapter and Coherent virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and interrupts - and provides a new set of HCALLs for the OS APIs to utilize AFUs. During the course of operation, a coherent platform function can encounter errors. Some possible reason for errors are: • Hardware recoverable and unrecoverable errors • Transient and over-threshold correctable errors PHYP implements its own state model for the coherent platform function. The current state of this Acclerator Fonction Unit (AFU) is available Accelerator Function Unit through a hcall. In case of low-level troubles (or error injection), The PHYP component the may reset the card and change the AFU state. The PHYP interface doesn't provide any way to be notified when that happens. The current implementation of the cxl driver, for the POWERVM environment, follows the general error recovery procedures required to reset operation of the coherent platform function. The platform firmware resets and reconfigures hardware when an external action is required - attach/detach a process, link ok, The purpose of this patch is to interact with the external driver (where the AFU is shown) even if no action is required. A kernel thread is needed to check every x seconds the current state of the AFU to see if we need to enter an error recovery path. Signed-off-by: Christophe LombardA few minor issues below. diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c index 8213372..06dfe7f 100644 --- a/drivers/misc/cxl/guest.c +++ b/drivers/misc/cxl/guest.c @@ -19,6 +19,10 @@ #define CXL_SLOT_RESET_EVENT2 #define CXL_RESUME_EVENT3 +#define CXL_KTHREAD "cxl_kthread" + +void stop_state_thread(struct cxl_afu *afu); static? [...] -static int afu_do_recovery(struct cxl_afu *afu) +static int handle_state_thread(void *data) { -int rc; +struct cxl_afu *afu; +int rc = 0; It looks like we don't use rc (see also comment below). -/* many threads can arrive here, in case of detach_all for example. - * Only one needs to drive the recovery - */ -if (mutex_trylock(>guest->recovery_lock)) { -rc = afu_update_state(afu); -mutex_unlock(>guest->recovery_lock); -return rc; +pr_devel("in %s\n", __func__); + +afu = (struct cxl_afu*)data; CodingStyle: space between cxl_afu and * +do { +set_current_state(TASK_INTERRUPTIBLE); + +if (afu) { +afu_update_state(afu); Should we be checking the retval here? Right, We have to check the retval here. Thanks +if (afu->guest->previous_state == H_STATE_PERM_UNAVAILABLE) +goto out; +} else +return -ENODEV; +schedule_timeout(msecs_to_jiffies(3000)); +} while(!kthread_should_stop()); CodingStyle: space between while and ( + +out: +afu->guest->kthread_tsk = NULL; +return rc; +} + +void start_state_thread(struct cxl_afu *afu) static? +{ +if (afu->guest->kthread_tsk) +return; + +/* start kernel thread to handle the state of the afu */ +afu->guest->kthread_tsk = kthread_run(_state_thread, + (void *)afu, CXL_KTHREAD); +if (IS_ERR(afu->guest->kthread_tsk)) { +pr_devel("cannot start state kthread\n"); +afu->guest->kthread_tsk = NULL; } -return 0; +} + +void stop_state_thread(struct cxl_afu *afu) static? Thanks for the review. I will send a patch update. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V3 1/2] cpufreq: qoriq: Remove __exit macro from .exit callback
On 19-04-16, 17:00, Jia Hongtao wrote: > .exit callback (qoriq_cpufreq_cpu_exit()) is also used during suspend. > So __exit macro should be removed or the function will be discarded. > > Signed-off-by: Jia Hongtao> --- > drivers/cpufreq/qoriq-cpufreq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Acked-by: Viresh Kumar -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 33/45] powerpc/powernv: Simplify pnv_eeh_reset()
On 02/17/2016 02:44 PM, Gavin Shan wrote: This drops unnecessary nested if statements in pnv_eeh_reset() to improve the code readability. After the changes, the unused local variable "ret" is dropped as well. No logical changes introduced. Signed-off-by: Gavin ShanReviewed-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/eeh-powernv.c | 67 +--- 1 file changed, 31 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index 69e41ce..9226df1 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -1009,8 +1009,9 @@ static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option) static int pnv_eeh_reset(struct eeh_pe *pe, int option) { struct pci_controller *hose = pe->phb; + struct pnv_phb *phb; struct pci_bus *bus; - int ret; + int64_t rc; /* * For PHB reset, we always have complete reset. For those PEs whose @@ -1026,45 +1027,39 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option) * reset. The side effect is that EEH core has to clear the frozen * state explicitly after BAR restore. */ - if (pe->type & EEH_PE_PHB) { - ret = pnv_eeh_phb_reset(hose, option); - } else { - struct pnv_phb *phb; - s64 rc; + if (pe->type & EEH_PE_PHB) + return pnv_eeh_phb_reset(hose, option); - /* -* The frozen PE might be caused by PAPR error injection -* registers, which are expected to be cleared after hitting -* frozen PE as stated in the hardware spec. Unfortunately, -* that's not true on P7IOC. So we have to clear it manually -* to avoid recursive EEH errors during recovery. -*/ - phb = hose->private_data; - if (phb->model == PNV_PHB_MODEL_P7IOC && - (option == EEH_RESET_HOT || - option == EEH_RESET_FUNDAMENTAL)) { - rc = opal_pci_reset(phb->opal_id, - OPAL_RESET_PHB_ERROR, - OPAL_ASSERT_RESET); - if (rc != OPAL_SUCCESS) { - pr_warn("%s: Failure %lld clearing " - "error injection registers\n", - __func__, rc); - return -EIO; - } + /* +* The frozen PE might be caused by PAPR error injection +* registers, which are expected to be cleared after hitting +* frozen PE as stated in the hardware spec. Unfortunately, +* that's not true on P7IOC. So we have to clear it manually +* to avoid recursive EEH errors during recovery. +*/ + phb = hose->private_data; + if (phb->model == PNV_PHB_MODEL_P7IOC && + (option == EEH_RESET_HOT || +option == EEH_RESET_FUNDAMENTAL)) { + rc = opal_pci_reset(phb->opal_id, + OPAL_RESET_PHB_ERROR, + OPAL_ASSERT_RESET); + if (rc != OPAL_SUCCESS) { + pr_warn("%s: Failure %lld clearing error injection registers\n", + __func__, rc); + return -EIO; } - - bus = eeh_pe_bus_get(pe); - if (pe->type & EEH_PE_VF) - ret = pnv_eeh_reset_vf_pe(pe, option); - else if (pci_is_root_bus(bus) || - pci_is_root_bus(bus->parent)) - ret = pnv_eeh_root_reset(hose, option); - else - ret = pnv_eeh_bridge_reset(bus->self, option); } - return ret; + bus = eeh_pe_bus_get(pe); + if (pe->type & EEH_PE_VF) + return pnv_eeh_reset_vf_pe(pe, option); + + if (pci_is_root_bus(bus) || + pci_is_root_bus(bus->parent)) + return pnv_eeh_root_reset(hose, option); + + return pnv_eeh_bridge_reset(bus->self, option); } /** -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Trouble with DMA on PPC linux question
On Mon, 2016-04-18 at 14:54 -0700, bruce_leon...@selinc.com wrote: > > On the DMA transactions that work, the virtual address I hand to > dma_map_single() is something like 0xe084 and the dma_addr_t result is > 0x1084 which is less than my 512Mb limit. On the transactions that > don't work, the virtual address is 0xd539 with the mapped result being > 0x2539, which is past my upper bound on my RAM. In fact it's not even > in my memory map, there's a hole there. Where does this virtual address come from ? The kernel has two types of virtual addresses. Those coming from the linear mapping (the stuff you get from kmalloc() for example, or get_pages()) which can be translated using that simple substraction. The other is the vmalloc space, and that is a non-linear mapping of random pages. If your vaddr comes from the latter it can't be passed to dma_map_single as-is, you need to get to the underlying pages first. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v8 30/45] powerpc/pci: Delay populating pdn
On 02/17/2016 02:44 PM, Gavin Shan wrote: The pdn (struct pci_dn) instances are allocated from memblock or bootmem when creating PCI controller (hoses) in setup_arch(). PCI hotplug, which will be supported by proceeding patches, releases PCI device nodes and their corresponding pdn on unplugging event. The memory chunks for pdn instances allocated from memblock or bootmem are hard to reused after being released. This delays creating pdn by pci_devs_phb_init() from setup_arch() to core_initcall() so that they are allocated from slab. The memory consumed by pdn can be released to system without problem during PCI unplugging time. It indicates that pci_dn is unavailable in setup_arch() and the the fixup on pdn (like AGP's) can't be carried out that time. We have to do that in ppc_md.pcibios_root_bridge_prepare() on maple/pasemi/powermac platforms where/when the pdn is available. At the mean while, the EEH device is created when pdn is populated, meaning pdn and EEH device have same life cycle. In turn, we needn't call eeh_dev_init() to create EEH device explicitly. Signed-off-by: Gavin ShanUff. It would not hurt to mention that pcibios_root_bridge_prepare is called from subsys_initcall() which is executed after core_initcall() so the code flow does not change. Have you checked if there is anything in between core_initcall(pci_devs_phb_init) and subsys_initcall(pcibios_init) which might need device tree nodes? For example, subsys_initcall(pcibios_init) calls (eventually) pnv_pci_ioda_fixup(), if we are unlucky and pcibios_init() (and therefore pnv_pci_ioda_fixup() or what pseries/others do) is called before pcibios_init() - won't we crash or something? -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
答复: [PATCH 1/2] cpufreq: qoriq: Fix cooling device registration issue during suspend
> -邮件原件- > 发件人: Viresh Kumar [mailto:viresh.ku...@linaro.org] > 发送时间: Monday, April 18, 2016 6:33 PM > 收件人: Hongtao Jia> 抄送: linux...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; Scott Wood > ; Yuantian Tang > 主题: Re: [PATCH 1/2] cpufreq: qoriq: Fix cooling device registration issue > during suspend > > On 18-04-16, 15:59, Jia Hongtao wrote: > > Cooling device is registered by ready callback. It's also invoked > > while system resuming from sleep (Enabling non-boot cpus). Thus > > cooling device may be multiple registered. Stop_cpu callback is > > invoked during suspend (Disabling non-boot cpus). So matchable > > unregistration is added to fix this issue. > > > > Signed-off-by: Jia Hongtao > > --- > > drivers/cpufreq/qoriq-cpufreq.c | 8 > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/cpufreq/qoriq-cpufreq.c > > b/drivers/cpufreq/qoriq-cpufreq.c index b23e525..1c2fdc1 100644 > > --- a/drivers/cpufreq/qoriq-cpufreq.c > > +++ b/drivers/cpufreq/qoriq-cpufreq.c > > @@ -305,6 +305,7 @@ static int __exit qoriq_cpufreq_cpu_exit(struct > > cpufreq_policy *policy) { > > struct cpu_data *data = policy->driver_data; > > > > + cpufreq_cooling_unregister(data->cdev); > > kfree(data->pclk); > > kfree(data->table); > > kfree(data); > > @@ -323,6 +324,12 @@ static int qoriq_cpufreq_target(struct cpufreq_policy > *policy, > > return clk_set_parent(policy->clk, parent); } > > > > +static void qoriq_cpufreq_stop_cpu(struct cpufreq_policy *policy) { > > + struct cpu_data *cpud = policy->driver_data; > > + > > + cpufreq_cooling_unregister(cpud->cdev); > > +} > > > > static void qoriq_cpufreq_ready(struct cpufreq_policy *policy) { @@ > > -352,6 +359,7 @@ static struct cpufreq_driver qoriq_cpufreq_driver = { > > .verify = cpufreq_generic_frequency_table_verify, > > .target_index = qoriq_cpufreq_target, > > .get= cpufreq_generic_get, > > + .stop_cpu = qoriq_cpufreq_stop_cpu, > > .ready = qoriq_cpufreq_ready, > > .attr = cpufreq_generic_attr, > > }; > > You don't need to do it from stop_cpu(), please use > qoriq_cpufreq_cpu_exit() for this. Thanks. The new patch will be submitted soon. -Hongtao. > > -- > viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2] cpufreq: qoriq: Fix cooling device registration issue during suspend
Cooling device is registered by ready callback. It's also invoked while system resuming from sleep (Enabling non-boot cpus). Thus cooling device may be multiple registered. Matchable unregistration is added to exit callback to fix this issue. Signed-off-by: Jia Hongtao--- Changes for V2: * Using qoriq_cpufreq_cpu_exit() callback instead of adding stop_cpu(). drivers/cpufreq/qoriq-cpufreq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c index b23e525..0b85f90 100644 --- a/drivers/cpufreq/qoriq-cpufreq.c +++ b/drivers/cpufreq/qoriq-cpufreq.c @@ -301,10 +301,11 @@ err_np: return -ENODEV; } -static int __exit qoriq_cpufreq_cpu_exit(struct cpufreq_policy *policy) +static int qoriq_cpufreq_cpu_exit(struct cpufreq_policy *policy) { struct cpu_data *data = policy->driver_data; + cpufreq_cooling_unregister(data->cdev); kfree(data->pclk); kfree(data->table); kfree(data); @@ -348,7 +349,7 @@ static struct cpufreq_driver qoriq_cpufreq_driver = { .name = "qoriq_cpufreq", .flags = CPUFREQ_CONST_LOOPS, .init = qoriq_cpufreq_cpu_init, - .exit = __exit_p(qoriq_cpufreq_cpu_exit), + .exit = qoriq_cpufreq_cpu_exit, .verify = cpufreq_generic_frequency_table_verify, .target_index = qoriq_cpufreq_target, .get= cpufreq_generic_get, -- 2.1.0.27.g96db324 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2] powerpc: Implement {cmp}xchg for u8 and u16
From: Pan XinhuiImplement xchg{u8,u16}{local,relaxed}, and cmpxchg{u8,u16}{,local,acquire,relaxed}. It works on all ppc. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Pan Xinhui --- change from V1: rework totally. --- arch/powerpc/include/asm/cmpxchg.h | 83 ++ 1 file changed, 83 insertions(+) diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h index 44efe73..79a1f45 100644 --- a/arch/powerpc/include/asm/cmpxchg.h +++ b/arch/powerpc/include/asm/cmpxchg.h @@ -7,6 +7,37 @@ #include #include +#ifdef __BIG_ENDIAN +#define BITOFF_CAL(size, off) ((sizeof(u32) - size - off) * BITS_PER_BYTE) +#else +#define BITOFF_CAL(size, off) (off * BITS_PER_BYTE) +#endif + +static __always_inline unsigned long +__cmpxchg_u32_local(volatile unsigned int *p, unsigned long old, + unsigned long new); + +#define __XCHG_GEN(cmp, type, sfx, u32sfx, skip, v)\ +static __always_inline u32 \ +__##cmp##xchg_##type##sfx(v void *ptr, u32 old, u32 new) \ +{ \ + int size = sizeof (type); \ + int off = (unsigned long)ptr % sizeof(u32); \ + volatile u32 *p = ptr - off;\ + int bitoff = BITOFF_CAL(size, off); \ + u32 bitmask = ((0x1 << size * BITS_PER_BYTE) - 1) << bitoff;\ + u32 oldv, newv; \ + u32 ret;\ + do {\ + oldv = READ_ONCE(*p); \ + ret = (oldv & bitmask) >> bitoff; \ + if (skip && ret != old) \ + break; \ + newv = (oldv & ~bitmask) | (new << bitoff); \ + } while (__cmpxchg_u32##u32sfx((v void*)p, oldv, newv) != oldv);\ + return ret; \ +} + /* * Atomic exchange * @@ -14,6 +45,19 @@ * the previous value stored there. */ +#define XCHG_GEN(type, sfx, v) \ + __XCHG_GEN(_, type, sfx, _local, 0, v) \ +static __always_inline u32 __xchg_##type##sfx(v void *p, u32 n)\ +{ \ + return ___xchg_##type##sfx(p, 0, n);\ +} + +XCHG_GEN(u8, _local, volatile); +XCHG_GEN(u8, _relaxed, ); +XCHG_GEN(u16, _local, volatile); +XCHG_GEN(u16, _relaxed, ); +#undef XCHG_GEN + static __always_inline unsigned long __xchg_u32_local(volatile void *p, unsigned long val) { @@ -88,6 +132,10 @@ static __always_inline unsigned long __xchg_local(volatile void *ptr, unsigned long x, unsigned int size) { switch (size) { + case 1: + return __xchg_u8_local(ptr, x); + case 2: + return __xchg_u16_local(ptr, x); case 4: return __xchg_u32_local(ptr, x); #ifdef CONFIG_PPC64 @@ -103,6 +151,10 @@ static __always_inline unsigned long __xchg_relaxed(void *ptr, unsigned long x, unsigned int size) { switch (size) { + case 1: + return __xchg_u8_relaxed(ptr, x); + case 2: + return __xchg_u16_relaxed(ptr, x); case 4: return __xchg_u32_relaxed(ptr, x); #ifdef CONFIG_PPC64 @@ -226,6 +278,21 @@ __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new) return prev; } + +#define CMPXCHG_GEN(type, sfx, v) \ + __XCHG_GEN(cmp, type, sfx, sfx, 1, v) + +CMPXCHG_GEN(u8, , volatile); +CMPXCHG_GEN(u8, _local, volatile); +CMPXCHG_GEN(u8, _relaxed, ); +CMPXCHG_GEN(u8, _acquire, ); +CMPXCHG_GEN(u16, , volatile); +CMPXCHG_GEN(u16, _local, volatile); +CMPXCHG_GEN(u16, _relaxed, ); +CMPXCHG_GEN(u16, _acquire, ); +#undef CMPXCHG_GEN +#undef __XCHG_GEN + #ifdef CONFIG_PPC64 static __always_inline unsigned long __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new) @@ -316,6 +383,10 @@ __cmpxchg(volatile void *ptr, unsigned long old, unsigned long new, unsigned int size) { switch (size) { + case 1: + return __cmpxchg_u8(ptr, old, new); + case 2: + return __cmpxchg_u16(ptr, old, new); case 4: return __cmpxchg_u32(ptr, old, new); #ifdef CONFIG_PPC64 @@ -332,6 +403,10 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
Re: [PATCH] cxl: static-ify variables to fix sparse warnings
Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev