Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote: > > Am I missing something here? If not, it seems to me that you need > the leading lwsync to instead be a sync. > > Of course, if I am not missing something, then this applies also to the > value-returning RMW atomic operations that you pulled this pattern from. > If so, it would seem that I didn't think through all the possibilities > back when PPC_ATOMIC_EXIT_BARRIER moved to sync... In fact, I believe > that I worried about the RMW atomic operation acting as a barrier, > but not as the load/store itself. :-/ > Paul, I know this may be difficult, but could you recall why the __futex_atomic_op() and futex_atomic_cmpxchg_inatomic() also got involved into the movement of PPC_ATOMIC_EXIT_BARRIER to "sync"? I did some search, but couldn't find the discussion of that patch. I ask this because I recall Peter once bought up a discussion: https://lkml.org/lkml/2015/8/26/596 Peter's conclusion seems to be that we could(though didn't want to) live with futex atomics not being full barriers. Peter, just be clear, I'm not in favor of relaxing futex atomics. But if I make PPC_ATOMIC_ENTRY_BARRIER being "sync", it will also strengthen the futex atomics, just wonder whether such strengthen is a -fix- or not, considering that I want this patch to go to -stable tree. Of course, in the meanwhile of waiting for your answer, I will try to figure out this by myself ;-) Regards, Boqun signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation
On Mon, Oct 19, 2015 at 12:23:24PM +0200, Peter Zijlstra wrote: > On Mon, Oct 19, 2015 at 09:17:18AM +0800, Boqun Feng wrote: > > This is confusing me right now. ;-) > > > > Let's use a simple example for only one primitive, as I understand it, > > if we say a primitive A is "fully ordered", we actually mean: > > > > 1. The memory operations preceding(in program order) A can't be > > reordered after the memory operations following(in PO) A. > > > > and > > > > 2. The memory operation(s) in A can't be reordered before the > > memory operations preceding(in PO) A and after the memory > > operations following(in PO) A. > > > > If we say A is a "full barrier", we actually means: > > > > 1. The memory operations preceding(in program order) A can't be > > reordered after the memory operations following(in PO) A. > > > > and > > > > 2. The memory ordering guarantee in #1 is visible globally. > > > > Is that correct? Or "full barrier" is more strong than I understand, > > i.e. there is a third property of "full barrier": > > > > 3. The memory operation(s) in A can't be reordered before the > > memory operations preceding(in PO) A and after the memory > > operations following(in PO) A. > > > > IOW, is "full barrier" a more strong version of "fully ordered" or not? > > Yes, that was how I used it. > > Now of course; the big question is do we want to promote this usage or > come up with a different set of words describing this stuff. > > I think separating the ordering from the transitivity is useful, for we > can then talk about and specify them independently. > Great idea! > That is, we can say: > > LOAD-ACQUIRE: orders LOAD->{LOAD,STORE} > weak transitivity (RCpc) > > MB: orders {LOAD,STORE}->{LOAD,STORE} (fully ordered) > strong transitivity (RCsc) > It will be helpful if we have this kind of description for each primitive mentioned in memory-barriers.txt, which, IMO, is better than the description like the following: """ Any atomic operation that modifies some state in memory and returns information about the state (old or new) implies an SMP-conditional general memory barrier (smp_mb()) on each side of the actual operation (with the exception of """ I'm assuming that the arrow "->" stands for the program order, and word "orders" means that a primitive guarantees some program order becomes the memory operation order, so that the description above can be rewritten as: value-returning atomics: orders {LOAD,STORE}->RmW(atomic operation)->{LOAD,STORE} strong transitivity much simpler and clearer for discussion and reasoning Regards, Boqun > etc.. > > Also, in the above I used weak and strong transitivity, but that too is > of course up for grabs. signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V6 1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR
On PHB_IODA2, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If a SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned from 64bit prefetchable window, which means M64 BAR can't work on it. The reason is PCI bridges support only 2 windows and the kernel code programs bridges in the way that one window is 32bit-nonprefetchable and the other one is 64bit-prefetchable. So if devices' IOV BAR is 64bit and non-prefetchable, it will be mapped into 32bit space and therefore M64 cannot be used for it. This patch makes this explicit and truncate IOV resource in this case to save MMIO space. Signed-off-by: Wei YangReviewed-by: Gavin Shan Acked-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 34 --- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 85cbc96..f042fed 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -908,9 +908,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) if (!res->flags || !res->parent) continue; - if (!pnv_pci_is_mem_pref_64(res->flags)) - continue; - /* * The actual IOV BAR range is determined by the start address * and the actual size for num_vfs VFs BAR. This check is to @@ -939,9 +936,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) if (!res->flags || !res->parent) continue; - if (!pnv_pci_is_mem_pref_64(res->flags)) - continue; - size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); res2 = *res; res->start += size * offset; @@ -1221,9 +1215,6 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) if (!res->flags || !res->parent) continue; - if (!pnv_pci_is_mem_pref_64(res->flags)) - continue; - for (j = 0; j < vf_groups; j++) { do { win = find_next_zero_bit(>ioda.m64_bar_alloc, @@ -1510,6 +1501,12 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) pdn = pci_get_pdn(pdev); if (phb->type == PNV_PHB_IODA2) { + if (!pdn->vfs_expanded) { + dev_info(>dev, "don't support this SRIOV device" + " with non 64bit-prefetchable IOV BAR\n"); + return -ENOSPC; + } + /* Calculate available PE for required VFs */ mutex_lock(>ioda.pe_alloc_mutex); pdn->offset = bitmap_find_next_zero_area( @@ -2775,9 +2772,10 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) if (!res->flags || res->parent) continue; if (!pnv_pci_is_mem_pref_64(res->flags)) { - dev_warn(>dev, " non M64 VF BAR%d: %pR\n", + dev_warn(>dev, "Don't support SR-IOV with" + " non M64 VF BAR%d: %pR. \n", i, res); - continue; + goto truncate_iov; } size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); @@ -2796,11 +2794,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) res = >resource[i + PCI_IOV_RESOURCES]; if (!res->flags || res->parent) continue; - if (!pnv_pci_is_mem_pref_64(res->flags)) { - dev_warn(>dev, "Skipping expanding VF BAR%d: %pR\n", -i, res); - continue; - } dev_dbg(>dev, " Fixing VF BAR%d: %pR to\n", i, res); size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); @@ -2810,6 +2803,15 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) i, res, mul); } pdn->vfs_expanded = mul; + + return; + +truncate_iov: + /* To save MMIO space, IOV BAR is truncated. */ + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { + res = >resource[i + PCI_IOV_RESOURCES]; + res->end = res->start - 1; + } } #endif /* CONFIG_PCI_IOV */ -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V6 3/6] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64 BARs in Single PE mode to cover the number of VFs required to be enabled. By doing so, several VFs would be in one VF Group and leads to interference between VFs in the same group. And in this patch, m64_wins is renamed to m64_map, which means index number of the M64 BAR used to map the VF BAR. Based on Gavin's comments. Also makes sure the VF BAR size is bigger than 32MB when M64 BAR is used in Single PE mode. This patch changes the design by using one M64 BAR in Single PE mode for one VF BAR. This gives absolute isolation for VFs. Signed-off-by: Wei YangReviewed-by: Gavin Shan Acked-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/pci-bridge.h | 5 +- arch/powerpc/platforms/powernv/pci-ioda.c | 177 -- 2 files changed, 75 insertions(+), 107 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 712add5..8aeba4c 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -214,10 +214,9 @@ struct pci_dn { u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ int offset; /* PE# for the first VF PE */ -#define M64_PER_IOV 4 - int m64_per_iov; + boolm64_single_mode;/* Use M64 BAR in Single Mode */ #define IODA_INVALID_M64(-1) - int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV]; + int (*m64_map)[PCI_SRIOV_NUM_BARS]; #endif /* CONFIG_PCI_IOV */ #endif struct list_head child_list; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 629ab1b..dc64026 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1148,29 +1148,36 @@ static void pnv_pci_ioda_setup_PEs(void) } #ifdef CONFIG_PCI_IOV -static int pnv_pci_vf_release_m64(struct pci_dev *pdev) +static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs) { struct pci_bus*bus; struct pci_controller *hose; struct pnv_phb*phb; struct pci_dn *pdn; inti, j; + intm64_bars; bus = pdev->bus; hose = pci_bus_to_host(bus); phb = hose->private_data; pdn = pci_get_pdn(pdev); + if (pdn->m64_single_mode) + m64_bars = num_vfs; + else + m64_bars = 1; + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) - for (j = 0; j < M64_PER_IOV; j++) { - if (pdn->m64_wins[i][j] == IODA_INVALID_M64) + for (j = 0; j < m64_bars; j++) { + if (pdn->m64_map[j][i] == IODA_INVALID_M64) continue; opal_pci_phb_mmio_enable(phb->opal_id, - OPAL_M64_WINDOW_TYPE, pdn->m64_wins[i][j], 0); - clear_bit(pdn->m64_wins[i][j], >ioda.m64_bar_alloc); - pdn->m64_wins[i][j] = IODA_INVALID_M64; + OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 0); + clear_bit(pdn->m64_map[j][i], >ioda.m64_bar_alloc); + pdn->m64_map[j][i] = IODA_INVALID_M64; } + kfree(pdn->m64_map); return 0; } @@ -1187,8 +1194,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) inttotal_vfs; resource_size_tsize, start; intpe_num; - intvf_groups; - intvf_per_group; + intm64_bars; bus = pdev->bus; hose = pci_bus_to_host(bus); @@ -1196,26 +1202,26 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) pdn = pci_get_pdn(pdev); total_vfs = pci_sriov_get_totalvfs(pdev); - /* Initialize the m64_wins to IODA_INVALID_M64 */ - for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) - for (j = 0; j < M64_PER_IOV; j++) - pdn->m64_wins[i][j] = IODA_INVALID_M64; + if (pdn->m64_single_mode) + m64_bars = num_vfs; + else + m64_bars = 1; + + pdn->m64_map = kmalloc(sizeof(*pdn->m64_map) * m64_bars, GFP_KERNEL); + if (!pdn->m64_map) + return -ENOMEM; + /* Initialize the m64_map to IODA_INVALID_M64 */ + for (i = 0; i < m64_bars ; i++) + for (j = 0; j < PCI_SRIOV_NUM_BARS; j++) + pdn->m64_map[i][j] = IODA_INVALID_M64; - if (pdn->m64_per_iov == M64_PER_IOV) { - vf_groups = (num_vfs <= M64_PER_IOV) ? num_vfs: M64_PER_IOV; -
[PATCH V6 2/6] powerpc/powernv: simplify the calculation of iov resource alignment
The alignment of IOV BAR on PowerNV platform is the total size of the IOV BAR. No matter whether the IOV BAR is extended with number of roundup_pow_of_two(total_vfs) or number of max PE number (256), the total size could be calculated by (vfs_expanded * VF_BAR_size). This patch simplifies the pnv_pci_iov_resource_alignment() by removing the first case. Signed-off-by: Wei YangReviewed-by: Gavin Shan Acked-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index f042fed..629ab1b 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2997,17 +2997,21 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev, int resno) { struct pci_dn *pdn = pci_get_pdn(pdev); - resource_size_t align, iov_align; - - iov_align = resource_size(>resource[resno]); - if (iov_align) - return iov_align; + resource_size_t align; + /* +* On PowerNV platform, IOV BAR is mapped by M64 BAR to enable the +* SR-IOV. While from hardware perspective, the range mapped by M64 +* BAR should be size aligned. +* +* This function returns the total IOV BAR size if M64 BAR is in +* Shared PE mode or just VF BAR size if not. +*/ align = pci_iov_resource_size(pdev, resno); - if (pdn->vfs_expanded) - return pdn->vfs_expanded * align; + if (!pdn->vfs_expanded) + return align; - return align; + return pdn->vfs_expanded * align; } #endif /* CONFIG_PCI_IOV */ -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V6 6/6] powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode
When M64 BAR is set to Single PE mode, the PE# assigned to VF could be sparse. This patch restructures the code to allocate sparse PE# for VFs when M64 BAR is set to Single PE mode. Also it rename the offset to pe_num_map to reflect the content is the PE number. Signed-off-by: Wei YangReviewed-by: Gavin Shan Acked-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/pci-bridge.h | 2 +- arch/powerpc/platforms/powernv/pci-ioda.c | 81 +++ 2 files changed, 63 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 8aeba4c..b3a226b 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -213,7 +213,7 @@ struct pci_dn { #ifdef CONFIG_PCI_IOV u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ - int offset; /* PE# for the first VF PE */ + int *pe_num_map;/* PE# for the first VF PE or array */ boolm64_single_mode;/* Use M64 BAR in Single Mode */ #define IODA_INVALID_M64(-1) int (*m64_map)[PCI_SRIOV_NUM_BARS]; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index a8c55f5..b8dfd31 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1243,7 +1243,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) /* Map the M64 here */ if (pdn->m64_single_mode) { - pe_num = pdn->offset + j; + pe_num = pdn->pe_num_map[j]; rc = opal_pci_map_pe_mmio_window(phb->opal_id, pe_num, OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 0); @@ -1347,7 +1347,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev) struct pnv_phb*phb; struct pci_dn *pdn; struct pci_sriov *iov; - u16 num_vfs; + u16num_vfs, i; bus = pdev->bus; hose = pci_bus_to_host(bus); @@ -1361,14 +1361,21 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev) if (phb->type == PNV_PHB_IODA2) { if (!pdn->m64_single_mode) - pnv_pci_vf_resource_shift(pdev, -pdn->offset); + pnv_pci_vf_resource_shift(pdev, -*pdn->pe_num_map); /* Release M64 windows */ pnv_pci_vf_release_m64(pdev, num_vfs); /* Release PE numbers */ - bitmap_clear(phb->ioda.pe_alloc, pdn->offset, num_vfs); - pdn->offset = 0; + if (pdn->m64_single_mode) { + for (i = 0; i < num_vfs; i++) { + if (pdn->pe_num_map[i] != IODA_INVALID_PE) + pnv_ioda_free_pe(phb, pdn->pe_num_map[i]); + } + } else + bitmap_clear(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs); + /* Releasing pe_num_map */ + kfree(pdn->pe_num_map); } } @@ -1394,7 +1401,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs) /* Reserve PE for each VF */ for (vf_index = 0; vf_index < num_vfs; vf_index++) { - pe_num = pdn->offset + vf_index; + if (pdn->m64_single_mode) + pe_num = pdn->pe_num_map[vf_index]; + else + pe_num = *pdn->pe_num_map + vf_index; pe = >ioda.pe_array[pe_num]; pe->pe_number = pe_num; @@ -1436,6 +1446,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) struct pnv_phb*phb; struct pci_dn *pdn; intret; + u16i; bus = pdev->bus; hose = pci_bus_to_host(bus); @@ -1458,20 +1469,44 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) return -EBUSY; } + /* Allocating pe_num_map */ + if (pdn->m64_single_mode) + pdn->pe_num_map = kmalloc(sizeof(*pdn->pe_num_map) * num_vfs, + GFP_KERNEL); + else + pdn->pe_num_map = kmalloc(sizeof(*pdn->pe_num_map), GFP_KERNEL); + + if (!pdn->pe_num_map) + return -ENOMEM; + + if (pdn->m64_single_mode) + for (i = 0; i < num_vfs; i++) + pdn->pe_num_map[i] = IODA_INVALID_PE; + /*
Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
On Tue, Oct 20, 2015 at 03:15:32PM +0800, Boqun Feng wrote: > On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote: > > > > Am I missing something here? If not, it seems to me that you need > > the leading lwsync to instead be a sync. > > > > Of course, if I am not missing something, then this applies also to the > > value-returning RMW atomic operations that you pulled this pattern from. > > If so, it would seem that I didn't think through all the possibilities > > back when PPC_ATOMIC_EXIT_BARRIER moved to sync... In fact, I believe > > that I worried about the RMW atomic operation acting as a barrier, > > but not as the load/store itself. :-/ > > > > Paul, I know this may be difficult, but could you recall why the > __futex_atomic_op() and futex_atomic_cmpxchg_inatomic() also got > involved into the movement of PPC_ATOMIC_EXIT_BARRIER to "sync"? > > I did some search, but couldn't find the discussion of that patch. > > I ask this because I recall Peter once bought up a discussion: > > https://lkml.org/lkml/2015/8/26/596 > > Peter's conclusion seems to be that we could(though didn't want to) live > with futex atomics not being full barriers. > > > Peter, just be clear, I'm not in favor of relaxing futex atomics. But if > I make PPC_ATOMIC_ENTRY_BARRIER being "sync", it will also strengthen > the futex atomics, just wonder whether such strengthen is a -fix- or > not, considering that I want this patch to go to -stable tree. So Linus' argued that since we only need to order against user accesses (true) and priv changes typically imply strong barriers (open) we might want to allow archs to rely on those instead of mandating they have explicit barriers in the futex primitives. And I indeed forgot to follow up on that discussion. So; does PPC imply full barriers on user<->kernel boundaries? If so, its not critical to the futex atomic implementations what extra barriers are added. If not; then strengthening the futex ops is indeed (probably) a good thing :-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V6 0/6] Redesign SR-IOV on PowerNV
In original design, it tries to group VFs to enable more number of VFs in the system, when VF BAR is bigger than 64MB. This design has a flaw in which one error on a VF will interfere other VFs in the same group. This patch series change this design by using M64 BAR in Single PE mode to cover only one VF BAR. By doing so, it gives absolute isolation between VFs. v6: * add the minimum size check when M64 BAR is in Single PE mode * truncate IOV BAR when powernv can't handle it v5: * rebase on top of v4.3-rc4, with commit 68230242cdb "net/mlx4_core: Add port attribute when tracking counters" reverted * add some reason in change log of Patch 1 * make the pnv_pci_iov_resource_alignment() more easy to read * initialize pe_num_map[] just after it is allocated * test ssh from guest to host via VF passed and then shutdown the guest * no code change v4: * rebase the code on top of v4.2-rc7 * switch back to use the dynamic version of pe_num_map and m64_map * split the memory allocation and PE assignment of pe_num_map to make it more easy to read * check pe_num_map value before free PE. * add the rename reason for pe_num_map and m64_map in change log v3: * return -ENOSPC when a VF has non-64bit prefetchable BAR * rename offset to pe_num_map and define it staticly * change commit log based on comments * define m64_map staticly v2: * clean up iov bar alignment calculation * change m64s to m64_bars * add a field to represent M64 Single PE mode will be used * change m64_wins to m64_map * calculate the gate instead of hard coded * dynamically allocate m64_map * dynamically allocate PE# * add a case to calculate iov bar alignment when M64 Single PE is used * when M64 Single PE is used, compare num_vfs with M64 BAR available number in system at first Wei Yang (6): powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR powerpc/powernv: simplify the calculation of iov resource alignment powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR powerpc/powernv: replace the hard coded boundary with gate powerpc/powernv: boundary the total VF BAR size instead of the individual one powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode arch/powerpc/include/asm/pci-bridge.h | 7 +- arch/powerpc/platforms/powernv/pci-ioda.c | 346 -- 2 files changed, 191 insertions(+), 162 deletions(-) -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V6 5/6] powerpc/powernv: boundary the total VF BAR size instead of the individual one
Each VF could have 6 BARs at most. When the total BAR size exceeds the gate, after expanding it will also exhaust the M64 Window. This patch limits the boundary by checking the total VF BAR size instead of the individual BAR. Signed-off-by: Wei YangReviewed-by: Gavin Shan Acked-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index de5a194..a8c55f5 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2701,7 +2701,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) const resource_size_t gate = phb->ioda.m64_segsize >> 2; struct resource *res; int i; - resource_size_t size; + resource_size_t size, total_vf_bar_sz; struct pci_dn *pdn; int mul, total_vfs; @@ -2714,6 +2714,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) total_vfs = pci_sriov_get_totalvfs(pdev); mul = phb->ioda.total_pe; + total_vf_bar_sz = 0; for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { res = >resource[i + PCI_IOV_RESOURCES]; @@ -2726,7 +2727,8 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) goto truncate_iov; } - size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); + total_vf_bar_sz += pci_iov_resource_size(pdev, + i + PCI_IOV_RESOURCES); /* * If bigger than quarter of M64 segment size, just round up @@ -2740,11 +2742,11 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) * limit the system flexibility. This is a design decision to * set the boundary to quarter of the M64 segment size. */ - if (size > gate) { - dev_info(>dev, "PowerNV: VF BAR%d: %pR IOV size " - "is bigger than %lld, roundup power2\n", -i, res, gate); + if (total_vf_bar_sz > gate) { mul = roundup_pow_of_two(total_vfs); + dev_info(>dev, + "VF BAR Total IOV size %llx > %llx, roundup to %d VFs\n", + total_vf_bar_sz, gate, mul); pdn->m64_single_mode = true; break; } -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V6 4/6] powerpc/powernv: replace the hard coded boundary with gate
At the moment 64bit-prefetchable window can be maximum 64GB, which is currently got from device tree. This means that in shared mode the maximum supported VF BAR size is 64GB/256=256MB. While this size could exhaust the whole 64bit-prefetchable window. This is a design decision to set a boundary to 64MB of the VF BAR size. Since VF BAR size with 64MB would occupy a quarter of the 64bit-prefetchable window, this is affordable. This patch replaces magic limit of 64MB with "gate", which is 1/4 of the M64 Segment Size(m64_segsize >> 2) and adds comment to explain the reason for it. Signed-off-by: Wei YangReviewed-by: Gavin Shan Acked-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 28 +++- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index dc64026..de5a194 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2696,8 +2696,9 @@ static void pnv_pci_init_ioda_msis(struct pnv_phb *phb) { } #ifdef CONFIG_PCI_IOV static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) { - struct pci_controller *hose; - struct pnv_phb *phb; + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + const resource_size_t gate = phb->ioda.m64_segsize >> 2; struct resource *res; int i; resource_size_t size; @@ -2707,9 +2708,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) if (!pdev->is_physfn || pdev->is_added) return; - hose = pci_bus_to_host(pdev->bus); - phb = hose->private_data; - pdn = pci_get_pdn(pdev); pdn->vfs_expanded = 0; pdn->m64_single_mode = false; @@ -2730,10 +2728,22 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); - /* bigger than 64M */ - if (size > (1 << 26)) { - dev_info(>dev, "PowerNV: VF BAR%d: %pR IOV size is bigger than 64M, roundup power2\n", -i, res); + /* +* If bigger than quarter of M64 segment size, just round up +* power of two. +* +* Generally, one M64 BAR maps one IOV BAR. To avoid conflict +* with other devices, IOV BAR size is expanded to be +* (total_pe * VF_BAR_size). When VF_BAR_size is half of M64 +* segment size , the expanded size would equal to half of the +* whole M64 space size, which will exhaust the M64 Space and +* limit the system flexibility. This is a design decision to +* set the boundary to quarter of the M64 segment size. +*/ + if (size > gate) { + dev_info(>dev, "PowerNV: VF BAR%d: %pR IOV size " + "is bigger than %lld, roundup power2\n", +i, res, gate); mul = roundup_pow_of_two(total_vfs); pdn->m64_single_mode = true; break; -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC][PATCH 3/3]perf/powerpc :add support for sampling intr machine state
On Tuesday 20 October 2015 09:50 AM, Madhavan Srinivasan wrote: On Monday 19 October 2015 05:48 PM, Anju T wrote: From: AnjuThe registers to sample are passed through the sample_regs_intr bitmask. The name and bit position for each register is defined in asm/perf_regs.h. This feature can be enabled by using -I option with perf record command. To display the sampled register values use perf script -D. The kernel uses the "PERF" register ids to find offset of the register in 'struct pt_regs'. CONFIG_HAVE_PERF_REGS will enable sampling of the interrupted machine state. Signed-off-by: Anju T --- arch/powerpc/Kconfig | 1 + arch/powerpc/perf/Makefile| 2 +- arch/powerpc/perf/perf_regs.c | 85 +++ tools/perf/config/Makefile| 4 ++ 4 files changed, 91 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/perf/perf_regs.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 9a7057e..c4ce60d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -119,6 +119,7 @@ config PPC select GENERIC_ATOMIC64 if PPC32 select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select HAVE_PERF_EVENTS + select HAVE_PERF_REGS select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_HW_BREAKPOINT if PERF_EVENTS && PPC_BOOK3S_64 select ARCH_WANT_IPC_PARSE_VERSION diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile index f9c083a..8e7f545 100644 --- a/arch/powerpc/perf/Makefile +++ b/arch/powerpc/perf/Makefile @@ -7,7 +7,7 @@ obj64-$(CONFIG_PPC_PERF_CTRS) += power4-pmu.o ppc970-pmu.o power5-pmu.o \ power5+-pmu.o power6-pmu.o power7-pmu.o \ power8-pmu.o obj32-$(CONFIG_PPC_PERF_CTRS) += mpc7450-pmu.o - +obj-$(CONFIG_PERF_EVENTS) += perf_regs.o obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o obj-$(CONFIG_FSL_EMB_PERF_EVENT_E500) += e500-pmu.o e6500-pmu.o diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c new file mode 100644 index 000..7a71de2 --- /dev/null +++ b/arch/powerpc/perf/perf_regs.c @@ -0,0 +1,85 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +#define PT_REGS_OFFSET(id, r) [id] = offsetof(struct pt_regs, r) + +#define REG_RESERVED (~((1ULL << PERF_REG_POWERPC_MAX) - 1)) + +static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = { + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR0, gpr[0]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR1, gpr[1]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR2, gpr[2]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR3, gpr[3]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR4, gpr[4]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR5, gpr[5]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR6, gpr[6]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR7, gpr[7]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR8, gpr[8]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR9, gpr[9]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR10, gpr[10]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR11, gpr[11]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR12, gpr[12]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR13, gpr[13]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR14, gpr[14]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR15, gpr[15]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR16, gpr[16]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR17, gpr[17]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR18, gpr[18]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR19, gpr[19]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR20, gpr[20]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR21, gpr[21]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR22, gpr[22]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR23, gpr[23]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR24, gpr[24]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR25, gpr[25]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR26, gpr[26]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR27, gpr[27]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR28, gpr[28]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR29, gpr[29]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR30, gpr[30]), + PT_REGS_OFFSET(PERF_REG_POWERPC_GPR31, gpr[31]), + PT_REGS_OFFSET(PERF_REG_POWERPC_NIP, nip), + PT_REGS_OFFSET(PERF_REG_POWERPC_MSR, msr), + PT_REGS_OFFSET(PERF_REG_POWERPC_ORIG_R3, orig_gpr3), + PT_REGS_OFFSET(PERF_REG_POWERPC_CTR, ctr), + PT_REGS_OFFSET(PERF_REG_POWERPC_LNK, link), + PT_REGS_OFFSET(PERF_REG_POWERPC_XER, xer), + PT_REGS_OFFSET(PERF_REG_POWERPC_CCR, ccr), +#ifdef __powerpc64__ + PT_REGS_OFFSET(PERF_REG_POWERPC_SOFTE, softe), +#else + PT_REGS_OFFSET(PERF_REG_POWERPC_MQ, mq), +#endif + PT_REGS_OFFSET(PERF_REG_POWERPC_TRAP, trap), + PT_REGS_OFFSET(PERF_REG_POWERPC_DAR, dar), + PT_REGS_OFFSET(PERF_REG_POWERPC_DSISR, dsisr), +
Re: [PATCH 1/3] perf/powerpc:add ability to sample intr machine state in power
Hi maddy, On Tuesday 20 October 2015 09:46 AM, Madhavan Srinivasan wrote: On Monday 19 October 2015 05:48 PM, Anju T wrote: From: AnjuThe enum definition assigns an 'id' to each register in power. I guess it should be "each register in "struct pt_regs" of arch/powerpc Right, that seems better.Will change the description like that. Thanks a lot for reviewing the patch . The order of these values in the enum definition are based on the corresponding macros in arch/powerpc/include/uapi/asm/ptrace.h . Signed-off-by: Anju T --- arch/powerpc/include/uapi/asm/perf_regs.h | 55 +++ 1 file changed, 55 insertions(+) create mode 100644 arch/powerpc/include/uapi/asm/perf_regs.h diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h new file mode 100644 index 000..b97727c --- /dev/null +++ b/arch/powerpc/include/uapi/asm/perf_regs.h @@ -0,0 +1,55 @@ +#ifndef _ASM_POWERPC_PERF_REGS_H +#define _ASM_POWERPC_PERF_REGS_H + +enum perf_event_powerpc_regs { + PERF_REG_POWERPC_GPR0, + PERF_REG_POWERPC_GPR1, + PERF_REG_POWERPC_GPR2, + PERF_REG_POWERPC_GPR3, + PERF_REG_POWERPC_GPR4, + PERF_REG_POWERPC_GPR5, + PERF_REG_POWERPC_GPR6, + PERF_REG_POWERPC_GPR7, + PERF_REG_POWERPC_GPR8, + PERF_REG_POWERPC_GPR9, + PERF_REG_POWERPC_GPR10, + PERF_REG_POWERPC_GPR11, + PERF_REG_POWERPC_GPR12, + PERF_REG_POWERPC_GPR13, + PERF_REG_POWERPC_GPR14, + PERF_REG_POWERPC_GPR15, + PERF_REG_POWERPC_GPR16, + PERF_REG_POWERPC_GPR17, + PERF_REG_POWERPC_GPR18, + PERF_REG_POWERPC_GPR19, + PERF_REG_POWERPC_GPR20, + PERF_REG_POWERPC_GPR21, + PERF_REG_POWERPC_GPR22, + PERF_REG_POWERPC_GPR23, + PERF_REG_POWERPC_GPR24, + PERF_REG_POWERPC_GPR25, + PERF_REG_POWERPC_GPR26, + PERF_REG_POWERPC_GPR27, + PERF_REG_POWERPC_GPR28, + PERF_REG_POWERPC_GPR29, + PERF_REG_POWERPC_GPR30, + PERF_REG_POWERPC_GPR31, + PERF_REG_POWERPC_NIP, + PERF_REG_POWERPC_MSR, + PERF_REG_POWERPC_ORIG_R3, + PERF_REG_POWERPC_CTR, + PERF_REG_POWERPC_LNK, + PERF_REG_POWERPC_XER, + PERF_REG_POWERPC_CCR, +#ifdef __powerpc64__ + PERF_REG_POWERPC_SOFTE, +#else + PERF_REG_POWERPC_MQ, +#endif + PERF_REG_POWERPC_TRAP, + PERF_REG_POWERPC_DAR, + PERF_REG_POWERPC_DSISR, + PERF_REG_POWERPC_RESULT, + PERF_REG_POWERPC_MAX, +}; +#endif /* _ASM_POWERPC_PERF_REGS_H */ Thanks Anju ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: linux-next: build warning after merge of the powerpc tree
On Tue, 2015-10-20 at 16:21 +1100, Stephen Rothwell wrote: > Hi all, > > After merging the powerpc tree, today's linux-next build (powerpc > allyesconfig) produced this warning: > > WARNING: vmlinux.o(.text+0x9367c): Section mismatch in reference from the > function .msi_bitmap_alloc() to the function > .init.text:.memblock_virt_alloc_try_nid() > The function .msi_bitmap_alloc() references > the function __init .memblock_virt_alloc_try_nid(). > This is often because .msi_bitmap_alloc lacks a __init > annotation or the annotation of .memblock_virt_alloc_try_nid is wrong. > > Introduced (probably) by commit > > cb2d3883c603 ("powerpc/msi: Free the bitmap if it was slab allocated") Yeah that's correct, though it should be safe in practice. I'm not sure why you only saw that now though, the patch has been in next since the 13th of October. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation
On Mon, Oct 12, 2015 at 04:30:48PM -0700, Paul E. McKenney wrote: > On Fri, Oct 09, 2015 at 07:33:28PM +0100, Will Deacon wrote: > > On Fri, Oct 09, 2015 at 10:43:27AM -0700, Paul E. McKenney wrote: > > > On Fri, Oct 09, 2015 at 10:51:29AM +0100, Will Deacon wrote: [snip] > > > > > We could also include a link to the ppcmem/herd web frontends and your > > > > lwn.net article. (ppcmem is already linked, but it's not obvious that > > > > you can run litmus tests in your browser). > > > > > > I bet that the URLs for the web frontends are not stable long term. > > > Don't get me wrong, PPCMEM/ARMMEM has been there for me for a goodly > > > number of years, but professors do occasionally move from one institution > > > to another. For but one example, Susmit Sarkar is now at University > > > of St. Andrews rather than at Cambridge. > > > > > > So to make this work, we probably need to be thinking in terms of > > > asking the researchers for permission to include their ocaml code in the > > > Linux-kernel source tree. I would be strongly in favor of this, actually. > > > > > > Thoughts? > > > > I'm extremely hesitant to import a bunch of dubiously licensed, academic > > ocaml code into the kernel. Even if we did, who would maintain it? > > > > A better solution might be to host a mirror of the code on kernel.org, > > along with a web front-end for people to play with (the tests we're talking > > about here do seem to run ok in my browser). > > I am not too worried about how this happens, but we should avoid > constraining the work of our academic partners. The reason I was thinking > in terms of in the kernel was to avoid version-synchronization issues. > "Wait, this is Linux kernel v4.17, which means that you need to use > version 8.3.5.1 of the tooling... And with these four patches as well." > Maybe including only the models' code(arm.cat, ppc.cat, etc.) into kernel rather than the whole code base could also solve the version-synchronization in some degree, and avoid maintaining the whole tool code? I'm assuming modifying the verifier's code other than the models' code will unlikely change the result of a litmus test. Regards, Boqun signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/5] drivers/tty: make more bool drivers explicitly non-modular
On 18/10/2015 at 18:21:13 -0400, Paul Gortmaker wrote : > The one common thread here for all the patches is that we also > scrap the .remove functions which would only be used for module > unload (impossible) and driver unbind. For the drivers here, there > doesn't seem to be a sensible unbind use case (vs. e.g. a multiport > PCI ethernet driver where one port is unbound and passed through to > a kvm guest or similar). Hence we just explicitly disallow any > driver unbind operations to help prevent root from doing something > illogical to the machine that they could have done previously. > > We've already done this for drivers/tty/serial/mpsc.c previously. > > Build tested for allmodconfig on ARM64 and powerpc for tty/tty-testing. > So, how does this actually build test atmel_serial? A proper solution would be to actually make it a tristate and allow building as a module. I think it currently fails because of console_initcall() but that is certainly fixable. -- Alexandre Belloni, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 5/7] powerpc/mm: update frag size
--- arch/powerpc/include/asm/book3s/64/hash-64k.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 5062c6d423fd..a28dbfe2baed 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -39,14 +39,14 @@ */ #define PTE_RPN_SHIFT (30) /* - * we support 8 fragments per PTE page of 64K size. + * we support 32 fragments per PTE page of 64K size. */ -#define PTE_FRAG_NR8 +#define PTE_FRAG_NR32 /* * We use a 2K PTE page fragment and another 4K for storing * real_pte_t hash index. Rounding the entire thing to 8K */ -#define PTE_FRAG_SIZE_SHIFT 13 +#define PTE_FRAG_SIZE_SHIFT 11 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) /* -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 3/7] powerpc/nohash: Update 64K nohash config to have 32 pte fragement
They don't need to track 4k subpage slot details and hence don't need second half of pgtable_t. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h index 1d8e26e8167b..dbd9de9264c2 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h @@ -10,14 +10,14 @@ #define PGD_INDEX_SIZE 12 /* - * we support 8 fragments per PTE page of 64K size + * we support 32 fragments per PTE page of 64K size */ -#define PTE_FRAG_NR8 +#define PTE_FRAG_NR32 /* * We use a 2K PTE page fragment and another 4K for storing * real_pte_t hash index. Rounding the entire thing to 8K */ -#define PTE_FRAG_SIZE_SHIFT 13 +#define PTE_FRAG_SIZE_SHIFT 11 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 7/7] powerpc/mm: getrid of real_pte_t
Now that we don't track 4k subpage slot details, get rid of real_pte Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hash-64k.h| 15 - arch/powerpc/include/asm/book3s/64/pgtable.h | 24 arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 3 +-- arch/powerpc/include/asm/nohash/64/pgtable.h | 17 +- arch/powerpc/include/asm/page.h | 15 - arch/powerpc/include/asm/tlbflush.h | 4 ++-- arch/powerpc/mm/hash64_64k.c | 28 +++- arch/powerpc/mm/hash_native_64.c | 4 ++-- arch/powerpc/mm/hash_utils_64.c | 4 ++-- arch/powerpc/mm/init_64.c| 3 +-- arch/powerpc/mm/tlb_hash64.c | 15 ++--- arch/powerpc/platforms/pseries/lpar.c| 4 ++-- 12 files changed, 44 insertions(+), 92 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 19e0afb36fa8..90d4c3bfbafd 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -43,8 +43,7 @@ */ #define PTE_FRAG_NR32 /* - * We use a 2K PTE page fragment and another 4K for storing - * real_pte_t hash index. Rounding the entire thing to 8K + * We use a 2K PTE page fragment */ #define PTE_FRAG_SIZE_SHIFT 11 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) @@ -58,21 +57,15 @@ #define PUD_MASKED_BITS0x1ff #ifndef __ASSEMBLY__ - /* * With 64K pages on hash table, we have a special PTE format that * uses a second "half" of the page table to encode sub-page information * in order to deal with 64K made of 4K HW pages. Thus we override the * generic accessors and iterators here */ -#define __real_pte __real_pte -extern real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep); -extern unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long hash, - unsigned long vpn, int ssize, bool *valid); -static inline pte_t __rpte_to_pte(real_pte_t rpte) -{ - return rpte.pte; -} +#define pte_to_hidx pte_to_hidx +extern unsigned long pte_to_hidx(pte_t rpte, unsigned long hash, +unsigned long vpn, int ssize, bool *valid); /* * Trick: we set __end to va + 64k, which happens works for * a 16M page as well as we want only one iteration diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 79a90ca7b9f6..1d5648e25fcb 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -35,36 +35,30 @@ #define __HAVE_ARCH_PTE_SPECIAL #ifndef __ASSEMBLY__ - /* * This is the default implementation of various PTE accessors, it's * used in all cases except Book3S with 64K pages where we have a * concept of sub-pages */ -#ifndef __real_pte - -#ifdef CONFIG_STRICT_MM_TYPECHECKS -#define __real_pte(a,e,p) ((real_pte_t){(e)}) -#define __rpte_to_pte(r) ((r).pte) -#else -#define __real_pte(a,e,p) (e) -#define __rpte_to_pte(r) (__pte(r)) +#ifndef pte_to_hidx +#define pte_to_hidx(pte, index)(pte_val(pte) >> _PAGE_F_GIX_SHIFT) #endif -#define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >>_PAGE_F_GIX_SHIFT) -#define pte_iterate_hashed_subpages(vpn, psize, shift) \ - do {\ - shift = mmu_psize_defs[psize].shift;\ +#ifndef pte_iterate_hashed_subpages +#define pte_iterate_hashed_subpages(vpn, psize, shift) \ + do {\ + shift = mmu_psize_defs[psize].shift;\ #define pte_iterate_hashed_end() } while(0) +#endif /* * We expect this to be called only for user addresses or kernel virtual * addresses other than the linear mapping. */ +#ifndef pte_pagesize_index #define pte_pagesize_index(mm, addr, pte) MMU_PAGE_4K - -#endif /* __real_pte */ +#endif static inline void pmd_set(pmd_t *pmdp, unsigned long val) { diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h index dbd9de9264c2..0f075799ae97 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h @@ -14,8 +14,7 @@ */ #define PTE_FRAG_NR32 /* - * We use a 2K PTE page fragment and another 4K for storing - * real_pte_t hash index. Rounding the entire thing to 8K + * We use a 2K PTE page fragment */ #define PTE_FRAG_SIZE_SHIFT 11 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h index 37b5a62d18f4..ddde5f16c385 100644 ---
Re: linux-next: build warning after merge of the powerpc tree
Hi Michael, On Tue, 20 Oct 2015 21:06:51 +1100 Michael Ellermanwrote: > > On Tue, 2015-10-20 at 16:21 +1100, Stephen Rothwell wrote: > > > After merging the powerpc tree, today's linux-next build (powerpc > > allyesconfig) produced this warning: > > > > WARNING: vmlinux.o(.text+0x9367c): Section mismatch in reference from the > > function .msi_bitmap_alloc() to the function > > .init.text:.memblock_virt_alloc_try_nid() > > The function .msi_bitmap_alloc() references > > the function __init .memblock_virt_alloc_try_nid(). > > This is often because .msi_bitmap_alloc lacks a __init > > annotation or the annotation of .memblock_virt_alloc_try_nid is wrong. > > > > Introduced (probably) by commit > > > > cb2d3883c603 ("powerpc/msi: Free the bitmap if it was slab allocated") > > Yeah that's correct, though it should be safe in practice. > > I'm not sure why you only saw that now though, the patch has been in next > since > the 13th of October. I don't always notice new warnings immediately among all the others :-( -- Cheers, Stephen Rothwells...@canb.auug.org.au ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 4/7] powerpc/mm: Don't track 4k subpage information with 64k linux page size
We search the hash table to find the slot information. This slows down the lookup, but we do that only for 4k subpage config Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hash-64k.h | 33 +-- arch/powerpc/include/asm/machdep.h| 2 + arch/powerpc/include/asm/page.h | 4 +- arch/powerpc/mm/hash64_64k.c | 59 --- arch/powerpc/mm/hash_native_64.c | 23 ++- arch/powerpc/mm/hash_utils_64.c | 5 ++- arch/powerpc/mm/pgtable_64.c | 6 ++- arch/powerpc/platforms/pseries/lpar.c | 17 +++- 8 files changed, 96 insertions(+), 53 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 681657cabbe4..5062c6d423fd 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -67,51 +67,22 @@ */ #define __real_pte __real_pte extern real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep); -static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index) -{ - if ((pte_val(rpte.pte) & _PAGE_COMBO)) - return (unsigned long) rpte.hidx[index] >> 4; - return (pte_val(rpte.pte) >> _PAGE_F_GIX_SHIFT) & 0xf; -} - +extern unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long hash, + unsigned long vpn, int ssize, bool *valid); static inline pte_t __rpte_to_pte(real_pte_t rpte) { return rpte.pte; } /* - * we look at the second half of the pte page to determine whether - * the sub 4k hpte is valid. We use 8 bits per each index, and we have - * 16 index mapping full 64K page. Hence for each - * 64K linux page we use 128 bit from the second half of pte page. - * The encoding in the second half of the page is as below: - * [ index 15 ] .[index 0] - * [bit 127 ..bit 0] - * fomat of each index - * bit 7 bit0 - * [one bit secondary][ 3 bit hidx][1 bit valid][000] - */ -static inline bool __rpte_sub_valid(real_pte_t rpte, unsigned long index) -{ - unsigned char index_val = rpte.hidx[index]; - - if ((index_val >> 3) & 0x1) - return true; - return false; -} - -/* * Trick: we set __end to va + 64k, which happens works for * a 16M page as well as we want only one iteration */ #define pte_iterate_hashed_subpages(rpte, psize, vpn, index, shift)\ do {\ unsigned long __end = vpn + (1UL << (PAGE_SHIFT - VPN_SHIFT)); \ - unsigned __split = (psize == MMU_PAGE_4K || \ - psize == MMU_PAGE_64K_AP); \ shift = mmu_psize_defs[psize].shift;\ for (index = 0; vpn < __end; index++, \ vpn += (1L << (shift - VPN_SHIFT))) { \ - if (!__split || __rpte_sub_valid(rpte, index)) \ do { #define pte_iterate_hashed_end() } while(0); } } while(0) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index cab6753f1be5..40df21982ae1 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -61,6 +61,8 @@ struct machdep_calls { unsigned long addr, unsigned char *hpte_slot_array, int psize, int ssize, int local); + + unsigned long (*get_hpte_v)(unsigned long slot); /* special for kexec, to be called in real mode, linear mapping is * destroyed as well */ void(*hpte_clear_all)(void); diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index f63b2761cdd0..bbdf9e6cc8b1 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -295,7 +295,7 @@ static inline pte_basic_t pte_val(pte_t x) * the "second half" part of the PTE for pseudo 64k pages */ #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64) -typedef struct { pte_t pte; unsigned char *hidx; } real_pte_t; +typedef struct { pte_t pte; } real_pte_t; #else typedef struct { pte_t pte; } real_pte_t; #endif @@ -347,7 +347,7 @@ static inline pte_basic_t pte_val(pte_t pte) } #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64) -typedef struct { pte_t pte; unsigned char *hidx; } real_pte_t; +typedef struct { pte_t pte; } real_pte_t; #else typedef pte_t real_pte_t; #endif diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c index 84867a1491a2..e063895694e9 100644 --- a/arch/powerpc/mm/hash64_64k.c +++
[RESEND PATCH 1/2] powerpc: platforms: mpc52xx_lpbfifo: Fix module autoload for OF platform driver
From: Luis de BethencourtThis platform driver has a OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt --- arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c index 251dcb9..7bb42a0 100644 --- a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c +++ b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c @@ -568,6 +568,7 @@ static const struct of_device_id mpc52xx_lpbfifo_match[] = { { .compatible = "fsl,mpc5200-lpbfifo", }, {}, }; +MODULE_DEVICE_TABLE(of, mpc52xx_lpbfifo_match); static struct platform_driver mpc52xx_lpbfifo_driver = { .driver = { -- 2.5.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 1/7] powerpc/mm: Don't hardcode page table size
pte and pmd table size are dependent on config items. Don't hard code the same. This make sure we use the right value when masking pmd entries and also while checking pmd_bad Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hash-64k.h| 30 ++-- arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 22 + arch/powerpc/include/asm/pgalloc-64.h| 10 arch/powerpc/mm/init_64.c| 4 4 files changed, 41 insertions(+), 25 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 957d66d13a97..565f9418c25f 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -25,12 +25,6 @@ #define PGDIR_SIZE (1UL << PGDIR_SHIFT) #define PGDIR_MASK (~(PGDIR_SIZE-1)) -/* Bits to mask out from a PMD to get to the PTE page */ -/* PMDs point to PTE table fragments which are 4K aligned. */ -#define PMD_MASKED_BITS0xfff -/* Bits to mask out from a PGD/PUD to get to the PMD page */ -#define PUD_MASKED_BITS0x1ff - #define _PAGE_COMBO0x0002 /* this is a combo 4k page */ #define _PAGE_4K_PFN 0x0004 /* PFN is for a single 4k page */ @@ -44,6 +38,24 @@ * of addressable physical space, or 46 bits for the special 4k PFNs. */ #define PTE_RPN_SHIFT (30) +/* + * we support 8 fragments per PTE page of 64K size. + */ +#define PTE_FRAG_NR8 +/* + * We use a 2K PTE page fragment and another 4K for storing + * real_pte_t hash index. Rounding the entire thing to 8K + */ +#define PTE_FRAG_SIZE_SHIFT 13 +#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) + +/* + * Bits to mask out from a PMD to get to the PTE page + * PMDs point to PTE table fragments which are PTE_FRAG_SIZE aligned. + */ +#define PMD_MASKED_BITS(PTE_FRAG_SIZE - 1) +/* Bits to mask out from a PGD/PUD to get to the PMD page */ +#define PUD_MASKED_BITS0x1ff #ifndef __ASSEMBLY__ @@ -112,8 +124,12 @@ static inline bool __rpte_sub_valid(real_pte_t rpte, unsigned long index) remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE,\ __pgprot(pgprot_val((prot)) | _PAGE_4K_PFN))) -#define PTE_TABLE_SIZE (sizeof(real_pte_t) << PTE_INDEX_SIZE) +#define PTE_TABLE_SIZE PTE_FRAG_SIZE +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define PMD_TABLE_SIZE ((sizeof(pmd_t) << PMD_INDEX_SIZE) + (sizeof(unsigned long) << PMD_INDEX_SIZE)) +#else #define PMD_TABLE_SIZE (sizeof(pmd_t) << PMD_INDEX_SIZE) +#endif #define PGD_TABLE_SIZE (sizeof(pgd_t) << PGD_INDEX_SIZE) #define pgd_pte(pgd) (pud_pte(((pud_t){ pgd }))) diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h index a44660d76096..1d8e26e8167b 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h @@ -9,8 +9,20 @@ #define PUD_INDEX_SIZE 0 #define PGD_INDEX_SIZE 12 +/* + * we support 8 fragments per PTE page of 64K size + */ +#define PTE_FRAG_NR8 +/* + * We use a 2K PTE page fragment and another 4K for storing + * real_pte_t hash index. Rounding the entire thing to 8K + */ +#define PTE_FRAG_SIZE_SHIFT 13 +#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) + + #ifndef __ASSEMBLY__ -#define PTE_TABLE_SIZE (sizeof(real_pte_t) << PTE_INDEX_SIZE) +#define PTE_TABLE_SIZE PTE_FRAG_SIZE #define PMD_TABLE_SIZE (sizeof(pmd_t) << PMD_INDEX_SIZE) #define PGD_TABLE_SIZE (sizeof(pgd_t) << PGD_INDEX_SIZE) #endif /* __ASSEMBLY__ */ @@ -32,9 +44,11 @@ #define PGDIR_SIZE (1UL << PGDIR_SHIFT) #define PGDIR_MASK (~(PGDIR_SIZE-1)) -/* Bits to mask out from a PMD to get to the PTE page */ -/* PMDs point to PTE table fragments which are 4K aligned. */ -#define PMD_MASKED_BITS0xfff +/* + * Bits to mask out from a PMD to get to the PTE page + * PMDs point to PTE table fragments which are PTE_FRAG_SIZE aligned. + */ +#define PMD_MASKED_BITS(PTE_FRAG_SIZE - 1) /* Bits to mask out from a PGD/PUD to get to the PMD page */ #define PUD_MASKED_BITS0x1ff diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h index 4f1cc6c46728..69ef28a81733 100644 --- a/arch/powerpc/include/asm/pgalloc-64.h +++ b/arch/powerpc/include/asm/pgalloc-64.h @@ -163,16 +163,6 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, } #else /* if CONFIG_PPC_64K_PAGES */ -/* - * we support 8 fragments per PTE page. - */ -#define PTE_FRAG_NR8 -/* - * We use a 2K PTE page fragment and another 4K for storing - * real_pte_t hash index. Rounding the entire thing to 8K - */ -#define PTE_FRAG_SIZE_SHIFT 13 -#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int); extern
[RESEND PATCH] powerpc: rackmeter: Fix module autoload for OF platform driver
From: Luis de BethencourtThis platform driver has a OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt --- Hi, This is a resend of a patch sent September 17 [0] This patch adds the missing MODULE_DEVICE_TABLE() for OF to export that information so modules have the correct aliases built-in and autoloading works correctly. A longer explanation by Javier Canillas can be found here: https://lkml.org/lkml/2015/7/30/519 Thanks, Luis [0] https://lkml.org/lkml/2015/9/17/598 drivers/macintosh/rack-meter.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/macintosh/rack-meter.c b/drivers/macintosh/rack-meter.c index 048901a..caaec65 100644 --- a/drivers/macintosh/rack-meter.c +++ b/drivers/macintosh/rack-meter.c @@ -582,6 +582,7 @@ static struct of_device_id rackmeter_match[] = { { .name = "i2s" }, { } }; +MODULE_DEVICE_TABLE(of, rackmeter_match); static struct macio_driver rackmeter_driver = { .driver = { -- 2.5.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
On Tue, Oct 20, 2015 at 11:21:47AM +0200, Peter Zijlstra wrote: > On Tue, Oct 20, 2015 at 03:15:32PM +0800, Boqun Feng wrote: > > On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote: > > > > > > Am I missing something here? If not, it seems to me that you need > > > the leading lwsync to instead be a sync. > > > > > > Of course, if I am not missing something, then this applies also to the > > > value-returning RMW atomic operations that you pulled this pattern from. > > > If so, it would seem that I didn't think through all the possibilities > > > back when PPC_ATOMIC_EXIT_BARRIER moved to sync... In fact, I believe > > > that I worried about the RMW atomic operation acting as a barrier, > > > but not as the load/store itself. :-/ > > > > > > > Paul, I know this may be difficult, but could you recall why the > > __futex_atomic_op() and futex_atomic_cmpxchg_inatomic() also got > > involved into the movement of PPC_ATOMIC_EXIT_BARRIER to "sync"? > > > > I did some search, but couldn't find the discussion of that patch. > > > > I ask this because I recall Peter once bought up a discussion: > > > > https://lkml.org/lkml/2015/8/26/596 > > > > Peter's conclusion seems to be that we could(though didn't want to) live > > with futex atomics not being full barriers. I have heard of user-level applications relying on unlock-lock being a full barrier. So paranoia would argue for the full barrier. > > Peter, just be clear, I'm not in favor of relaxing futex atomics. But if > > I make PPC_ATOMIC_ENTRY_BARRIER being "sync", it will also strengthen > > the futex atomics, just wonder whether such strengthen is a -fix- or > > not, considering that I want this patch to go to -stable tree. > > So Linus' argued that since we only need to order against user accesses > (true) and priv changes typically imply strong barriers (open) we might > want to allow archs to rely on those instead of mandating they have > explicit barriers in the futex primitives. > > And I indeed forgot to follow up on that discussion. > > So; does PPC imply full barriers on user<->kernel boundaries? If so, its > not critical to the futex atomic implementations what extra barriers are > added. > > If not; then strengthening the futex ops is indeed (probably) a good > thing :-) I am not seeing a sync there, but I really have to defer to the maintainers on this one. I could easily have missed one. Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 0/7] Remove 4k subpage tracking with hash 64K config
Hi, This patch series is on top of the series posted at https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135299.html "[PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64". In this series we remove 4k subpage tracking with 64K config. Instead we do a hash table lookup to get the slot information of 4k hash ptes. This also allow us to remove real_pte_t. Side effect of the change is that a specific 4k slot lookup can result in multiple H_READ hcalls. But that should only impact when we are using 4K subpages which should be rare. NOTE: I only tested this on systemsim. Wanted to get this out to get early feedback. Aneesh Kumar K.V (7): powerpc/mm: Don't hardcode page table size powerpc/mm: Don't hardcode the hash pte slot shift powerpc/nohash: Update 64K nohash config to have 32 pte fragement powerpc/mm: Don't track 4k subpage information with 64k linux page size powerpc/mm: update frag size powerpc/mm: Update pte_iterate_hashed_subpaes args powerpc/mm: getrid of real_pte_t arch/powerpc/include/asm/book3s/64/hash-64k.h| 75 +--- arch/powerpc/include/asm/book3s/64/pgtable.h | 25 +++- arch/powerpc/include/asm/machdep.h | 2 + arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 21 +-- arch/powerpc/include/asm/nohash/64/pgtable.h | 24 +++- arch/powerpc/include/asm/page.h | 15 - arch/powerpc/include/asm/pgalloc-64.h| 10 arch/powerpc/include/asm/tlbflush.h | 4 +- arch/powerpc/mm/hash64_64k.c | 67 + arch/powerpc/mm/hash_native_64.c | 35 --- arch/powerpc/mm/hash_utils_64.c | 13 ++-- arch/powerpc/mm/init_64.c| 7 +-- arch/powerpc/mm/pgtable_64.c | 6 +- arch/powerpc/mm/tlb_hash64.c | 15 +++-- arch/powerpc/platforms/pseries/lpar.c| 23 ++-- 15 files changed, 175 insertions(+), 167 deletions(-) -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 2/7] powerpc/mm: Don't hardcode the hash pte slot shift
Use the #define instead of open-coding the same Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +- arch/powerpc/include/asm/book3s/64/pgtable.h | 2 +- arch/powerpc/include/asm/nohash/64/pgtable.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 565f9418c25f..681657cabbe4 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -71,7 +71,7 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index) { if ((pte_val(rpte.pte) & _PAGE_COMBO)) return (unsigned long) rpte.hidx[index] >> 4; - return (pte_val(rpte.pte) >> 12) & 0xf; + return (pte_val(rpte.pte) >> _PAGE_F_GIX_SHIFT) & 0xf; } static inline pte_t __rpte_to_pte(real_pte_t rpte) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 0b43ca60dcb9..64ef7316ff88 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -50,7 +50,7 @@ #define __real_pte(a,e,p) (e) #define __rpte_to_pte(r) (__pte(r)) #endif -#define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >> 12) +#define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >>_PAGE_F_GIX_SHIFT) #define pte_iterate_hashed_subpages(rpte, psize, va, index, shift) \ do { \ diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h index c4dff4d41c26..8969b4c93c4f 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h @@ -121,7 +121,7 @@ #define __real_pte(a,e,p) (e) #define __rpte_to_pte(r) (__pte(r)) #endif -#define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >> 12) +#define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >> _PAGE_F_GIX_SHIFT) #define pte_iterate_hashed_subpages(rpte, psize, va, index, shift) \ do { \ -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RESEND PATCH 0/2] powerpc: Fix module autoload for OF platform drivers
Hi, This is a resend of this patch series. It was posted on September 18 [0] These patches add the missing MODULE_DEVICE_TABLE() for OF to export the information so modules have the correct aliases built-in and autoloading works correctly. A longer explanation by Javier Canillas can be found here: https://lkml.org/lkml/2015/7/30/519 Thanks, Luis [0] https://lkml.org/lkml/2015/9/18/749 [1] https://lkml.org/lkml/2015/9/18/750 [2] https://lkml.org/lkml/2015/9/18/752 Luis de Bethencourt (2): powerpc: platforms: mpc52xx_lpbfifo: Fix module autoload for OF platform driver powerpc: axonram: Fix module autoload for OF platform driver arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c | 1 + arch/powerpc/sysdev/axonram.c | 1 + 2 files changed, 2 insertions(+) -- 2.5.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 6/7] powerpc/mm: Update pte_iterate_hashed_subpaes args
Now that we don't really use real_pte_t drop them from iterator argument list. The follow up patch will remove real_pte_t completely Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hash-64k.h | 5 +++-- arch/powerpc/include/asm/book3s/64/pgtable.h | 7 +++ arch/powerpc/include/asm/nohash/64/pgtable.h | 7 +++ arch/powerpc/mm/hash_native_64.c | 10 -- arch/powerpc/mm/hash_utils_64.c | 6 +++--- arch/powerpc/platforms/pseries/lpar.c | 4 ++-- 6 files changed, 18 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index a28dbfe2baed..19e0afb36fa8 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -77,9 +77,10 @@ static inline pte_t __rpte_to_pte(real_pte_t rpte) * Trick: we set __end to va + 64k, which happens works for * a 16M page as well as we want only one iteration */ -#define pte_iterate_hashed_subpages(rpte, psize, vpn, index, shift)\ +#define pte_iterate_hashed_subpages(vpn, psize, shift) \ do {\ - unsigned long __end = vpn + (1UL << (PAGE_SHIFT - VPN_SHIFT)); \ + unsigned long index;\ + unsigned long __end = vpn + (1UL << (PAGE_SHIFT - VPN_SHIFT)); \ shift = mmu_psize_defs[psize].shift;\ for (index = 0; vpn < __end; index++, \ vpn += (1L << (shift - VPN_SHIFT))) { \ diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 64ef7316ff88..79a90ca7b9f6 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -52,10 +52,9 @@ #endif #define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >>_PAGE_F_GIX_SHIFT) -#define pte_iterate_hashed_subpages(rpte, psize, va, index, shift) \ - do { \ - index = 0; \ - shift = mmu_psize_defs[psize].shift; \ +#define pte_iterate_hashed_subpages(vpn, psize, shift) \ + do {\ + shift = mmu_psize_defs[psize].shift;\ #define pte_iterate_hashed_end() } while(0) diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h index 8969b4c93c4f..37b5a62d18f4 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h @@ -123,10 +123,9 @@ #endif #define __rpte_to_hidx(r,index)(pte_val(__rpte_to_pte(r)) >> _PAGE_F_GIX_SHIFT) -#define pte_iterate_hashed_subpages(rpte, psize, va, index, shift) \ - do { \ - index = 0; \ - shift = mmu_psize_defs[psize].shift; \ +#define pte_iterate_hashed_subpages(vpn, psize, shift) \ + do { \ + shift = mmu_psize_defs[psize].shift; \ #define pte_iterate_hashed_end() } while(0) diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index ca747ae19c76..b035dafcdea0 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -646,7 +646,7 @@ static void native_hpte_clear(void) static void native_flush_hash_range(unsigned long number, int local) { unsigned long vpn; - unsigned long hash, index, hidx, shift, slot; + unsigned long hash, hidx, shift, slot; struct hash_pte *hptep; unsigned long hpte_v; unsigned long want_v; @@ -665,7 +665,7 @@ static void native_flush_hash_range(unsigned long number, int local) vpn = batch->vpn[i]; pte = batch->pte[i]; - pte_iterate_hashed_subpages(pte, psize, vpn, index, shift) { + pte_iterate_hashed_subpages(vpn, psize, shift) { hash = hpt_hash(vpn, shift, ssize); hidx = __rpte_to_hidx(pte, hash, vpn, ssize, _slot); if (!valid_slot) @@ -693,8 +693,7 @@ static void native_flush_hash_range(unsigned long number, int local) vpn = batch->vpn[i]; pte = batch->pte[i]; - pte_iterate_hashed_subpages(pte, psize, - vpn, index, shift) { + pte_iterate_hashed_subpages(vpn, psize, shift) {
[RESEND PATCH 2/2] powerpc: axonram: Fix module autoload for OF platform driver
From: Luis de BethencourtThis platform driver has a OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt --- arch/powerpc/sysdev/axonram.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c index d2b79bc..51b41c9 100644 --- a/arch/powerpc/sysdev/axonram.c +++ b/arch/powerpc/sysdev/axonram.c @@ -312,6 +312,7 @@ static const struct of_device_id axon_ram_device_id[] = { }, {} }; +MODULE_DEVICE_TABLE(of, axon_ram_device_id); static struct platform_driver axon_ram_driver = { .probe = axon_ram_probe, -- 2.5.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/5] drivers/tty: make more bool drivers explicitly non-modular
[Re: [PATCH 0/5] drivers/tty: make more bool drivers explicitly non-modular] On 20/10/2015 (Tue 17:10) Alexandre Belloni wrote: > On 18/10/2015 at 18:21:13 -0400, Paul Gortmaker wrote : > > The one common thread here for all the patches is that we also > > scrap the .remove functions which would only be used for module > > unload (impossible) and driver unbind. For the drivers here, there > > doesn't seem to be a sensible unbind use case (vs. e.g. a multiport > > PCI ethernet driver where one port is unbound and passed through to > > a kvm guest or similar). Hence we just explicitly disallow any > > driver unbind operations to help prevent root from doing something > > illogical to the machine that they could have done previously. > > > > We've already done this for drivers/tty/serial/mpsc.c previously. > > > > Build tested for allmodconfig on ARM64 and powerpc for tty/tty-testing. > > > > So, how does this actually build test atmel_serial? Not sure why this should be a surprise; I build test it exactly like this: paul@builder-02:~/git/linux-head$ echo $ARCH arm64 paul@builder-02:~/git/linux-head$ echo $CROSS_COMPILE aarch64-linux-gnu- paul@builder-02:~/git/linux-head$ make O=../arm-build/ drivers/tty/serial/atmel_serial.o make[1]: Entering directory '/home/paul/git/arm-build' arch/arm64/Makefile:25: LSE atomics not supported by binutils CHK include/config/kernel.release Using /home/paul/git/linux-head as source for kernel GEN ./Makefile CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h [...] HOSTCC scripts/sign-file HOSTCC scripts/extract-cert CC drivers/tty/serial/atmel_serial.o make[1]: Leaving directory '/home/paul/git/arm-build' paul@builder-02:~/git/linux-head$ It did build; no warning/error. Would you call it an invalid build test? > > A proper solution would be to actually make it a tristate and allow > building as a module. I think it currently fails because of > console_initcall() but that is certainly fixable. Well, as per other threads on this topic, if people want to extend the functionality to support tristate, then great. But please do not confuse that with existing functionality which is clearly non modular in this case. Thanks, Paul. -- > > > -- > Alexandre Belloni, Free Electrons > Embedded Linux, Kernel and Android engineering > http://free-electrons.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V6 1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR
On Tue, Oct 20, 2015 at 05:03:00PM +0800, Wei Yang wrote: >On PHB_IODA2, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If ^ s/PHB_IODA2/PHB3 or s/PHB_IODA2/IODA2 PHB >a SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned >from 64bit prefetchable window, which means M64 BAR can't work on it. > >The reason is PCI bridges support only 2 windows and the kernel code ^ It would be more accurate: "2 memory windows". >programs bridges in the way that one window is 32bit-nonprefetchable and >the other one is 64bit-prefetchable. So if devices' IOV BAR is 64bit and >non-prefetchable, it will be mapped into 32bit space and therefore M64 >cannot be used for it. > >This patch makes this explicit and truncate IOV resource in this case to >save MMIO space. > >Signed-off-by: Wei Yang>Reviewed-by: Gavin Shan >Acked-by: Alexey Kardashevskiy >--- > arch/powerpc/platforms/powernv/pci-ioda.c | 34 --- > 1 file changed, 18 insertions(+), 16 deletions(-) > >diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >b/arch/powerpc/platforms/powernv/pci-ioda.c >index 85cbc96..f042fed 100644 >--- a/arch/powerpc/platforms/powernv/pci-ioda.c >+++ b/arch/powerpc/platforms/powernv/pci-ioda.c >@@ -908,9 +908,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, >int offset) > if (!res->flags || !res->parent) > continue; > >- if (!pnv_pci_is_mem_pref_64(res->flags)) >- continue; >- > /* >* The actual IOV BAR range is determined by the start address >* and the actual size for num_vfs VFs BAR. This check is to >@@ -939,9 +936,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, >int offset) > if (!res->flags || !res->parent) > continue; > >- if (!pnv_pci_is_mem_pref_64(res->flags)) >- continue; >- > size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); > res2 = *res; > res->start += size * offset; >@@ -1221,9 +1215,6 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, >u16 num_vfs) > if (!res->flags || !res->parent) > continue; > >- if (!pnv_pci_is_mem_pref_64(res->flags)) >- continue; >- > for (j = 0; j < vf_groups; j++) { > do { > win = > find_next_zero_bit(>ioda.m64_bar_alloc, >@@ -1510,6 +1501,12 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 >num_vfs) > pdn = pci_get_pdn(pdev); > > if (phb->type == PNV_PHB_IODA2) { >+ if (!pdn->vfs_expanded) { >+ dev_info(>dev, "don't support this SRIOV device" >+ " with non 64bit-prefetchable IOV BAR\n"); >+ return -ENOSPC; >+ } >+ > /* Calculate available PE for required VFs */ > mutex_lock(>ioda.pe_alloc_mutex); > pdn->offset = bitmap_find_next_zero_area( >@@ -2775,9 +2772,10 @@ static void pnv_pci_ioda_fixup_iov_resources(struct >pci_dev *pdev) > if (!res->flags || res->parent) > continue; > if (!pnv_pci_is_mem_pref_64(res->flags)) { >- dev_warn(>dev, " non M64 VF BAR%d: %pR\n", >+ dev_warn(>dev, "Don't support SR-IOV with" >+ " non M64 VF BAR%d: %pR. \n", >i, res); >- continue; >+ goto truncate_iov; > } > > size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); >@@ -2796,11 +2794,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct >pci_dev *pdev) > res = >resource[i + PCI_IOV_RESOURCES]; > if (!res->flags || res->parent) > continue; >- if (!pnv_pci_is_mem_pref_64(res->flags)) { >- dev_warn(>dev, "Skipping expanding VF BAR%d: >%pR\n", >- i, res); >- continue; >- } > > dev_dbg(>dev, " Fixing VF BAR%d: %pR to\n", i, res); > size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); >@@ -2810,6 +2803,15 @@ static void pnv_pci_ioda_fixup_iov_resources(struct >pci_dev *pdev) >i, res, mul); > } > pdn->vfs_expanded = mul; >+ >+ return; >+ >+truncate_iov: >+ /* To save MMIO space, IOV BAR is truncated. */ >+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { >+ res = >resource[i + PCI_IOV_RESOURCES]; >+ res->end = res->start - 1; >+ }
Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation
On Mon, Oct 19, 2015 at 09:17:18AM +0800, Boqun Feng wrote: > On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote: > > On Fri, Oct 09, 2015 at 10:31:38AM +0200, Peter Zijlstra wrote: > [snip] > > > > > > So lots of little confusions added up to complete fail :-{ > > > > > > Mostly I think it was the UNLOCK x + LOCK x are fully ordered (where I > > > forgot: but not against uninvolved CPUs) and RELEASE/ACQUIRE are > > > transitive (where I forgot: RELEASE/ACQUIRE _chains_ are transitive, but > > > again not against uninvolved CPUs). > > > > > > Which leads me to think I would like to suggest alternative rules for > > > RELEASE/ACQUIRE (to replace those Will suggested; as I think those are > > > partly responsible for my confusion). > > > > Yeah, sorry. I originally used the phrase "fully ordered" but changed it > > to "full barrier", which has stronger transitivity (newly understood > > definition) requirements that I didn't intend. > > > > RELEASE -> ACQUIRE should be used for message passing between two CPUs > > and not have ordering effects on other observers unless they're part of > > the RELEASE -> ACQUIRE chain. > > > > > - RELEASE -> ACQUIRE is fully ordered (but not a full barrier) when > > >they operate on the same variable and the ACQUIRE reads from the > > >RELEASE. Notable, RELEASE/ACQUIRE are RCpc and lack transitivity. > > > > Are we explicit about the difference between "fully ordered" and "full > > barrier" somewhere else, because this looks like it will confuse people. > > > > This is confusing me right now. ;-) > > Let's use a simple example for only one primitive, as I understand it, > if we say a primitive A is "fully ordered", we actually mean: > > 1.The memory operations preceding(in program order) A can't be > reordered after the memory operations following(in PO) A. > > and > > 2.The memory operation(s) in A can't be reordered before the > memory operations preceding(in PO) A and after the memory > operations following(in PO) A. > > If we say A is a "full barrier", we actually means: > > 1.The memory operations preceding(in program order) A can't be > reordered after the memory operations following(in PO) A. > > and > > 2.The memory ordering guarantee in #1 is visible globally. > > Is that correct? Or "full barrier" is more strong than I understand, > i.e. there is a third property of "full barrier": > > 3.The memory operation(s) in A can't be reordered before the > memory operations preceding(in PO) A and after the memory > operations following(in PO) A. > > IOW, is "full barrier" a more strong version of "fully ordered" or not? There is also the question of whether the barrier forces ordering of unrelated stores, everything initially zero and all accesses READ_ONCE() or WRITE_ONCE(): P0 P1 P2 P3 X = 1; Y = 1; r1 = X; r3 = Y; some_barrier(); some_barrier(); r2 = Y; r4 = X; P2's and P3's ordering could be globally visible without requiring P0's and P1's independent stores to be ordered, for example, if you used smp_rmb() for some_barrier(). In contrast, if we used smp_mb() for barrier, everyone would agree on the order of P0's and P0's stores. There are actually a fair number of different combinations of aspects of memory ordering. We will need to choose wisely. ;-) My hope is that the store-ordering gets folded into the globally visible transitive level. Especially given that I have not (yet) seen any algorithms used in production that relied on the ordering of independent stores. Thanx, Paul > Regards, > Boqun > > > > - RELEASE -> ACQUIRE can be upgraded to a full barrier (including > > >transitivity) using smp_mb__release_acquire(), either before RELEASE > > >or after ACQUIRE (but consistently [*]). > > > > Hmm, but we don't actually need this for RELEASE -> ACQUIRE, afaict. This > > is just needed for UNLOCK -> LOCK, and is exactly what RCU is currently > > using (for PPC only). > > > > Stepping back a second, I believe that there are three cases: > > > > > > RELEASE X -> ACQUIRE Y (same CPU) > >* Needs a barrier on TSO architectures for full ordering > > > > UNLOCK X -> LOCK Y (same CPU) > >* Needs a barrier on PPC for full ordering > > > > RELEASE X -> ACQUIRE X (different CPUs) > > UNLOCK X -> ACQUIRE X (different CPUs) > >* Fully ordered everywhere... > >* ... but needs a barrier on PPC to become a full barrier > > > > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3] powerpc/prom: Avoid reference to potentially freed memory
of_get_property() is used inside the loop, but then the reference to the node is dropped before dereferencing the prop pointer, which could by then point to junk if the node has been freed. Instead use of_property_read_u32() to actually read the property value before dropping the reference. Use of_get_next_parent to simplify code. Signed-off-by: Christophe JAILLET--- v2: Fix missing '{' v3: Use of_get_next_parent to simply code *** COMPILE-TESTED ONLY *** --- arch/powerpc/kernel/prom.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index bef76c5..ba29c0d 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -783,17 +783,14 @@ void __init early_get_first_memblock_info(void *params, phys_addr_t *size) int of_get_ibm_chip_id(struct device_node *np) { of_node_get(np); - while(np) { - struct device_node *old = np; - const __be32 *prop; + while (np) { + u32 chip_id; - prop = of_get_property(np, "ibm,chip-id", NULL); - if (prop) { + if (!of_property_read_u32(np, "ibm,chip-id", _id)) { of_node_put(np); - return be32_to_cpup(prop); + return chip_id; } - np = of_get_parent(np); - of_node_put(old); + np = of_get_next_parent(np); } return -1; } -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v6 22/22] of/platform: Defer probes of registered devices
On Mon, 2015-09-21 at 16:03 +0200, Tomeu Vizoso wrote: > Instead of trying to match and probe platform and AMBA devices right > after each is registered, delay their probes until device_initcall_sync. > > This means that devices will start probing once all built-in drivers > have registered, and after all platform and AMBA devices from the DT > have been registered already. > > This allows us to prevent deferred probes by probing dependencies on > demand. > > Signed-off-by: Tomeu Vizoso> --- > > Changes in v4: > - Also defer probes of AMBA devices registered from the DT as they can > also request resources. > > drivers/of/platform.c | 11 --- > 1 file changed, 8 insertions(+), 3 deletions(-) This breaks arch/powerpc/sysdev/fsl_pci.c. The PCI bus is an OF platform device, and it must be probed before pcibios_init() which is a subsys_initcall(), or else the PCI bus never gets scanned. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/msi: fix section mismatch warning
Building with CONFIG_DEBUG_SECTION_MISMATCH gives the following warning: WARNING: vmlinux.o(.text+0x41fa8): Section mismatch in reference from the function .msi_bitmap_alloc() to the function .init.text:.memblock_virt_alloc_try_nid() The function .msi_bitmap_alloc() references the function __init .memblock_virt_alloc_try_nid(). This is often because .msi_bitmap_alloc lacks a __init annotation or the annotation of .memblock_virt_alloc_try_nid is wrong. Signed-off-by: Denis Kirjanov--- arch/powerpc/include/asm/msi_bitmap.h | 2 +- arch/powerpc/sysdev/msi_bitmap.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/msi_bitmap.h b/arch/powerpc/include/asm/msi_bitmap.h index 1ec7125..fbd3424 100644 --- a/arch/powerpc/include/asm/msi_bitmap.h +++ b/arch/powerpc/include/asm/msi_bitmap.h @@ -29,7 +29,7 @@ void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq); int msi_bitmap_reserve_dt_hwirqs(struct msi_bitmap *bmp); -int msi_bitmap_alloc(struct msi_bitmap *bmp, unsigned int irq_count, +int __init_refok msi_bitmap_alloc(struct msi_bitmap *bmp, unsigned int irq_count, struct device_node *of_node); void msi_bitmap_free(struct msi_bitmap *bmp); diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c index 1a826f3..ed5234e 100644 --- a/arch/powerpc/sysdev/msi_bitmap.c +++ b/arch/powerpc/sysdev/msi_bitmap.c @@ -112,7 +112,7 @@ int msi_bitmap_reserve_dt_hwirqs(struct msi_bitmap *bmp) return 0; } -int msi_bitmap_alloc(struct msi_bitmap *bmp, unsigned int irq_count, +int __init_refok msi_bitmap_alloc(struct msi_bitmap *bmp, unsigned int irq_count, struct device_node *of_node) { int size; -- 2.4.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev