On Fri, Feb 09, 2018 at 12:07:41PM -0600, Bjorn Helgaas wrote:
> On Fri, Feb 09, 2018 at 05:23:58PM +1100, Alexey Kardashevskiy wrote:
> > Commit 59f47eff03a0 ("powerpc/pci: Use of_irq_parse_and_map_pci() helper")
> > replaced of_irq_parse_pci() + irq_create_of_mapping() with
> >
The number of high slices a process might use now depends on its
address space size, and what allocation address it has requested.
This patch uses that limit throughout call chains where possible,
rather than use the fixed SLICE_NUM_HIGH for bitmap operations.
This saves some cost for processes
For address above 512TB we allocate additonal mmu context. To make it all
easy address above 512TB is handled with IR/DR=1 and with stack frame setup.
We do the additonal context allocation in SLB miss handler. If the context is
not allocated, we enable interrupts and allocate the context and
Hi All,
The AmigaOne X1000 doesn’t boot anymore since the PCI updates. I have seen,
that the PCI updates are different to the updates below. The code below works
but the latest not. Is there a problem with the latest PCI updates currently?
Thanks,
Christian
Sent from my iPhone
On 2. Dec
The slice_mask cache was a basic conversion which copied the slice
mask into caller's structures, because that's how the original code
worked. In most cases the pointer can be used directly instead, saving
a copy and an on-stack structure.
This also converts the slice_mask bit operation helpers
Rather than build slice masks from a range then use that to check for
fit in a candidate mask, implement slice_check_range_fits that checks
if a range fits in a mask directly.
This allows several structures to be removed from stacks, and also we
don't expect a huge range in a lot of these cases,
Aneesh Kumar K.V writes:
> "Aneesh Kumar K.V" writes:
>
>> To support memory keys, we moved the hash pte slot information to the second
>> half of the page table. This was ok with PTE entries at level 4 and level 3.
>> We already
This patch series extended the max virtual address space value from 512TB
to 4PB with 64K page size. We do that by allocating one vsid context for
each 512TB range. More details of that is explained in patch 4.
Aneesh Kumar K.V (5):
powerpc: Don't do runtime futex_cmpxchg test
futex_detect_cmpxchg() does a cmpxchg_futex_value_locked on a NULL user addr to
runtime detect whether architecture implements atomic cmpxchg for futex. POWER
do implement the feature and hence we can enable the config instead of depending
on runtime detection.
We could possible enable this on
---
arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
arch/powerpc/include/asm/processor.h | 9 -
arch/powerpc/mm/init_64.c | 6 --
arch/powerpc/mm/pgtable_64.c | 5 -
4 files changed, 9 insertions(+), 13 deletions(-)
diff --git
This series intends to improve performance and reduce stack
consumption in the slice allocation code. It does it by keeping slice
masks in the mm_context rather than compute them for each allocation,
and by reducing bitmaps and slice_masks from stacks, using pointers
instead where possible.
Pass around const pointers to struct slice_mask where possible, rather
than copies of slice_mask, to reduce stack and call overhead.
checkstack.pl gives, before:
0x0de4 slice_get_unmapped_area [slice.o]: 656
0x1b4c is_hugepage_only_range [slice.o]:512
0x075c
Calculating the slice mask can become a signifcant overhead for
get_unmapped_area. This patch adds a struct slice_mask for
each page size in the mm_context, and keeps these in synch with
the slices psize arrays and slb_addr_limit.
This saves about 30% kernel time on a single-page mmap/munmap
We will make code changes in the next patch. To make the review easier split
the documentation update in to a seperate patch.
Signed-off-by: Aneesh Kumar K.V
---
arch/powerpc/mm/slice.c | 27 +++
1 file changed, 19 insertions(+), 8
This patch kill potential_mask and compat_mask variable and instead use tmp_mask
so that we can reduce the stack usage. This is required so that we can increase
the high_slices bitmap to a larger value.
The patch does result in extra computation in final stage, where it ends up
recomputing the
On Sat, 10 Feb 2018 13:54:25 +0100 (CET)
Christophe Leroy wrote:
> bitmap_or() and bitmap_andnot() can work properly with dst identical
> to src1 or src2. There is no need of an intermediate result bitmap
> that is copied back to dst in a second step.
Everyone seems to
Le 10/02/2018 à 15:43, Nicholas Piggin a écrit :
On Sat, 10 Feb 2018 13:54:25 +0100 (CET)
Christophe Leroy wrote:
bitmap_or() and bitmap_andnot() can work properly with dst identical
to src1 or src2. There is no need of an intermediate result bitmap
that is copied
Le 29/01/2018 à 07:23, Aneesh Kumar K.V a écrit :
Christophe Leroy writes:
In preparation for the following patch which will fix an issue on
the 8xx by re-using the 'slices', this patch enhances the
'slices' implementation to support 32 bits CPUs.
On PPC32, the
Le 29/01/2018 à 07:29, Aneesh Kumar K.V a écrit :
Christophe Leroy writes:
While the implementation of the "slices" address space allows
a significant amount of high slices, it limits the number of
low slices to 16 due to the use of a single u64 low_slices_psize
In preparation for the following patch which will fix an issue on
the 8xx by re-using the 'slices', this patch enhances the
'slices' implementation to support 32 bits CPUs.
On PPC32, the address space is limited to 4Gbytes, hence only the low
slices will be used.
This patch moves "slices"
On the 8xx, the page size is set in the PMD entry and applies to
all pages of the page table pointed by the said PMD entry.
When an app has some regular pages allocated (e.g. see below) and tries
to mmap() a huge page at a hint address covered by the same PMD entry,
the kernel accepts the hint
On the 8xx, the minimum slice size is the size of the area
covered by a single PMD entry, ie 4M in 4K pages mode and 64M in
16K pages mode.
This patch increases the number of slices from 16 to 64 on the 8xx.
Signed-off-by: Christophe Leroy
---
v4: New
bitmap_or() and bitmap_andnot() can work properly with dst identical
to src1 or src2. There is no need of an intermediate result bitmap
that is copied back to dst in a second step.
Signed-off-by: Christophe Leroy
Reviewed-by: Aneesh Kumar K.V
While the implementation of the "slices" address space allows
a significant amount of high slices, it limits the number of
low slices to 16 due to the use of a single u64 low_slices_psize
element in struct mm_context_t
On the 8xx, the minimum slice size is the size of the area
covered by a single
On 02/10/2018 10:20 PM, Ram Pai wrote:
On Sat, Feb 10, 2018 at 03:17:02PM +0530, Aneesh Kumar K.V wrote:
Aneesh Kumar K.V writes:
"Aneesh Kumar K.V" writes:
To support memory keys, we moved the hash pte slot information
On 02/10/2018 06:08 AM, Alexei Starovoitov wrote:
> On 2/9/18 8:54 AM, Naveen N. Rao wrote:
>> Naveen N. Rao wrote:
>>> Alexei Starovoitov wrote:
On 2/8/18 4:03 AM, Sandipan Das wrote:
> The imm field of a bpf_insn is a signed 32-bit integer. For
> JIT-ed bpf-to-bpf function calls,
On Sat, Feb 10, 2018 at 03:17:02PM +0530, Aneesh Kumar K.V wrote:
> Aneesh Kumar K.V writes:
>
> > "Aneesh Kumar K.V" writes:
> >
> >> To support memory keys, we moved the hash pte slot information to the
> >> second
> >> half
On Sat, Feb 10, 2018 at 09:05:40AM +0100, Christian Zigotzky wrote:
> Hi All,
>
> The AmigaOne X1000 doesn’t boot anymore since the PCI updates. I
> have seen, that the PCI updates are different to the updates below.
> The code below works but the latest not. Is there a problem with the
> latest
PCI fixes:
- fix POWER9/powernv INTx regression from the merge window (Alexey
Kardashevskiy)
The following changes since commit ab8c609356fbe8dbcd44df11e884ce8cddf3739e:
Merge branch 'pci/spdx' into next (2018-02-01 11:40:07 -0600)
are available in the Git repository at:
29 matches
Mail list logo