lling to go and benchmark both
> allocators to confirm Mel's observations and current users of SLAB
> can confirm their workloads do not regress either then let's just drop
> it.
>
Independent verification would be nice. Of particular interest would be
a real set of networking tests o
hsingularity.net
Fixes: e332f741a8dd ("mm, compaction: be selective about what pageblocks to
clear skip hints")
Reported-by: Mikhail Gavrilov
Tested-by: Mikhail Gavrilov
Cc: Daniel Jordan
Cc: Qian Cai
Cc: Vlastimil Babka
Signed-off-by: Mel Gorman
---
mm/compaction.c | 27 +
ed-by: Mel Gorman
Cc: Daniel Jordan
Cc: Mikhail Gavrilov
Cc: Vlastimil Babka
Cc: Pavel Tatashin
Signed-off-by: Mel Gorman
---
mm/compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index b4930bf93c8a..3319e0872d01 100644
--- a/mm/co
then please ignore this entirely
so the normal submission path is preserved. Otherwise, please either git
pull this or pick up the patches directly at your discretion.
Mel Gorman (1):
mm/compaction.c: correct zone boundary handling when resetting
pageblock skip hints
Qian Cai (1):
mm
Commit-ID: 0e9f02450da07fc7b1346c8c32c771555173e397
Gitweb: https://git.kernel.org/tip/0e9f02450da07fc7b1346c8c32c771555173e397
Author: Mel Gorman
AuthorDate: Tue, 19 Mar 2019 12:36:10 +
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 09:50:22 +0200
sched/fair: Do not re-read
the Linux kernel and related features.
> test-url: http://linux-test-project.github.io/
>
>
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>
> caused below changes (please refer to attached dmesg/kmsg for entire
> log/backtrace):
>
There are fixes queued in Andrew's tree that should cover this.
--
Mel Gorman
SUSE Labs
nline_page(block_pfn);
> > + if (!end_page)
> > + return false;
>
> Should not we check zone against page_zone() from both start and end page
> here.
The lower address has the max(block_pfn, zone->zone_start_pfn) and the
upper address has the min(block_pfn, zone_end_pfn(zone) - 1) check to
keep the PFN within the zone boundary.
--
Mel Gorman
SUSE Labs
selective about what pageblocks to
clear skip hints")
Reported-and-tested-by: Mikhail Gavrilov
Signed-off-by: Mel Gorman
---
mm/compaction.c | 27 +--
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index f171a83707ce..b4
the motivation behind removing it or what we gain.
Yes, it's undocumented and it's unlikely that anyone will. Any potential
semantics are almost meaningless with mbind but there are two
possibilities. One, mbind is relaxed to allow migration within allowed
nodes and two, interleave could initially interleave but allow migration
to local node to get a mix of average performance at init and local
performance over time. No one tried taking that option so far but it
appears harmless to leave it alone too.
--
Mel Gorman
SUSE Labs
0
> kthread+0x32c/0x3f0
> ret_from_fork+0x35/0x40
>
> Fixes: dbe2d4e4f12e ("mm, compaction: round-robin the order while searching
> the free lists for a target")
> Signed-off-by: Qian Cai
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
no further oops after 10 days of testing.
As Peter pointed out, it is also necessary to use WRITE_ONCE to avoid any
potential problems with store tearing.
Fixes: 685207963be9 ("sched: Move h_load calculation to task_h_load()")
[pet...@infradead.org: Use WRITE_ONCE to protect against st
On Tue, Mar 19, 2019 at 01:06:09PM +0100, Peter Zijlstra wrote:
> > ---
> > kernel/sched/fair.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 310d0637fe4b..34aeb40e69d2 100644
> > --- a/kernel/sched/fair.c
> >
On Tue, Mar 19, 2019 at 11:38:25AM +, Valentin Schneider wrote:
> Hi,
>
> On 19/03/2019 09:35, Mel Gorman wrote:
> > A NULL pointer dereference bug was reported on a distribution kernel but
> > the same issue should be present on mainline kernel. It occured on s390
>
high overhead so this patch uses READ_ONCE to read h_load_next
only once and check for NULL before dereferencing. It was confirmed that
there were no further oops after 10 days of testing.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
dif
> kthread+0x374/0x390
> ret_from_fork+0x10/0x18
>
> Fixes: 70b44595eafe ("mm, compaction: use free lists to quickly locate a
> migration source")
> Signed-off-by: Qian Cai
Acked-by: Mel Gorman
FWIW, I had seen the same message when trying to isolate potential
corrup
rth it; I hope Mel has no strong objection.
>
No objection, thanks!
--
Mel Gorman
SUSE Labs
> Acked-by: Johannes Weiner
> Acked-by: Vlastimil Babka
> Cc: Michal Hocko
> Cc: Rik van Riel
> Cc: Mel Gorman
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
Rik van Riel
> Cc: Johannes Weiner
> Cc: Michal Hocko
> Cc: Mel Gorman
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
t; Acked-by: Vlastimil Babka
> Cc: Johannes Weiner
> Cc: Michal Hocko
> Cc: Rik van Riel
> Cc: Mel Gorman
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
c: Vlastimil Babka
> Cc: Mel Gorman
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
> |
> | | cpufreq_governor=performance
> |
> | | nr_threads=100%
> |
> | | testtime=1s
> |
> +--+--+
>
Given full machine utilisation and a 1 second duration, it's a case
where saturating the local node early was sub-optimal and 1 second is
too long for load balancing or other factors to correct it.
Bottom line, the patch is a trade off but from a range of tests, I found
that on balance we benefit more from having tasks start local until
there is evidence that the kernel is justified to spread the load to
remote nodes.
--
Mel Gorman
SUSE Labs
On Fri, Feb 22, 2019 at 12:45:44PM +, Mel Gorman wrote:
> On Mon, Feb 18, 2019 at 09:49:10AM -0800, Linus Torvalds wrote:
> > On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote:
> > >
> > > However; whichever way around you turn this cookie; it is expensive and
&
if (vdiff > gran)
return 1;
}
I haven't tried debugging it yet.
--
Mel Gorman
SUSE Labs
s flaw back in 2010, see commit c01778001a4f
> ("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
>
> My proposed fix moves the D-cache maintenance inside move_to_new_page
> to make it common for both cases.
>
> Signed-off-by: Lars Persson
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
talk slot.
--
Mel Gorman
SUSE Labs
On Wed, Feb 13, 2019 at 06:54:34PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 13, 2019 at 05:47:56PM +0000, Mel Gorman wrote:
> > If there is a tangiable performance benefit from using contiguous regions
> > then I would suggest optimistically allocating them with appropriat
if
necessary. Don't stick it behind capabilities or restrict it to privileged
users. Only hugetlbfs provides restricted access and exposes an
interface to userspace for applications and even that can be
unprivileged.
--
Mel Gorman
SUSE Labs
orted-and-tested-by: Yury Norov
Tested-by: Will Deacon
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 12
1 file changed, 12 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d295c9bc01a8..bb1c7d843ebf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2170,6
On Wed, Feb 13, 2019 at 02:42:36PM +0100, Vlastimil Babka wrote:
> On 2/13/19 2:19 PM, Mel Gorman wrote:
> > Yury Norov reported that an arm64 KVM instance could not boot since after
> > v5.0-rc1 and could addressed by reverting the patches
> >
> > 1c30844d2dfe272d58c
On Wed, Feb 13, 2019 at 02:51:15PM +0300, Yury Norov wrote:
> On Wed, Feb 13, 2019 at 11:14:09AM +0000, Mel Gorman wrote:
> > On Wed, Feb 13, 2019 at 11:25:40AM +0300, Yury Norov wrote:
> > > Hi Mel, all,
> > >
> > > My kernel on qemu/arm64 setu
fragmentation event occurs")
Reported-and-tested-by: Yury Norov
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d295c9bc01a8..ae7e4ba5b9f5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -
mpletely agree that it's worth pinning down both issues.
--
Mel Gorman
SUSE Labs
if only 1G is configured (log below). For the mm folks, it's
> probably worth pointing out that you're using 64k pages.
>
Thanks Will.
While I agree that going OOM early is a problem and would explain why
the boosting logic was hit at all, it's still the case that the boosting
should not divide by zero. Even if the booting is broken due to a lack
of memory, I'd still not prefer to crash due to 1c30844d2dfe272d58c.
--
Mel Gorman
SUSE Labs
ermark is very small. This
patch checks for the conditions and avoids boosting in those cases.
Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
fragmentation event occurs")
Reported-by: Yury Norov
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 5 +
1
reproduced by running LTP tests on an arm64 server.
>
That was careless of me but the patch looks
correct. Andrew, this is a fix to the mmotm patch
mm-compaction-be-selective-about-what-pageblocks-to-clear-skip-hints.patch
Acked-by: Mel Gorman
Thanks Qian!
--
Mel Gorman
SUSE Labs
ugh the cracks.
Signed-off-by: Mel Gorman
---
drivers/gpu/drm/i915/i915_utils.h | 6 --
include/linux/list.h | 11 +++
mm/compaction.c | 10 ++
3 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_utils.h
b/driv
Vlastimil correctly pointed out that when a fast search fails and
cc->migrate_pfn
is reinitialised to the lowest PFN found that the caller does not use the
updated
PFN.
This is a fix for the mmotm patch
mm-compaction-use-free-lists-to-quickly-locate-a-migration-source.patch
Signed-off-by:
d here into cc->migrate_pfn will never get
> actually used, except when isolate_migratepages() returns with
> ISOLATED_ABORT.
> So maybe the infinite kcompactd loop is linked to ISOLATED_ABORT?
>
I'm not entirely sure it would fix the infinite loop. I suspect that is
going to be a boundary conditions where the two scanners are close but
do not meet if it still exists after the batch of fixes. However, you're
right that this code is problematic. I'll write a fix, test it and post
it if it's ok.
Well spotted!
--
Mel Gorman
SUSE Labs
the first page
then list_cut_before moves the entire list to sublist before splicing it
back so it's a pointless operation.
--
Mel Gorman
SUSE Labs
Vlastimil pointed out that a check for isolation is redundant in
__free_one_page as compaction_capture checks for it.
This is a fix for the mmotm patch
mm-compaction-capture-a-page-under-direct-compaction.patch
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 3 +--
1 file changed, 1 insertion
I would also be interested in discussing this topic. My activity is
mostly compaction-related but I believe it will evolve into something
that returns more sane data to the page allocator. That should make it a
bit easier to detect when local compaction fails and make it easier to
improve the page allocator workflow without throwing another workload
under a bus.
--
Mel Gorman
SUSE Labs
when the system is in normal use but kcompactd has not
pegged at 100%. At minimum, I'd like to see what the sources of high-order
allocations are and the likely causes of wakeups of kcompactd in case
there are any hints there. Your Kconfig is also potentially useful.
Thanks.
--
Mel Gorman
SUSE Labs
a more reasonable size. If not, reduce the sleep
time to gather a shorter inverval.
2) Sample stack traces of kcompact while pegged at 100%
echo -n > /tmp/kcompactd-stack; for i in `seq 1 100`; do echo sample $i >>
/tmp/kcompactd-stack; cat /proc/`pidof kcompactd0`/stack >>
/tmp/kcompactd-stack; done; gzip -f /tmp/kcompactd-stack
And mail me the resulting /tmp/kcompactd-stack.gz
Thanks.
--
Mel Gorman
SUSE Labs
ation scan/free
scanner meeting and exiting compaction. Again, a reproduction case of
some sort would be nice or an indication of how long it takes to
trigger. An update of the series is due which may or may not fix this
but if it doesn't, we'll need to start tracing this to see what's going
on at the point of failure.
--
Mel Gorman
SUSE Labs
> [<0>] remove_migration_ptes+0x69/0x70
> [<0>] migrate_pages+0xb6d/0xfd8
> [<0>] compact_zone+0xb70/0x1370
> [<0>] compact_zone_order+0xd8/0x120
> [<0>] try_to_compact_pages+0xe5/0x550
> [<0>] __alloc_pages_direct_compact+0x6d/0x1a0
> [<0>] __alloc_pages_slowpath+0x6c9/0x1640
> [<0>] __alloc_pages_nodemask+0x558/0x5b0
> [<0>] khugepaged+0x499/0x810
> [<0>] kthread+0x158/0x170
> [<0>] ret_from_fork+0x3a/0x50
> [<0>] 0x
>
> Looks like something has gone astray with compact_zone.
>
--
Mel Gorman
SUSE Labs
On Fri, Jan 18, 2019 at 05:51:14PM +, Mel Gorman wrote:
> This is a drop-in replacement for the series currently in Andrews tree that
> incorporates static checking and compile warning fixes (Dan, YueHaibing)
> and extensive review feedback from Vlastimil. Big thanks to
scan rates are reduced as expected by 6% for the migration scanner
and 29% for the free scanner indicating that there is less redundant work.
Compaction migrate scanned2081536219573286
Compaction free scanned 1635261211510663
Signed-off-by: Mel Gorman
---
include/linux/compact
in latency, success rates and scan rates. This is expected as clearing
the hints is not that common but doing a small amount of work out-of-band
to avoid a large amount of work in-band later is generally a good thing.
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 2 +
mm/compaction.c
by the free scanner are almost full instead of
being properly packed. Previous testing had indicated that without this
patch there were occasional large spikes in the free scanner without this
patch.
[dan.carpen...@oracle.com: Fix static checker warning]
Signed-off-by: Mel Gorman
Acked-by: Vlastimil
patches but it just makes the review slightly
harder.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 61 ++---
1 file changed, 23 insertions(+), 38 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index
recently.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 18 ++
1 file changed, 18 insertions(+)
diff --git a/mm/compaction.c b/mm/compaction.c
index 14bb66d48392..829540f6f3da 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1969,6 +1969,7
how pageblocks are treated as earlier iterations
of those patches hit corner cases where the restarts were punishing and
very visible.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 27 ++-
1 file changed, 10 insertions(+), 17 deletions(-)
diff --
scanned
by the migration scanner is greater than the free scanner due to the
increased search efficiency.
Signed-off-by: Mel Gorman
---
mm/compaction.c | 28 ++--
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index
( 0.00%)14618.59 ( 11.07%)
Amean fault-both-3017531.72 ( 0.00%)16650.96 ( 5.02%)
Amean fault-both-3217101.96 ( 0.00%)17145.15 ( -0.25%)
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 23 ---
1 file changed, 4
are not materially different.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index b261c0bfac24..14bb66d48392 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
case where migration scan rates go through the roof
was due to a dirty/writeback pageblock located at the boundary of the
migration/free scanner did not happen in this case. When it does happen,
the scan rates multipled by massive margins.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm
the primary justification for
this patch is that completing scanning of a pageblock is very important
for later patches.
[yuehaib...@huawei.com: Fix unused variable warning]
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 90
scanner are more dramatic which is a likely reflection that the
machine has more memory.
[dan.carpen...@oracle.com: Fix static checker warning]
[vba...@suse.cz: Correct number of pages scanned for lower orders]
Signed-off-by: Mel Gorman
---
mm/compaction.c | 218
are reduced by 38%.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 124
1 file changed, 99 insertions(+), 25 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 92d10eb3d1c7..7c4c9cce7907 100644
allocation
success rates. While not presented, there was a 31% reduction in migration
scanning and a 8% reduction on system CPU usage. A 2-socket machine showed
similar benefits.
[vba...@suse.cz: Migrate block that was found-fast, some optimisations]
Signed-off-by: Mel Gorman
---
mm/compaction.c | 176
was increased by less
than 1% which is marginal. However, detailed tracing indicated that
failure of migration due to a premature ENOMEM triggered by watermark
checks were eliminated.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1
The last_migrated_pfn field is a bit dubious as to whether it really helps
but either way, the information from it can be inferred without increasing
the size of compact_control so remove the field.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 25
It's non-obvious that high-order free pages are split into order-0 pages
from the function name. Fix it.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index
. The
change could be much deeper but this was enough to briefly clarify
the flow.
No functional change.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c | 54 ++
1 file changed, 26 insertions(+), 28 deletions(-)
diff --git
it is offset by future reductions
in scanning. Hence, the results are not presented this time due to a
misleading mix of gains/losses without any clear pattern. However, full
scanning of the pageblock is important for later patches.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/compaction.c
.00%)21707.05 ( 4.43%)
Amean fault-both-3221692.92 ( 0.00%)21968.16 ( -1.27%)
The 2-socket results are not materially different. Scan rates are similar
as expected.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 delet
The isolate and migrate scanners should never isolate more than a pageblock
of pages so unsigned int is sufficient saving 8 bytes on a 64-bit build.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm
compact_control spans two cache lines with write-intensive lines on
both. Rearrange so the most write-intensive fields are in the same
cache line. This has a negligible impact on the overall performance of
compaction and is more a tidying exercise than anything.
Signed-off-by: Mel Gorman
Acked
This is a drop-in replacement for the series currently in Andrews tree that
incorporates static checking and compile warning fixes (Dan, YueHaibing)
and extensive review feedback from Vlastimil. Big thanks to Vlastimil as
the review was extremely detailed and a number of issues were caught. Not
On Fri, Jan 18, 2019 at 02:51:00PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > Remote compaction is expensive and possibly counter-productive. Locality
> > is expected to often have better performance characteristics than remote
> > high-o
On Fri, Jan 18, 2019 at 02:40:00PM +0100, Vlastimil Babka wrote:
> > Signed-off-by: Mel Gorman
>
> Great, you crossed off this old TODO item, and didn't need pageblock isolation
> to do that :D
>
The TODO is not just old, it's ancient! The idea of capture was first
floated
> *cc)
> > unsigned long isolate_start_pfn; /* exact pfn we start at */
> > unsigned long block_end_pfn;/* end of current pageblock */
> > unsigned long low_pfn; /* lowest pfn scanner is able to scan */
> > - unsigned long nr_isolated;
> > struct list_head *freelist = >freepages;
> > unsigned int stride;
> >
> > @@ -1374,6 +1453,8 @@ static void isolate_freepages(struct compact_control
> > *cc)
> > block_end_pfn = block_start_pfn,
> > block_start_pfn -= pageblock_nr_pages,
> > isolate_start_pfn = block_start_pfn) {
> > + unsigned long nr_isolated;
>
> Unrelated cleanup? Nevermind.
>
I'll move the hunks to "mm, compaction: Sample pageblocks for free
pages" where they belong
--
Mel Gorman
SUSE Labs
On Fri, Jan 18, 2019 at 11:38:38AM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > Once fast searching finishes, there is a possibility that the linear
> > scanner is scanning full blocks found by the fast scanner earlier. This
> > patch uses an adap
On Thu, Jan 17, 2019 at 06:58:30PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > The fast isolation of pages can move the scanner faster than is necessary
> > depending on the contents of the free list. This patch will only allow
> > the fast is
On Thu, Jan 17, 2019 at 06:33:37PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > Scanning on large machines can take a considerable length of time and
> > eventually need to be rescheduled. This is treated as an abort event but
> > that's not appro
On Thu, Jan 17, 2019 at 06:17:28PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > Migrate has separate cached PFNs for ASYNC and SYNC* migration on the
> > basis that some migrations will fail in ASYNC mode. However, if the cached
> > PFNs match at
On Thu, Jan 17, 2019 at 06:01:18PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > When scanning for sources or targets, PageCompound is checked for huge
> > pages as they can be skipped quickly but it happens relatively late after
> > a lot
gt; >
> > Signed-off-by: Mel Gorman
>
> Acked-by: Vlastimil Babka
>
> Some comments below.
>
Thanks
> > @@ -538,18 +535,8 @@ static unsigned long isolate_freepages_block(struct
> > compact_control *cc,
> > * r
On Thu, Jan 17, 2019 at 04:16:54PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:50 PM, Mel Gorman wrote:
> > Pageblocks are marked for skip when no pages are isolated after a scan.
> > However, it's possible to hit corner cases where the migration scanner
> > gets stuck near
rse(freepage, freelist, lru) {
> > + unsigned long pfn;
> > +
> > + order_scanned++;
> > + nr_scanned++;
>
> Seems order_scanned is supposed to be reset to 0 for each new order? Otherwise
> it's equivalent to nr_scanned...
>
Yes, it was meant to be. Not sure at what point I broke that and failed
to spot it afterwards. As you note elsewhere, the code structure doesn't
make sense if it wasn't been set to 0. Instead of doing a shorter search
at each order, it would simply check one page for each lower order.
Thanks!
--
Mel Gorman
SUSE Labs
56 [inline]
>
> Mel's new code... but might be just a victim of e.g. bad struct page
> initialization?
>
The error looks like compaction found a !PageBuddy on the free lists
while the zone lock was held. That seems bad no matter what. I expect
there will be a respin of the entire series relatively soon but none of
the fixes so far would be for that level of damage.
--
Mel Gorman
SUSE Labs
On Wed, Jan 16, 2019 at 04:45:59PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:49 PM, Mel Gorman wrote:
> > Due to either a fast search of the free list or a linear scan, it is
> > possible for multiple compaction instances to pick the same pageblock
> > for migration.
On Wed, Jan 16, 2019 at 04:00:22PM +0100, Vlastimil Babka wrote:
> On 1/16/19 3:33 PM, Mel Gorman wrote:
> >>> + break;
> >>> + }
> >>> +
> >>> + /*
> >>>
On Wed, Jan 16, 2019 at 02:15:10PM +0100, Vlastimil Babka wrote:
> >
> > + if (free_pfn < high_pfn) {
> > + update_fast_start_pfn(cc, free_pfn);
> > +
> > + /*
> > +* Avoid if skipped recently. Move
On Tue, Jan 15, 2019 at 01:39:28PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:49 PM, Mel Gorman wrote:
> > release_pages() is a simpler version of free_unref_page_list() but it
> > tracks the highest PFN for caching the restart point of the compaction
> > free scanner.
On Tue, Jan 15, 2019 at 12:50:45PM +, Mel Gorman wrote:
> > AFAICS memory allocator is not the only user of PageReserved. There
> > seems to be some drivers as well, notably the DRM subsystem via
> > drm_pci_alloc(). There's an effort to clean those up [1] but until the
On Tue, Jan 15, 2019 at 01:10:57PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:49 PM, Mel Gorman wrote:
> > Reserved pages are set at boot time, tend to be clustered and almost never
> > become unreserved. When isolating pages for either migration sources or
> > target, skip
deal with some cases but I'm not sure it'll survive
long-term, particularly if HPC continues to report in the field that
reboots are necessary to reshufffle the lists (taken from your linked
documents). That workaround of running STREAM before a job starts and
rebooting the machine if the performance SLAs are not met is horrid.
--
Mel Gorman
SUSE Labs
On Wed, Jan 09, 2019 at 11:27:31AM -0800, Andrew Morton wrote:
> On Wed, 9 Jan 2019 11:13:44 +0000 Mel Gorman
> wrote:
>
> > Full compaction of a node passes in negative orders which can lead to array
> > boundary issues. While it could be addressed in the control flow of
er is signed. This is a fix to the mmotm patch
broken-out/mm-compaction-round-robin-the-order-while-searching-the-free-lists-for-a-target.patch
Signed-off-by: Mel Gorman
---
mm/internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/internal.h b/mm/internal.h
index d028abd8a8f3..
tion.patch
Signed-off-by: YueHaibing
Signed-off-by: Mel Gorman
---
mm/compaction.c | 5 -
1 file changed, 5 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 51da4691092b..ca8da58ce1cd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1963,7 +1963,6 @@ static enum comp
-to-quickly-locate-a-migration-target.patch
Signed-off-by: Mel Gorman
---
mm/compaction.c | 4
1 file changed, 4 insertions(+)
diff --git a/mm/compaction.c b/mm/compaction.c
index 9438f0564ed5..167ad0f5c2fe 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1206,6 +1206,10
t;)
>
Dang. This is left-over debugging code that got accidentally merged
during a rebase. Andrew, can you pick this up as a fix to the mmotm
patch mm-compaction-finish-pageblock-scanning-on-contention.patch please?
Thanks YueHaibing.
--
Mel Gorman
SUSE Labs
lly a task waking 2+
wakees that temporarily stack on one CPU when nearby CPUs sharing LLC
remain idle. It's why the select idle sibling logic tried to take into
account a recently used CPU to wake such tasks if the recent CPU was
still idle.
--
Mel Gorman
SUSE Labs
On Mon, Jan 07, 2019 at 03:43:54PM -0800, Andrew Morton wrote:
> On Fri, 4 Jan 2019 12:49:46 +0000 Mel Gorman
> wrote:
>
> > This series reduces scan rates and success rates of compaction, primarily
> > by using the free lists to shorten scans, better controlling of
positive and negative effects,
it is best to avoid the possibility of remote compaction given the cost
relative to any potential benefit.
Signed-off-by: Mel Gorman
---
mm/compaction.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/mm/compaction.c b/mm/compaction.c
index
but will not be universally true.
Signed-off-by: Mel Gorman
---
mm/compaction.c | 33 ++---
mm/internal.h | 3 ++-
2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 6c5552c6d8f9..652e249168b1 100644
--- a/mm/compaction.c
+++ b/mm
2590377.00 7986174.00
The impact on 2-socket is much larger albeit not presented. Under
a different workload that fragments heavily, the allocation latency
is reduced by 26% while the success rate goes from 63% to 80%
Signed-off-by: Mel Gorman
---
include/linux/compaction.h | 3 ++-
inc
in latency, success rates and scan rates. This is expected as clearing
the hints is not that common but doing a small amount of work out-of-band
to avoid a large amount of work in-band later is generally a good thing.
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 2 +
mm/compaction.c
but the
free scan rate is reduced by 87% on a 1-socket machine and 92% on a
2-socket machine. It's also the first time in the series where the number
of pages scanned by the migration scanner is greater than the free scanner
due to the increased search efficiency.
Signed-off-by: Mel Gorman
---
mm
601 - 700 of 10256 matches
Mail list logo