On Tue, Jun 10, 2014 at 11:40:09AM +0200, Vlastimil Babka wrote:
> commit d3132e4b83e6bd383c74d716f7281d7c3136089c upstream.
>
> Compaction caches pfn's for its migrate and free scanners to avoid
> scanning the whole zone each time. In compact_zone(), the cached values
> are read to set up initial values for the scanners. There are several
> situations when these cached pfn's are reset to the first and last pfn
> of the zone, respectively. One of these situations is when a compaction
> has been deferred for a zone and is now being restarted during a direct
> compaction, which is also done in compact_zone().
>
> However, compact_zone() currently reads the cached pfn's *before*
> resetting them. This means the reset doesn't affect the compaction that
> performs it, and with good chance also subsequent compactions, as
> update_pageblock_skip() is likely to be called and update the cached
> pfn's to those being processed. Another chance for a successful reset
> is when a direct compaction detects that migration and free scanners
> meet (which has its own problems addressed by another patch) and sets
> update_pageblock_skip flag which kswapd uses to do the reset because it
> goes to sleep.
>
> This is clearly a bug that results in non-deterministic behavior, so
> this patch moves the cached pfn reset to be performed *before* the
> values are read.
>
> Signed-off-by: Vlastimil Babka <[email protected]>
> Acked-by: Mel Gorman <[email protected]>
> Acked-by: Rik van Riel <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
> ---
> I have realized that this should have been CC'd stable as well. It fixes a bug
> that makes compaction nondeterministic and broken.
> Please apply on 3.10 and 3.12 (fixes 3.7+)
Thank you, I'm queuing the first 2 patches for the 3.11 kernel as well
(the last one was tagged for stable, so it was already included).
Cheers,
--
Luís
>
> mm/compaction.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index d2c6751..c2082d8 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -954,6 +954,14 @@ static int compact_zone(struct zone *zone, struct
> compact_control *cc)
> }
>
> /*
> + * Clear pageblock skip if there were failures recently and compaction
> + * is about to be retried after being deferred. kswapd does not do
> + * this reset as it'll reset the cached information when going to sleep.
> + */
> + if (compaction_restarting(zone, cc->order) && !current_is_kswapd())
> + __reset_isolation_suitable(zone);
> +
> + /*
> * Setup to move all movable pages to the end of the zone. Used cached
> * information on where the scanners should start but check that it
> * is initialised by ensuring the values are within zone boundaries.
> @@ -969,14 +977,6 @@ static int compact_zone(struct zone *zone, struct
> compact_control *cc)
> zone->compact_cached_migrate_pfn = cc->migrate_pfn;
> }
>
> - /*
> - * Clear pageblock skip if there were failures recently and compaction
> - * is about to be retried after being deferred. kswapd does not do
> - * this reset as it'll reset the cached information when going to sleep.
> - */
> - if (compaction_restarting(zone, cc->order) && !current_is_kswapd())
> - __reset_isolation_suitable(zone);
> -
> migrate_prep_local();
>
> while ((ret = compact_finished(zone, cc)) == COMPACT_CONTINUE) {
> --
> 1.8.4.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html