-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/12/2015 03:30 PM, Michal Hocko wrote:
> On Thu 12-03-15 11:22:56, Eric B Munson wrote:
>> Currently, pages which are marked as unevictable are protected
>> from compaction, but not from other types of migration.  The
>> mlock desctription does not promise that all page faults will be
>> avoided, only major ones so this protection is not necessary.
>> This extra protection can cause problems for applications that
>> are using mlock to avoid swapping pages out, but require order >
>> 0 allocations to continue to succeed in a fragmented environment.
>> This patch adds a sysctl entry that will be used to allow root to
>> enable compaction of unevictable pages.
> 
> It would be appropriate to add a justification for the sysctl,
> because it is not obvious from the above description. mlock
> preventing from the swapout is not sufficient to justify it. It is
> the real time extension mentioned by Peter in the previous version
> which makes it worth a new user visible knob.
> 
> I would also argue that the knob should be enabled by default
> because the real time extension requires an additional changes
> anyway (rt-kernel at least) while general usage doesn't need such a
> strong requirement.

Thanks for the review, I will incorporate your suggestions into a V5.
 I agree that many users will want to set this to 1, but keeping the
default to 0 maintains the behavior of the kernel today.  I'd like to
have the real time folks say that they are okay with a default of 1
before I make that change.

> 
> You also should provide a knob description to 
> Documentation/sysctl/vm.txt

Will do.

> 
>> To illustrate this problem I wrote a quick test program that
>> mmaps a large number of 1MB files filled with random data.  These
>> maps are created locked and read only.  Then every other mmap is
>> unmapped and I attempt to allocate huge pages to the static huge
>> page pool.  When the compact_unevictable sysctl is 0, I cannot
>> allocate hugepages after fragmenting memory.  When the value is
>> set to 1, allocations succeed.
>> 
>> Signed-off-by: Eric B Munson <[email protected]> Cc: Vlastimil
>> Babka <[email protected]> Cc: Thomas Gleixner <[email protected]> 
>> Cc: Christoph Lameter <[email protected]> Cc: Peter Zijlstra
>> <[email protected]> Cc: Mel Gorman <[email protected]> Cc: David
>> Rientjes <[email protected]> Cc: Rik van Riel
>> <[email protected]> Cc: [email protected] Cc:
>> [email protected]
> 
> After the above things are fixed Acked-by: Michal Hocko
> <[email protected]>
> 
> One minor suggestion below
> 
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c index
>> 88ea2d6..cc1a678 100644 --- a/kernel/sysctl.c +++
>> b/kernel/sysctl.c @@ -1313,6 +1313,13 @@ static struct ctl_table
>> vm_table[] = { .extra1               = &min_extfrag_threshold, .extra2       
>>         =
>> &max_extfrag_threshold, }, + { +             .procname       =
>> "compact_unevictable", +             .data           = 
>> &sysctl_compact_unevictable, +
>> .maxlen              = sizeof(int), +                .mode           = 0644, 
>> +               .proc_handler   =
>> proc_dointvec,
> 
> You can use .extra1 = &zero and .extra2 = &one to reduce the value 
> space.
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBAgAGBQJVAezjAAoJELbVsDOpoOa9MHwP/j4nvm3dFm00qjOX4RKxsalz
cjhQxuozKnRU9H+OPS3dXXoqDGdpLaLu6CsUsu8FGiJj3zLgUNxea+quJSnSmYVz
8fO5VqhgA3alu7R7zSF3MtOjLzyOoP5/+jDNiNUDLL8sUzg/3hKXLUgBO9R1VU4Q
yD0Yuhw5veNLOvF57xhMCk/quCydIvZV9kAJyTr+fgoY4b8wLyp+QAcqi2lGMCBj
4W9lXtO1abG+gu/m5zAhXLX7MS+ZRQtA070G+kmkY7Z95DtKePGitNjLN7+X9EI6
F1073D+GtiEOJhC+xNOc6Xzwpfl4vRghg4jj6aTkSSrb+sY5/byuKg06p8rMRfef
pJrqjprbBNqiAP95z7X9H6FWty31kx6ZVXtM8CA9/XDqabJGgGs0qmDwPVf264+M
8ySZy5wPRE85yNUKElpvDnx7+t1gka8vDy3bVO+zPsJV3ZqSwhgAiYTTL6u2f/Qe
QwMXWgu4PaAeq0Wltrd/OtA6Fu9H9A91rkk8t69ctPkTjZYCgN0UDGzaa0WpH9SB
H2mz2B3+AE2sdpuoBoVQ62SU4/7PiBIT/ILRuzgQnnNELZFStjstRZPrnVSAYRKI
E5ArRQHfMwbortIiz9KH8SoibTyS0QZiXuKua6LXPwGTnvYqsSN8Jz1XoytV3I5G
MRhLUI7k4dgaVHPTVUYb
=qBj6
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to