Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-12 Thread Luiz Capitulino
On Tue, 11 Feb 2014 19:59:40 -0800 (PST)
David Rientjes  wrote:

> On Tue, 11 Feb 2014, Luiz Capitulino wrote:
> 
> > > > HugeTLB command-line option hugepages= allows the user to specify how 
> > > > many
> > > > huge pages should be allocated at boot. On NUMA systems, this argument
> > > > automatically distributes huge pages allocation among nodes, which can
> > > > be undesirable.
> > > > 
> > > 
> > > And when hugepages can no longer be allocated on a node because it is too 
> > > small, the remaining hugepages are distributed over nodes with memory 
> > > available, correct?
> > 
> > No. hugepagesnid= tries to obey what was specified by the uses as much as
> > possible.
> 
> I'm referring to what I quoted above, the hugepages= parameter. 

Oh, OK.

> I'm 
> saying that using existing functionality you can reserve an excess of 
> hugepages and then free unneeded hugepages at runtime to get the desired 
> amount allocated only on a specific node.

I got that part. I only think this is not a good solution as I explained
bellow.

> > > Strange, it would seem better to just reserve as many hugepages as you 
> > > want so that you get the desired number on each node and then free the 
> > > ones you don't need at runtime.
> > 
> > You mean, for example, if I have a 2 node system and want 2 1G huge pages
> > from node 1, then I have to allocate 4 1G huge pages and then free 2 pages
> > on node 0 after boot? That seems very cumbersome to me. Besides, what if
> > node0 needs this memory during boot?
> > 
> 
> All of this functionality, including the current hugepages= reservation at 
> boot, needs to show that it can't be done as late as when you could run an 
> initscript to do the reservation at runtime and fragmentation is at its 
> lowest level when userspace first becomes available.

It's not that it can't. The point is that for 1G huge pages it's more
reliable to allocate them as early as possible during the kernel boot
process. I'm all for having/improving 1G allocation support at run-time,
and volunteer to help with that effort, but that's something that can
(and IMO should) be done on top of this series.

> I don't see any justification given in the patchset that suggests you 
> can't simply do this in an initscript if it is possible to allocate 1GB 
> pages at runtime.  If it's too late because of oom, then your userspace is 
> going to oom anyway if you reserve the hugepages at boot; if it's too late 
> because of fragmentation, let's work on that issue (and justification why 
> things like movablecore= don't work for you).
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-12 Thread Luiz Capitulino
On Wed, 12 Feb 2014 03:37:11 +0100
Andi Kleen  wrote:

> > The real syntax is hugepagesnid=nid,nr-pages,size. Which looks 
> > straightforward
> > to me. I honestly can't think of anything better than that, but I'm open for
> > suggestions.
> 
> hugepages_node=nid:nr-pages:size,... ? 

Looks good, I'll consider using it for v2.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-12 Thread Mel Gorman
On Tue, Feb 11, 2014 at 06:15:57PM -0200, Marcelo Tosatti wrote:
> On Tue, Feb 11, 2014 at 05:10:35PM +, Mel Gorman wrote:
> > On Tue, Feb 11, 2014 at 01:26:29PM -0200, Marcelo Tosatti wrote:
> > > > Or take a stab at allocating 1G pages at runtime. It would require
> > > > finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
> > > > runtime. I would expect it would only work very early in the lifetime of
> > > > the system but if the user is willing to use kernel parameters to
> > > > allocate them then it should not be an issue.
> > > 
> > > Can be an improvement on top of the current patchset? Certain use-cases
> > > require allocation guarantees (even if that requires kernel parameters).
> > > 
> > 
> > Sure, they're not mutually exclusive. It would just avoid the need to
> > create a new kernel parameter and use the existing interfaces.
> 
> Yes, the problem is there is no guarantee is there?
> 

There is no guarantee anyway and early in the lifetime of the system there
is going to be very little difference in success rates. In case there is a
misunderstanding here, I'm not looking to NAK a series that adds another
kernel parameter. If it was me, I would have tried runtime allocation
first to avoid adding a new interface but it's a personal preference.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-12 Thread Mel Gorman
On Tue, Feb 11, 2014 at 06:15:57PM -0200, Marcelo Tosatti wrote:
 On Tue, Feb 11, 2014 at 05:10:35PM +, Mel Gorman wrote:
  On Tue, Feb 11, 2014 at 01:26:29PM -0200, Marcelo Tosatti wrote:
Or take a stab at allocating 1G pages at runtime. It would require
finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
runtime. I would expect it would only work very early in the lifetime of
the system but if the user is willing to use kernel parameters to
allocate them then it should not be an issue.
   
   Can be an improvement on top of the current patchset? Certain use-cases
   require allocation guarantees (even if that requires kernel parameters).
   
  
  Sure, they're not mutually exclusive. It would just avoid the need to
  create a new kernel parameter and use the existing interfaces.
 
 Yes, the problem is there is no guarantee is there?
 

There is no guarantee anyway and early in the lifetime of the system there
is going to be very little difference in success rates. In case there is a
misunderstanding here, I'm not looking to NAK a series that adds another
kernel parameter. If it was me, I would have tried runtime allocation
first to avoid adding a new interface but it's a personal preference.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-12 Thread Luiz Capitulino
On Wed, 12 Feb 2014 03:37:11 +0100
Andi Kleen a...@firstfloor.org wrote:

  The real syntax is hugepagesnid=nid,nr-pages,size. Which looks 
  straightforward
  to me. I honestly can't think of anything better than that, but I'm open for
  suggestions.
 
 hugepages_node=nid:nr-pages:size,... ? 

Looks good, I'll consider using it for v2.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-12 Thread Luiz Capitulino
On Tue, 11 Feb 2014 19:59:40 -0800 (PST)
David Rientjes rient...@google.com wrote:

 On Tue, 11 Feb 2014, Luiz Capitulino wrote:
 
HugeTLB command-line option hugepages= allows the user to specify how 
many
huge pages should be allocated at boot. On NUMA systems, this argument
automatically distributes huge pages allocation among nodes, which can
be undesirable.

   
   And when hugepages can no longer be allocated on a node because it is too 
   small, the remaining hugepages are distributed over nodes with memory 
   available, correct?
  
  No. hugepagesnid= tries to obey what was specified by the uses as much as
  possible.
 
 I'm referring to what I quoted above, the hugepages= parameter. 

Oh, OK.

 I'm 
 saying that using existing functionality you can reserve an excess of 
 hugepages and then free unneeded hugepages at runtime to get the desired 
 amount allocated only on a specific node.

I got that part. I only think this is not a good solution as I explained
bellow.

   Strange, it would seem better to just reserve as many hugepages as you 
   want so that you get the desired number on each node and then free the 
   ones you don't need at runtime.
  
  You mean, for example, if I have a 2 node system and want 2 1G huge pages
  from node 1, then I have to allocate 4 1G huge pages and then free 2 pages
  on node 0 after boot? That seems very cumbersome to me. Besides, what if
  node0 needs this memory during boot?
  
 
 All of this functionality, including the current hugepages= reservation at 
 boot, needs to show that it can't be done as late as when you could run an 
 initscript to do the reservation at runtime and fragmentation is at its 
 lowest level when userspace first becomes available.

It's not that it can't. The point is that for 1G huge pages it's more
reliable to allocate them as early as possible during the kernel boot
process. I'm all for having/improving 1G allocation support at run-time,
and volunteer to help with that effort, but that's something that can
(and IMO should) be done on top of this series.

 I don't see any justification given in the patchset that suggests you 
 can't simply do this in an initscript if it is possible to allocate 1GB 
 pages at runtime.  If it's too late because of oom, then your userspace is 
 going to oom anyway if you reserve the hugepages at boot; if it's too late 
 because of fragmentation, let's work on that issue (and justification why 
 things like movablecore= don't work for you).
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread David Rientjes
On Wed, 12 Feb 2014, Andi Kleen wrote:

> > The real syntax is hugepagesnid=nid,nr-pages,size. Which looks 
> > straightforward
> > to me. I honestly can't think of anything better than that, but I'm open for
> > suggestions.
> 
> hugepages_node=nid:nr-pages:size,... ? 
> 

I think that if we actually want this support that it should behave like 
hugepages= and hugepagesz=, i.e. you specify a hugepagesnode= and, if 
present, all remaining hugepages= and hugepagesz= parameters act only on 
that node unless overridden by another hugepagesnode=.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread David Rientjes
On Tue, 11 Feb 2014, Luiz Capitulino wrote:

> > > HugeTLB command-line option hugepages= allows the user to specify how many
> > > huge pages should be allocated at boot. On NUMA systems, this argument
> > > automatically distributes huge pages allocation among nodes, which can
> > > be undesirable.
> > > 
> > 
> > And when hugepages can no longer be allocated on a node because it is too 
> > small, the remaining hugepages are distributed over nodes with memory 
> > available, correct?
> 
> No. hugepagesnid= tries to obey what was specified by the uses as much as
> possible.

I'm referring to what I quoted above, the hugepages= parameter.  I'm 
saying that using existing functionality you can reserve an excess of 
hugepages and then free unneeded hugepages at runtime to get the desired 
amount allocated only on a specific node.

> > Strange, it would seem better to just reserve as many hugepages as you 
> > want so that you get the desired number on each node and then free the 
> > ones you don't need at runtime.
> 
> You mean, for example, if I have a 2 node system and want 2 1G huge pages
> from node 1, then I have to allocate 4 1G huge pages and then free 2 pages
> on node 0 after boot? That seems very cumbersome to me. Besides, what if
> node0 needs this memory during boot?
> 

All of this functionality, including the current hugepages= reservation at 
boot, needs to show that it can't be done as late as when you could run an 
initscript to do the reservation at runtime and fragmentation is at its 
lowest level when userspace first becomes available.

I don't see any justification given in the patchset that suggests you 
can't simply do this in an initscript if it is possible to allocate 1GB 
pages at runtime.  If it's too late because of oom, then your userspace is 
going to oom anyway if you reserve the hugepages at boot; if it's too late 
because of fragmentation, let's work on that issue (and justification why 
things like movablecore= don't work for you).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Andi Kleen
> The real syntax is hugepagesnid=nid,nr-pages,size. Which looks straightforward
> to me. I honestly can't think of anything better than that, but I'm open for
> suggestions.

hugepages_node=nid:nr-pages:size,... ? 

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Luiz Capitulino
On Tue, 11 Feb 2014 22:17:32 +0100
Andi Kleen  wrote:

> On Mon, Feb 10, 2014 at 12:27:44PM -0500, Luiz Capitulino wrote:
> > HugeTLB command-line option hugepages= allows the user to specify how many
> > huge pages should be allocated at boot. On NUMA systems, this argument
> > automatically distributes huge pages allocation among nodes, which can
> > be undesirable.
> > 
> > The hugepagesnid= option introduced by this commit allows the user
> > to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> > pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> > from node 0 only. More details on patch 3/4 and patch 4/4.
> 
> The syntax seems very confusing. Can you make that more obvious?

I guess that my bad description in this email may have contributed to make
it look confusing.

The real syntax is hugepagesnid=nid,nr-pages,size. Which looks straightforward
to me. I honestly can't think of anything better than that, but I'm open for
suggestions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Andi Kleen
On Mon, Feb 10, 2014 at 12:27:44PM -0500, Luiz Capitulino wrote:
> HugeTLB command-line option hugepages= allows the user to specify how many
> huge pages should be allocated at boot. On NUMA systems, this argument
> automatically distributes huge pages allocation among nodes, which can
> be undesirable.
> 
> The hugepagesnid= option introduced by this commit allows the user
> to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> from node 0 only. More details on patch 3/4 and patch 4/4.

The syntax seems very confusing. Can you make that more obvious?

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Marcelo Tosatti
On Tue, Feb 11, 2014 at 05:10:35PM +, Mel Gorman wrote:
> On Tue, Feb 11, 2014 at 01:26:29PM -0200, Marcelo Tosatti wrote:
> > > Or take a stab at allocating 1G pages at runtime. It would require
> > > finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
> > > runtime. I would expect it would only work very early in the lifetime of
> > > the system but if the user is willing to use kernel parameters to
> > > allocate them then it should not be an issue.
> > 
> > Can be an improvement on top of the current patchset? Certain use-cases
> > require allocation guarantees (even if that requires kernel parameters).
> > 
> 
> Sure, they're not mutually exclusive. It would just avoid the need to
> create a new kernel parameter and use the existing interfaces.

Yes, the problem is there is no guarantee is there?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Mel Gorman
On Tue, Feb 11, 2014 at 01:26:29PM -0200, Marcelo Tosatti wrote:
> > Or take a stab at allocating 1G pages at runtime. It would require
> > finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
> > runtime. I would expect it would only work very early in the lifetime of
> > the system but if the user is willing to use kernel parameters to
> > allocate them then it should not be an issue.
> 
> Can be an improvement on top of the current patchset? Certain use-cases
> require allocation guarantees (even if that requires kernel parameters).
> 

Sure, they're not mutually exclusive. It would just avoid the need to
create a new kernel parameter and use the existing interfaces.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Luiz Capitulino
On Mon, 10 Feb 2014 15:13:54 -0800
Andrew Morton  wrote:

> On Mon, 10 Feb 2014 12:27:44 -0500 Luiz Capitulino  
> wrote:
> 
> > HugeTLB command-line option hugepages= allows the user to specify how many
> > huge pages should be allocated at boot. On NUMA systems, this argument
> > automatically distributes huge pages allocation among nodes, which can
> > be undesirable.
> 
> Grumble.  "can be undesirable" is the entire reason for the entire
> patchset.  We need far, far more detail than can be conveyed in three
> words, please!

Right, sorry for that. I'll improve this for v2, but a better introduction
for the series would be something like the following.

Today, HugeTLB provides support for controlling allocation of persistent
huge pages on a NUMA system through sysfs. So, for example, if a sysadmin
wants to allocate 300 2M huge pages on node 1, s/he can do:

 echo 300 > 
/sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages

This works as long as you have enough contiguous pages, which may work
for 2M pages, but is harder for 1G huge pages. For those, it's better or even
required to reserve them at boot.

To this end we have the hugepages= command-line option, which works but misses
the per node control. This option evenly distributes huge pages among nodes.
However, we have users who want more flexibility. They want to be able to
specify something like: allocate 2 1G huge pages from node0 and 4 1G huge page
from node1. This is what this series implements.

It's basically per node allocation control for 1G huge pages, but it's
important to note that this series is not intrusive. All it does is to set
the initial per node allocation. All the functions and data structure added
by this series are only used once at boot, after that they are discarded and
rest in oblivion.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Luiz Capitulino
On Mon, 10 Feb 2014 18:54:20 -0800 (PST)
David Rientjes  wrote:

> On Mon, 10 Feb 2014, Luiz Capitulino wrote:
> 
> > HugeTLB command-line option hugepages= allows the user to specify how many
> > huge pages should be allocated at boot. On NUMA systems, this argument
> > automatically distributes huge pages allocation among nodes, which can
> > be undesirable.
> > 
> 
> And when hugepages can no longer be allocated on a node because it is too 
> small, the remaining hugepages are distributed over nodes with memory 
> available, correct?

No. hugepagesnid= tries to obey what was specified by the uses as much as
possible. So, if you specify that 10 1G huge pages should be allocated from
node0 but only 7 1G pages can actually be allocated, then hugepagesnid= will
do just that.

> > The hugepagesnid= option introduced by this commit allows the user
> > to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> > pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> > from node 0 only. More details on patch 3/4 and patch 4/4.
> > 
> 
> Strange, it would seem better to just reserve as many hugepages as you 
> want so that you get the desired number on each node and then free the 
> ones you don't need at runtime.

You mean, for example, if I have a 2 node system and want 2 1G huge pages
from node 1, then I have to allocate 4 1G huge pages and then free 2 pages
on node 0 after boot? That seems very cumbersome to me. Besides, what if
node0 needs this memory during boot?

> That probably doesn't work because we can't free very large hugepages that 
> are reserved at boot, would fixing that issue reduce the need for this 
> patchset?

I don't think so.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Marcelo Tosatti
On Tue, Feb 11, 2014 at 09:25:14AM +, Mel Gorman wrote:
> On Mon, Feb 10, 2014 at 06:54:20PM -0800, David Rientjes wrote:
> > On Mon, 10 Feb 2014, Luiz Capitulino wrote:
> > 
> > > HugeTLB command-line option hugepages= allows the user to specify how many
> > > huge pages should be allocated at boot. On NUMA systems, this argument
> > > automatically distributes huge pages allocation among nodes, which can
> > > be undesirable.
> > > 
> > 
> > And when hugepages can no longer be allocated on a node because it is too 
> > small, the remaining hugepages are distributed over nodes with memory 
> > available, correct?
> > 
> > > The hugepagesnid= option introduced by this commit allows the user
> > > to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> > > pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> > > from node 0 only. More details on patch 3/4 and patch 4/4.
> > > 
> > 
> > Strange, it would seem better to just reserve as many hugepages as you 
> > want so that you get the desired number on each node and then free the 
> > ones you don't need at runtime.

You have to know the behaviour of the allocator, and rely on that 
to allocate the exact number of 1G hugepages on a particular node.

Is that desired in constrast with specifying the exact number, and
location, of hugepages to allocated?

> Or take a stab at allocating 1G pages at runtime. It would require
> finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
> runtime. I would expect it would only work very early in the lifetime of
> the system but if the user is willing to use kernel parameters to
> allocate them then it should not be an issue.

Can be an improvement on top of the current patchset? Certain use-cases
require allocation guarantees (even if that requires kernel parameters).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Mel Gorman
On Mon, Feb 10, 2014 at 06:54:20PM -0800, David Rientjes wrote:
> On Mon, 10 Feb 2014, Luiz Capitulino wrote:
> 
> > HugeTLB command-line option hugepages= allows the user to specify how many
> > huge pages should be allocated at boot. On NUMA systems, this argument
> > automatically distributes huge pages allocation among nodes, which can
> > be undesirable.
> > 
> 
> And when hugepages can no longer be allocated on a node because it is too 
> small, the remaining hugepages are distributed over nodes with memory 
> available, correct?
> 
> > The hugepagesnid= option introduced by this commit allows the user
> > to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> > pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> > from node 0 only. More details on patch 3/4 and patch 4/4.
> > 
> 
> Strange, it would seem better to just reserve as many hugepages as you 
> want so that you get the desired number on each node and then free the 
> ones you don't need at runtime.
> 

Or take a stab at allocating 1G pages at runtime. It would require
finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
runtime. I would expect it would only work very early in the lifetime of
the system but if the user is willing to use kernel parameters to
allocate them then it should not be an issue.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Mel Gorman
On Mon, Feb 10, 2014 at 06:54:20PM -0800, David Rientjes wrote:
 On Mon, 10 Feb 2014, Luiz Capitulino wrote:
 
  HugeTLB command-line option hugepages= allows the user to specify how many
  huge pages should be allocated at boot. On NUMA systems, this argument
  automatically distributes huge pages allocation among nodes, which can
  be undesirable.
  
 
 And when hugepages can no longer be allocated on a node because it is too 
 small, the remaining hugepages are distributed over nodes with memory 
 available, correct?
 
  The hugepagesnid= option introduced by this commit allows the user
  to specify which NUMA nodes should be used to allocate boot-time HugeTLB
  pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
  from node 0 only. More details on patch 3/4 and patch 4/4.
  
 
 Strange, it would seem better to just reserve as many hugepages as you 
 want so that you get the desired number on each node and then free the 
 ones you don't need at runtime.
 

Or take a stab at allocating 1G pages at runtime. It would require
finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
runtime. I would expect it would only work very early in the lifetime of
the system but if the user is willing to use kernel parameters to
allocate them then it should not be an issue.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Marcelo Tosatti
On Tue, Feb 11, 2014 at 09:25:14AM +, Mel Gorman wrote:
 On Mon, Feb 10, 2014 at 06:54:20PM -0800, David Rientjes wrote:
  On Mon, 10 Feb 2014, Luiz Capitulino wrote:
  
   HugeTLB command-line option hugepages= allows the user to specify how many
   huge pages should be allocated at boot. On NUMA systems, this argument
   automatically distributes huge pages allocation among nodes, which can
   be undesirable.
   
  
  And when hugepages can no longer be allocated on a node because it is too 
  small, the remaining hugepages are distributed over nodes with memory 
  available, correct?
  
   The hugepagesnid= option introduced by this commit allows the user
   to specify which NUMA nodes should be used to allocate boot-time HugeTLB
   pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
   from node 0 only. More details on patch 3/4 and patch 4/4.
   
  
  Strange, it would seem better to just reserve as many hugepages as you 
  want so that you get the desired number on each node and then free the 
  ones you don't need at runtime.

You have to know the behaviour of the allocator, and rely on that 
to allocate the exact number of 1G hugepages on a particular node.

Is that desired in constrast with specifying the exact number, and
location, of hugepages to allocated?

 Or take a stab at allocating 1G pages at runtime. It would require
 finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
 runtime. I would expect it would only work very early in the lifetime of
 the system but if the user is willing to use kernel parameters to
 allocate them then it should not be an issue.

Can be an improvement on top of the current patchset? Certain use-cases
require allocation guarantees (even if that requires kernel parameters).

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Luiz Capitulino
On Mon, 10 Feb 2014 18:54:20 -0800 (PST)
David Rientjes rient...@google.com wrote:

 On Mon, 10 Feb 2014, Luiz Capitulino wrote:
 
  HugeTLB command-line option hugepages= allows the user to specify how many
  huge pages should be allocated at boot. On NUMA systems, this argument
  automatically distributes huge pages allocation among nodes, which can
  be undesirable.
  
 
 And when hugepages can no longer be allocated on a node because it is too 
 small, the remaining hugepages are distributed over nodes with memory 
 available, correct?

No. hugepagesnid= tries to obey what was specified by the uses as much as
possible. So, if you specify that 10 1G huge pages should be allocated from
node0 but only 7 1G pages can actually be allocated, then hugepagesnid= will
do just that.

  The hugepagesnid= option introduced by this commit allows the user
  to specify which NUMA nodes should be used to allocate boot-time HugeTLB
  pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
  from node 0 only. More details on patch 3/4 and patch 4/4.
  
 
 Strange, it would seem better to just reserve as many hugepages as you 
 want so that you get the desired number on each node and then free the 
 ones you don't need at runtime.

You mean, for example, if I have a 2 node system and want 2 1G huge pages
from node 1, then I have to allocate 4 1G huge pages and then free 2 pages
on node 0 after boot? That seems very cumbersome to me. Besides, what if
node0 needs this memory during boot?

 That probably doesn't work because we can't free very large hugepages that 
 are reserved at boot, would fixing that issue reduce the need for this 
 patchset?

I don't think so.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Luiz Capitulino
On Mon, 10 Feb 2014 15:13:54 -0800
Andrew Morton a...@linux-foundation.org wrote:

 On Mon, 10 Feb 2014 12:27:44 -0500 Luiz Capitulino lcapitul...@redhat.com 
 wrote:
 
  HugeTLB command-line option hugepages= allows the user to specify how many
  huge pages should be allocated at boot. On NUMA systems, this argument
  automatically distributes huge pages allocation among nodes, which can
  be undesirable.
 
 Grumble.  can be undesirable is the entire reason for the entire
 patchset.  We need far, far more detail than can be conveyed in three
 words, please!

Right, sorry for that. I'll improve this for v2, but a better introduction
for the series would be something like the following.

Today, HugeTLB provides support for controlling allocation of persistent
huge pages on a NUMA system through sysfs. So, for example, if a sysadmin
wants to allocate 300 2M huge pages on node 1, s/he can do:

 echo 300  
/sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages

This works as long as you have enough contiguous pages, which may work
for 2M pages, but is harder for 1G huge pages. For those, it's better or even
required to reserve them at boot.

To this end we have the hugepages= command-line option, which works but misses
the per node control. This option evenly distributes huge pages among nodes.
However, we have users who want more flexibility. They want to be able to
specify something like: allocate 2 1G huge pages from node0 and 4 1G huge page
from node1. This is what this series implements.

It's basically per node allocation control for 1G huge pages, but it's
important to note that this series is not intrusive. All it does is to set
the initial per node allocation. All the functions and data structure added
by this series are only used once at boot, after that they are discarded and
rest in oblivion.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Mel Gorman
On Tue, Feb 11, 2014 at 01:26:29PM -0200, Marcelo Tosatti wrote:
  Or take a stab at allocating 1G pages at runtime. It would require
  finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
  runtime. I would expect it would only work very early in the lifetime of
  the system but if the user is willing to use kernel parameters to
  allocate them then it should not be an issue.
 
 Can be an improvement on top of the current patchset? Certain use-cases
 require allocation guarantees (even if that requires kernel parameters).
 

Sure, they're not mutually exclusive. It would just avoid the need to
create a new kernel parameter and use the existing interfaces.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Marcelo Tosatti
On Tue, Feb 11, 2014 at 05:10:35PM +, Mel Gorman wrote:
 On Tue, Feb 11, 2014 at 01:26:29PM -0200, Marcelo Tosatti wrote:
   Or take a stab at allocating 1G pages at runtime. It would require
   finding properly aligned 1Gs worth of contiguous MAX_ORDER_NR_PAGES at
   runtime. I would expect it would only work very early in the lifetime of
   the system but if the user is willing to use kernel parameters to
   allocate them then it should not be an issue.
  
  Can be an improvement on top of the current patchset? Certain use-cases
  require allocation guarantees (even if that requires kernel parameters).
  
 
 Sure, they're not mutually exclusive. It would just avoid the need to
 create a new kernel parameter and use the existing interfaces.

Yes, the problem is there is no guarantee is there?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Andi Kleen
On Mon, Feb 10, 2014 at 12:27:44PM -0500, Luiz Capitulino wrote:
 HugeTLB command-line option hugepages= allows the user to specify how many
 huge pages should be allocated at boot. On NUMA systems, this argument
 automatically distributes huge pages allocation among nodes, which can
 be undesirable.
 
 The hugepagesnid= option introduced by this commit allows the user
 to specify which NUMA nodes should be used to allocate boot-time HugeTLB
 pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
 from node 0 only. More details on patch 3/4 and patch 4/4.

The syntax seems very confusing. Can you make that more obvious?

-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Luiz Capitulino
On Tue, 11 Feb 2014 22:17:32 +0100
Andi Kleen a...@firstfloor.org wrote:

 On Mon, Feb 10, 2014 at 12:27:44PM -0500, Luiz Capitulino wrote:
  HugeTLB command-line option hugepages= allows the user to specify how many
  huge pages should be allocated at boot. On NUMA systems, this argument
  automatically distributes huge pages allocation among nodes, which can
  be undesirable.
  
  The hugepagesnid= option introduced by this commit allows the user
  to specify which NUMA nodes should be used to allocate boot-time HugeTLB
  pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
  from node 0 only. More details on patch 3/4 and patch 4/4.
 
 The syntax seems very confusing. Can you make that more obvious?

I guess that my bad description in this email may have contributed to make
it look confusing.

The real syntax is hugepagesnid=nid,nr-pages,size. Which looks straightforward
to me. I honestly can't think of anything better than that, but I'm open for
suggestions.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread Andi Kleen
 The real syntax is hugepagesnid=nid,nr-pages,size. Which looks straightforward
 to me. I honestly can't think of anything better than that, but I'm open for
 suggestions.

hugepages_node=nid:nr-pages:size,... ? 

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread David Rientjes
On Tue, 11 Feb 2014, Luiz Capitulino wrote:

   HugeTLB command-line option hugepages= allows the user to specify how many
   huge pages should be allocated at boot. On NUMA systems, this argument
   automatically distributes huge pages allocation among nodes, which can
   be undesirable.
   
  
  And when hugepages can no longer be allocated on a node because it is too 
  small, the remaining hugepages are distributed over nodes with memory 
  available, correct?
 
 No. hugepagesnid= tries to obey what was specified by the uses as much as
 possible.

I'm referring to what I quoted above, the hugepages= parameter.  I'm 
saying that using existing functionality you can reserve an excess of 
hugepages and then free unneeded hugepages at runtime to get the desired 
amount allocated only on a specific node.

  Strange, it would seem better to just reserve as many hugepages as you 
  want so that you get the desired number on each node and then free the 
  ones you don't need at runtime.
 
 You mean, for example, if I have a 2 node system and want 2 1G huge pages
 from node 1, then I have to allocate 4 1G huge pages and then free 2 pages
 on node 0 after boot? That seems very cumbersome to me. Besides, what if
 node0 needs this memory during boot?
 

All of this functionality, including the current hugepages= reservation at 
boot, needs to show that it can't be done as late as when you could run an 
initscript to do the reservation at runtime and fragmentation is at its 
lowest level when userspace first becomes available.

I don't see any justification given in the patchset that suggests you 
can't simply do this in an initscript if it is possible to allocate 1GB 
pages at runtime.  If it's too late because of oom, then your userspace is 
going to oom anyway if you reserve the hugepages at boot; if it's too late 
because of fragmentation, let's work on that issue (and justification why 
things like movablecore= don't work for you).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-11 Thread David Rientjes
On Wed, 12 Feb 2014, Andi Kleen wrote:

  The real syntax is hugepagesnid=nid,nr-pages,size. Which looks 
  straightforward
  to me. I honestly can't think of anything better than that, but I'm open for
  suggestions.
 
 hugepages_node=nid:nr-pages:size,... ? 
 

I think that if we actually want this support that it should behave like 
hugepages= and hugepagesz=, i.e. you specify a hugepagesnode= and, if 
present, all remaining hugepages= and hugepagesz= parameters act only on 
that node unless overridden by another hugepagesnode=.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread Davidlohr Bueso
On Mon, 2014-02-10 at 15:13 -0800, Andrew Morton wrote:
> On Mon, 10 Feb 2014 12:27:44 -0500 Luiz Capitulino  
> wrote:
> 
> > HugeTLB command-line option hugepages= allows the user to specify how many
> > huge pages should be allocated at boot. On NUMA systems, this argument
> > automatically distributes huge pages allocation among nodes, which can
> > be undesirable.
> 
> Grumble.  "can be undesirable" is the entire reason for the entire
> patchset.  We need far, far more detail than can be conveyed in three
> words, please!

One (not so real-world) scenario that comes right to mind which can
benefit for such a feature is the ability to study socket/node scaling
for hugepage aware applications. Yes, we do have numactl to bind
programs to resources, but I don't mind having a way of finer graining
hugetlb allocations, specially if it doesn't hurt anything.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread David Rientjes
On Mon, 10 Feb 2014, Luiz Capitulino wrote:

> HugeTLB command-line option hugepages= allows the user to specify how many
> huge pages should be allocated at boot. On NUMA systems, this argument
> automatically distributes huge pages allocation among nodes, which can
> be undesirable.
> 

And when hugepages can no longer be allocated on a node because it is too 
small, the remaining hugepages are distributed over nodes with memory 
available, correct?

> The hugepagesnid= option introduced by this commit allows the user
> to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> from node 0 only. More details on patch 3/4 and patch 4/4.
> 

Strange, it would seem better to just reserve as many hugepages as you 
want so that you get the desired number on each node and then free the 
ones you don't need at runtime.

That probably doesn't work because we can't free very large hugepages that 
are reserved at boot, would fixing that issue reduce the need for this 
patchset?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread Andrew Morton
On Mon, 10 Feb 2014 12:27:44 -0500 Luiz Capitulino  
wrote:

> HugeTLB command-line option hugepages= allows the user to specify how many
> huge pages should be allocated at boot. On NUMA systems, this argument
> automatically distributes huge pages allocation among nodes, which can
> be undesirable.

Grumble.  "can be undesirable" is the entire reason for the entire
patchset.  We need far, far more detail than can be conveyed in three
words, please!

> The hugepagesnid= option introduced by this commit allows the user
> to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> from node 0 only. More details on patch 3/4 and patch 4/4.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread Luiz Capitulino
On Mon, 10 Feb 2014 12:27:44 -0500
Luiz Capitulino  wrote:

> The hugepagesnid= option introduced by this commit allows the user
> to specify which NUMA nodes should be used to allocate boot-time HugeTLB
> pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
> from node 0 only. More details on patch 3/4 and patch 4/4.

s/2G/1G

I repeatedly did this mistake even when testing... For some reason my
brain insists on typing "2,2G" instead of "2,1G".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread Luiz Capitulino
On Mon, 10 Feb 2014 12:27:44 -0500
Luiz Capitulino lcapitul...@redhat.com wrote:

 The hugepagesnid= option introduced by this commit allows the user
 to specify which NUMA nodes should be used to allocate boot-time HugeTLB
 pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
 from node 0 only. More details on patch 3/4 and patch 4/4.

s/2G/1G

I repeatedly did this mistake even when testing... For some reason my
brain insists on typing 2,2G instead of 2,1G.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread Andrew Morton
On Mon, 10 Feb 2014 12:27:44 -0500 Luiz Capitulino lcapitul...@redhat.com 
wrote:

 HugeTLB command-line option hugepages= allows the user to specify how many
 huge pages should be allocated at boot. On NUMA systems, this argument
 automatically distributes huge pages allocation among nodes, which can
 be undesirable.

Grumble.  can be undesirable is the entire reason for the entire
patchset.  We need far, far more detail than can be conveyed in three
words, please!

 The hugepagesnid= option introduced by this commit allows the user
 to specify which NUMA nodes should be used to allocate boot-time HugeTLB
 pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
 from node 0 only. More details on patch 3/4 and patch 4/4.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread David Rientjes
On Mon, 10 Feb 2014, Luiz Capitulino wrote:

 HugeTLB command-line option hugepages= allows the user to specify how many
 huge pages should be allocated at boot. On NUMA systems, this argument
 automatically distributes huge pages allocation among nodes, which can
 be undesirable.
 

And when hugepages can no longer be allocated on a node because it is too 
small, the remaining hugepages are distributed over nodes with memory 
available, correct?

 The hugepagesnid= option introduced by this commit allows the user
 to specify which NUMA nodes should be used to allocate boot-time HugeTLB
 pages. For example, hugepagesnid=0,2,2G will allocate two 2G huge pages
 from node 0 only. More details on patch 3/4 and patch 4/4.
 

Strange, it would seem better to just reserve as many hugepages as you 
want so that you get the desired number on each node and then free the 
ones you don't need at runtime.

That probably doesn't work because we can't free very large hugepages that 
are reserved at boot, would fixing that issue reduce the need for this 
patchset?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] hugetlb: add hugepagesnid= command-line option

2014-02-10 Thread Davidlohr Bueso
On Mon, 2014-02-10 at 15:13 -0800, Andrew Morton wrote:
 On Mon, 10 Feb 2014 12:27:44 -0500 Luiz Capitulino lcapitul...@redhat.com 
 wrote:
 
  HugeTLB command-line option hugepages= allows the user to specify how many
  huge pages should be allocated at boot. On NUMA systems, this argument
  automatically distributes huge pages allocation among nodes, which can
  be undesirable.
 
 Grumble.  can be undesirable is the entire reason for the entire
 patchset.  We need far, far more detail than can be conveyed in three
 words, please!

One (not so real-world) scenario that comes right to mind which can
benefit for such a feature is the ability to study socket/node scaling
for hugepage aware applications. Yes, we do have numactl to bind
programs to resources, but I don't mind having a way of finer graining
hugetlb allocations, specially if it doesn't hurt anything.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/