multi-queue scheduler update

2001-01-18 Thread Mike Kravetz
1.661 1024 FRC196.425 6.166 2048 FRC FRC 23.291 4096 FRC FRC 47.117 *FRC = failed to reach confidence level -- Mike Kravetz [EMAIL PROTECTED] IBM Linux

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-18 Thread Mike Kravetz
On Fri, Jan 19, 2001 at 01:26:16AM +0100, Andrea Arcangeli wrote: On Thu, Jan 18, 2001 at 03:53:11PM -0800, Mike Kravetz wrote: Here are some very preliminary numbers from sched_test_yield (which was previously posted to this (lse-tech) list by Bill Hartner). Tests were run on a system

Re: multi-queue scheduler update

2001-01-18 Thread Mike Kravetz
queue lock at 57%. Now, I know nothing about this benchmark, but it will be interesting to see what happens after applying my patch. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscri

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-18 Thread Mike Kravetz
On Fri, Jan 19, 2001 at 02:30:41AM +0100, Andrea Arcangeli wrote: On Thu, Jan 18, 2001 at 04:52:25PM -0800, Mike Kravetz wrote: was less than the number of processors. I'll give the tests a try with a smaller number of threads. I'm also open to suggestions for OK! what benchmarks

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-19 Thread Mike Kravetz
of running tasks is less than the number of processors. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the F

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-19 Thread Mike Kravetz
multi-queue scheduler, tasks on a remote queue must have high enough priority (to overcome this boost) before being moved to the local queue. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center 15450 SW Koll Parkway Beaverton, OR 97006-6063

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-19 Thread Mike Kravetz
On Thu, Jan 18, 2001 at 05:34:35PM -0800, Mike Kravetz wrote: On Fri, Jan 19, 2001 at 02:30:41AM +0100, Andrea Arcangeli wrote: On Thu, Jan 18, 2001 at 04:52:25PM -0800, Mike Kravetz wrote: was less than the number of processors. I'll give the tests a try with a smaller number

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-19 Thread Mike Kravetz
On Fri, Jan 19, 2001 at 12:49:21PM -0800, Mike Kravetz showed his lack of internet slang understanding and wrote: It was my intention to post IIRC numbers for small thread counts today. However, the benchmark (not the system) seems to hang on occasion. This occurs on both the unmodified

Re: [Lse-tech] Re: multi-queue scheduler update

2001-01-19 Thread Mike Kravetz
. That is why you need to put some type of synchronization in place. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please re

Re: multi-queue scheduler update

2001-01-19 Thread Mike Kravetz
in the not too distant future. Until then, we'll be looking into optimizations to help out the multi-queue scheduler at low thread counts. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-k

more on scheduler benchmarks

2001-01-22 Thread Mike Kravetz
statement? If the above is accurate, then I am wondering what would be a good scheduler benchmark for these low task count situations. I could undo the optimizations in sys_sched_yield() (for testing purposes only!), and run the existing benchmarks. Can anyone suggest a better solution? Thanks, -- Mi

Re: [Lse-tech] multi-queue scheduler update

2001-01-23 Thread Mike Kravetz
contention. This was done at the expense of the normal case. I'm currently working on this situation and expect to have a new patch out in the not too distant future. I expect the numbers will get better. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center

reschedule_idle changes in ac kernels

2001-06-04 Thread Mike Kravetz
, but we also need to be aware of performance in the non-realtime case. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info

sys_sched_yield fast path

2001-03-09 Thread Mike Kravetz
? OR Is the reasoning that in these cases there is so much 'scheduling' activity that we should force the reschedule? -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: How to optimize routing performance

2001-03-15 Thread Mike Kravetz
are some big IFs. I know little about the networking stack or this workload. Just wanted to explain how this scheduling work 'could' be related to interrupt load. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send

Re: test12pre6: BUG in schedule (sched.c, 115)

2000-12-06 Thread Mike Kravetz
Ragnar, Are you sure that was line 115? Could it have been line 515? Also, do you have any Oops data? Thanks, -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center 15450 SW Koll Parkway Beaverton, OR 97006-6063 (503)578-3494 On Wed

Scheduling Scalability Update

2000-12-15 Thread Mike Kravetz
. The Scheduling Scalability page is at: http://lse.sourceforge.net/scheduling/ If you are interested in this work, please join the lse-tech mailing list at: http://sourceforge.net/projects/lse -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center

test9: running tasks not in run-queue

2000-11-08 Thread Mike Kravetz
, -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center 15450 SW Koll Parkway Beaverton, OR 97006-6063 (503)578-3494 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECT

Scheduler Scalability CFP

2000-11-16 Thread Mike Kravetz
://sourceforge.net/projects/lse Thanks, -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center 15450 SW Koll Parkway Beaverton, OR 97006-6063 (503)578-3494 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" i

Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Mike Kravetz
On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote: I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a workload with Samba, and I wanted to get some feedback on results so far. Do you have any kernel profile or lock contention data? -- Mike Kravetz

Re: reschedule_idle changes in ac kernels

2001-06-05 Thread Mike Kravetz
value as opposed to 1. My guess is that the threshold value was changed from 0 to 1 in the 2.4 kernel for better performance with some workload. Anyone remember what that workload was/is? -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center

Re: Threads FAQ entry incomplete

2001-06-20 Thread Mike Kravetz
to the number of CPUs yet scheduler performance has gone downhill. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info

Re: wake_up vs. wake_up_sync

2001-06-27 Thread Mike Kravetz
on the runqueue, isn't it possible that another task has a higher goodness value than the task being awakened. In such a case, isn't is possible that the awakened task could sit on the runqueue (waiting for a CPU) while tasks with a lower goodness value are allowed to run? -- Mike Kravetz

Re: Strange thread behaviour on 8-way x86 machine

2001-07-03 Thread Mike Kravetz
utilizing 8 CPUs on 2.4.* kernels. This may seem obvious, but do you have more than 4 CPUs worth of work for the system to do? What is the runqueue length during this benchmark? -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from

Re: linux scheduler limitations?

2001-03-29 Thread Mike Kravetz
at: http://lse.sourceforge.net/scheduling/ I would be interested in your observations. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: a quest for a better scheduler

2001-04-03 Thread Mike Kravetz
' benchmark which attempts to address these issues (called reflex at the above site). However, I would really like to get a pointer to a community acceptable workload/benchmark for these low thread cases. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center

Re: a quest for a better scheduler

2001-04-03 Thread Mike Kravetz
that we have moved away from a 'realistic' low task count system load. lmbench's lat_ctx for example, and other tools in lmbench trigger various scheduler workloads as well. Thanks, I'll add these to our list. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology

Re: a quest for a better scheduler

2001-04-03 Thread Mike Kravetz
scheduler always attempts to make the same global scheduling decisions as the current scheduler. -- Mike Kravetz [EMAIL PROTECTED] IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: a quest for a better scheduler

2001-04-04 Thread Mike Kravetz
considerable effort to get working in a reasonable well performing manner. Could you make a port of your thing on recent kernels? There is a 2.4.2 patch on the web page. I'll put out a 2.4.3 patch as soon as I get some time. -- Mike Kravetz [EMAIL PROTECTED] IBM

RT scheduling: wakeup bug?

2007-10-01 Thread Mike Kravetz
I've been trying to track down some unexpected realtime latencies and believe one source is a bug in the wakeup code. Specifically, this is within the try_to_wake_up() routine. Within this routine there is the following code segment: /* * If a newly woken up RT task cannot

Re: -rt scheduling: wakeup bug?

2007-10-02 Thread Mike Kravetz
On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote: * Mike Kravetz [EMAIL PROTECTED] wrote: My observations/debugging/conclusions are based on an earlier version of the code. It appears the same code/issue still exists in the most version. But, I have not not done any work

Re: -rt scheduling: wakeup bug?

2007-10-03 Thread Mike Kravetz
On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote: Index: linux-rt-rebase.q/kernel/sched.c === --- linux-rt-rebase.q.orig/kernel/sched.c +++ linux-rt-rebase.q/kernel/sched.c @@ -1819,6 +1819,13 @@ out_set_cpu:

-rt more realtime scheduling issues

2007-10-05 Thread Mike Kravetz
Hi Ingo, After applying the fix to try_to_wake_up() I was still seeing some large latencies for realtime tasks. Some debug code pointed out two additional causes of these latencies. I have put fixes into my 'old' kernel and the scheduler related latencies have gone away. I'm pretty confident

Re: -rt more realtime scheduling issues

2007-10-08 Thread Mike Kravetz
On Fri, Oct 05, 2007 at 07:15:48PM -0700, Mike Kravetz wrote: After applying the fix to try_to_wake_up() I was still seeing some large latencies for realtime tasks. I've been looking for places in the code where reschedule IPIs should be sent in the case of 'overload' to redistribute RealTime

Re: -rt more realtime scheduling issues

2007-10-09 Thread Mike Kravetz
On Mon, Oct 08, 2007 at 11:04:12PM -0400, Steven Rostedt wrote: On Mon, Oct 08, 2007 at 11:45:23AM -0700, Mike Kravetz wrote: Are these accurate statements? I'll start working on a reliable delivery mechanism for RealTime scheduling. But, I just want to make sure that is really necessary

Re: [PATCH RT] fix rt-task scheduling issue

2007-10-09 Thread Mike Kravetz
On Mon, Oct 08, 2007 at 10:46:21PM -0400, Steven Rostedt wrote: Mike, Can you attach your Signed-off-by to this patch, please. On Fri, Oct 05, 2007 at 07:15:48PM -0700, Mike Kravetz wrote: Hi Ingo, After applying the fix to try_to_wake_up() I was still seeing some large latencies

Re: [RFC PATCH RT] push waiting rt tasks to cpus with lower prios.

2007-10-09 Thread mike kravetz
On Tue, Oct 09, 2007 at 01:59:37PM -0400, Steven Rostedt wrote: This has been complied tested (and no more ;-) The idea here is when we find a situation that we just scheduled in an RT task and we either pushed a lesser RT task away or more than one RT task was scheduled on this CPU before

Re: [RFC PATCH RT] push waiting rt tasks to cpus with lower prios.

2007-10-09 Thread mike kravetz
On Tue, Oct 09, 2007 at 04:50:47PM -0400, Steven Rostedt wrote: I did something like this a while ago for another scheduling project. A couple 'possible' optimizations to think about are: 1) Only scan the remote runqueues once and keep a local copy of the remote priorities for

Re: [PATCH] RT: Fix special-case exception for preempting the local CPU

2007-10-10 Thread mike kravetz
On Wed, Oct 10, 2007 at 10:49:35AM -0400, Gregory Haskins wrote: diff --git a/kernel/sched.c b/kernel/sched.c index 3e75c62..b7f7a96 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -1869,7 +1869,8 @@ out_activate: * extra locking in this particular case, because

Re: -rt more realtime scheduling issues

2007-10-10 Thread Mike Kravetz
On Wed, Oct 10, 2007 at 07:50:52AM -0400, Steven Rostedt wrote: On Tue, Oct 09, 2007 at 11:49:53AM -0700, Mike Kravetz wrote: The more I try understand the IPI handling the more confused I get. :( At fist I was concerned about an IPI happening in the middle of the __schedule routine

[PATCH] ppc64 Kconfig memory models

2005-04-05 Thread Mike Kravetz
and FLAT for others. -- Signed-off-by: Mike Kravetz [EMAIL PROTECTED] diff -Naupr linux-2.6.12-rc2-mm1/arch/ppc64/Kconfig linux-2.6.12-rc2-mm1.work/arch/ppc64/Kconfig --- linux-2.6.12-rc2-mm1/arch/ppc64/Kconfig 2005-04-05 18:44:57.0 + +++ linux-2.6.12-rc2-mm1.work/arch/ppc64/Kconfig

Re: [RFC][PATCH] Sparse Memory Handling (hot-add foundation)

2005-02-17 Thread Mike Kravetz
On Thu, Feb 17, 2005 at 04:03:53PM -0800, Dave Hansen wrote: The attached patch Just tried to compile this and noticed that there is no definition of valid_section_nr(), referenced in sparse_init. -- Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a

[PATCH] PPC64 NUMA memory fixup (another try)

2005-03-16 Thread Mike Kravetz
and OpenPower 720. -- Signed-off-by: Mike Kravetz [EMAIL PROTECTED] diff -Naupr linux-2.6.11.4/arch/ppc64/mm/numa.c linux-2.6.11.4.work/arch/ppc64/mm/numa.c --- linux-2.6.11.4/arch/ppc64/mm/numa.c 2005-03-16 00:09:31.0 + +++ linux-2.6.11.4.work/arch/ppc64/mm/numa.c2005-03-16 17:40

Re: [PATCH] ppc64: Add mem=X option, updated NUMA support

2005-03-23 Thread Mike kravetz
On Wed, Mar 23, 2005 at 11:11:10PM +1100, Michael Ellerman wrote: Can you test this on your 720 or whatever it was? And if anyone else has an interesting NUMA machine they can test it on I'd love to hear about it! I've tested this with various config options on my 720. Appears to work

Re: [PATCH] PPC64 NUMA memory fixup

2005-03-10 Thread mike kravetz
On Thu, Mar 10, 2005 at 02:36:13AM -0800, Andrew Morton wrote: This patch causes the non-numa G5 to oops very early in boot in smp_call_function(). OK - Let me take a look. -- Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL

Re: [PATCH] PPC64 NUMA memory fixup

2005-03-11 Thread mike kravetz
On Fri, Mar 11, 2005 at 07:51:38PM +1100, Paul Mackerras wrote: Anyway, the ultimate reason seems to be that the numa.c code is assuming that an address value and a size value occupy the same number of cells. On the G5 we have #address-cells = 2 but #size-cells = 1. Previously this didn't

Re: [PATCH] PPC64 NUMA memory fixup

2005-03-11 Thread mike kravetz
this on a machine known to break with the previous version (such as G5). -- Signed-off-by: Mike Kravetz [EMAIL PROTECTED] diff -Naupr linux-2.6.11/arch/ppc64/mm/numa.c linux-2.6.11.work/arch/ppc64/mm/numa.c --- linux-2.6.11/arch/ppc64/mm/numa.c 2005-03-02 07:38:38.0 + +++ linux-2.6.11

Re: [PATCH 1/4] create mm/Kconfig for arch-independent memory options

2005-04-04 Thread Mike Kravetz
On Mon, Apr 04, 2005 at 10:50:09AM -0700, Dave Hansen wrote: diff -puN mm/Kconfig~A6-mm-Kconfig mm/Kconfig --- memhotplug/mm/Kconfig~A6-mm-Kconfig 2005-04-04 09:04:48.0 -0700 +++ memhotplug-dave/mm/Kconfig 2005-04-04 10:15:23.0 -0700 @@ -0,0 +1,25 @@ +choice + prompt Memory

Re: NUMA policy interface

2005-08-04 Thread Mike Kravetz
On Thu, Aug 04, 2005 at 03:19:52PM -0700, Christoph Lameter wrote: This code already exist in the memory hotplug code base and Ray already had a working implementation for page migration. The migration code will also be necessary in order to relocate pages with ECC single bit failures that

Re: Bug: early_pfn_in_nid() called when not early

2006-12-13 Thread Mike Kravetz
On Wed, Dec 13, 2006 at 07:20:57PM +0100, Arnd Bergmann wrote: After a lot of debugging in spufs, I found that a crash that we encountered on Cell actually was caused by a change in the memory management. The patch that caused it is archived in http://lkml.org/lkml/2006/11/1/43, and this one

[PATCH V2 3/4] hugetlbfs: accept subpool min_size mount option and setup accordingly

2015-03-16 Thread Mike Kravetz
time an attempt is made to reserve min_size pages. If the reservation fails, the mount fails. At umount time, the reserved pages are released. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- fs/hugetlbfs/inode.c| 75 ++--- include/linux

[PATCH V2 0/4] hugetlbfs: add min_size filesystem mount option

2015-03-16 Thread Mike Kravetz
to specify minimum size. (David Rientjes) V1: Comments from RFC addressed/incorporated Mike Kravetz (4): hugetlbfs: add minimum size tracking fields to subpool structure hugetlbfs: add minimum size accounting to subpools hugetlbfs: accept subpool min_size mount option and setup accordingly

[PATCH V2 2/4] hugetlbfs: add minimum size accounting to subpools

2015-03-16 Thread Mike Kravetz
. The routines now return this global reserve count adjustment. This global adjusted reserve count is then passed to the global accounting routine hugetlb_acct_memory(). Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 115 --- 1

[PATCH V2 4/4] hugetlbfs: document min_size mount option

2015-03-16 Thread Mike Kravetz
Update documentation for the hugetlbfs min_size mount option. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- Documentation/vm/hugetlbpage.txt | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm

[PATCH V2 1/4] hugetlbfs: add minimum size tracking fields to subpool structure

2015-03-16 Thread Mike Kravetz
to this minimum. An additional field (rsv_hpages) is used to track the number of pages reserved to meet this minimum size. The hstate pointer in the subpool is convenient to have when reserving and unreserving the pages. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- include/linux/hugetlb.h | 2

Re: [PATCH V2 4/4] hugetlbfs: document min_size mount option

2015-03-18 Thread Mike Kravetz
On 03/18/2015 02:41 PM, Andrew Morton wrote: On Mon, 16 Mar 2015 16:53:29 -0700 Mike Kravetz mike.krav...@oracle.com wrote: Update documentation for the hugetlbfs min_size mount option. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- Documentation/vm/hugetlbpage.txt | 21

Re: [PATCH V2 3/4] hugetlbfs: accept subpool min_size mount option and setup accordingly

2015-03-18 Thread Mike Kravetz
On 03/18/2015 02:40 PM, Andrew Morton wrote: On Mon, 16 Mar 2015 16:53:28 -0700 Mike Kravetz mike.krav...@oracle.com wrote: Make 'min_size=' be an option when mounting a hugetlbfs. This option takes the same value as the 'size' option. min_size can be specified with specifying size. If both

Re: [PATCH V2 4/4] hugetlbfs: document min_size mount option

2015-03-20 Thread Mike Kravetz
On 03/18/2015 07:23 PM, Andrew Morton wrote: On Wed, 18 Mar 2015 18:51:22 -0700 Mike Kravetz mike.krav...@oracle.com wrote: Nowhere here is the reader told the units of size. We should at least describe that, and maybe even rename the thing to min_bytes. Ok, I will add that the size

[PATCH V3 2/4] hugetlbfs: add minimum size accounting to subpools

2015-03-20 Thread Mike Kravetz
. The routines now return this global reserve count adjustment. This global reserve count adjustment is then passed to the global accounting routine hugetlb_acct_memory(). Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 123

[PATCH V3 0/4] hugetlbfs: add min_size filesystem mount option

2015-03-20 Thread Mike Kravetz
to specify minimum size. Suggsted by David Rientjes V1: Comments from RFC addressed/incorporated Mike Kravetz (4): hugetlbfs: add minimum size tracking fields to subpool structure hugetlbfs: add minimum size accounting to subpools hugetlbfs: accept subpool min_size mount option and setup

[PATCH V3 3/4] hugetlbfs: accept subpool min_size mount option and setup accordingly

2015-03-20 Thread Mike Kravetz
, then at mount time an attempt is made to reserve min_size pages. If the reservation fails, the mount fails. At umount time, the reserved pages are released. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- fs/hugetlbfs/inode.c| 90 ++--- include

[PATCH V3 4/4] hugetlbfs: document min_size mount option and cleanup

2015-03-20 Thread Mike Kravetz
Add min_size mount option to the hugetlbfs documentation. Also, add the missing pagesize option and mention that size can be specified as bytes or a percentage of huge page pool. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- Documentation/vm/hugetlbpage.txt | 31

[PATCH V3 1/4] hugetlbfs: add minimum size tracking fields to subpool structure

2015-03-20 Thread Mike Kravetz
to this minimum. An additional field (rsv_hpages) is used to track the number of pages reserved to meet this minimum size. The hstate pointer in the subpool is convenient to have when reserving and unreserving the pages. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- include/linux/hugetlb.h | 8

Re: [RFC 0/3] hugetlbfs: optionally reserve all fs pages at mount time

2015-03-06 Thread Mike Kravetz
On 03/06/2015 07:10 AM, Michal Hocko wrote: On Mon 02-03-15 17:18:14, Mike Kravetz wrote: On 03/02/2015 03:10 PM, Andrew Morton wrote: On Fri, 27 Feb 2015 14:58:08 -0800 Mike Kravetz mike.krav...@oracle.com wrote: hugetlbfs allocates huge pages from the global pool as needed. Even

Re: [RFC 0/3] hugetlbfs: optionally reserve all fs pages at mount time

2015-03-06 Thread Mike Kravetz
On 03/06/2015 01:14 PM, David Rientjes wrote: On Fri, 6 Mar 2015, Mike Kravetz wrote: Thanks for the CONFIG_CGROUP_HUGETLB suggestion, however I do not believe this will be a satisfactory solution for my usecase. As you point out, cgroups could be set up (by a sysadmin) for every hugetlb user

Re: [RFC 1/3] hugetlbfs: add reserved mount fields to subpool structure

2015-03-02 Thread Mike Kravetz
On 03/02/2015 03:10 PM, Andrew Morton wrote: On Fri, 27 Feb 2015 14:58:10 -0800 Mike Kravetz mike.krav...@oracle.com wrote: Add a boolean to the subpool structure to indicate that the pages for subpool have been reserved. The hstate pointer in the subpool is convienient to have when it comes

Re: [RFC 0/3] hugetlbfs: optionally reserve all fs pages at mount time

2015-03-02 Thread Mike Kravetz
On 03/02/2015 03:10 PM, Andrew Morton wrote: On Fri, 27 Feb 2015 14:58:08 -0800 Mike Kravetz mike.krav...@oracle.com wrote: hugetlbfs allocates huge pages from the global pool as needed. Even if the global pool contains a sufficient number pages for the filesystem size at mount time, those

Re: [RFC 3/3] hugetlbfs: accept subpool reserved option and setup accordingly

2015-03-02 Thread Mike Kravetz
On 03/02/2015 03:10 PM, Andrew Morton wrote: On Fri, 27 Feb 2015 14:58:13 -0800 Mike Kravetz mike.krav...@oracle.com wrote: Make reserved be an option when mounting a hugetlbfs. New mount option triggers a user documentation update. hugetlbfs isn't well documented, but Documentation/vm

Re: [RFC 2/3] hugetlbfs: coordinate global and subpool reserve accounting

2015-03-02 Thread Mike Kravetz
On 03/02/2015 03:10 PM, Andrew Morton wrote: On Fri, 27 Feb 2015 14:58:11 -0800 Mike Kravetz mike.krav...@oracle.com wrote: If the pages for a subpool are reserved, then the reservations have already been accounted for in the global pool. Therefore, when requesting a new reservation

Re: [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time

2015-03-04 Thread Mike Kravetz
On 03/03/2015 09:49 PM, David Rientjes wrote: On Tue, 3 Mar 2015, Mike Kravetz wrote: Add a new hugetlbfs mount option 'reserved' to specify that the number of pages associated with the size of the filesystem will be reserved. If there are insufficient pages, the mount will fail

[PATCH 2/4] hugetlbfs: coordinate global and subpool reserve accounting

2015-03-03 Thread Mike Kravetz
not adjust the global count. However, when actually allocating or freeing a hugepage be sure to adjust the global reserve count so that it corresponds with the global free count. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 36 1 file

[PATCH 4/4] hugetlbfs: document reserved mount option

2015-03-03 Thread Mike Kravetz
Update documentation for the hugetlbfs reserved mount option. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- Documentation/vm/hugetlbpage.txt | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm

[PATCH 3/4] hugetlbfs: accept subpool reserved option and setup accordingly

2015-03-03 Thread Mike Kravetz
Make reserved be an option when mounting a hugetlbfs. reserved option is only possible if size option is also specified, otherwise the mount will fail. On mount, reserve size hugepages from the global pool and note in subpool. Unreserve hugepages when fs is unmounted. Signed-off-by: Mike

[PATCH 1/4] hugetlbfs: add reserved mount fields to subpool structure

2015-03-03 Thread Mike Kravetz
. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c| 2 ++ 2 files changed, 8 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 431b7fc..12fbd5d 100644 --- a/include/linux/hugetlb.h +++ b/include

[PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time

2015-03-03 Thread Mike Kravetz
are allocated and free'ed a sufficient number of pages remains reserved. Comments from RFC addressed/incorporated Mike Kravetz (4): hugetlbfs: add reserved mount fields to subpool structure hugetlbfs: coordinate global and subpool reserve accounting hugetlbfs: accept subpool reserved option

Re: [RFC 2/3] hugetlbfs: coordinate global and subpool reserve accounting

2015-02-28 Thread Mike Kravetz
correspondingly. Hillf Thanks Hillf. I'll also take a look at other comments in the area of 'accounting'. As I discovered, it is only a matter of adjusting the accounting to support reservation of pages for the entire filesystem. -- Mike Kravetz - ret = hugetlb_acct_memory(h, chg

[RFC 2/3] hugetlbfs: coordinate global and subpool reserve accounting

2015-02-27 Thread Mike Kravetz
decrement gobal reserve count to correspond to with decrement in global free pages. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c6adf65..4ef8379 100644

[RFC 0/3] hugetlbfs: optionally reserve all fs pages at mount time

2015-02-27 Thread Mike Kravetz
and free'ed a sufficient number of pages remains reserved. Mike Kravetz (3): hugetlbfs: add reserved mount fields to subpool structure hugetlbfs: coordinate global and subpool reserve accounting hugetlbfs: accept subpool reserved option and setup accordingly fs/hugetlbfs/inode.c| 15

[RFC 1/3] hugetlbfs: add reserved mount fields to subpool structure

2015-02-27 Thread Mike Kravetz
-off-by: Mike Kravetz mike.krav...@oracle.com --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c| 2 ++ 2 files changed, 8 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 431b7fc..605c648 100644 --- a/include/linux/hugetlb.h +++ b/include/linux

[RFC 3/3] hugetlbfs: accept subpool reserved option and setup accordingly

2015-02-27 Thread Mike Kravetz
Make reserved be an option when mounting a hugetlbfs. reserved option is only possible if size option is also specified. On mount, reserve size hugepages and note in subpool. Unreserve pages when fs is unmounted. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- fs/hugetlbfs/inode.c

[RFC 2/3] hugetlbfs: coordinate global and subpool reserve accounting

2015-02-27 Thread Mike Kravetz
decrement global reserve count to correspond to with decrement in global free pages. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c6adf65..4ef8379 100644

[RFC 1/3] hugetlbfs: add reserved mount fields to subpool structure

2015-02-27 Thread Mike Kravetz
-off-by: Mike Kravetz mike.krav...@oracle.com --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c| 2 ++ 2 files changed, 8 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 431b7fc..605c648 100644 --- a/include/linux/hugetlb.h +++ b/include/linux

Re: [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time

2015-03-06 Thread Mike Kravetz
On 03/06/2015 02:13 PM, Andi Kleen wrote: Mike Kravetz mike.krav...@oracle.com writes: hugetlbfs allocates huge pages from the global pool as needed. Even if the global pool contains a sufficient number pages for the filesystem size at mount time, those global pages could be grabbed for some

[RFC v2 PATCH 4/5] hugetlbfs: add hugetlbfs_fallocate()

2015-04-23 Thread Mike Kravetz
it is currently implemented using fallocate(). MADV_REMOVE lets madvise() remove pages from the middle of a hugetlbfs file, which wasn't possible before. hugetlbfs fallocate only operates on whole huge pages. Based-on code-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz

[RFC v2 PATCH 5/5] mm: madvise allow remove operation for hugetlbfs

2015-04-23 Thread Mike Kravetz
Now that we have hole punching support for hugetlbfs, we can also support the MADV_REMOVE interface to it. Signed-off-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/madvise.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[RFC v2 PATCH 2/5] hugetlbfs: remove region_truncte() as region_del() can be used

2015-04-23 Thread Mike Kravetz
Now that region_del() exists, the region_truncate() routine can be removed. Callers of region_truncate are changed to call region_del instead with a ending value of -1. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 37 + 1 file

[RFC v2 PATCH 3/5] hugetlbfs: New huge_add_to_page_cache helper routine

2015-04-23 Thread Mike Kravetz
Currently, there is only a single place where hugetlbfs pages are added to the page cache. The new fallocate code be adding a second one, so break the functionality out into its own helper. Signed-off-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav

[RFC v2 PATCH 1/5] hugetlbfs: truncate_hugepages() takes a range of pages

2015-04-23 Thread Mike Kravetz
. Based-on code-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- fs/hugetlbfs/inode.c| 31 +++- include/linux/hugetlb.h | 3 +- mm/hugetlb.c| 76 +++-- 3 files changed, 100

[RFC v2 PATCH 0/5] hugetlbfs: add fallocate support

2015-04-23 Thread Mike Kravetz
noticed by Hillf Danton New region_del() routine for region tracking/resv_map of ranges Fixed several issues found during more extensive testing Error handling in region_del() when kmalloc() fails stills needs to be addressed madvise remove support remains Mike Kravetz (5

[PATCH 2/2] mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages

2015-05-18 Thread Mike Kravetz
. Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 37 + 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7f64034..63f6d43 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1374,13 +1374,16 @@ static

[PATCH 0/2] alloc_huge_page/hugetlb_reserve_pages race

2015-05-18 Thread Mike Kravetz
at the same another task is mapping the same region without the MAP_NORESERVE flag. The patch set does not prevent the race from happening. Rather, it adds simple functionality to detect when the race has occurred. If a race is detected, then the incorrect counts are adjusted. Mike Kravetz (2

[PATCH 1/2] mm/hugetlb: compute/return the number of regions added by region_add()

2015-05-18 Thread Mike Kravetz
of region_add(). The special case return values of vma_needs_reservation() should also be taken into account when determining the return value of vma_commit_reservation(). Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/hugetlb.c | 19 +++ 1 file changed, 15 insertions(+), 4

hugetlbfs alignment requirements conflicting with documentation

2015-04-15 Thread Mike Kravetz
value). cc'ing some people from the recent hugetlb munmap alignment thread as I'm sure they will have an opinion here. -- Mike Kravetz -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

[RFC PATCH 0/4] hugetlbfs: add fallocate support

2015-04-16 Thread Mike Kravetz
, and ideally would like to release them back to the subpool or global pools for other uses. The fallocate() system call provides an interface for preallocation and hole punching within files. This patch set adds fallocate functionality to hugetlbfs. Mike Kravetz (4): hugetlbfs: truncate_hugepages() takes

[RFC PATCH 1/4] hugetlbfs: truncate_hugepages() takes a range of pages

2015-04-16 Thread Mike Kravetz
Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- fs/hugetlbfs/inode.c | 25 + 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index c274aca..d5b67fd 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c

[RFC PATCH 2/4] hugetlbfs: New huge_add_to_page_cache helper routine

2015-04-16 Thread Mike Kravetz
Currently, there is only a single place where hugetlbfs pages are added to the page cache. The new fallocate code be adding a second one, so break the functionality out into its own helper. Signed-off-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav

[RFC PATCH 3/4] hugetlbfs: add hugetlbfs_fallocate()

2015-04-16 Thread Mike Kravetz
it is currently implemented using fallocate(). MADV_REMOVE lets us remove data from the middle of a hugetlbfs file, which wasn't possible before. hugetlbfs fallocate only operates on whole huge pages. Based-on code-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav

[RFC PATCH 4/4] mm: madvise allow remove operation for hugetlbfs

2015-04-16 Thread Mike Kravetz
Now that we have hole punching support for hugetlbfs, we can also support the MADV_REMOVE interface to it. Signed-off-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/madvise.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [RFC PATCH 3/4] hugetlbfs: add hugetlbfs_fallocate()

2015-04-17 Thread Mike Kravetz
() +* unlock_page because locked by add_to_page_cache() +*/ + put_page(page); Still needed if EEXIST? Nope. Good catch. I'll fix this in the next version. -- Mike Kravetz -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: [RFC PATCH 4/4] mm: madvise allow remove operation for hugetlbfs

2015-04-17 Thread Mike Kravetz
On 04/17/2015 12:10 AM, Hillf Danton wrote: Now that we have hole punching support for hugetlbfs, we can also support the MADV_REMOVE interface to it. Signed-off-by: Dave Hansen dave.han...@linux.intel.com Signed-off-by: Mike Kravetz mike.krav...@oracle.com --- mm/madvise.c | 2 +- 1 file

  1   2   3   4   5   6   7   8   9   10   >