following patch describes the current race in detail and adds the mutex
to prevent truncate/fault races.
Mike Kravetz (1):
hugetlbfs: introduce truncation/fault mutex to avoid races
fs/hugetlbfs/inode.c| 24
include/linux/hugetlb.h | 1 +
mm/hugetlb.c| 25
ation takes in write mode.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 24
include/linux/hugetlb.h | 1 +
mm/hugetlb.c| 25 +++--
mm/userfaultfd.c| 8 +++-
4 files changed, 47 insertions(+), 11 deletions(-)
diff
iated page.
This is how we end up with an elevated map count.
To solve, check the dst_pte entry for huge_pte_none. If !none, this
implies PMD sharing so do not copy.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 23 +++
1 file changed, 19 insertions(+), 4 deletions(-)
diff
On 11/5/18 1:30 PM, Andrew Morton wrote:
> On Mon, 5 Nov 2018 13:23:15 -0800 Mike Kravetz
> wrote:
>
>> This bug has been experienced several times by Oracle DB team.
>> The BUG is in the routine remove_inode_hugepages() as follows:
>> /*
>> * If
On 10/23/18 12:43 AM, Michal Hocko wrote:
> On Wed 17-10-18 21:10:22, Mike Kravetz wrote:
>> Some test systems were experiencing negative huge page reserve
>> counts and incorrect file block counts. This was traced to
>> /proc/sys/vm/drop_caches removing clean pages f
d in read mode after huge_pte_alloc, until the caller
is finished with the returned ptep.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 21 ++
mm/hugetlb.c | 65 +---
mm/rmap.c| 10 +++
mm/userfaultfd.c |
worse. This leads to bad
things such as incorrect page map/reference counts or invaid memory
references.
Fix this all by modifying the usage of i_mmap_rwsem to cover
fault/truncate races as well as handling of shared pmds
Mike Kravetz (1):
hugetlbfs: use i_mmap_rwsem for pmd sharing and tru
On 10/18/18 4:08 PM, Andrew Morton wrote:
> On Wed, 17 Oct 2018 21:10:22 -0700 Mike Kravetz
> wrote:
>
>> Some test systems were experiencing negative huge page reserve
>> counts and incorrect file block counts. This was traced to
>> /proc/sys/vm/drop_caches removing
On 10/18/18 6:47 PM, Andrew Morton wrote:
> On Thu, 18 Oct 2018 20:46:21 -0400 Andrea Arcangeli
> wrote:
>
>> On Thu, Oct 18, 2018 at 04:16:40PM -0700, Mike Kravetz wrote:
>>> I was not sure about this, and expected someone could come up with
>>> something
workflow above. With the
suggested changes, I think this is OK for huge pages. However, it seems
that setting HWPoison on a in use non-huge page could cause issues?
While looking at the code, I noticed this comment in __get_any_page()
/*
* When the target page is a free hugepage, just remove it
* from free hugepage list.
*/
Did that apply to some code that was removed? It does not seem to make
any sense in that routine.
--
Mike Kravetz
On 07/17/2018 06:28 PM, Naoya Horiguchi wrote:
> On Tue, Jul 17, 2018 at 01:10:39PM -0700, Mike Kravetz wrote:
>> It seems that soft_offline_free_page can be called for in use pages.
>> Certainly, that is the case in the first workflow above. With the
>> suggested changes, I
On 4/20/21 1:46 AM, Muchun Song wrote:
> On Tue, Apr 20, 2021 at 7:20 AM Mike Kravetz wrote:
>>
>> On 4/15/21 1:40 AM, Muchun Song wrote:
>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>>> index 0abed7e766b8..6e970a7d3480 100644
>>&g
s may not be too bad in
the case of freeing a single page, but would become more complex when doing
bulk freeing. After a little thought, the workqueue approach may even end
up simpler. However, I would suggest a very simple workqueue implementation
with non-blocking allocations. If we can not quickly get vmemmap pages,
put the page back on the hugetlb free list and treat as a surplus page.
--
Mike Kravetz
caused a DOS scenario
as Michal sugested.
However, this is an 'opt in' feature. So, I would not expect anyone who
carefully plans the size of their hugetlb pool to enable such a feature.
If there is a use case where hugetlb pages are used in a non-essential
application, this might be of use.
--
Mike Kravetz
On 4/7/21 12:24 AM, Miaohe Lin wrote:
> Hi:
> On 2021/4/7 10:49, Mike Kravetz wrote:
>> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>>> The resv_map could be NULL since this routine can be called in the evict
>>> inode path for all hugetlbfs inodes. So we could have chg = 0
On 4/6/21 8:09 PM, Miaohe Lin wrote:
> On 2021/4/7 10:37, Mike Kravetz wrote:
>> On 4/6/21 7:05 PM, Miaohe Lin wrote:
>>> Hi:
>>> On 2021/4/7 8:53, Mike Kravetz wrote:
>>>> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>>>>> It's guarant
ing you suggest. Please do
not start until we get an Ack from Oscar as he will need to participate.
Remove patches for this series in your tree from Mike Kravetz:
- hugetlb: add lockdep_assert_held() calls for hugetlb_lock
- hugetlb: fix irq locking omissions
- hugetlb: make free_huge_page irq safe
-
On 4/7/21 7:44 PM, Miaohe Lin wrote:
> On 2021/4/8 5:23, Mike Kravetz wrote:
>> On 4/6/21 8:09 PM, Miaohe Lin wrote:
>>> On 2021/4/7 10:37, Mike Kravetz wrote:
>>>> On 4/6/21 7:05 PM, Miaohe Lin wrote:
>>>>> Hi:
>>>>> On 2021/4/7 8:53, Mi
On 4/7/21 8:26 PM, Miaohe Lin wrote:
> On 2021/4/8 11:24, Miaohe Lin wrote:
>> On 2021/4/8 4:53, Mike Kravetz wrote:
>>> On 4/7/21 12:24 AM, Miaohe Lin wrote:
>>>> Hi:
>>>> On 2021/4/7 10:49, Mike Kravetz wrote:
>>>>> On 4/2/21 2:32 AM,
e if (!rsv_adjust) {
> + reserved = true;
> }
> +
> + if (!reserved)
> + pr_warn("hugetlb: fix reserve count failed\n");
We should expand this warning message a bit to indicate what this may
mean to the user. Add something like"
"Huge Page Reserved count may go negative".
--
Mike Kravetz
On 4/8/21 8:01 PM, Miaohe Lin wrote:
> On 2021/4/9 6:53, Mike Kravetz wrote:
>>
>> Yes, add a comment to hugetlb_unreserve_pages saying that !resv_map
>> implies freed == 0.
>>
>
> Sounds good!
>
>> It would also be helpful to check for (
I've been trying to track down some unexpected realtime latencies and
believe one source is a bug in the wakeup code. Specifically, this is
within the try_to_wake_up() routine. Within this routine there is the
following code segment:
/*
* If a newly woken up RT task cannot preem
On 03/18/2015 07:23 PM, Andrew Morton wrote:
On Wed, 18 Mar 2015 18:51:22 -0700 Mike Kravetz wrote:
Nowhere here is the reader told the units of "size". We should at
least describe that, and maybe even rename the thing to min_bytes.
Ok, I will add that the size is in unit of
s specified, then at mount time an attempt is made to reserve
min_size pages. If the reservation fails, the mount fails. At umount
time, the reserved pages are released.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 90 ++---
include/linux/h
:
Added ability to specify minimum size. Suggsted by David Rientjes
V1:
Comments from RFC addressed/incorporated
Mike Kravetz (4):
hugetlbfs: add minimum size tracking fields to subpool structure
hugetlbfs: add minimum size accounting to subpools
hugetlbfs: accept subpool min_size mo
routines now return
this global reserve count adjustment. This global reserve count
adjustment is then passed to the global accounting routine
hugetlb_acct_memory().
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 123 ---
1 file changed, 100
minimum. An additional
field (rsv_hpages) is used to track the number of pages reserved
to meet this minimum size. The hstate pointer in the subpool
is convenient to have when reserving and unreserving the pages.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 8 +++-
mm/hugetlb.c
Add min_size mount option to the hugetlbfs documentation. Also,
add the missing pagesize option and mention that size can be
specified as bytes or a percentage of huge page pool.
Signed-off-by: Mike Kravetz
---
Documentation/vm/hugetlbpage.txt | 31 ++-
1 file
noticed by Hillf Danton
New region_del() routine for region tracking/resv_map of ranges
Fixed several issues found during more extensive testing
Error handling in region_del() when kmalloc() fails stills needs
to be addressed
madvise remove support remains
Mike Kravetz (5
it is currently
implemented using fallocate(). MADV_REMOVE lets madvise() remove
pages from the middle of a hugetlbfs file, which wasn't possible
before.
hugetlbfs fallocate only operates on whole huge pages.
Based-on code-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/in
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index d551475
Now that region_del() exists, the region_truncate() routine can be
removed. Callers of region_truncate are changed to call region_del
instead with a ending value of -1.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 37 +
1 file changed, 1 insertion(+), 36
Currently, there is only a single place where hugetlbfs pages are
added to the page cache. The new fallocate code be adding a second
one, so break the functionality out into its own helper.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 2 ++
mm
.
Based-on code-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 31 +++-
include/linux/hugetlb.h | 3 +-
mm/hugetlb.c| 76 +++--
3 files changed, 100 insertions(+), 10 deletions(-)
diff --git a/fs
ze
aligned value).
cc'ing some people from the recent hugetlb munmap alignment thread as
I'm sure they will have an opinion here.
--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.or
Modify truncate_hugepages() to take a range of pages (start, end)
instead of simply start. If the value of end is -1, this indicates
the end of the range is the end of the file. This functionality
will be used for fallocate hole punching.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
ideally would like to release them back to
the subpool or global pools for other uses. The fallocate() system
call provides an interface for preallocation and hole punching within
files. This patch set adds fallocate functionality to hugetlbfs.
Mike Kravetz (4):
hugetlbfs: truncate_hugepages() takes
Currently, there is only a single place where hugetlbfs pages are
added to the page cache. The new fallocate code be adding a second
one, so break the functionality out into its own helper.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 2 ++
mm
it is currently
implemented using fallocate(). MADV_REMOVE lets us remove data
from the middle of a hugetlbfs file, which wasn't possible before.
hugetlbfs fallocate only operates on whole huge pages.
Based-on code-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index d551475
y fault/allocate any
huge pages. The result was the reservation (HugePages_Rsvd)
of sufficient huge pages to cover the mapping. When the program
exited, the reservations remained. If I remove (unlink) the
file the reservations will be removed.
--
Mike Kravetz
--
To unsubscribe from this
On 04/16/2015 11:44 PM, Christoph Hellwig wrote:
On Thu, Apr 16, 2015 at 04:02:58PM -0700, Mike Kravetz wrote:
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Meh. Just use fallocate for any new code..
I don't have the com
On 04/17/2015 12:10 AM, Hillf Danton wrote:
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
()
+* unlock_page because locked by add_to_page_cache()
+*/
+ put_page(page);
Still needed if EEXIST?
Nope. Good catch.
I'll fix this in the next version.
--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" i
On 04/17/2015 10:11 AM, Mike Kravetz wrote:
On 04/17/2015 12:10 AM, Hillf Danton wrote:
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file
On 05/26/2015 04:09 PM, Andrew Morton wrote:
On Tue, 26 May 2015 14:27:10 -0700 Mike Kravetz wrote:
This is a documentation only patch and does not modify any code.
Descriptions of the routines used for reserve map/region tracking
are added.
Confused. This adds comments which are similar
region_add(). In the normal case, we want vma_commit_reservation
to return the same value as the preceding call to vma_needs_reservation.
Create a common __vma_reservation_common routine to help keep the
special case return values in sync
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 72
This is a documentation only patch and does not modify any code.
Descriptions of the routines used for reserve map/region tracking
are added.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 52 ++--
1 file changed, 50 insertions(+), 2 deletions
off parameter commit for easier reading
v2:
Added documentation for the region/reserve map routines
Created common routine for vma_commit_reservation and
vma_commit_reservation to help prevent them from drifting
apart in the future.
Mike Kravetz (3):
mm/hugetlb: document the reserve
.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 34 ++
1 file changed, 30 insertions(+), 4 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b3d3d59..038c84e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1544,7 +1544,7 @@ static struct page *alloc_huge_page
On 05/28/2015 07:01 AM, Davidlohr Bueso wrote:
On Wed, 2015-05-27 at 10:56 -0700, Mike Kravetz wrote:
alloc_huge_page and hugetlb_reserve_pages use region_chg to
calculate the number of pages which will be added to the reserve
map. Subpool and global reserve counts are adjusted based on
the
and do not need to deal with error handling. Future
callers of region_del() (such as fallocate hole punch) will need to
handle this error.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 88 ++--
1 file changed, 62 insertions(+), 26 deletions
t and error handling issues noticed by Hillf Danton
New region_del() routine for region tracking/resv_map of ranges
Fixed several issues found during more extensive testing
Error handling in region_del() when kmalloc() fails stills needs
to be addressed
madvise remove support remains
Areas hole punched by fallocate will not have entries in the
region/reserve map. However, shared mappings with min_size subpool
reservations may still have reserved pages. alloc_huge_page needs
to handle this special case and do the proper accounting.
Signed-off-by: Mike Kravetz
---
mm
-by: Mike Kravetz
---
include/linux/hugetlb.h | 10 ++
mm/hugetlb.c| 20
2 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2050261..bbd072e 100644
--- a/include/linux/hugetlb.h
+++ b
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index d215ea9
only operates on whole huge pages.
Based-on code-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 156 +++-
include/linux/hugetlb.h | 3 +
mm/hugetlb.c| 8 +--
3 files changed, 162 insertions(+), 5 dele
Currently, there is only a single place where hugetlbfs pages are
added to the page cache. The new fallocate code be adding a second
one, so break the functionality out into its own helper.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 2 ++
mm
). vma_has_reserves is passed "chg" which indicates
whether or not a region/reserve map is present. Use this to determine
if reserves are actually present or were removed via hole punch.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 16 +---
1 file changed, 13 insertions(+), 3 deletion
callers to add 0 as end of range.
Since the routine will be used in hole punch as well as truncate
operations, it is more appropriately renamed to hugetlb_vmdelete_list().
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 25 ++---
1 file changed, 18 insertions(+), 7
() is also modified to take a range of pages.
hugetlb_unreserve_pages is modified to detect an error from
region_del and pass it back to the caller.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 93 +++--
include/linux/hugetlb.h | 4 ++-
mm
On 06/11/2015 03:46 PM, Davidlohr Bueso wrote:
On Thu, 2015-06-11 at 14:01 -0700, Mike Kravetz wrote:
/* Forward declaration */
static int hugetlb_acct_memory(struct hstate *h, long delta);
@@ -3324,7 +3324,8 @@ static u32 fault_mutex_hash(struct hstate *h, struct
mm_struct *mm
On 06/14/2015 11:34 PM, Naoya Horiguchi wrote:
> On Thu, Jun 11, 2015 at 02:01:37PM -0700, Mike Kravetz wrote:
>> Areas hole punched by fallocate will not have entries in the
>> region/reserve map. However, shared mappings with min_size subpool
>> reservations may stil
On 06/11/2015 03:46 PM, Davidlohr Bueso wrote:
On Thu, 2015-06-11 at 14:01 -0700, Mike Kravetz wrote:
/* Forward declaration */
static int hugetlb_acct_memory(struct hstate *h, long delta);
@@ -3324,7 +3324,8 @@ static u32 fault_mutex_hash(struct hstate *h, struct
mm_struct *mm
changes to
be more consistent with other global hugetlb symbols.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 5 +
mm/hugetlb.c| 20 ++--
2 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
only operates on whole huge pages.
Based-on code-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 158 +++-
include/linux/hugetlb.h | 3 +
mm/hugetlb.c| 2 +-
3 files changed, 161 insertions(+), 2 deletions(-)
() is also modified to take a range of pages.
hugetlb_unreserve_pages is modified to detect an error from
region_del and pass it back to the caller.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 98 -
include/linux/hugetlb.h | 4 +-
mm
). vma_has_reserves is passed "chg" which
indicates whether or not a region/reserve map is present. Use
this to determine if reserves are actually present or were removed
via hole punch.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 16 +---
1 file changed, 13 insertions(+), 3
found during more extensive testing
Error handling in region_del() when kmalloc() fails stills needs
to be addressed
madvise remove support remains
Mike Kravetz (9):
mm/hugetlb: add region_del() to delete a specific range of entries
mm/hugetlb: expose hugetlb fault mutex for use by fall
and do not need to deal with error handling. Future
callers of region_del() (such as fallocate hole punch) will need to
handle this error.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 88 ++--
1 file changed, 62 insertions(+), 26 deletions
callers to add 0 as end of range.
Since the routine will be used in hole punch as well as truncate
operations, it is more appropriately renamed to hugetlb_vmdelete_list().
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 25 ++---
1 file changed, 18 insertions(+), 7
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index d215ea9
Currently, there is only a single place where hugetlbfs pages are
added to the page cache. The new fallocate code be adding a second
one, so break the functionality out into its own helper.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 2 ++
mm
Areas hole punched by fallocate will not have entries in the
region/reserve map. However, shared mappings with min_size subpool
reservations may still have reserved pages. alloc_huge_page needs
to handle this special case and do the proper accounting.
Signed-off-by: Mike Kravetz
---
mm
routine for vma_commit_reservation and
vma_commit_reservation to help prevent them from drifting
apart in the future.
Mike Kravetz (3):
mm/hugetlb: document the reserve map/region tracking routines
mm/hugetlb: compute/return the number of regions added by region_add()
mm/hugetlb
This is a documentation only patch and does not modify any code.
Descriptions of the routines used for reserve map/region tracking
are added.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 52 ++--
1 file changed, 50 insertions(+), 2 deletions
region_add(). In the normal case, we want vma_commit_reservation
to return the same value as the preceding call to vma_needs_reservation.
Create a common __vma_reservation_common routine to help keep the
special case return values in sync
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 72
.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 39 +++
1 file changed, 35 insertions(+), 4 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index cd3fc41..75c0eef 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1542,7 +1542,7 @@ static struct page
On 05/27/2015 10:56 AM, Mike Kravetz wrote:
alloc_huge_page and hugetlb_reserve_pages use region_chg to
calculate the number of pages which will be added to the reserve
map. Subpool and global reserve counts are adjusted based on
the output of region_chg. Before the pages are actually added
to
ages are not accounted for when
they are allocated as 'reserves'. It is not until these reserves are actually
used that accounting limits are checked. This 'seems' to align with general
allocation of huge pages within the pool. No accounting is done until they
are actually allocated to a mapping/file.
--
Mike Kravetz
On 05/22/2018 09:41 AM, Reinette Chatre wrote:
> On 5/21/2018 4:48 PM, Mike Kravetz wrote:
>> On 05/21/2018 01:54 AM, Vlastimil Babka wrote:
>>> On 05/04/2018 01:29 AM, Mike Kravetz wrote:
>>>> +/**
>>>> + * find_alloc_contig_pages() --
On 04/04/2018 04:36 AM, Anders Roxell wrote:
> On 14 March 2018 at 02:09, Mike Kravetz wrote:
>> On 03/13/2018 04:42 AM, Anders Roxell wrote:
>>> gcc warns about implicit declaration.
>>>
>>> gcc -D_FILE_OFFSET_BITS=64 -I../../../../include/uapi/
>>>
4.
>
> There is a regression on arm32 in libhugetlbfs/truncate_above_4GB-2M-32
> that also exists in 4.14 and mainline. We'll investigate the root cause
> and report upstream in mainline. I suspect the cause is "hugetlbfs:
> check for pgoff value overflow", but have no
On 03/28/2018 12:06 PM, Mike Kravetz wrote:
> On 03/28/2018 11:44 AM, Dan Rue wrote:
>> On Tue, Mar 27, 2018 at 06:26:40PM +0200, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 4.15.14 release.
>>> There are 105 patches in this seri
han 4GB on 32 bit kernels.
The above is in the commit message. 63489f8e8211 has been sent upstream
and to stable, so cc'ing stable here as well.
I would appreciate some more eyes on this code. There have been several
fixes and we keep running into issues.
Mike Kravetz (1):
hugetlbfs: f
y: Dan Rue
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 22 +-
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b9a254dcc0e7..8450a1d75dfa 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
pages that are not charged to a memcg. memcg charges in other
code paths seem to happen at huge page allocation time.
--
Mike Kravetz
>
> The page charged to memcg will finally be uncharged at free_huge_page.
>
> Modification of memcontrol.c is for updating of statistical information
On 04/21/2018 09:16 AM, Vlastimil Babka wrote:
> On 04/17/2018 04:09 AM, Mike Kravetz wrote:
>> find_alloc_contig_pages() is a new interface that attempts to locate
>> and allocate a contiguous range of pages. It is provided as a more
>> convenient interface than alloc_co
On 05/01/2018 11:54 PM, TSUKADA Koutaro wrote:
> On 2018/05/02 13:41, Mike Kravetz wrote:
>> What is the reason for not charging pages at allocation/reserve time? I am
>> not an expert in memcg accounting, but I would think the pages should be
>> charged at allocation tim
ssh_to_dbg # sudo ./test_mmap 4
mapping 4 huge pages
address 7f62bba0 read (-)
address 7f62bbc0 read (-)
Connection to dbg closed by remote host.
Connection to dbf closed.
OOM did kick in (lots of console/log output) and killed the shell
as well.
--
Mike Kravetz
libhugetlbfs tests for an unrelated
issue/change and, will do some analysis to see exactly what is happening.
Also, will take it upon myself to run libhugetlbfs test suite on a
regular (at least weekly) basis.
--
Mike Kravetz
to ~25.6%, the
> IPC (instruction per cycle) increased from 0.3 to 0.37, and the time
> spent in user space is reduced ~19.3%
Since this patch only addresses hugetlbfs huge pages, I would suggest
making that more explicit in the commit message. Other than that, the
changes look fine to me.
that another area to consider?
That gets back to Michal's question of a specific use case or generic
optimization. Unless code is simple (as in this patch), seems like we should
hold off on considering additional optimizations unless there is a specific
use case.
I'm still OK with this change.
--
Mike Kravetz
On 3/1/19 5:21 AM, Alexandre Ghiti wrote:
> On 03/01/2019 07:25 AM, Alex Ghiti wrote:
>> On 2/28/19 5:26 PM, Mike Kravetz wrote:
>>> On 2/28/19 12:23 PM, Dave Hansen wrote:
>>>> On 2/28/19 11:50 AM, Mike Kravetz wrote:
>>>>> On 2/28/19
in controls
who can have access to hugetlbfs, so I think adding code to the open
routine as in patch 2 of this series would seem to work.
However, I can imagine more special cases being added for other users. And,
once you have more than one special case then you may want to combine them.
For example, kvm and hugetlbfs together.
--
Mike Kravetz
On 3/12/19 11:00 PM, Peter Xu wrote:
> On Tue, Mar 12, 2019 at 12:59:34PM -0700, Mike Kravetz wrote:
>> On 3/11/19 2:36 AM, Peter Xu wrote:
>>>
>>> The "kvm" entry is a bit special here only to make sure that existing
>>> users like QEMU/KVM won'
tup process enable uffd for all users.
Correct?
This may be too simple, and I don't really like group access, but how about
just defining a uffd group? If you are in the group you can make uffd
system calls.
--
Mike Kravetz
On 3/13/19 4:55 PM, Andrea Arcangeli wrote:
> On Wed, Mar 13, 2019 at 01:01:40PM -0700, Mike Kravetz wrote:
>> On 3/13/19 11:52 AM, Andrea Arcangeli wrote:
>>> Unless somebody suggests a consistent way to make hugetlbfs "just
>>> work" (like we could achi
91.658122] do_mount+0x11f0/0x1640
> [ 91.658125] ksys_mount+0xc0/0xd0
> [ 91.658129] __arm64_sys_mount+0xcc/0xe4
> [ 91.658137] el0_svc_handler+0x28c/0x338
> [ 91.681740] el0_svc+0x8/0xc
>
> Fixes: 2284cf59cbce ("hugetlbfs: Convert to fs_context")
> Signed-off-by:
tructures are only needed for inodes which can have associated
page allocations. To fix the leak, only allocate resv_map for those
inodes which could possibly be associated with page allocations.
Reported-by: Yufen Yu
Suggested-by: Yufen Yu
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 2
201 - 300 of 1393 matches
Mail list logo