We are going to do IO a huge page a time. For x86-64, it's 512 pages, so
we need to double current BIO_MAX_PAGES.
To be portable to other archtectures we need more generic solution.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/bio.h | 2 +-
-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/truncate.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/mm/truncate.c b/mm/truncate.c
index a01cce450a26..ce904e4b1708 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -504,10 +504,21 @@ unsigne
From: Matthew Wilcox <wi...@infradead.org>
radix_tree_replace_clear_tags() can be called with NULL as the replacement
value; in this case we need to delete sibling entries which point to
the slot.
Signed-off-by: Matthew Wilcox <wi...@infradead.org>
Signed-off-by: Kirill A. Shutemov &
This reverts commit 356e1c23292a4f63cfdf1daf0e0ddada51f32de8.
After conversion of huge tmpfs to multi-order entries, we don't need
this anymore.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/radix-tree.h | 1 -
lib/radix-tree.c
As the function handles zeroing range only within one block, the
required changes are trivial, just remove assuption on page size.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --gi
Introduce new helpers which return size/mask of the page:
HPAGE_PMD_SIZE/HPAGE_PMD_MASK if the page is PageTransHuge() and
PAGE_SIZE/PAGE_MASK otherwise.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/huge_mm.h | 16
1 file chang
The same four values as in tmpfs case.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/ext4.h | 5 +
fs/ext4/inode.c | 26 +-
fs/ext4/super.c | 19 +++
3 files changed, 45 insertions(+), 5 deletions(-)
diff --gi
We writeback whole huge page a time.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/mm/filemap.c b/mm/filemap.c
index ad73b99c5ba7..3d46db277e73 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@
The approach is straight-forward: for compound pages we read out whole
huge page.
For huge page we cannot have array of buffer head pointers on stack --
it's 4096 pointers on x86-64 -- 'arr' is allocated with kmalloc() for
huge pages.
Signed-off-by: Kirill A. Shutemov <kirill.sh
Change ext4_writepage() and underlying ext4_bio_write_page().
It basically removes assumption on page size, infer it from struct page
instead.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 10 +-
fs/ext4/page-io.c | 11 +--
2
It's more or less straight-forward.
Most changes are around getting offset/len withing page right and zero
out desired part of the page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 53 +++--
1 file c
split_huge_page() is ready to handle file-backed huge pages, we only
need to remove one guarding VM_BUG_ON_PAGE().
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/huge_memory.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
Call ext4_da_should_update_i_disksize() for head page with offset
relative to head page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
.
With memory-mapped IO we would loose holes in some cases when we have
THP in page cache, since we cannot track access on 4k level in this
case.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 2 +-
mm/truncate.
?);
- check if memory reclaim process is adequate for huge pages with
backing storage (unnecessary split_huge_page() ?);
- handle shadow entries properly;
- encryption, 1k blocks, bigalloc, ...
Kirill A. Shutemov (27):
mm, shmem: swich huge tmpfs to multi-order radix-tree entries
Revert
Modify mpage_map_and_submit_buffers() to do writeback with huge pages.
This is somewhat unstable. I have hard time see full picture yet.
More work is required.
Not-yet-signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.
With huge pages in page cache we see tail pages in more code paths.
This patch replaces direct access to struct page fields with macros
which can handle tail pages properly.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 2 +-
fs/ext4/i
n call radix_tree_for_each_slot() and
radix_tree_replace_slot() in order to turn these retry entries into the
intended new entries. Tags are replicated from the original multiorder
entry into each new entry.
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill
We writeback whole huge page a time. Let's adjust iteration this way.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/mm.h | 1 +
include/linux/pagemap.h | 1 +
mm/page-writeback.c | 17 -
3 files changed, 14 insertions
On Fri, Aug 12, 2016 at 04:34:40PM -0400, Theodore Ts'o wrote:
> On Fri, Aug 12, 2016 at 09:37:43PM +0300, Kirill A. Shutemov wrote:
> > Here's stabilized version of my patchset which intended to bring huge pages
> > to ext4.
>
> So this patch is more about mm level cha
-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/huge_mm.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index de2789b4402c..5c5466ba37df 100644
--- a/include/linux/huge_mm.h
+++ b/i
Call ext4_da_should_update_i_disksize() for head page with offset
relative to head page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
Adjust check on whether part of the page beyond file size and apply
compound_head() and page_mapping() where appropriate.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/bu
lated code have to be updated.
Note that hugetlb_fault_mutex_hash() and reservation region handling are
still working with hugepage offset.
Signed-off-by: Naoya Horiguchi <n-horigu...@ah.jp.nec.com>
[kirill.shute...@linux.intel.com: reject fixed]
Signed-off-by: Kirill A. Shutemov <kirill.
Adjust how we find relevant block within page and how we clear the
required part of the page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/move_extent.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/move_extent.
We writeback whole huge page a time.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/mm/filemap.c b/mm/filemap.c
index 93fa97f143ab..429f9a0962b3 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@
As the function handles zeroing range only within one block, the
required changes are trivial, just remove assuption on page size.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --gi
-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 320 +--
mm/huge_memory.c | 47 +---
mm/khugepaged.c | 26 ++---
mm/shmem.c | 36 ++-
4 files changed, 247 insertions(+), 182 deletions(-)
diff --gi
We writeback whole huge page a time. Let's adjust iteration this way.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/mm.h | 1 +
include/linux/pagemap.h | 1 +
mm/page-writeback.c | 17 -
3 files changed, 14 insertions
It simply matches changes to __block_write_begin_int().
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 24
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bee21f
Change ext4_writepage() and underlying ext4_bio_write_page().
It basically removes assumption on page size, infer it from struct page
instead.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 10 +-
fs/ext4/page-io.c | 11 +--
2
e walk, but they will never see NULL for an
index which was populated before the join.
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/radix-tree.h| 2 +
lib/radix-tree.c
them, before attempt split. And remove one guarding
VM_BUG_ON_PAGE().
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/buffer_head.h | 1 +
mm/huge_memory.c| 19 ++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff
This reverts commit 356e1c23292a4f63cfdf1daf0e0ddada51f32de8.
After conversion of huge tmpfs to multi-order entries, we don't need
this anymore.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/radix-tree.h | 1 -
lib/radix-tree.c
These flags are in use for filesystems with backing storage: PG_error,
PG_writeback and PG_readahead.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/page-flags.h | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include
Let's add FileHugePages and FilePmdMapped fields into meminfo and smaps.
It indicates how many times we allocate and map file THP.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
drivers/base/node.c| 6 ++
fs/proc/meminfo.c | 4
fs/proc/task
kernel include files. We also need the
real definition of gfpflags_allow_blocking() to persuade the radix tree
to actually use its preallocated nodes.
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
tools/
-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/truncate.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/mm/truncate.c b/mm/truncate.c
index a01cce450a26..ce904e4b1708 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -504,10 +504,21 @@ unsigne
Most of work happans on head page. Only when we need to do copy data to
userspace we find relevant subpage.
We are still limited by PAGE_SIZE per iteration. Lifting this limitation
would require some more work.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/fil
n call radix_tree_for_each_slot() and
radix_tree_replace_slot() in order to turn these retry entries into the
intended new entries. Tags are replicated from the original multiorder
entry into each new entry.
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill
From: Matthew Wilcox <wi...@linux.intel.com>
The radix tree uses its own buggy WARN_ON_ONCE. Replace it with the
definition from asm-generic/bug.h
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
.
With memory-mapped IO we would loose holes in some cases when we have
THP in page cache, since we cannot track access on 4k level in this
case.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c| 2 +-
include/linux/mm.h | 9 +-
mm/truncate.c
Most of work happans on head page. Only when we need to do copy data to
userspace we find relevant subpage.
We are still limited by PAGE_SIZE per iteration. Lifting this limitation
would require some more work.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/fil
The approach is straight-forward: for compound pages we read out whole
huge page.
For huge page we cannot have array of buffer head pointers on stack --
it's 4096 pointers on x86-64 -- 'arr' is allocated with kmalloc() for
huge pages.
Signed-off-by: Kirill A. Shutemov <kirill.sh
HACK.
Having that said, I don't think it should prevent huge page support to
be applied. Future will show if lacking readahead is a big deal with
huge pages in page cache.
Any suggestions are welcome.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/readahead.
On Thu, Jan 26, 2017 at 07:44:39AM -0800, Matthew Wilcox wrote:
> On Thu, Jan 26, 2017 at 02:57:48PM +0300, Kirill A. Shutemov wrote:
> > For filesystems that wants to be write-notified (has mkwrite), we will
> > encount write-protection faults for huge PMDs in
With huge pages in page cache we see tail pages in more code paths.
This patch replaces direct access to struct page fields with macros
which can handle tail pages properly.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 2 +-
fs/ext4/i
ext4_find_unwritten_pgoff() needs few tweaks to work with huge pages.
Mostly trivial page_mapping()/page_to_pgoff() and adjustment to how we
find relevant block.
Signe-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/file.c | 18 ++
1 file chang
It's more or less straight-forward.
Most changes are around getting offset/len withing page right and zero
out desired part of the page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 70 +++--
: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
Date: Fri, 12 Aug 2016 19:44:30 +0300
Subject: [PATCH] Add few more configurations to test ext4 with huge pages
Four new configurations: huge_4k, huge_1k, huge_bigalloc, huge_encrypt.
Signed-off-by: Kirill A. Shutemov <kir
__ext4_block_zero_page_range() adjusted to calculate starting iblock
correctry for huge pages.
ext4_{collapse,insert}_range() requires page cache invalidation. We need
the invalidation to be aligning to huge page border if huge pages are
possible in page cache.
Signed-off-by: Kirill A. Shutemov
For filesystems that wants to be write-notified (has mkwrite), we will
encount write-protection faults for huge PMDs in shared mappings.
The easiest way to handle them is to clear the PMD and let it refault as
wriable.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
Re
Call ext4_da_should_update_i_disksize() for head page with offset
relative to head page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
filesystem, but hopefully this change would be
enough to address the concern.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/ext4_jbd2.h | 16 +---
fs/ext4/inode.c | 34 +++---
2 files changed, 40 insertions(+), 10 del
We need to account huge pages according to its size to get background
writaback work properly.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/fs-writeback.c | 10 +++---
include/linux/backing-dev.h | 10 ++
include/linux/memcontrol.h
The same four values as in tmpfs case.
Encyption code is not yet ready to handle huge page, so we disable huge
pages support if the inode has EXT4_INODE_ENCRYPT.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/ext4.h | 5 +
fs/ext4/inode.
We writeback whole huge page a time. Let's adjust iteration this way.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/mm.h | 1 +
include/linux/pagemap.h | 1 +
mm/page-writeback.c | 17 -
3 files changed, 14 insertions
Adjust check on whether part of the page beyond file size and apply
compound_head() and page_mapping() where appropriate.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/bu
Adjust how we find relevant block within page and how we clear the
required part of the page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/move_extent.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/move_extent.
For huge pages 'stop' must be within HPAGE_PMD_SIZE.
Let's use hpage_size() in the BUG_ON().
We also need to change how we calculate lblk for cluster deallocation.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 5 +++--
1 file changed, 3 inse
lated code have to be updated.
Note that hugetlb_fault_mutex_hash() and reservation region handling are
still working with hugepage offset.
Signed-off-by: Naoya Horiguchi <n-horigu...@ah.jp.nec.com>
[kirill.shute...@linux.intel.com: reject fixed]
Signed-off-by: Kirill A. Shutemov <kirill.
Let's add FileHugePages and FilePmdMapped fields into meminfo and smaps.
It indicates how many times we allocate and map file THP.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
drivers/base/node.c| 6 ++
fs/proc/meminfo.c | 4
fs/proc/task
Write path allocate pages using pagecache_get_page(). We should be able
to allocate huge pages there, if it's allowed. As usually, fallback to
small pages, if failed.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 17 +++--
1 file chang
Change ext4_writepage() and underlying ext4_bio_write_page().
It basically removes assumption on page size, infer it from struct page
instead.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 10 +-
fs/ext4/page-io.c | 11 +--
2
to accumulate information from shadow entires to
return to caller (average eviction time?).
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/fs.h | 5 ++
include/linux/pagemap.h | 21 ++-
mm/filemap.c
We writeback whole huge page a time.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/mm/filemap.c b/mm/filemap.c
index 4e398d5e4134..f5cd654b3662 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@
-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/huge_mm.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e5c9c26d2439..5e6c408f5b47 100644
--- a/include/linux/huge_mm.h
+++ b/i
Trivial: remove assumption on page size.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8d1b5e63cb15..a25be1cf4506 100644
--
We want mmap(NULL) to return PMD-aligned address if the inode can have
huge pages in page cache.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/huge_memory.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_me
It simply matches changes to __block_write_begin_int().
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 35 +--
1 file changed, 21 insertions(+), 14 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
to
HPAGE_PMD_NR);
This would provide balanced exposure of multi-order entires to the rest
of the kernel.
[1] find_get_pages(), pagecache_get_page(), pagevec_lookup(), etc.
[2] find_get_entry(), find_get_entries(), pagevec_lookup_entries(), etc.
Signed-off-by: Kirill A. Shutemov <kirill.sh
Introduce new helpers which return size/mask of the page:
HPAGE_PMD_SIZE/HPAGE_PMD_MASK if the page is PageTransHuge() and
PAGE_SIZE/PAGE_MASK otherwise.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/huge_mm.h | 16
1 file chang
On Wed, Feb 08, 2017 at 08:01:13PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 26, 2017 at 02:57:45PM +0300, Kirill A. Shutemov wrote:
> > These flags are in use for filesystems with backing storage: PG_error,
> > PG_writeback and PG_readahead.
>
> Oh ;-) Then I amend
On Thu, Feb 09, 2017 at 01:18:35PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 26, 2017 at 02:57:49PM +0300, Kirill A. Shutemov wrote:
> > Later we can add logic to accumulate information from shadow entires to
> > return to caller (average eviction time?).
>
> I would sa
On Thu, Feb 09, 2017 at 01:55:05PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 26, 2017 at 02:57:50PM +0300, Kirill A. Shutemov wrote:
> > +++ b/mm/filemap.c
> > @@ -1886,6 +1886,7 @@ static ssize_t do_generic_file_read(struct file
> > *filp, loff_t *ppos,
> >
On Wed, Feb 08, 2017 at 07:57:27PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 26, 2017 at 02:57:43PM +0300, Kirill A. Shutemov wrote:
> > +++ b/include/linux/pagemap.h
> > @@ -332,6 +332,15 @@ static inline struct page
> > *grab_cache_page_nowait(struct a
On Thu, Feb 09, 2017 at 07:58:20PM +0300, Kirill A. Shutemov wrote:
> I'll look into it.
I ended up with this (I'll test it more later):
void filemap_map_pages(struct vm_fault *vmf,
pgoff_t start_pgoff, pgoff_t end_pgoff)
{
struct radix_tree_iter iter;
void **s
This reverts commit 356e1c23292a4f63cfdf1daf0e0ddada51f32de8.
After conversion of huge tmpfs to multi-order entries, we don't need
this anymore.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/radix-tree.h | 1 -
lib/radix-tree.c
From: Matthew Wilcox <wi...@infradead.org>
radix_tree_replace_clear_tags() can be called with NULL as the replacement
value; in this case we need to delete sibling entries which point to
the slot.
Signed-off-by: Matthew Wilcox <wi...@infradead.org>
Signed-off-by: Kirill A. Shutemov &
For huge pages we need to unmap whole range covered by the huge page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/truncate.c | 27 +++
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/mm/truncate.c b/mm/truncate.c
Modify mpage_map_and_submit_buffers() and mpage_release_unused_pages()
to deal with huge pages.
Mostly result of try-and-error. Critical view would be appriciated.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.
This patch modifies ext4_mpage_readpages() to deal with huge pages.
We read out 2M at once, so we have to alloc (HPAGE_PMD_NR *
blocks_per_page) sector_t for that. I'm not entirely happy with kmalloc
in this codepath, but don't see any other option.
Signed-off-by: Kirill A. Shutemov
them, before attempt split. And remove one guarding
VM_BUG_ON_PAGE().
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/buffer_head.h | 1 +
mm/huge_memory.c| 19 ++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff
It's more or less straight-forward.
Most changes are around getting offset/len withing page right and zero
out desired part of the page.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 53 +++--
1 file c
It simply matches changes to __block_write_begin_int().
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 24
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a07c05
As the function handles zeroing range only within one block, the
required changes are trivial, just remove assuption on page size.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --gi
Most of work happans on head page. Only when we need to do copy data to
userspace we find relevant subpage.
We are still limited by PAGE_SIZE per iteration. Lifting this limitation
would require some more work.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/fil
For huge pages 'stop' must be within HPAGE_PMD_SIZE.
Let's use hpage_size() in the BUG_ON().
We also need to change how we calculate lblk for cluster deallocation.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/ext4/inode.c | 5 +++--
1 file changed, 3 inse
.
With memory-mapped IO we would loose holes in some cases when we have
THP in page cache, since we cannot track access on 4k level in this
case.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 2 +-
mm/truncate.
We writeback whole huge page a time.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/mm/filemap.c b/mm/filemap.c
index 05b42d3e5ed8..53da93156e60 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@
These flags are in use for filesystems with backing storage: PG_error,
PG_writeback and PG_readahead.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/page-flags.h | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include
Let's add FileHugePages and FilePmdMapped fields into meminfo and smaps.
It indicates how many times we allocate and map file THP.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
drivers/base/node.c| 6 ++
fs/proc/meminfo.c | 4
fs/proc/task
r too few (checked by comparing nr_allocated before
and after the call to radix_tree_split()).
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/radix-tree.h|
e walk, but they will never see NULL for an
index which was populated before the join.
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/radix-tree.h| 2 +
lib/radix-tree.c
n call radix_tree_for_each_slot() and
radix_tree_replace_slot() in order to turn these retry entries into the
intended new entries. Tags are replicated from the original multiorder
entry into each new entry.
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Kirill
-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 320 +--
mm/huge_memory.c | 47 +---
mm/khugepaged.c | 26 ++---
mm/shmem.c | 36 ++-
4 files changed, 247 insertions(+), 182 deletions(-)
diff --gi
Write path allocate pages using pagecache_get_page(). We should be able
to allocate huge pages there, if it's allowed. As usually, fallback to
small pages, if failed.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
mm/filemap.c | 18 --
1 file chang
-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
include/linux/huge_mm.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index de2789b4402c..5c5466ba37df 100644
--- a/include/linux/huge_mm.h
+++ b/i
Adjust check on whether part of the page beyond file size and apply
compound_head() and page_mapping() where appropriate.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
---
fs/buffer.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/bu
For filesystems that wants to be write-notified (has mkwrite), we will
encount write-protection faults for huge PMDs in shared mappings.
The easiest way to handle them is to clear the PMD and let it refault as
wriable.
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
-
1 - 100 of 160 matches
Mail list logo