Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
YAMAMOTO Takashi wrote: >> On 7/10/07, YAMAMOTO Takashi <[EMAIL PROTECTED]> wrote: >>> hi, >>> diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ >>> it seems that the page will be leaked on error. >> You mean meta_page right? > > no. i meant 'new_page'. > > YAMAMOTO Takashi Yes, I see. Thanks for clarifying. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
> On 7/10/07, YAMAMOTO Takashi <[EMAIL PROTECTED]> wrote: > > hi, > > > > > diff -puN mm/memory.c~mem-control-accounting mm/memory.c > > > --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 > > > 13:45:18.0 -0700 > > > +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 > > > 13:45:18.0 -0700 > > > > > @@ -1731,6 +1736,9 @@ gotten: > > > cow_user_page(new_page, old_page, address, vma); > > > } > > > > > > + if (mem_container_charge(new_page, mm)) > > > + goto oom; > > > + > > > /* > > >* Re-check the pte - we dropped the lock > > >*/ > > > > it seems that the page will be leaked on error. > > You mean meta_page right? no. i meant 'new_page'. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
On 7/10/07, YAMAMOTO Takashi <[EMAIL PROTECTED]> wrote: hi, > diff -puN mm/memory.c~mem-control-accounting mm/memory.c > --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 > +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 > @@ -1731,6 +1736,9 @@ gotten: > cow_user_page(new_page, old_page, address, vma); > } > > + if (mem_container_charge(new_page, mm)) > + goto oom; > + > /* >* Re-check the pte - we dropped the lock >*/ it seems that the page will be leaked on error. You mean meta_page right? > @@ -2188,6 +2196,11 @@ static int do_swap_page(struct mm_struct > } > > delayacct_clear_flag(DELAYACCT_PF_SWAPIN); > + if (mem_container_charge(page, mm)) { > + ret = VM_FAULT_OOM; > + goto out; > + } > + > mark_page_accessed(page); > lock_page(page); > ditto. > @@ -2264,6 +2278,9 @@ static int do_anonymous_page(struct mm_s > if (!page) > goto oom; > > + if (mem_container_charge(page, mm)) > + goto oom; > + > entry = mk_pte(page, vma->vm_page_prot); > entry = maybe_mkwrite(pte_mkdirty(entry), vma); > ditto. can you check the rest of the patch by yourself? thanks. Excellent catch! I'll review the accounting framework and post the updated version soon Balbir - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
hi, > diff -puN mm/memory.c~mem-control-accounting mm/memory.c > --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 > 13:45:18.0 -0700 > +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 > -0700 > @@ -1731,6 +1736,9 @@ gotten: > cow_user_page(new_page, old_page, address, vma); > } > > + if (mem_container_charge(new_page, mm)) > + goto oom; > + > /* >* Re-check the pte - we dropped the lock >*/ it seems that the page will be leaked on error. > @@ -2188,6 +2196,11 @@ static int do_swap_page(struct mm_struct > } > > delayacct_clear_flag(DELAYACCT_PF_SWAPIN); > + if (mem_container_charge(page, mm)) { > + ret = VM_FAULT_OOM; > + goto out; > + } > + > mark_page_accessed(page); > lock_page(page); > ditto. > @@ -2264,6 +2278,9 @@ static int do_anonymous_page(struct mm_s > if (!page) > goto oom; > > + if (mem_container_charge(page, mm)) > + goto oom; > + > entry = mk_pte(page, vma->vm_page_prot); > entry = maybe_mkwrite(pte_mkdirty(entry), vma); > ditto. can you check the rest of the patch by yourself? thanks. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
hi, diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ it seems that the page will be leaked on error. @@ -2188,6 +2196,11 @@ static int do_swap_page(struct mm_struct } delayacct_clear_flag(DELAYACCT_PF_SWAPIN); + if (mem_container_charge(page, mm)) { + ret = VM_FAULT_OOM; + goto out; + } + mark_page_accessed(page); lock_page(page); ditto. @@ -2264,6 +2278,9 @@ static int do_anonymous_page(struct mm_s if (!page) goto oom; + if (mem_container_charge(page, mm)) + goto oom; + entry = mk_pte(page, vma-vm_page_prot); entry = maybe_mkwrite(pte_mkdirty(entry), vma); ditto. can you check the rest of the patch by yourself? thanks. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
On 7/10/07, YAMAMOTO Takashi [EMAIL PROTECTED] wrote: hi, diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ it seems that the page will be leaked on error. You mean meta_page right? @@ -2188,6 +2196,11 @@ static int do_swap_page(struct mm_struct } delayacct_clear_flag(DELAYACCT_PF_SWAPIN); + if (mem_container_charge(page, mm)) { + ret = VM_FAULT_OOM; + goto out; + } + mark_page_accessed(page); lock_page(page); ditto. @@ -2264,6 +2278,9 @@ static int do_anonymous_page(struct mm_s if (!page) goto oom; + if (mem_container_charge(page, mm)) + goto oom; + entry = mk_pte(page, vma-vm_page_prot); entry = maybe_mkwrite(pte_mkdirty(entry), vma); ditto. can you check the rest of the patch by yourself? thanks. Excellent catch! I'll review the accounting framework and post the updated version soon Balbir - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
On 7/10/07, YAMAMOTO Takashi [EMAIL PROTECTED] wrote: hi, diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ it seems that the page will be leaked on error. You mean meta_page right? no. i meant 'new_page'. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
YAMAMOTO Takashi wrote: On 7/10/07, YAMAMOTO Takashi [EMAIL PROTECTED] wrote: hi, diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ it seems that the page will be leaked on error. You mean meta_page right? no. i meant 'new_page'. YAMAMOTO Takashi Yes, I see. Thanks for clarifying. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[-mm PATCH 4/8] Memory controller memory accounting (v2)
Add the accounting hooks. The accounting is carried out for RSS and Page Cache (unmapped) pages. There is now a common limit and accounting for both. The RSS accounting is accounted at page_add_*_rmap() and page_remove_rmap() time. Page cache is accounted at add_to_page_cache(), __delete_from_page_cache(). Swap cache is also accounted for. Each page's meta_page is protected with a bit in page flags, this makes handling of race conditions involving simultaneous mappings of a page easier. A reference count is kept in the meta_page to deal with cases where a page might be unmapped from the RSS of all tasks, but still lives in the page cache. Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- fs/exec.c |1 include/linux/memcontrol.h | 11 +++ include/linux/page-flags.h |3 + mm/filemap.c |8 ++ mm/memcontrol.c| 132 - mm/memory.c| 22 +++ mm/migrate.c |6 ++ mm/page_alloc.c|3 + mm/rmap.c |2 mm/swap_state.c|8 ++ mm/swapfile.c | 40 +++-- 11 files changed, 218 insertions(+), 18 deletions(-) diff -puN fs/exec.c~mem-control-accounting fs/exec.c --- linux-2.6.22-rc6/fs/exec.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/fs/exec.c 2007-07-05 13:45:18.0 -0700 @@ -51,6 +51,7 @@ #include #include #include +#include #include #include diff -puN include/linux/memcontrol.h~mem-control-accounting include/linux/memcontrol.h --- linux-2.6.22-rc6/include/linux/memcontrol.h~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/include/linux/memcontrol.h 2007-07-05 18:27:26.0 -0700 @@ -24,6 +24,8 @@ extern void mm_init_container(struct mm_ extern void mm_free_container(struct mm_struct *mm); extern void page_assign_meta_page(struct page *page, struct meta_page *mp); extern struct meta_page *page_get_meta_page(struct page *page); +extern int mem_container_charge(struct page *page, struct mm_struct *mm); +extern void mem_container_uncharge(struct meta_page *mp); #else /* CONFIG_CONTAINER_MEM_CONT */ static inline void mm_init_container(struct mm_struct *mm, @@ -45,6 +47,15 @@ static inline struct meta_page *page_get return NULL; } +static inline int mem_container_charge(struct page *page, struct mm_struct *mm) +{ + return 0; +} + +static inline void mem_container_uncharge(struct meta_page *mp) +{ +} + #endif /* CONFIG_CONTAINER_MEM_CONT */ #endif /* _LINUX_MEMCONTROL_H */ diff -puN include/linux/page-flags.h~mem-control-accounting include/linux/page-flags.h --- linux-2.6.22-rc6/include/linux/page-flags.h~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/include/linux/page-flags.h 2007-07-05 13:45:18.0 -0700 @@ -98,6 +98,9 @@ #define PG_checked PG_owner_priv_1 /* Used by some filesystems */ #define PG_pinned PG_owner_priv_1 /* Xen pinned pagetable */ +#define PG_metapage21 /* Used for checking if a meta_page */ + /* is associated with a page*/ + #if (BITS_PER_LONG > 32) /* * 64-bit-only flags build down from bit 31 diff -puN mm/filemap.c~mem-control-accounting mm/filemap.c --- linux-2.6.22-rc6/mm/filemap.c~mem-control-accounting2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/filemap.c2007-07-05 18:26:29.0 -0700 @@ -31,6 +31,7 @@ #include #include #include /* for BUG_ON(!in_atomic()) only */ +#include #include "internal.h" /* @@ -116,6 +117,7 @@ void __remove_from_page_cache(struct pag { struct address_space *mapping = page->mapping; + mem_container_uncharge(page_get_meta_page(page)); radix_tree_delete(>page_tree, page->index); page->mapping = NULL; mapping->nrpages--; @@ -442,6 +444,11 @@ int add_to_page_cache(struct page *page, int error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); if (error == 0) { + + error = mem_container_charge(page, current->mm); + if (error) + goto out; + write_lock_irq(>tree_lock); error = radix_tree_insert(>page_tree, offset, page); if (!error) { @@ -455,6 +462,7 @@ int add_to_page_cache(struct page *page, write_unlock_irq(>tree_lock); radix_tree_preload_end(); } +out: return error; } EXPORT_SYMBOL(add_to_page_cache); diff -puN mm/memcontrol.c~mem-control-accounting mm/memcontrol.c --- linux-2.6.22-rc6/mm/memcontrol.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memcontrol.c 2007-07-05 18:27:29.0 -0700 @@ -16,6 +16,9 @@ #include #include #include +#include +#include
[-mm PATCH 4/8] Memory controller memory accounting (v2)
Add the accounting hooks. The accounting is carried out for RSS and Page Cache (unmapped) pages. There is now a common limit and accounting for both. The RSS accounting is accounted at page_add_*_rmap() and page_remove_rmap() time. Page cache is accounted at add_to_page_cache(), __delete_from_page_cache(). Swap cache is also accounted for. Each page's meta_page is protected with a bit in page flags, this makes handling of race conditions involving simultaneous mappings of a page easier. A reference count is kept in the meta_page to deal with cases where a page might be unmapped from the RSS of all tasks, but still lives in the page cache. Signed-off-by: Balbir Singh [EMAIL PROTECTED] --- fs/exec.c |1 include/linux/memcontrol.h | 11 +++ include/linux/page-flags.h |3 + mm/filemap.c |8 ++ mm/memcontrol.c| 132 - mm/memory.c| 22 +++ mm/migrate.c |6 ++ mm/page_alloc.c|3 + mm/rmap.c |2 mm/swap_state.c|8 ++ mm/swapfile.c | 40 +++-- 11 files changed, 218 insertions(+), 18 deletions(-) diff -puN fs/exec.c~mem-control-accounting fs/exec.c --- linux-2.6.22-rc6/fs/exec.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/fs/exec.c 2007-07-05 13:45:18.0 -0700 @@ -51,6 +51,7 @@ #include linux/cn_proc.h #include linux/audit.h #include linux/signalfd.h +#include linux/memcontrol.h #include asm/uaccess.h #include asm/mmu_context.h diff -puN include/linux/memcontrol.h~mem-control-accounting include/linux/memcontrol.h --- linux-2.6.22-rc6/include/linux/memcontrol.h~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/include/linux/memcontrol.h 2007-07-05 18:27:26.0 -0700 @@ -24,6 +24,8 @@ extern void mm_init_container(struct mm_ extern void mm_free_container(struct mm_struct *mm); extern void page_assign_meta_page(struct page *page, struct meta_page *mp); extern struct meta_page *page_get_meta_page(struct page *page); +extern int mem_container_charge(struct page *page, struct mm_struct *mm); +extern void mem_container_uncharge(struct meta_page *mp); #else /* CONFIG_CONTAINER_MEM_CONT */ static inline void mm_init_container(struct mm_struct *mm, @@ -45,6 +47,15 @@ static inline struct meta_page *page_get return NULL; } +static inline int mem_container_charge(struct page *page, struct mm_struct *mm) +{ + return 0; +} + +static inline void mem_container_uncharge(struct meta_page *mp) +{ +} + #endif /* CONFIG_CONTAINER_MEM_CONT */ #endif /* _LINUX_MEMCONTROL_H */ diff -puN include/linux/page-flags.h~mem-control-accounting include/linux/page-flags.h --- linux-2.6.22-rc6/include/linux/page-flags.h~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/include/linux/page-flags.h 2007-07-05 13:45:18.0 -0700 @@ -98,6 +98,9 @@ #define PG_checked PG_owner_priv_1 /* Used by some filesystems */ #define PG_pinned PG_owner_priv_1 /* Xen pinned pagetable */ +#define PG_metapage21 /* Used for checking if a meta_page */ + /* is associated with a page*/ + #if (BITS_PER_LONG 32) /* * 64-bit-only flags build down from bit 31 diff -puN mm/filemap.c~mem-control-accounting mm/filemap.c --- linux-2.6.22-rc6/mm/filemap.c~mem-control-accounting2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/filemap.c2007-07-05 18:26:29.0 -0700 @@ -31,6 +31,7 @@ #include linux/syscalls.h #include linux/cpuset.h #include linux/hardirq.h /* for BUG_ON(!in_atomic()) only */ +#include linux/memcontrol.h #include internal.h /* @@ -116,6 +117,7 @@ void __remove_from_page_cache(struct pag { struct address_space *mapping = page-mapping; + mem_container_uncharge(page_get_meta_page(page)); radix_tree_delete(mapping-page_tree, page-index); page-mapping = NULL; mapping-nrpages--; @@ -442,6 +444,11 @@ int add_to_page_cache(struct page *page, int error = radix_tree_preload(gfp_mask ~__GFP_HIGHMEM); if (error == 0) { + + error = mem_container_charge(page, current-mm); + if (error) + goto out; + write_lock_irq(mapping-tree_lock); error = radix_tree_insert(mapping-page_tree, offset, page); if (!error) { @@ -455,6 +462,7 @@ int add_to_page_cache(struct page *page, write_unlock_irq(mapping-tree_lock); radix_tree_preload_end(); } +out: return error; } EXPORT_SYMBOL(add_to_page_cache); diff -puN mm/memcontrol.c~mem-control-accounting mm/memcontrol.c --- linux-2.6.22-rc6/mm/memcontrol.c~mem-control-accounting 2007-07-05