[PATCH] Memory controller Add Documentation

2007-08-22 Thread Balbir Singh
Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- Documentation/memcontrol.txt | 193 +++ 1 file changed, 193 insertions(+) diff -puN /dev/null Documentation/memcontrol.txt --- /dev/null 2007-06-01 20:42:04.0 +0530 +++ linux-2.6.23-r

compile error: soundbus 2.6.23-rc3-mm1

2007-08-22 Thread Balbir Singh
parameter which was being filled by add_uevent_var() is now gone. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordo

Re: [PATCH] Memory controller Add Documentation

2007-08-22 Thread Balbir Singh
Randy Dunlap wrote: > On Wed, 22 Aug 2007 18:36:12 +0530 Balbir Singh wrote: > >> Documentation/memcontrol.txt | 193 >> +++ > > Is there some sub-dir that is appropriate for this, such as > vm/ or accounting/ or containers/

Re: [BUG] 2.6.23-rc3-mm1 - kernel BUG at net/core/skbuff.c:95!

2007-08-22 Thread Balbir Singh
, &env->buflen); >> if (!cp) >> return -ENODEV; >> >> _ >> >> have done? > > Does replacing "&length" with "NULL" work? That's what's in the updated > patch. > Hi, Kay, replacing &length

Re: [PATCH] Memory controller Add Documentation

2007-08-22 Thread Balbir Singh
Paul Menage wrote: > On 8/22/07, Balbir Singh <[EMAIL PROTECTED]> wrote: >> >> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> >> --- >> >> Documentation/memcontrol.txt | 193 >> +++ >> 1

Re: [BUG] 2.6.23-rc3-mm1 - kernel BUG at net/core/skbuff.c:95!

2007-08-22 Thread Balbir Singh
Kay Sievers wrote: > On Thu, 2007-08-23 at 00:34 +0530, Balbir Singh wrote: >> Kay Sievers wrote: >>>> gargh, sorry, that's probably due to my screwed up attempt to fix Kay's >>>> screwed up >>>> gregkh-driver-driver-core-change-add_uevent_

Re: [BUG] 2.6.23-rc3-mm1 oom-killer gets invoked

2007-08-22 Thread Balbir Singh
o 4k). The log shows that OOM occurred several times. Kamalesh, how much memory do you have the on the system and what test were you running when you hit this problem? Is the problem reproducible? What is the configured swap size? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] Memory controller Add Documentation

2007-08-23 Thread Balbir Singh
s into the documentation (I think it will help developers and users alike). > Writing all above may be too much :) > > I'm sorry if I say something pointless. > No.. not at all! Thank you for reading the documentation and commenting on it. > Thanks, > -Kame > > -

Re: [BUG] 2.6.23-rc3-mm1 - kernel BUG at net/core/skbuff.c:95!

2007-08-23 Thread Balbir Singh
Kay Sievers wrote: > On Thu, 2007-08-23 at 00:34 +0530, Balbir Singh wrote: >> Kay Sievers wrote: >>>> gargh, sorry, that's probably due to my screwed up attempt to fix Kay's >>>> screwed up >>>> gregkh-driver-driver-core-change-add_uevent_

[-mm PATCH 0/10] Memory controller introduction (v7)

2007-08-24 Thread Balbir Singh
t-of-memory mem-control-choose-rss-vs-rss-and-pagecache mem-control-per-container-page-referenced mem-control-documentation -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel"

[-mm PATCH 1/10] Memory controller resource counters (v7)

2007-08-24 Thread Balbir Singh
PROTECTED]> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- include/linux/res_counter.h | 102 + init/Kconfig|7 ++ kernel/Makefile |1 kernel/res_counter.c| 120 +++

[-mm PATCH 2/10] Memory controller containers setup (v7)

2007-08-24 Thread Balbir Singh
Changelong 1. use depends instead of select in init/Kconfig 2. Port to v11 3. Clean up the usage of names (container files) for v11 Setup the memory container and add basic hooks and controls to integrate and work with the container. Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- i

[-mm PATCH 3/10] Memory controller accounting setup (v7)

2007-08-24 Thread Balbir Singh
ad of using mem_container_from_cont() along with task_container. Basic setup routines, the mm_struct has a pointer to the container that it belongs to and the the page has a page_container associated with it. Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]&

[-mm PATCH 4/10] Memory controller memory accounting (v7)

2007-08-24 Thread Balbir Singh
Srinivasan <[EMAIL PROTECTED]> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- include/linux/memcontrol.h | 20 + mm/filemap.c | 12 ++- mm/memcontrol.c| 166 - mm/memory.c| 43 +++

[-mm PATCH 5/10] Memory controller task migration (v7)

2007-08-24 Thread Balbir Singh
Allow tasks to migrate from one container to the other. We migrate mm_struct's mem_container only when the thread group id migrates. Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- mm/memcontrol.c | 35 +++ 1 file changed, 35 insertions(+) d

[-mm PATCH 6/10] Memory controller add per container LRU and reclaim (v7)

2007-08-24 Thread Balbir Singh
ned-off-by: Pavel Emelianov <[EMAIL PROTECTED]> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- include/linux/memcontrol.h | 12 +++ include/linux/res_counter.h | 23 +++ include/linux/swap.h|3 mm/memcontrol.c | 135 +

[-mm PATCH 7/10] Memory controller OOM handling (v7)

2007-08-24 Thread Balbir Singh
ndling. Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- include/linux/memcontrol.h |1 + mm/memcontrol.c|1 + mm/oom_kill.c | 42 ++ 3 files changed, 4

[-mm PATCH 8/10] Memory controller add switch to control what type of pages to limit (v7)

2007-08-24 Thread Balbir Singh
Choose if we want cached pages to be accounted or not. By default both are accounted for. A new set of tunables are added. echo -n 1 > mem_control_type switches the accounting to account for only mapped pages echo -n 3 > mem_control_type switches the behaviour back Signed-off-by:

[-mm PATCH 9/10] Memory controller make page_referenced() container aware (v7)

2007-08-24 Thread Balbir Singh
when they are not actively referenced from the container that brought them in Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- include/linux/memcontrol.h |6 ++ include/linux/rmap.h |5 +++-- mm/memcontrol.c|5 + mm/rmap.c

[-mm PATCH 10/10] Memory controller add documentation

2007-08-24 Thread Balbir Singh
Changelog since version 1 1. Wording and punctuation comments - Randy Dunlap 2. Differentiate between RSS and Page Cache - Paul Menage 3. Add detailed description of features - KAMEZAWA Hiroyuki 4. Fix a typo (drop_pages should be drop_caches) - YAMAMOTO Takshi Signed-off-by: Balbir Singh

Re: [PATCH] Add all thread stats for TASKSTATS_CMD_ATTR_TGID

2007-08-25 Thread Balbir Singh
Guillaume Chazarain wrote: > Le Mon, 20 Aug 2007 22:31:08 +0530, > Balbir Singh <[EMAIL PROTECTED]> a écrit : > >>> --- a/kernel/taskstats.cSat Aug 18 17:15:17 2007 -0700 >>> +++ b/kernel/taskstats.cSun Aug 19 17:20:15 2007 +0200 >>> @@ -246,6

Re: [-mm PATCH 5/10] Memory controller task migration (v7)

2007-08-27 Thread Balbir Singh
ently, we treat the mm as owned by the thread group leader. But this policy can be easily adapted to any other desired policy. Would you like to see it change to something else? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe fr

Re: [PATCH] proc/schedstat: Expose /proc//schedstat if delay accounting is enabled

2015-05-28 Thread Balbir Singh
HEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) > ONE("schedstat", S_IRUGO, proc_pid_schedstat), > #endif > #ifdef CONFIG_LATENCYTOP > -- The change looks reasonable, from what I can understand you want these changes so that you can use /proc//schedstat instead of the ne

Re: [RFC] virtio: Use DMA MAP API for devices without an IOMMU

2018-04-05 Thread Balbir Singh
On Thu, Apr 5, 2018 at 8:56 PM, Anshuman Khandual wrote: > There are certian platforms which would like to use SWIOTLB based DMA API > for bouncing purpose without actually requiring an IOMMU back end. But the > virtio core does not allow such mechanism. Right now DMA MAP API is only > selected fo

Re: [RFC][PATCH] memcg: Replace mm->owner with mm->memcg

2018-05-02 Thread Balbir Singh
> } > > +/** > + * mm_update_memcg - Update the memory cgroup of a mm_struct > + * @mm: mm struct > + * @new: new memory cgroup value > + * > + * Called whenever mm->memcg needs to change. Consumes a reference > + * to new (unless new is NULL). The reference to the old memory > + * cgroup is decreased. > + */ > +void mm_update_memcg(struct mm_struct *mm, struct mem_cgroup *new) > +{ > + /* This is the only place where mm->memcg is changed */ > + struct mem_cgroup *old; > + > + old = xchg(&mm->memcg, new); > + if (old) > + css_put(&old->css); > +} > + > +static void task_update_memcg(struct task_struct *tsk, struct mem_cgroup > *new) > +{ > + struct mm_struct *mm; > + task_lock(tsk); > + mm = tsk->mm; > + if (mm && !(tsk->flags & PF_KTHREAD)) > + mm_update_memcg(mm, new); > + task_unlock(tsk); > +} > + > +static void mem_cgroup_attach(struct cgroup_taskset *tset) > +{ > + struct cgroup_subsys_state *css; > + struct task_struct *tsk; > + > + cgroup_taskset_for_each(tsk, css, tset) { > + struct mem_cgroup *new = mem_cgroup_from_css(css); > + css_get(css); > + task_update_memcg(tsk, new); I'd have to go back and check and I think your comment refers to this, but we don't expect non tgid tasks to show up here? My concern is I can't find the guaratee that task_update_memcg(tsk, new) is not 1. Duplicated for each thread in the process or attached to the mm 2. Do not update mm->memcg to point to different places, so the one that sticks is the one that updated things last. Balbir Singh

Re: [PATCH v3 0/9] klp-convert livepatch build tooling

2019-04-15 Thread Balbir Singh
.c. > > use-case 3: There is a relocation in the lp that cannot be automatically > resolved similarly as 2, but no annotation was provided in the > livepatch, triggering an error during compilation. Reproducible by > removing the KLP_MODULE_RELOC / KLP_SYMPOS annotation sections in > lib/

Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return

2019-02-05 Thread Balbir Singh
On Tue, Feb 5, 2019 at 10:24 PM Michael Ellerman wrote: > > Balbir Singh writes: > > On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh wrote: > >> > >> On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: > >> > From: Nicolai Stange > >>

Re: [PATCH] powerpc/powernv/npu: Remove redundant change_pte() hook

2019-02-05 Thread Balbir Singh
e looks good to me as well. > > Reviewed-by: Alistair Popple > I checked the three callers of set_pte_at_notify and the assumption seems correct Reviewed-by: Balbir Singh

Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return

2019-02-06 Thread Balbir Singh
On Wed, Feb 6, 2019 at 3:44 PM Michael Ellerman wrote: > > Balbir Singh writes: > > On Tue, Feb 5, 2019 at 10:24 PM Michael Ellerman > > wrote: > >> Balbir Singh writes: > >> > On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh > >> > wrote: &g

Re: [PATCH 0/5] [v4] Allow persistent memory to be used like normal RAM

2019-01-28 Thread Balbir Singh
gt; this newly-added memory can be selected by its unique NUMA > node. NUMA is distance based topology, does HMAT solve these problems? How do we prevent fallback nodes of normal nodes being pmem nodes? On an unexpected crash/failure is there a scrubbing mechanism or do we rely on the allocator to do the right thing prior to reallocating any memory. Will frequent zero'ing hurt NVDIMM/pmem's life times? Balbir Singh.

Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status

2021-02-10 Thread Balbir Singh
icts in ioctl numbers, inability to check the types of the parameters passed in and out makes it not so good. Not to mention versioning issues, with the genl interface we have the flexibility to version requests. I would really hate to have two ways to do the same thing. The overhead is there, do you see the overhead of 20ms per 10,000 calls significant? Does it affect your use case significantly? Balbir Singh

Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status

2021-02-04 Thread Balbir Singh
On Sun, Jan 31, 2021 at 05:16:47PM +0800, Weiping Zhang wrote: > On Wed, Jan 27, 2021 at 7:13 PM Balbir Singh wrote: > > > > On Fri, Jan 22, 2021 at 10:07:50PM +0800, Weiping Zhang wrote: > > > Hello Balbir Singh, > > > > > > Could you help review thi

Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status

2021-02-04 Thread Balbir Singh
On Thu, Feb 04, 2021 at 10:37:20PM +0800, Weiping Zhang wrote: > On Thu, Feb 4, 2021 at 6:20 PM Balbir Singh wrote: > > > > On Sun, Jan 31, 2021 at 05:16:47PM +0800, Weiping Zhang wrote: > > > On Wed, Jan 27, 2021 at 7:13 PM Balbir Singh > > > wrote: > >

[PATCH v4 0/5] Next revision of the L1D flush patches

2021-01-08 Thread Balbir Singh
at boot time, second by the application - Rename l1d_flush_out/L1D_FLUSH_OUT to l1d_flush/L1D_FLUSH - Implement other review recommendations Changelog v3: - Implement the SIGBUS mechansim - Update and fix the documentation Balbir Singh (5): x86/smp: Add a per-cpu view of SMT state x86/mm

[PATCH v4 1/5] x86/smp: Add a per-cpu view of SMT state

2021-01-08 Thread Balbir Singh
. Suggested-by: Thomas Gleixner Signed-off-by: Balbir Singh --- arch/x86/include/asm/processor.h | 2 ++ arch/x86/kernel/smpboot.c| 10 +- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index

[PATCH v4 5/5] Documentation: Add L1D flushing Documentation

2021-01-08 Thread Balbir Singh
Add documentation of l1d flushing, explain the need for the feature and how it can be used. Signed-off-by: Balbir Singh Signed-off-by: Thomas Gleixner --- Documentation/admin-guide/hw-vuln/index.rst | 1 + .../admin-guide/hw-vuln/l1d_flush.rst | 70 +++ .../admin

[PATCH v4 4/5] prctl: Hook L1D flushing in via prctl

2021-01-08 Thread Balbir Singh
ivery). There is also no seccomp integration for the feature. Suggested-by: Thomas Gleixner Signed-off-by: Balbir Singh Signed-off-by: Thomas Gleixner --- arch/Kconfig | 4 ++ arch/x86/Kconfig | 1 + arch/x86/include/asm/nospec-branch.h | 2 + arc

[PATCH v4 2/5] x86/mm: Refactor cond_ibpb() to support other use cases

2021-01-08 Thread Balbir Singh
: Balbir Singh Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/20200510014803.12190-4-sbl...@amazon.com Link: https://lore.kernel.org/r/20200729001103.6450-3-sbl...@amazon.com --- arch/x86/include/asm/tlbflush.h | 2 +- arch/x86/mm/tlb.c | 53

[PATCH v4 3/5] x86/mm: Optionally flush L1D on context switch

2021-01-08 Thread Balbir Singh
called only when HW assisted flushing is available. Suggested-by: Thomas Gleixner Signed-off-by: Balbir Singh Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/r/20200729001103.6450-4-sbl...@amazon.com --- arch/x86/include/asm/cacheflush.h | 8 arch/x86/include/asm

Re: [PATCH v1 0/3] drivers/char: remove /dev/kmem for good

2021-03-24 Thread Balbir Singh
ycles. I guess not all code can be accumulated under a single hierarchy. May not be worth the effort, just thinking out loud. Balbir Singh

Re: [PATCH v4 01/25] mm: Introduce struct folio

2021-03-18 Thread Balbir Singh
he caller > guarantees that the pointer it is passing does not point to a tail page. > Is this a part of a larger use case or general cleanup/refactor where the split between page and folio simplify programming? Balbir Singh.

Re: [PATCH v11 0/6] KASAN for powerpc64 radix

2021-03-19 Thread Balbir Singh
. Both 64k and 4k pages work. Running as a KVM host works, but > nothing in arch/powerpc/kvm is instrumented. It's also potentially a bit > fragile - if any real mode code paths call out to instrumented code, things > will go boom. > The last time I checked, the changes for real mode, made the code hard to review/maintain. I am happy to see that we've decided to leave that off the table for now, reviewing the series Balbir Singh.

Re: [PATCH v11 1/6] kasan: allow an architecture to disable inline instrumentation

2021-03-19 Thread Balbir Singh
VE_ARCH_KASAN_HW_TAGS > config HAVE_ARCH_KASAN_VMALLOC > bool > > +config ARCH_DISABLE_KASAN_INLINE > + def_bool n > + Some comments on what arch's want to disable kasan inline would be helpful and why. Balbir Singh.

Re: [PATCH v4 01/25] mm: Introduce struct folio

2021-03-19 Thread Balbir Singh
On Fri, Mar 19, 2021 at 01:25:27AM +, Matthew Wilcox wrote: > On Fri, Mar 19, 2021 at 10:56:45AM +1100, Balbir Singh wrote: > > On Fri, Mar 05, 2021 at 04:18:37AM +, Matthew Wilcox (Oracle) wrote: > > > A struct folio refers to an entire (possibly compound) page. A fu

Re: [PATCH v11 6/6] powerpc: Book3S 64-bit outline-only KASAN support

2021-03-20 Thread Balbir Singh
ny code that runs with translations off after > booting. Take this approach for now and require outline instrumentation. > > Previous attempts allowed inline instrumentation. However, they came with > some unfortunate restrictions: only physically contiguous memory could be > used and

Re: [PATCH v17 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-05 Thread Balbir Singh
On Thu, Feb 25, 2021 at 09:21:25PM +0800, Muchun Song wrote: > When we free a HugeTLB page to the buddy allocator, we should allocate > the vmemmap pages associated with it. But we may cannot allocate vmemmap > pages when the system is under memory pressure, in this case, we just > refuse to free t

Re: [RFC PATCH 00/15] Use obj_cgroup APIs to charge the LRU pages

2021-03-30 Thread Balbir Singh
ake > them the liability of jobs in the system that DON'T share the same fs. > > But again, this is a useful discussion to have, but I don't quite see > why it's relevant to Muchun's patches. They're purely an optimization. > > So I'd like to clear that up first before going further. > I suspect a lot of the issue really is the lack of lockstepping between a page (unmapped page cache) and the corresponding memcgroups lifecycle. When we delete a memcgroup, we sort of lose accounting (depending on the inheriting parent) and ideally we want to bring back the accounting when the page is reused in a different cgroup (almost like first touch). I would like to look at the patches and see if they do solve the issue that leads to zombie cgroups hanging around. In my experience, the combination of namespaces and number of cgroups (several of which could be zombies), does not scale well. Balbir Singh.

Re: [PATCH v17 1/9] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c

2021-03-03 Thread Balbir Singh
c > > @@ -0,0 +1,124 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * linux/mm/bootmem_info.c > > + * > > + * Copyright (C) > > Looks like incomplete > Not that my comment was, I should have said The copyright looks very incomplete Balbir Singh.

Re: [External] Re: [PATCH v17 0/9] Free some vmemmap pages of HugeTLB page

2021-03-03 Thread Balbir Singh
+---+ | | > > > | | > > > | | | 4 | + | > > > | | > > > |2MB| +---+ | > > > | | > > > | | | 5 | --+ > > > | | > > > | | +---+ > > > | | > > > | | | 6 | > > > + | > > > | | +---+ > > > | > > > | | | 7 | > > > --+ > > > | | +---+ > > > | | > > > | | > > > | | > > > +---+ > > > > > > When a HugeTLB is freed to the buddy system, we should allocate 6 pages > > > for > > > vmemmap pages and restore the previous mapping relationship. > > > > > > > Can these 6 pages come from the hugeTLB page itself? When you say 6 pages, > > I presume you mean 6 pages of PAGE_SIZE > > There was a decent discussion about this in a previous version of the > series starting here: > > https://lore.kernel.org/linux-mm/20210126092942.GA10602@linux/ > > In this thread various other options were suggested and discussed. > Thanks, Balbir Singh

Re: [PATCH v17 2/9] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP

2021-03-03 Thread Balbir Singh
> > Signed-off-by: Muchun Song > Reviewed-by: Oscar Salvador > Acked-by: Mike Kravetz > Reviewed-by: Miaohe Lin > --- Reviewed-by: Balbir Singh

Re: [PATCH v11 6/6] powerpc: Book3S 64-bit outline-only KASAN support

2021-03-21 Thread Balbir Singh
we start at c00e... > >> + */ > >> + > > > > assuming we have > > #define VMEMMAP_END R_VMEMMAP_END > > and ditto for hash we probably need > > > > BUILD_BUG_ON(VMEMMAP_END + KASAN_SHADOW_OFFSET != KASAN_SHADOW_END); > > Sorry, I'm not sure what this is supposed to be testing? In what > situation would this trigger? > I am bit concerned that we have hard coded (IIR) 0xa80e... in the config, any changes to VMEMMAP_END, KASAN_SHADOW_OFFSET/END should be guarded. Balbir Singh.

Re: [PATCH v2] kernel/resource: Fix locking in request_free_mem_region

2021-03-25 Thread Balbir Singh
!= > + if (__region_intersects(addr, size, 0, IORES_DESC_NONE) != > REGION_DISJOINT) > continue; > > - if (dev) > - res = devm_request_mem_region(dev, addr, size, name); > - else > - res = request_mem_region(addr, size, name); > - if (!res) > - return ERR_PTR(-ENOMEM); > + if (!request_region_locked(&iomem_resource, res, addr, > +size, name, 0)) > + break; > + > res->desc = IORES_DESC_DEVICE_PRIVATE_MEMORY; > + if (dev) { > + dr->parent = &iomem_resource; > + dr->start = addr; > + dr->n = size; > + devres_add(dev, dr); > + } > + > + write_unlock(&resource_lock); > return res; > } > > + write_unlock(&resource_lock); > + free_resource(res); > + > return ERR_PTR(-ERANGE); > } > Balbir Singh.

Re: [PATCH v2] kernel/resource: Fix locking in request_free_mem_region

2021-03-28 Thread Balbir Singh
On Mon, Mar 29, 2021 at 12:55:15PM +1100, Alistair Popple wrote: > On Friday, 26 March 2021 4:15:36 PM AEDT Balbir Singh wrote: > > On Fri, Mar 26, 2021 at 12:20:35PM +1100, Alistair Popple wrote: > > > +static int __region_intersects(resource_size_t

Re: [PATCH] delayacct: clear right task's flag after blkio completes

2021-04-19 Thread Balbir Singh
ret = VM_FAULT_HWPOISON; > - delayacct_clear_flag(DELAYACCT_PF_SWAPIN); > + delayacct_clear_flag(current, DELAYACCT_PF_SWAPIN); > goto out_release; > } > > locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags); > > - delayacct_clear_flag(DELAYACCT_PF_SWAPIN); > + delayacct_clear_flag(current, DELAYACCT_PF_SWAPIN); > if (!locked) { > ret |= VM_FAULT_RETRY; > goto out_release; Acked-by: Balbir Singh The changes seem reasonable to me. I don't maintain a git tree, Andrew can we please queue them up in your tree? Balbir Singh.

Re: [RESEND PATCH 1/2] delayacct: refactor the code to simplify the implementation

2021-04-19 Thread Balbir Singh
> > Signed-off-by: Chunguang Xu The approach seems to make sense, but the test robot has found a few issues, can you correct those as applicable please? Balbir Singh.

Re: [PATCH] bpf: Fix backport of "bpf: restrict unknown scalars of mixed signed bounds for unprivileged"

2021-04-19 Thread Balbir Singh
f: restrict unknown scalars of mixed signed bounds > for unprivileged") > Signed-off-by: Samuel Mendoza-Jonas > Reviewed-by: Frank van der Linden > Reviewed-by: Ethan Chen > --- Thanks for catching it :) Reviewed-by: Balbir Singh

[PATCH v1 0/3] Fixes to L1D flushing (on top of linux-next and

2020-11-17 Thread Balbir Singh
y the SIGBUS behaviour, there needs to be contention on the CPU where the task that opts into L1D flushing is running to see the SIGBUS being sent to it (the deterministic bit is that if there is scope of data leak the task will get killed) Balbir Singh (3): x86/mm: change l1d flush runtime

[PATCH v1 1/3] x86/mm: change l1d flush runtime prctl behaviour

2020-11-17 Thread Balbir Singh
Detection of task affinities at API opt-in time is not the best approach, the approach is to kill the task if it runs on a SMT enable core. This is better than not flushing the L1D cache when the task switches from a non-SMT core to an SMT enabled core. Signed-off-by: Balbir Singh --- To be

[PATCH v1 3/3] Documentation/l1d_flush: Fix up warning with labels

2020-11-17 Thread Balbir Singh
Add a label to spec_set_ctrl to remove the build warning. Signed-off-by: Balbir Singh --- To be applied on top of tip commit id 767d46ab566dd489733666efe48732d523c8c332 Documentation/admin-guide/hw-vuln/l1d_flush.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a

[PATCH v1 2/3] Documentation: Update the new SIGBUS behaviour for tasks

2020-11-17 Thread Balbir Singh
Update the documentation to mention that a SIGBUS will be sent to tasks that opt-into L1D flushing and execute on non-SMT cores. Signed-off-by: Balbir Singh --- To be applied on top of tip commit id 767d46ab566dd489733666efe48732d523c8c332 Documentation/admin-guide/hw-vuln/l1d_flush.rst | 8

Re: [PATCH -tip 14/32] sched: migration changes for core scheduling

2020-11-30 Thread Balbir Singh
On Thu, Nov 26, 2020 at 05:26:31PM +0800, Li, Aubrey wrote: > On 2020/11/26 16:32, Balbir Singh wrote: > > On Thu, Nov 26, 2020 at 11:20:41AM +0800, Li, Aubrey wrote: > >> On 2020/11/26 6:57, Balbir Singh wrote: > >>> On Wed, Nov 25, 2020 at 11:12:53AM +0800, Li, Aubr

Re: [PATCH -tip 22/32] sched: Split the cookie and setup per-task cookie on fork

2020-11-30 Thread Balbir Singh
set cgroup tag to 0 when the loop is done below. > */ > while ((p = css_task_iter_next(&it))) { > - p->core_cookie = !!val ? (unsigned long)tg : 0UL; > - > - if (sched_core_enqueued(p)) { > - sched_core_dequeue(task_rq(p), p); > - if (!p->core_cookie) > - continue; > - } > - > - if (sched_core_enabled(task_rq(p)) && > - p->core_cookie && task_on_rq_queued(p)) > - sched_core_enqueue(task_rq(p), p); > + unsigned long cookie = !!val ? (unsigned long)tg : 0UL; > > + sched_core_tag_requeue(p, cookie, true /* group */); > } > css_task_iter_end(&it); > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c > index 60a922d3f46f..8c452b8010ad 100644 > --- a/kernel/sched/debug.c > +++ b/kernel/sched/debug.c > @@ -1024,6 +1024,10 @@ void proc_sched_show_task(struct task_struct *p, > struct pid_namespace *ns, > __PS("clock-delta", t1-t0); > } > > +#ifdef CONFIG_SCHED_CORE > + __PS("core_cookie", p->core_cookie); > +#endif > + > sched_show_numa(p, m); > } > Balbir Singh.

Re: [PATCH -tip 26/32] sched: Add a second-level tag for nested CGroup usecase

2020-11-30 Thread Balbir Singh
hat still live) in CDE? Have the most specific tag live. Same with > that thread stuff. > > All this API stuff here is a complete and utter trainwreck. Please just > delete the patches and start over. Hint: if you use stop_machine(), > you're doing it wrong. > > At best you now have the requirements sorted. +1, just remove this patch from the series so as to unblock the series. Balbir Singh.

Re: [PATCH -tip 32/32] sched: Debug bits...

2020-11-30 Thread Balbir Singh
than just trace_printk() Balbir Singh.

Re: [PATCH -tip 03/32] sched/fair: Fix pick_task_fair crashes due to empty rbtree

2020-11-23 Thread Balbir Singh
as we call > put_prev_task before calling pick_task_fair. But for coresched, we > call pick_task_fair on siblings while the task is running and would > not be able to call put_prev_task. So this refactor of the code fixes > the crash by explicitly passing curr. > > Hope this clarifies.. > Yes, it does! Thanks, Balbir Singh.

Re: [PATCH -tip 09/32] sched/fair: Snapshot the min_vruntime of CPUs on force idle

2020-11-23 Thread Balbir Singh
On Mon, Nov 23, 2020 at 07:31:31AM -0500, Vineeth Pillai wrote: > Hi Balbir, > > On 11/22/20 6:44 AM, Balbir Singh wrote: > > > > This seems cumbersome, is there no way to track the min_vruntime via > > rq->core->min_vruntime? > Do you mean to have a core w

Re: [PATCH -tip 13/32] sched: Trivial forced-newidle balancer

2020-11-23 Thread Balbir Singh
On Mon, Nov 23, 2020 at 11:07:27PM +0800, Li, Aubrey wrote: > On 2020/11/23 12:38, Balbir Singh wrote: > > On Tue, Nov 17, 2020 at 06:19:43PM -0500, Joel Fernandes (Google) wrote: > >> From: Peter Zijlstra > >> > >> When a sibling is forced-idle to match the c

Re: [PATCH v3 4/5] prctl: Hook L1D flushing in via prctl

2020-12-04 Thread Balbir Singh
On Fri, Dec 04, 2020 at 11:19:17PM +0100, Thomas Gleixner wrote: > > Balbir, > > On Fri, Nov 27 2020 at 17:59, Balbir Singh wrote: > > +enum l1d_flush_out_mitigations { > > + L1D_FLUSH_OUT_OFF, > > + L1D_FLUSH_OUT_ON, > > +}; > > +

Re: [PATCH -tip 13/32] sched: Trivial forced-newidle balancer

2020-11-25 Thread Balbir Singh
On Tue, Nov 24, 2020 at 08:32:01AM +0800, Li, Aubrey wrote: > On 2020/11/24 7:35, Balbir Singh wrote: > > On Mon, Nov 23, 2020 at 11:07:27PM +0800, Li, Aubrey wrote: > >> On 2020/11/23 12:38, Balbir Singh wrote: > >>> On Tue, Nov 17, 2020 at 06:19:43PM -0500,

Re: [PATCH -tip 14/32] sched: migration changes for core scheduling

2020-11-25 Thread Balbir Singh
On Wed, Nov 25, 2020 at 11:12:53AM +0800, Li, Aubrey wrote: > On 2020/11/24 23:42, Peter Zijlstra wrote: > > On Mon, Nov 23, 2020 at 12:36:10PM +0800, Li, Aubrey wrote: > +#ifdef CONFIG_SCHED_CORE > +/* > + * Skip this cpu if source task's cookie does

Re: [PATCH -tip 10/32] sched: Fix priority inversion of cookied task with sibling

2020-11-25 Thread Balbir Singh
On Tue, Nov 24, 2020 at 01:30:38PM -0500, Joel Fernandes wrote: > On Mon, Nov 23, 2020 at 09:41:23AM +1100, Balbir Singh wrote: > > On Tue, Nov 17, 2020 at 06:19:40PM -0500, Joel Fernandes (Google) wrote: > > > From: Peter Zijlstra > > > > > > The rationale

Re: [PATCH -tip 09/32] sched/fair: Snapshot the min_vruntime of CPUs on force idle

2020-11-25 Thread Balbir Singh
On Tue, Nov 24, 2020 at 10:09:55AM +0100, Peter Zijlstra wrote: > On Tue, Nov 24, 2020 at 10:31:49AM +1100, Balbir Singh wrote: > > On Mon, Nov 23, 2020 at 07:31:31AM -0500, Vineeth Pillai wrote: > > > Hi Balbir, > > > > > > On 11/22/20 6:44 AM, Balbir Sing

Re: [PATCH -tip 02/32] sched: Introduce sched_class::pick_task()

2020-11-25 Thread Balbir Singh
On Fri, Nov 20, 2020 at 11:58:54AM -0500, Joel Fernandes wrote: > On Fri, Nov 20, 2020 at 10:56:09AM +1100, Singh, Balbir wrote: > [..] > > > +#ifdef CONFIG_SMP > > > +static struct task_struct *pick_task_fair(struct rq *rq) > > > +{ > > > + struct cfs_rq *cfs_rq = &rq->cfs; > > > + struct sched_e

Re: [PATCH -tip 31/32] sched: Add a coresched command line option

2020-11-25 Thread Balbir Singh
able on SMT (provided you did that > CONFIG_ thing). Even on AMD systems RT tasks might want to claim the > core exclusively. Agreed, specifically if we need to have special cgroup tag/association to enable it. Balbir Singh.

Re: [PATCH -tip 04/32] sched: Core-wide rq->lock

2020-11-25 Thread Balbir Singh
On Tue, Nov 24, 2020 at 09:16:17AM +0100, Peter Zijlstra wrote: > On Sun, Nov 22, 2020 at 08:11:52PM +1100, Balbir Singh wrote: > > On Tue, Nov 17, 2020 at 06:19:34PM -0500, Joel Fernandes (Google) wrote: > > > From: Peter Zijlstra > > > > > > Introduce the

Re: [PATCH -tip 18/32] kernel/entry: Add support for core-wide protection of kernel-mode

2020-11-25 Thread Balbir Singh
et; > + > + raw_spin_lock(rq_lockp(rq)); > + /* > + * Core-wide nesting counter can never be 0 because we are > + * still in it on this CPU. > + */ > + nest = rq->core->core_unsafe_nest; > + WARN_ON_ONCE(!nest); > + > + WRITE_ONCE(rq->core->core_unsafe_nest, nest - 1); > + /* > + * The raw_spin_unlock release semantics pairs with the nest counter's > + * smp_load_acquire() in sched_core_wait_till_safe(). > + */ > + raw_spin_unlock(rq_lockp(rq)); > +ret: > + local_irq_restore(flags); > +} > + > // XXX fairness/fwd progress conditions > /* > * Returns > @@ -5497,6 +5737,7 @@ static inline void sched_core_cpu_starting(unsigned int > cpu) > rq = cpu_rq(i); > if (rq->core && rq->core == rq) > core_rq = rq; > + init_sched_core_irq_work(rq); > } > > if (!core_rq) > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 615092cb693c..be6691337bbb 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -1074,6 +1074,8 @@ struct rq { > unsigned intcore_enabled; > unsigned intcore_sched_seq; > struct rb_root core_tree; > + struct irq_work core_irq_work; /* To force HT into kernel */ > + unsigned intcore_this_unsafe_nest; > > /* shared state */ > unsigned intcore_task_seq; > @@ -1081,6 +1083,7 @@ struct rq { > unsigned long core_cookie; > unsigned char core_forceidle; > unsigned intcore_forceidle_seq; > + unsigned intcore_unsafe_nest; > #endif > }; > Balbir Singh.

Re: [PATCH -tip 14/32] sched: migration changes for core scheduling

2020-11-26 Thread Balbir Singh
On Thu, Nov 26, 2020 at 11:20:41AM +0800, Li, Aubrey wrote: > On 2020/11/26 6:57, Balbir Singh wrote: > > On Wed, Nov 25, 2020 at 11:12:53AM +0800, Li, Aubrey wrote: > >> On 2020/11/24 23:42, Peter Zijlstra wrote: > >>> On Mon, Nov 23, 2020 at 12:36:10PM +0800, Li,

Re: [PATCH v17 5/9] mm: hugetlb: set the PageHWPoison to the raw error page

2021-03-07 Thread Balbir Singh
On Thu, Feb 25, 2021 at 09:21:26PM +0800, Muchun Song wrote: > Because we reuse the first tail vmemmap page frame and remap it > with read-only, we cannot set the PageHWPosion on some tail pages. > So we can use the head[4].private (There are at least 128 struct > page structures associated with th

Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status

2021-02-07 Thread Balbir Singh
On Fri, Feb 05, 2021 at 10:43:02AM +0800, Weiping Zhang wrote: > On Fri, Feb 5, 2021 at 8:08 AM Balbir Singh wrote: > > > > On Thu, Feb 04, 2021 at 10:37:20PM +0800, Weiping Zhang wrote: > > > On Thu, Feb 4, 2021 at 6:20 PM Balbir Singh wrote: > > > > > &

Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status

2021-01-27 Thread Balbir Singh
On Mon, Dec 28, 2020 at 10:10:03PM +0800, Weiping Zhang wrote: > Hi David, > > Could you help review this patch ? > > thanks I've got it on my review list, thanks for the ping! You should hear back from me soon. Balbir Singh. > > On Fri, Dec 18, 2020 at 1:24

Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status

2021-01-27 Thread Balbir Singh
On Fri, Jan 22, 2021 at 10:07:50PM +0800, Weiping Zhang wrote: > Hello Balbir Singh, > > Could you help review this patch, thanks > > On Mon, Dec 28, 2020 at 10:10 PM Weiping Zhang wrote: > > > > Hi David, > > > > Could you help review this patch ? > &g

Re: [PATCH -tip 10/32] sched: Fix priority inversion of cookied task with sibling

2020-11-26 Thread Balbir Singh
On Thu, Nov 26, 2020 at 09:29:14AM +0100, Peter Zijlstra wrote: > On Thu, Nov 26, 2020 at 10:05:19AM +1100, Balbir Singh wrote: > > > @@ -5259,7 +5254,20 @@ pick_next_task(struct rq *rq, struct task_struct > > > *prev, struct rq_flags *rf) > > >

[PATCH v3 0/5] Next revision of the L1D flush patches

2020-11-26 Thread Balbir Singh
-data-sampling [3] https://lkml.org/lkml/2020/6/2/1150 [4] https://lore.kernel.org/lkml/20200729001103.6450-1-sbl...@amazon.com/ [5] https://lore.kernel.org/lkml/20201117234934.25985-2-sbl...@amazon.com/ Changelog v3: - Implement the SIGBUS mechansim - Update and fix the documentation Balbir Singh

[PATCH v3 1/5] x86/mm: change l1d flush runtime prctl behaviour

2020-11-26 Thread Balbir Singh
Detection of task affinities at API opt-in time is not the best approach, the approach is to kill the task if it runs on a SMT enable core. This is better than not flushing the L1D cache when the task switches from a non-SMT core to an SMT enabled core. Signed-off-by: Balbir Singh --- arch/x86

[PATCH v3 4/5] prctl: Hook L1D flushing in via prctl

2020-11-26 Thread Balbir Singh
ivery). There is also no seccomp integration for the feature. Suggested-by: Thomas Gleixner Signed-off-by: Balbir Singh Signed-off-by: Thomas Gleixner --- arch/Kconfig | 4 +++ arch/x86/Kconfig | 1 + arch/x86/kernel/cpu/bugs.c

[PATCH v3 2/5] x86/mm: Refactor cond_ibpb() to support other use cases

2020-11-26 Thread Balbir Singh
: Balbir Singh Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/20200510014803.12190-4-sbl...@amazon.com Link: https://lore.kernel.org/r/20200729001103.6450-3-sbl...@amazon.com --- arch/x86/include/asm/tlbflush.h | 2 +- arch/x86/mm/tlb.c | 53

[PATCH v3 3/5] x86/mm: Optionally flush L1D on context switch

2020-11-26 Thread Balbir Singh
called only when HW assisted flushing is available. Suggested-by: Thomas Gleixner Signed-off-by: Balbir Singh Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/r/20200729001103.6450-4-sbl...@amazon.com --- arch/x86/include/asm/cacheflush.h | 8 arch/x86/include/asm

[PATCH v3 5/5] Documentation: Add L1D flushing Documentation

2020-11-26 Thread Balbir Singh
Add documentation of l1d flushing, explain the need for the feature and how it can be used. Signed-off-by: Balbir Singh Signed-off-by: Thomas Gleixner --- Documentation/admin-guide/hw-vuln/index.rst | 1 + .../admin-guide/hw-vuln/l1d_flush.rst | 69 +++ .../admin

Re: [PATCH v15 00/26] Control-flow Enforcement: Shadow Stack

2020-11-27 Thread Balbir Singh
way to run these patches for testing? Bochs emulation or anything else? I presume you've been testing against violations of CET in user space? Can you share your testing? Balbir Singh.

Re: [PATCH -tip 01/32] sched: Wrap rq::lock access

2020-11-22 Thread Balbir Singh
g is > dynamic based on whether core sched is enabled or not (both statically and > dynamically). > My point was that the word game does not do justice to the change, some details around how this abstractions helps based on the (re)definition of rq with coresched might help. Balbir Singh.

Re: [PATCH -tip 04/32] sched: Core-wide rq->lock

2020-11-22 Thread Balbir Singh
it possible to have some cores with core sched disabled? I don't see a strong use case for it, but I am wondering if the design will fall apart if that assumption is broken? Balbir Singh

Re: [PATCH -tip 08/32] sched/fair: Fix forced idle sibling starvation corner case

2020-11-22 Thread Balbir Singh
P */ > > +#ifdef CONFIG_SCHED_CORE > +static inline bool > +__entity_slice_used(struct sched_entity *se, int min_nr_tasks) > +{ > + u64 slice = sched_slice(cfs_rq_of(se), se); I wonder if the definition of sched_slice() should be revisited for core scheduling? Should we use sched_slice = sched_slice / cpumask_weight(smt_mask)? Would that resolve the issue your seeing? Effectively we need to answer if two sched core siblings should be treated as executing one large slice? Balbir Singh.

Re: [PATCH -tip 09/32] sched/fair: Snapshot the min_vruntime of CPUs on force idle

2020-11-22 Thread Balbir Singh
easier. Further, it may make reverting the improvement easier in > case the improvement causes any regression. > This seems cumbersome, is there no way to track the min_vruntime via rq->core->min_vruntime? Balbir Singh.

Re: [PATCH -tip 10/32] sched: Fix priority inversion of cookied task with sibling

2020-11-22 Thread Balbir Singh
On Tue, Nov 17, 2020 at 06:19:40PM -0500, Joel Fernandes (Google) wrote: > From: Peter Zijlstra > > The rationale is as follows. In the core-wide pick logic, even if > need_sync == false, we need to go look at other CPUs (non-local CPUs) to > see if they could be running RT. > > Say the RQs in a

Re: [PATCH -tip 14/32] sched: migration changes for core scheduling

2020-11-22 Thread Balbir Singh
if core scheduler is not enabled on the CPU. */ > + if (!sched_core_enabled(rq)) > + return true; > + > + for_each_cpu(cpu, cpu_smt_mask(cpu_of(rq))) { > + if (!available_idle_cpu(cpu)) { I was looking at this snippet and comparing this to is_core_idle(), the major difference is the check for vcpu_is_preempted(). Do we want to call the core as non idle if any vcpu was preempted on this CPU? > + idle_core = false; > + break; > + } > + } > + > + /* > + * A CPU in an idle core is always the best choice for tasks with > + * cookies. > + */ > + return idle_core || rq->core->core_cookie == p->core_cookie; > +} > + Balbir Singh.

Re: [PATCH -tip 13/32] sched: Trivial forced-newidle balancer

2020-11-22 Thread Balbir Singh
presume we are looking at either one or two cpus to define the core_occupation and we expect to match it against the destination CPU. Balbir Singh.

Re: [PATCH -tip 17/32] arch/x86: Add a new TIF flag for untrusted tasks

2020-11-22 Thread Balbir Singh
d by the series to determine if waiting is > needed or not, during exit to user or guest mode. > > Tested-by: Julien Desfossez > Reviewed-by: Aubrey Li > Signed-off-by: Joel Fernandes (Google) > --- Acked-by: Balbir Singh

Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

2014-02-10 Thread Balbir Singh
On Mon, Feb 10, 2014 at 04:21:30PM +0530, Gautham R Shenoy wrote: > On Mon, Feb 10, 2014 at 02:45:55PM +0530, Srivatsa S. Bhat wrote: > > + cpuhp_lock_acquire_read(); > mutex_lock(&cpu_hotplug.lock); Don't you want to abstract cpuhp_lock_acquire_read and mutex_lock into a more useful pr

Re: [PATCH v2 0/2] perf probe fixes for ppc64le

2016-04-13 Thread Balbir Singh
ms in patch 1 and for symbol table in patch 2. > 3. perf probe failure with kretprobe when using kallsyms. This was > failing as we were specifying an offset. This is fixed in patch 1. > > A few examples demonstrating the issues and the fix: > Given the choices, I think this makes sense Acked-by: Balbir Singh

Re: [RFC 1/5] powerpc: Rename context.vdso_base to context.vdso

2016-05-01 Thread Balbir Singh
vmap field using arch_* operations? Not sure Balbir Singh

<    2   3   4   5   6   7   8   9   10   11   >