[RFC PATCH 2/2] bfq/mq-deadline: remove redundant check for passthrough request

2021-04-14 Thread Lin Feng
der SAS controller and hdds under AHCI controller but obviously not covers all. Not sure if passthrough request can still escape into IO scheduler from blk_mq_sched_insert_requests, which is used by blk_mq_flush_plug_list and has lots of indirect callers.) Signed-off-by: Lin Feng --- block/

[PATCH 1/2] blk-mq: bypass IO scheduler's limit_depth for passthrough request

2021-04-14 Thread Lin Feng
troduce a new wrapper to make code not that ugly. Signed-off-by: Lin Feng --- block/blk-mq.c | 3 ++- include/linux/blkdev.h | 6 ++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index d4d7c1caa439..927189a55575 100644 --- a/block/blk-mq.c ++

Re: [PATCH] Revert "bfq: Fix computation of shallow depth"

2021-02-02 Thread Lin Feng
Hi all, On 2/2/21 22:20, Jens Axboe wrote: On 2/2/21 5:28 AM, Jan Kara wrote: Hello! On Fri 29-01-21 19:18:08, Lin Feng wrote: This reverts commit 6d4d273588378c65915acaf7b2ee74e9dd9c130a. bfq.limit_depth passes word_depths[] as shallow_depth down to sbitmap core sbitmap_get_shallow, which

Re: [PATCH] Revert "bfq: Fix computation of shallow depth"

2021-01-31 Thread Lin Feng
des for bfq's word_depths array are not necessary and one variable is enough. But IMHO async depth limitation for slow drivers is essential, which is what we always did in cfq age. On 1/29/21 19:18, Lin Feng wrote: This reverts commit 6d4d273588378c65915acaf7b2ee74e9dd9c130a. bfq.limit_de

[PATCH] x86/kaslr: try process e820 entries if can not get suitable regions from efi

2021-01-05 Thread Lin Feng
: Physical KASLR disabled: no suitable memory region! To enable physical kaslr with kexec, call process_e820_entries when no suitable regions in efi memmaps. Signed-off-by: Lin Feng --- I find a regular of Kernel code and data placement with kexec. It seems unsafe. The reason is showed above. I'm

[PATCH] sysctl.c: fix underflow value setting risk in vm_table

2020-12-23 Thread Lin Feng
vfs_cache_pressure and zone_reclaim_mode, -1 is apparently not a valid value, but we can set to them. And then kernel may crash. # echo -1 > /proc/sys/vm/vfs_cache_pressure Signed-off-by: Lin Feng --- kernel/sysctl.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --

[PATCH] [RFC] init/main: fix broken buffer_init when DEFERRED_STRUCT_PAGE_INIT set

2020-11-23 Thread Lin Feng
we used a half done value of zone->managed_pages before, or should we use a smaller factor(<10%) in previous formula. Signed-off-by: Lin Feng --- init/main.c | 2 -- mm/page_alloc.c | 3 +++ 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/init/main.c b/init/main.c index 2

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-19 Thread Lin Feng
On 9/19/19 11:49, Matthew Wilcox wrote: On Thu, Sep 19, 2019 at 10:33:10AM +0800, Lin Feng wrote: On 9/18/19 20:33, Michal Hocko wrote: I absolutely agree here. From you changelog it is also not clear what is the underlying problem. Both congestion_wait and wait_iff_congested should wake up

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
On 9/18/19 20:33, Michal Hocko wrote: +mm_reclaim_congestion_wait_jiffies +== + +This control is used to define how long kernel will wait/sleep while +system memory is under pressure and memroy reclaim is relatively active. +Lower values will decrease the kernel wait/sleep time. +

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
Hi, On 9/18/19 19:38, Matthew Wilcox wrote: On Wed, Sep 18, 2019 at 11:21:04AM +0800, Lin Feng wrote: Adding a new tunable is not the right solution. The right way is to make Linux auto-tune itself to avoid the problem. For example, bdi_writeback contains an estimated write bandwidth

Re: [PATCH] [RESEND] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
On 9/18/19 20:27, Michal Hocko wrote: Please do not post a new version with a minor compile fixes until there is a general agreement on the approach. Willy had comments which really need to be resolved first. Sorry, but thanks for pointing out. Also does this [...] Reported-by: kbuild

[PATCH] [RESEND] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
of this patch. Signed-off-by: Lin Feng Reported-by: kbuild test robot --- Documentation/admin-guide/sysctl/vm.rst | 17 + kernel/sysctl.c | 10 ++ mm/vmscan.c | 14 +++--- 3 files changed, 38 insertions(+), 3 deletions

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Lin Feng
On 9/17/19 20:06, Matthew Wilcox wrote: On Tue, Sep 17, 2019 at 07:58:24PM +0800, Lin Feng wrote: In direct and background(kswapd) pages reclaim paths both may fall into calling msleep(100) or congestion_wait(HZ/10) or wait_iff_congested(HZ/10) while under IO pressure, and the sleep length

[PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Lin Feng
%hi, 0.3%si, 0.0%st Cpu22 : 1.0%us, 1.0%sy, 0.0%ni, 98.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.7%us, 0.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st Signed-off-by: Lin Feng --- Documentation/admin-guide/sysctl/vm.rst | 17 + kernel/sysctl.c

[PATCH 2/2] kernel/latencytop.c: remove unnecessary checks for latencytop_enabled

2019-02-26 Thread Lin Feng
ion clear_global_latency_tracing. Notes: These changes only visible to users who sets CONFIG_LATENCYTOP and won't change user tool latencytop's behaviors. Signed-off-by: Lin Feng --- kernel/latencytop.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/kernel/latencytop.c b/kernel/latencytop.c index 9e794b497

[PATCH 1/2] kernel/latencytop.c: rename clear_all_latency_tracing to clear_tsk_latency_tracing

2019-02-26 Thread Lin Feng
The name clear_all_latency_tracing is misleading, in fact which only clear per task's latency_record[], and we do have another function named clear_global_latency_tracing which clear the global latency_record[] buffer. Signed-off-by: Lin Feng --- fs/proc/base.c | 2 +- include/linux

Re: [PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-08 Thread Lin Feng
Hi Andreas, Thanks for your reply and review. On 06/08/2016 05:01 AM, Andreas Dilger wrote: On Jun 2, 2016, at 6:01 AM, Lin Feng <l...@chinanetcenter.com> wrote: Descriptions: ext4 block allocation core stack: ext4_mb_new_blocks ext4_mb_normalize_request ext4_mb_regular_all

Re: [PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-08 Thread Lin Feng
Hi Andreas, Thanks for your reply and review. On 06/08/2016 05:01 AM, Andreas Dilger wrote: On Jun 2, 2016, at 6:01 AM, Lin Feng wrote: Descriptions: ext4 block allocation core stack: ext4_mb_new_blocks ext4_mb_normalize_request ext4_mb_regular_allocator ext4_mb_find_by_goal

[PATCH] ext4: mballoc.h typo fix: correct wrong comments about MB_DEFAULT_STREAM_THRESHOLD

2016-06-07 Thread Lin Feng
ation mode") and the comments for MB_DEFAULT_STREAM_THRESHOLD became stale. Signed-off-by: Lin Feng <l...@chinanetcenter.com> --- fs/ext4/mballoc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 3ef1df6..2e64c0e 1006

[PATCH] ext4: mballoc.h typo fix: correct wrong comments about MB_DEFAULT_STREAM_THRESHOLD

2016-06-07 Thread Lin Feng
ation mode") and the comments for MB_DEFAULT_STREAM_THRESHOLD became stale. Signed-off-by: Lin Feng --- fs/ext4/mballoc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 3ef1df6..2e64c0e 100644 --- a/fs/ext4/mballoc.h +++

Re: [PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-05 Thread Lin Feng
HINT_MERGE is only tested once and nowhere teaches how to use it. IIUC it also should be folded into EXT4_MB_HINT_TRY_GOAL path or simply skip EXT4_MB_HINT_MERGE test at -L1871. thanks, linfeng On 06/02/2016 08:01 PM, Lin Feng wrote: Descriptions: ext4 block allocation core stack:

Re: [PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-05 Thread Lin Feng
HINT_MERGE is only tested once and nowhere teaches how to use it. IIUC it also should be folded into EXT4_MB_HINT_TRY_GOAL path or simply skip EXT4_MB_HINT_MERGE test at -L1871. thanks, linfeng On 06/02/2016 08:01 PM, Lin Feng wrote: Descriptions: ext4 block allocation core stack:

[PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-02 Thread Lin Feng
le may get fragments even if the physical blocks in the hole is free, which is expected to be merged into a single extent. Signed-off-by: Lin Feng <l...@chinanetcenter.com> --- fs/ext4/mballoc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/ext4/mballoc

[PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-02 Thread Lin Feng
le may get fragments even if the physical blocks in the hole is free, which is expected to be merged into a single extent. Signed-off-by: Lin Feng --- fs/ext4/mballoc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index c1ab3ec..e

Re: [PATCH 2/2] mm: vmemmap: arm64: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Hi will, On 04/08/2013 06:55 PM, Will Deacon wrote: > Given that we don't have NUMA support or memory-hotplug on arm64 yet, I'm > not sure that this change makes much sense at the moment. early_pfn_to_nid > will always return 0 and we only ever have one node. > > To be honest, I'm not sure what

Re: [PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
Hi Yinghai, On 04/09/2013 02:40 AM, Yinghai Lu wrote: > On Mon, Apr 8, 2013 at 2:56 AM, Lin Feng wrote: >> In hot add node(memory) case, vmemmap pages are always allocated from other >> node, > > that is broken, and should be fixed. > vmemmap should be on local no

Re: [PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
Hi Andrew, On 04/09/2013 04:55 AM, Andrew Morton wrote: > On Mon, 8 Apr 2013 11:40:11 -0700 Yinghai Lu wrote: > >> On Mon, Apr 8, 2013 at 2:56 AM, Lin Feng wrote: >>> In hot add node(memory) case, vmemmap pages are always allocated from other >>> node, >> &g

Re: [PATCH 1/2] mm: vmemmap: x86: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Hi all, On 04/08/2013 05:56 PM, Lin Feng wrote: > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 474e28f..e2a7277 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1318,6 +1318,8 @@ vmemmap_populate(struct page *start_page, unsigned lo

[PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
In hot add node(memory) case, vmemmap pages are always allocated from other node, but the current logic just skip vmemmap_verify check. So we should also issue "potential offnode page_structs" warning messages if we are the case Lin Feng (2): mm: vmemmap: x86: add vmemmap_verify che

[PATCH 2/2] mm: vmemmap: arm64: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Deacon Cc: Arnd Bergmann Cc: Tony Lindgren Cc: Ben Hutchings Cc: Andrew Morton Reported-by: Yasuaki Ishimatsu Signed-off-by: Lin Feng --- arch/arm64/mm/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 70b8cd4..9f1e

[PATCH 1/2] mm: vmemmap: x86: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
ar Cc: "H. Peter Anvin" Cc: Yinghai Lu Cc: Andrew Morton Reported-by: Yasuaki Ishimatsu Signed-off-by: Lin Feng --- arch/x86/mm/init_64.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 474e28f..e2a7277 1

[PATCH 1/2] mm: vmemmap: x86: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: Yinghai Lu ying...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Reported-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Lin Feng linf...@cn.fujitsu.com --- arch/x86/mm/init_64.c | 6

[PATCH 2/2] mm: vmemmap: arm64: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
catalin.mari...@arm.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Cc: Tony Lindgren t...@atomide.com Cc: Ben Hutchings b...@decadent.org.uk Cc: Andrew Morton a...@linux-foundation.org Reported-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Lin Feng linf

[PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
In hot add node(memory) case, vmemmap pages are always allocated from other node, but the current logic just skip vmemmap_verify check. So we should also issue potential offnode page_structs warning messages if we are the case Lin Feng (2): mm: vmemmap: x86: add vmemmap_verify check for hot

Re: [PATCH 1/2] mm: vmemmap: x86: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Hi all, On 04/08/2013 05:56 PM, Lin Feng wrote: diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 474e28f..e2a7277 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1318,6 +1318,8 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node

Re: [PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
Hi Andrew, On 04/09/2013 04:55 AM, Andrew Morton wrote: On Mon, 8 Apr 2013 11:40:11 -0700 Yinghai Lu ying...@kernel.org wrote: On Mon, Apr 8, 2013 at 2:56 AM, Lin Feng linf...@cn.fujitsu.com wrote: In hot add node(memory) case, vmemmap pages are always allocated from other node

Re: [PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
Hi Yinghai, On 04/09/2013 02:40 AM, Yinghai Lu wrote: On Mon, Apr 8, 2013 at 2:56 AM, Lin Feng linf...@cn.fujitsu.com wrote: In hot add node(memory) case, vmemmap pages are always allocated from other node, that is broken, and should be fixed. vmemmap should be on local node even for hot

Re: [PATCH 2/2] mm: vmemmap: arm64: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Hi will, On 04/08/2013 06:55 PM, Will Deacon wrote: Given that we don't have NUMA support or memory-hotplug on arm64 yet, I'm not sure that this change makes much sense at the moment. early_pfn_to_nid will always return 0 and we only ever have one node. To be honest, I'm not sure what that

Re: [PATCH] x86: numa: mm: kill double initialization for NODE_DATA

2013-04-02 Thread Lin Feng
Hi Wanpeng, On 04/02/2013 06:57 PM, Wanpeng Li wrote: >> >PS. For clarifying calling chains are showed as follows: >> >setup_arch() >> > ... >> > initmem_init() >> >x86_numa_init() >> > numa_init() >> >numa_register_memblks() >> > setup_node_data() >> >

[PATCH] x86: numa: mm: kill double initialization for NODE_DATA

2013-04-02 Thread Lin Feng
_early(pgdat->node_id,...) ... zone_sizes_init() free_area_init_nodes() free_area_init_node() pgdat->node_id = nid; pgdat->node_start_pfn = node_start_pfn; calculate_node_totalpages(); pgdat->node_spanned_pages = totalpages; Signed-off-by: Lin Feng ---

[PATCH] x86: numa: mm: kill double initialization for NODE_DATA

2013-04-02 Thread Lin Feng
() free_area_init_nodes() free_area_init_node() pgdat-node_id = nid; pgdat-node_start_pfn = node_start_pfn; calculate_node_totalpages(); pgdat-node_spanned_pages = totalpages; Signed-off-by: Lin Feng linf...@cn.fujitsu.com --- arch/x86/mm/numa.c

Re: [PATCH] x86: numa: mm: kill double initialization for NODE_DATA

2013-04-02 Thread Lin Feng
Hi Wanpeng, On 04/02/2013 06:57 PM, Wanpeng Li wrote: PS. For clarifying calling chains are showed as follows: setup_arch() ... initmem_init() x86_numa_init() numa_init() numa_register_memblks() setup_node_data() NODE_DATA(nid)-node_id = nid;

Re: THP: AnonHugePages in /proc/[pid]/smaps is correct or not?

2013-04-01 Thread Lin Feng
Hi Zhouping, On 04/02/2013 11:09 AM, Zhouping Liu wrote: > I don't understand clearly the last sentence 'you'll probably only get 100% > hugepages only 1/512th of the time.' > could you please explain more details about 'only 1/512th of the time'? IIUC, thp size is 2M so it may be comprised of

Re: THP: AnonHugePages in /proc/[pid]/smaps is correct or not?

2013-04-01 Thread Lin Feng
Hi Zhouping, On 04/02/2013 11:09 AM, Zhouping Liu wrote: I don't understand clearly the last sentence 'you'll probably only get 100% hugepages only 1/512th of the time.' could you please explain more details about 'only 1/512th of the time'? IIUC, thp size is 2M so it may be comprised of 512

Re: [PATCH] kernel/range.c: subtract_range: fix the broken phrase issued by printk

2013-03-27 Thread Lin Feng
Hi Bjorn and others, On 03/28/2013 01:27 AM, Bjorn Helgaas wrote: >> - printk(KERN_ERR "run of slot in ranges\n"); >> > + pr_err("%s: run out of slot in ranges\n", >> > + __func__); >> >

Re: [PATCH] kernel/range.c: subtract_range: fix the broken phrase issued by printk

2013-03-27 Thread Lin Feng
Hi Bjorn and others, On 03/28/2013 01:27 AM, Bjorn Helgaas wrote: - printk(KERN_ERR run of slot in ranges\n); + pr_err(%s: run out of slot in ranges\n, + __func__); }

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-25 Thread Lin Feng
Hi, On 03/25/2013 05:00 PM, Lenky Gao wrote: > I have found a comment in function physflat_cpu_mask_to_apicid to explain why. > > static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask) > { > int cpu; > > /* >* We're using fixed IRQ delivery, can only

Re: [PATCH] x86: mm: add_pfn_range_mapped: use meaningful index to teach clean_sort_range()

2013-03-25 Thread Lin Feng
Hi Andrew, On 03/19/2013 02:52 AM, Yinghai Lu wrote: > On Mon, Mar 18, 2013 at 3:21 AM, Lin Feng wrote: >> Since add_range_with_merge() return the max none zero element of the array, >> it's >> suffice to use it to instruct clean_sort_range() to do the sort. Or the &g

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-25 Thread Lin Feng
On 03/25/2013 02:46 PM, Lenky Gao wrote: >> Do you mean on your old machine the irq will be distributed automatically >> among the cpus set by smp_affinity? >> > > Yes. My another machine's interrupts are as follows: And without irqbalance service? It sounds weird to me.. thanks, linfeng > >

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-25 Thread Lin Feng
On 03/25/2013 02:46 PM, Lenky Gao wrote: Do you mean on your old machine the irq will be distributed automatically among the cpus set by smp_affinity? Yes. My another machine's interrupts are as follows: And without irqbalance service? It sounds weird to me.. thanks, linfeng

Re: [PATCH] x86: mm: add_pfn_range_mapped: use meaningful index to teach clean_sort_range()

2013-03-25 Thread Lin Feng
Hi Andrew, On 03/19/2013 02:52 AM, Yinghai Lu wrote: On Mon, Mar 18, 2013 at 3:21 AM, Lin Feng linf...@cn.fujitsu.com wrote: Since add_range_with_merge() return the max none zero element of the array, it's suffice to use it to instruct clean_sort_range() to do the sort. Or the former

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-25 Thread Lin Feng
Hi, On 03/25/2013 05:00 PM, Lenky Gao wrote: I have found a comment in function physflat_cpu_mask_to_apicid to explain why. static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask) { int cpu; /* * We're using fixed IRQ delivery, can only return

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi, On 03/25/2013 11:44 AM, Lenky Gao wrote: >> On 03/25/2013 11:18 AM, Lenky Gao wrote: >>> The irqbalance service has been stopped. >> So try start irqbalance to see what happen? >> It should help to give what you want ;-) > > Using the irqbalance service to dynamically change the IRQ-bound?

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi, On 03/25/2013 11:18 AM, Lenky Gao wrote: > The irqbalance service has been stopped. So try start irqbalance to see what happen? It should help to give what you want ;-) thanks, linfeng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi Gao, On 03/25/2013 10:33 AM, Lenky Gao wrote: > [root@localhost ~]# echo 6 > /proc/irq/25/smp_affinity > [root@localhost ~]# cat /proc/irq/25/smp_affinity > 06 Seems you bind the nic irq to second and third cpu for the bit mask you set is 110, so now eth9's irq is working on the 3rd cpu.

Re: [patch] mm: speedup in __early_pfn_to_nid

2013-03-24 Thread Lin Feng
On 03/24/2013 04:37 AM, Yinghai Lu wrote: > +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP > +int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > + unsigned long *start_pfn, unsigned long *end_pfn) > +{ > + struct memblock_type *type = > + int mid =

Re: [patch] mm: speedup in __early_pfn_to_nid

2013-03-24 Thread Lin Feng
On 03/24/2013 04:37 AM, Yinghai Lu wrote: +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP +int __init_memblock memblock_search_pfn_nid(unsigned long pfn, + unsigned long *start_pfn, unsigned long *end_pfn) +{ + struct memblock_type *type = memblock.memory; + int mid =

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi Gao, On 03/25/2013 10:33 AM, Lenky Gao wrote: [root@localhost ~]# echo 6 /proc/irq/25/smp_affinity [root@localhost ~]# cat /proc/irq/25/smp_affinity 06 Seems you bind the nic irq to second and third cpu for the bit mask you set is 110, so now eth9's irq is working on the 3rd cpu. Have

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi, On 03/25/2013 11:18 AM, Lenky Gao wrote: The irqbalance service has been stopped. So try start irqbalance to see what happen? It should help to give what you want ;-) thanks, linfeng -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi, On 03/25/2013 11:44 AM, Lenky Gao wrote: On 03/25/2013 11:18 AM, Lenky Gao wrote: The irqbalance service has been stopped. So try start irqbalance to see what happen? It should help to give what you want ;-) Using the irqbalance service to dynamically change the IRQ-bound? It's seems

[PATCH] kernel/range.c: subtract_range: fix the broken phrase issued by printk

2013-03-18 Thread Lin Feng
Also replace deprecated printk(KERN_ERR...) with pr_err() as suggested by Yinghai, attaching the function name to provide plenty info. Cc: Yinghai Lu Signed-off-by: Lin Feng --- kernel/range.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/range.c b/kernel/range.c

[PATCH] x86: mm: accurate the comments for STEP_SIZE_SHIFT macro

2013-03-18 Thread Lin Feng
For x86 PUD_SHIFT is 30 and PMD_SHIFT is 21, so the consequence of (PUD_SHIFT-PMD_SHIFT)/2 is 4. Update the comments to the code. Cc: Yinghai Lu Signed-off-by: Lin Feng --- arch/x86/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm

[PATCH] kernel/range.c: subtract_range: return instead of continue to save some loops

2013-03-18 Thread Lin Feng
If we fall into that branch it means that there is a range fully covering the subtract range, so it's suffice to return there if there isn't any other overlapping ranges. Also fix the broken phrase issued by printk. Cc: Yinghai Lu Signed-off-by: Lin Feng --- kernel/range.c | 4 ++-- 1 file

[PATCH] x86: mm: add_pfn_range_mapped: use meaningful index to teach clean_sort_range()

2013-03-18 Thread Lin Feng
and it never depends on nr_pfn_mapped. Cc: Jacob Shin Cc: Yinghai Lu Signed-off-by: Lin Feng --- arch/x86/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 59b7fc4..55ae904 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c

[PATCH] x86: mm: add_pfn_range_mapped: use meaningful index to teach clean_sort_range()

2013-03-18 Thread Lin Feng
and it never depends on nr_pfn_mapped. Cc: Jacob Shin jacob.s...@amd.com Cc: Yinghai Lu ying...@kernel.org Signed-off-by: Lin Feng linf...@cn.fujitsu.com --- arch/x86/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 59b7fc4..55ae904

[PATCH] kernel/range.c: subtract_range: return instead of continue to save some loops

2013-03-18 Thread Lin Feng
If we fall into that branch it means that there is a range fully covering the subtract range, so it's suffice to return there if there isn't any other overlapping ranges. Also fix the broken phrase issued by printk. Cc: Yinghai Lu ying...@kernel.org Signed-off-by: Lin Feng linf...@cn.fujitsu.com

[PATCH] x86: mm: accurate the comments for STEP_SIZE_SHIFT macro

2013-03-18 Thread Lin Feng
For x86 PUD_SHIFT is 30 and PMD_SHIFT is 21, so the consequence of (PUD_SHIFT-PMD_SHIFT)/2 is 4. Update the comments to the code. Cc: Yinghai Lu ying...@kernel.org Signed-off-by: Lin Feng linf...@cn.fujitsu.com --- arch/x86/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

[PATCH] kernel/range.c: subtract_range: fix the broken phrase issued by printk

2013-03-18 Thread Lin Feng
Also replace deprecated printk(KERN_ERR...) with pr_err() as suggested by Yinghai, attaching the function name to provide plenty info. Cc: Yinghai Lu ying...@kernel.org Signed-off-by: Lin Feng linf...@cn.fujitsu.com --- kernel/range.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff

Re: [PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-03-06 Thread Lin Feng
Hi Yasuaki, On 03/06/2013 03:48 PM, Yasuaki Ishimatsu wrote: > Hi Lin, > > IMHO, current implementation depends on luck. So even if system has > many non movable memory, get_user_pages_non_movable() may not allocate > non movable memory. Sorry, I'm not quite understand here, since the to be

Re: [RFC/PATCH 3/5] mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set

2013-03-06 Thread Lin Feng
Hi Marek, On 03/05/2013 02:57 PM, Marek Szyprowski wrote: > Ensure that newly allocated pages, which are faulted in in FOLL_DURABLE > mode comes from non-movalbe pageblocks, to workaround migration failures > with Contiguous Memory Allocator. snip > @@ -2495,7 +2498,7 @@ static inline void

Re: [RFC/PATCH 3/5] mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set

2013-03-06 Thread Lin Feng
Hi Marek, On 03/05/2013 02:57 PM, Marek Szyprowski wrote: > @@ -2495,7 +2498,7 @@ static inline void cow_user_page(struct page *dst, > struct page *src, unsigned lo > */ > static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long address, pte_t

Re: [RFC/PATCH 3/5] mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set

2013-03-06 Thread Lin Feng
Hi Marek, On 03/05/2013 02:57 PM, Marek Szyprowski wrote: @@ -2495,7 +2498,7 @@ static inline void cow_user_page(struct page *dst, struct page *src, unsigned lo */ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pte_t

Re: [RFC/PATCH 3/5] mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set

2013-03-06 Thread Lin Feng
Hi Marek, On 03/05/2013 02:57 PM, Marek Szyprowski wrote: Ensure that newly allocated pages, which are faulted in in FOLL_DURABLE mode comes from non-movalbe pageblocks, to workaround migration failures with Contiguous Memory Allocator. snip @@ -2495,7 +2498,7 @@ static inline void

Re: [PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-03-06 Thread Lin Feng
Hi Yasuaki, On 03/06/2013 03:48 PM, Yasuaki Ishimatsu wrote: Hi Lin, IMHO, current implementation depends on luck. So even if system has many non movable memory, get_user_pages_non_movable() may not allocate non movable memory. Sorry, I'm not quite understand here, since the to be pinned

Re: [PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-26 Thread Lin Feng
Hi Andrew, Mel and other guys, How about this V3 patch, any comments? thanks, linfeng On 02/21/2013 07:01 PM, Lin Feng wrote: > get_user_pages() always tries to allocate pages from movable zone, which is > not > reliable to memory hotremove framework in some case. > > This p

Re: [PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-26 Thread Lin Feng
Hi Andrew, Mel and other guys, How about this V3 patch, any comments? thanks, linfeng On 02/21/2013 07:01 PM, Lin Feng wrote: get_user_pages() always tries to allocate pages from movable zone, which is not reliable to memory hotremove framework in some case. This patch introduces a new

[PATCH V3 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-21 Thread Lin Feng
v1->v2: Patch1: - Fix the negative return value bug pointed out by Andrew and other suggestions pointed out by Andrew and Jeff. Patch2: - Kill the CONFIG_MEMORY_HOTREMOVE dependence suggested by Jeff. --- Lin Feng (2): mm: hotplug: implement non-movable version of get_user_pages()

[PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-21 Thread Lin Feng
Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- include/linux/mm.h | 14 ++ include/linux/mmzone.h |4 ++ mm/memory.c| 103 mm/page_isolation.c|8 4 files changed

[PATCH V3 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-21 Thread Lin Feng
Morton Cc: Jeff Moyer Cc: Minchan Kim Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- fs/aio.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 2512232..193e145 100644 --- a/fs/aio.c +++ b/fs/aio.c

[PATCH V3 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-21 Thread Lin Feng
Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org Cc: Jeff Moyer jmo...@redhat.com Cc: Minchan Kim minc...@kernel.org Cc: Zach Brown z...@redhat.com Reviewed-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Gu Zheng guz.f...@cn.fujitsu.com Signed-off-by: Lin Feng linf

[PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-21 Thread Lin Feng
...@jp.fujitsu.com Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Cc: Jeff Moyer jmo...@redhat.com Cc: Minchan Kim minc...@kernel.org Cc: Zach Brown z...@redhat.com Reviewed-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Gu Zheng guz.f...@cn.fujitsu.com Signed-off-by: Lin Feng linf

[PATCH V3 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-21 Thread Lin Feng
-v2: Patch1: - Fix the negative return value bug pointed out by Andrew and other suggestions pointed out by Andrew and Jeff. Patch2: - Kill the CONFIG_MEMORY_HOTREMOVE dependence suggested by Jeff. --- Lin Feng (2): mm: hotplug: implement non-movable version of get_user_pages() called

Re: [PATCH V2 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
Hi Wanpeng, On 02/20/2013 07:37 PM, Wanpeng Li wrote: >> + * This function first calls get_user_pages() to get the candidate pages, >> and >> >+ * then check to ensure all pages are from non movable zone. Otherwise >> >migrate > How about "Otherwise migrate candidate pages which have already

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
On 02/20/2013 07:31 PM, Simon Jeons wrote: > On 02/20/2013 06:23 PM, Lin Feng wrote: >> Hi Simon, >> >> On 02/20/2013 05:58 PM, Simon Jeons wrote: >>>> The other is that this almost certainly broken for transhuge page >>>> handling. gup returns the

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
Hi Simon, On 02/20/2013 05:58 PM, Simon Jeons wrote: > >> >> The other is that this almost certainly broken for transhuge page >> handling. gup returns the head and tail pages and ordinarily this is ok > > When need gup thp? in kvm case? gup just pins the wanted pages(for x86 is 4k size) of

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
Hi Simon, On 02/20/2013 05:58 PM, Simon Jeons wrote: The other is that this almost certainly broken for transhuge page handling. gup returns the head and tail pages and ordinarily this is ok When need gup thp? in kvm case? gup just pins the wanted pages(for x86 is 4k size) of user

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
On 02/20/2013 07:31 PM, Simon Jeons wrote: On 02/20/2013 06:23 PM, Lin Feng wrote: Hi Simon, On 02/20/2013 05:58 PM, Simon Jeons wrote: The other is that this almost certainly broken for transhuge page handling. gup returns the head and tail pages and ordinarily this is ok When need gup

Re: [PATCH V2 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
Hi Wanpeng, On 02/20/2013 07:37 PM, Wanpeng Li wrote: + * This function first calls get_user_pages() to get the candidate pages, and + * then check to ensure all pages are from non movable zone. Otherwise migrate How about Otherwise migrate candidate pages which have already been

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Wanpeng, On 02/20/2013 10:44 AM, Wanpeng Li wrote: >> Sorry, I misunderstood what "tail pages" means, stupid question, just ignore >> it. >> >flee... > According to the compound page, the first page of compound page is > called head page, other sub pages are called tail pages. > > Regards, >

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
On 02/19/2013 09:37 PM, Lin Feng wrote: >> > >> > The other is that this almost certainly broken for transhuge page >> > handling. gup returns the head and tail pages and ordinarily this is ok > I can't find codes doing such things :(, could you please point me

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Mel, On 02/05/2013 09:32 PM, Mel Gorman wrote: > On Tue, Feb 05, 2013 at 11:57:22AM +, Mel Gorman wrote: >> + migrate_pre_flag = 1; + } + + if (!isolate_lru_page(pages[i])) { +

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Mel, On 02/18/2013 11:17 PM, Mel Gorman wrote: >>> > > >>> > > >>> > > result. It's a little clumsy but the memory hot-remove failure message >>> > > could list what applications have pinned the pages that cannot be >>> > > removed >>> > > so the administrator has the option of force-killing

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Mel, On 02/18/2013 11:17 PM, Mel Gorman wrote: SNIP result. It's a little clumsy but the memory hot-remove failure message could list what applications have pinned the pages that cannot be removed so the administrator has the option of force-killing the application. It is

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Mel, On 02/05/2013 09:32 PM, Mel Gorman wrote: On Tue, Feb 05, 2013 at 11:57:22AM +, Mel Gorman wrote: + migrate_pre_flag = 1; + } + + if (!isolate_lru_page(pages[i])) { +

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
On 02/19/2013 09:37 PM, Lin Feng wrote: The other is that this almost certainly broken for transhuge page handling. gup returns the head and tail pages and ordinarily this is ok I can't find codes doing such things :(, could you please point me out? Sorry, I misunderstood what tail

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Wanpeng, On 02/20/2013 10:44 AM, Wanpeng Li wrote: Sorry, I misunderstood what tail pages means, stupid question, just ignore it. flee... According to the compound page, the first page of compound page is called head page, other sub pages are called tail pages. Regards, Wanpeng Li

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-18 Thread Lin Feng
Hi Mel, See below. On 02/05/2013 07:57 PM, Mel Gorman wrote: > On Mon, Feb 04, 2013 at 04:06:24PM -0800, Andrew Morton wrote: >> The ifdefs aren't really needed here and I encourage people to omit >> them. This keeps the header files looking neater and reduces the >> chances of things later

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-18 Thread Lin Feng
Hi Mel, See below. On 02/05/2013 07:57 PM, Mel Gorman wrote: On Mon, Feb 04, 2013 at 04:06:24PM -0800, Andrew Morton wrote: The ifdefs aren't really needed here and I encourage people to omit them. This keeps the header files looking neater and reduces the chances of things later breaking

[PATCH V2 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-05 Thread Lin Feng
Morton Cc: Jeff Moyer Cc: Minchan Kim Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- fs/aio.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 71f613c..f7a0d5c 100644 --- a/fs/aio.c +++ b/fs/aio.c

[PATCH V2 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-05 Thread Lin Feng
Fix the negative return value bug pointed out by Andrew and other suggestions pointed out by Andrew and Jeff. Patch2: - Kill the CONFIG_MEMORY_HOTREMOVE dependence suggested by Jeff. --- Lin Feng (2): mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_mova

  1   2   >