Re: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task.

2018-10-18 Thread Tetsuo Handa
On 2018/10/18 15:55, Michal Hocko wrote: > On Thu 18-10-18 11:46:50, Tetsuo Handa wrote: >> This is essentially a ratelimit approach, roughly equivalent with: >> >> static DEFINE_RATELIMIT_STATE(oom_no_victim_rs, 60 * HZ, 1); >> oom_no_victim_rs.flags |= RATELIMIT

Re: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task.

2018-10-18 Thread Tetsuo Handa
On 2018/10/18 17:13, Sergey Senozhatsky wrote: > On (10/18/18 09:56), Michal Hocko wrote: >> On Thu 18-10-18 15:10:18, Sergey Senozhatsky wrote: >> [...] >>> and let's hear from MM people what they can suggest. >>> >>> Michal, Andrew, Johannes, any thoughts? >> >> I have already stated my position.

Re: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task.

2018-10-18 Thread Tetsuo Handa
Petr Mladek wrote: > This looks very complex and I see even more problems: > > + You would need to update the rate limit intervals when > new console is attached. Note that the ratelimits might > get initialized early during boot. It might be solvable > but ... > > + You might nee

Re: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task.

2018-10-19 Thread Tetsuo Handa
On 2018/10/19 8:54, Sergey Senozhatsky wrote: > On (10/18/18 20:58), Tetsuo Handa wrote: >>> >>> A knob might do. >>> As well as /proc/sys/kernel/printk tweaks, probably. One can even add >>> echo "a b c d" > /proc/sys/kernel/printk to .bash

[PATCH] mm,oom: Use timeout based back off.

2018-10-20 Thread Tetsuo Handa
not want to use timeout. I believe that timeout based back off is the only approach we can use for now. [1] https://marc.info/?i=20180910125513.311-1-mho...@kernel.org Signed-off-by: Tetsuo Handa --- include/linux/oom.h | 2 +- kernel/fork.c | 13 ++ mm/mmap.c | 4

Re: KASAN: use-after-free Read in task_is_descendant

2018-10-21 Thread Tetsuo Handa
On 2018/10/21 16:10, syzbot wrote: > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188 > [inline] > BUG: KASAN: use-after-free in task_is_descendant.part.2+0x610/0x670 > security/yama/yama_lsm.c:295 > Read of size 8 at addr 8801c4666b20 by task syz-executor3/12722 >

Re: [RFC PATCH 1/2] mm, oom: marks all killed tasks as oom victims

2018-10-22 Thread Tetsuo Handa
Michal Hocko wrote: > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -898,6 +898,7 @@ static void __oom_kill_process(struct task_struct *victim) > if (unlikely(p->flags & PF_KTHREAD)) > continue; > do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, PIDTY

Re: [RFC v4 0/2] WhiteEgret LSM module

2018-10-22 Thread Tetsuo Handa
Steve Kemp wrote: > This is an interesting idea, and an evolution since the initial > approach which was submitted based upon xattr attributes. I still > find the idea of using attributes simpler to manage though, since > they're easy to add, and audit for. > > I suspect the biggest objection to

Re: [RFC PATCH 1/2] mm, oom: marks all killed tasks as oom victims

2018-10-22 Thread Tetsuo Handa
On 2018/10/22 17:48, Michal Hocko wrote: > On Mon 22-10-18 16:58:50, Tetsuo Handa wrote: >> Michal Hocko wrote: >>> --- a/mm/oom_kill.c >>> +++ b/mm/oom_kill.c >>> @@ -898,6 +898,7 @@ static void __oom_kill_process(struct task_struct >>> *victim) >&

Re: KASAN: use-after-free Read in task_is_descendant

2018-10-22 Thread Tetsuo Handa
On 2018/10/22 18:54, Oleg Nesterov wrote: > On 10/21, Tetsuo Handa wrote: >> >> On 2018/10/21 16:10, syzbot wrote: >>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188 >>> [inline] >>> BUG: KASAN: use-after-free in task_is_d

Re: [RFC PATCH 1/2] mm, oom: marks all killed tasks as oom victims

2018-10-22 Thread Tetsuo Handa
On 2018/10/22 19:43, Michal Hocko wrote: > On Mon 22-10-18 18:42:30, Tetsuo Handa wrote: >> On 2018/10/22 17:48, Michal Hocko wrote: >>> On Mon 22-10-18 16:58:50, Tetsuo Handa wrote: >>>> Michal Hocko wrote: >>>>> --- a/mm/oom_kill.c >>>>&g

Re: [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks

2018-10-22 Thread Tetsuo Handa
On 2018/10/22 16:13, Michal Hocko wrote: > From: Michal Hocko > > Tetsuo has reported [1] that a single process group memcg might easily > swamp the log with no-eligible oom victim reports due to race between > the memcg charge and oom_reaper > > Thread 1 Thread2

Re: [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks

2018-10-22 Thread Tetsuo Handa
On 2018/10/22 21:03, Michal Hocko wrote: > On Mon 22-10-18 20:45:17, Tetsuo Handa wrote: >> On 2018/10/22 16:13, Michal Hocko wrote: >>> From: Michal Hocko >>> >>> Tetsuo has reported [1] that a single process group memcg might easily >>> swamp the l

Re: [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks

2018-10-22 Thread Tetsuo Handa
On 2018/10/22 22:43, Michal Hocko wrote: > On Mon 22-10-18 22:20:36, Tetsuo Handa wrote: >> I mean: >> >> mm/memcontrol.c | 3 +- >> mm/oom_kill.c | 111 >> +--- >> 2 files changed, 12 insertions

Re: [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks

2018-10-22 Thread Tetsuo Handa
Michal Hocko wrote: > On Mon 22-10-18 20:45:17, Tetsuo Handa wrote: > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > index e79cb59552d9..a9dfed29967b 100644 > > > --- a/mm/memcontrol.c > > > +++ b/mm/memcontrol.c > > > @@ -1380,10 +1380,2

Re: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task.

2018-10-23 Thread Tetsuo Handa
On 2018/10/23 17:21, Petr Mladek wrote: > On Fri 2018-10-19 09:18:16, Tetsuo Handa wrote: >> I assumed we calculate the average dynamically, for the amount of >> messages printed by an OOM event is highly unstable (depends on >> hardware configuration such as number of n

Re: INFO: task hung in fsnotify_connector_destroy_workfn (2)

2018-09-15 Thread Tetsuo Handa
On 2018/09/15 11:33, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit:    11da3a7f84f1 Linux 4.19-rc3 > git tree:   upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=141ffbca40 > kernel config:  https://syzkaller.appspot.com/x/.config?x=99

Re: [PATCH 16/18] LSM: Allow arbitrary LSM ordering

2018-09-16 Thread Tetsuo Handa
On 2018/09/17 8:00, Kees Cook wrote: > On Sun, Sep 16, 2018 at 11:49 AM, Casey Schaufler > wrote: >> One solution is to leave security= as is, not affecting "minor" >> modules and only allowing specification of one major module, and adding > > I would much prefer this, yes. > > A question remain

Re: [patch -mm] mm, oom: remove oom_lock from exit_mmap

2018-07-16 Thread Tetsuo Handa
On 2018/07/16 16:44, Michal Hocko wrote: >> If setting MMF_OOM_SKIP is guarded by oom_lock, we can enforce >> last second allocation attempt like below. >> >> CPU 0 CPU 1 >> >> mutex_trylock(&oom_lock) in __alloc_pages_may_oom() succeeds. >> get_page_from_

Re: [PATCH] x86: Avoid pr_cont() in show_opcodes()

2018-07-16 Thread Tetsuo Handa
Ingo, is this patch acceptable? On 2018/07/07 22:54, Tetsuo Handa wrote: >From 61752cef56fad2a910f6bfd277e1b9b028aeab43 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa > Date: Sat, 7 Jul 2018 22:45:30 +0900 > Subject: [PATCH v2] x86: Avoid pr_cont() in show_opcodes() > > Since

Re: [PATCH v13 0/7] cgroup-aware OOM killer

2018-07-16 Thread Tetsuo Handa
Roman Gushchin wrote: > On Tue, Jul 17, 2018 at 06:13:47AM +0900, Tetsuo Handa wrote: > > No response from Roman and David... > > > > Andrew, will you once drop Roman's cgroup-aware OOM killer and David's > > patches? > > Roman's series has a bug

[PATCH] n_tty: Protect tty->disc_data using refcount.

2018-07-17 Thread Tetsuo Handa
>From e8360c16fd07985686bcfb388364103f35a6523a Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 17 Jul 2018 16:43:32 +0900 Subject: [PATCH] n_tty: Protect tty->disc_data using refcount. syzbot is reporting NULL pointer dereference at n_tty_set_termios() [1]. This is because

Re: [PATCH] x86: Avoid pr_cont() in show_opcodes()

2018-07-17 Thread Tetsuo Handa
well. >From 96d9d4d135994a081e54d33d23f5007c53d9b5dd Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 17 Jul 2018 22:47:11 +0900 Subject: [PATCH v3] x86: Avoid pr_cont() in show_opcodes() Since syzbot is confused by concurrent printk() messages [1], this patch changes show_opcodes() to use %*ph for

Re: [PATCH] x86: Avoid pr_cont() in show_opcodes()

2018-07-17 Thread Tetsuo Handa
Corrected Signed-off-by: addresses. >From 96d9d4d135994a081e54d33d23f5007c53d9b5dd Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 17 Jul 2018 22:47:11 +0900 Subject: [PATCH v4] x86: Avoid pr_cont() in show_opcodes() Since syzbot is confused by concurrent printk() messages [1], t

Re: [patch v3] mm, oom: fix unnecessary killing of additional processes

2018-07-17 Thread Tetsuo Handa
This patch should be dropped from linux-next because it is incorrectly using MMF_UNSTABLE. On 2018/06/22 6:35, David Rientjes wrote: > diff --git a/mm/mmap.c b/mm/mmap.c > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -3059,25 +3059,28 @@ void exit_mmap(struct mm_struct *mm) > if (unlikely(mm_is_oo

[PATCH v5] x86: Avoid pr_cont() in show_opcodes().

2018-07-18 Thread Tetsuo Handa
mp;x=139d342c40 Signed-off-by: Tetsuo Handa Signed-off-by: Rasmus Villemoes Cc: Borislav Petkov Cc: Thomas Gleixner Cc: Peter Zijlstra Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Andy Lutomirski --- arch/x86/kernel/dumpstack.c | 29 + 1 file changed, 9 insert

Re: INFO: task hung in generic_file_write_iter

2018-07-18 Thread Tetsuo Handa
On 2018/07/18 17:58, syzbot wrote: > mmap: syz-executor7 (10902) uses deprecated remap_file_pages() syscall. See > Documentation/vm/remap_file_pages.rst. There are many reports which are stalling inside __getblk_gfp(). And there is horrible comment for __getblk_gfp(): /* * __getblk_gfp() wi

Re: INFO: task hung in grab_super

2018-07-18 Thread Tetsuo Handa
Dmitry, this is yet another example of stalling inside __bread_gfp(). Can you find all reports where NMI backtrace contains __bread_gfp ? I need to wget all reports if I try to do that on my side. If you can locally grep on your side, it will be nice. On 2018/07/18 19:38, syzbot wrote: > CPU: 1

Re: INFO: task hung in grab_super

2018-07-18 Thread Tetsuo Handa
On 2018/07/18 20:41, Dmitry Vyukov wrote: > This seems to be related to 9p. After rerunning the log I got: > > root@syzkaller:~# ps afxu | grep syz > root 18253 0.0 0.0 0 0 ttyS0Zl 10:16 0:00 \_ > [syz-executor] > root@syzkaller:~# cat /proc/18253/task/*/stack > [<0>] p9_c

Re: INFO: task hung in grab_super

2018-07-18 Thread Tetsuo Handa
On 2018/07/18 22:04, Dmitry Vyukov wrote: > On Wed, Jul 18, 2018 at 2:53 PM, Tetsuo Handa > wrote: >> On 2018/07/18 20:41, Dmitry Vyukov wrote: >>> This seems to be related to 9p. After rerunning the log I got: >>> >>> root@syzkaller:~# ps afxu | grep syz &g

Re: INFO: task hung in grab_super

2018-07-18 Thread Tetsuo Handa
On 2018/07/18 23:11, Dmitry Vyukov wrote: > On Wed, Jul 18, 2018 at 3:35 PM, Tetsuo Handa > wrote: >>>>> This seems to be related to 9p. After rerunning the log I got: >>>>> >>>>> root@syzkaller:~# ps afxu | grep syz >>>>>

Re: [syzbot] [net?] [virt?] upstream test error: KMSAN: uninit-value in receive_buf

2024-06-17 Thread Tetsuo Handa
Bisection reached commit f9dac92ba908 ("virtio_ring: enable premapped mode whatever use_dma_api"). On 2024/05/26 1:12, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit:56fb6f92854f Merge tag 'drm-next-2024-05-25' of https://gi.. > git tree: upstream > con

Re: [syzbot] [net?] [virt?] upstream test error: KMSAN: uninit-value in receive_buf

2024-06-17 Thread Tetsuo Handa
On 2024/06/18 10:27, Xuan Zhuo wrote: > Maybe this patch can fix this issue: > > > http://lore.kernel.org/all/20240606111345.93600-1-xuanz...@linux.alibaba.com Yes, thank you. #syz fix: virtio_ring: fix KMSAN error for premapped mode

[PATCH (repost)] sched/core: defer printk() while rq lock is held

2024-07-19 Thread Tetsuo Handa
call printk(), guard the whole section between raw_spin_rq_{lock,lock_nested,trylock}() and raw_spin_rq_unlock() using printk_deferred_{enter,exit}(). Reported-by: syzbot Closes: https://syzkaller.appspot.com/bug?extid=18cfb7f63482af8641df Signed-off-by: Tetsuo Handa --- This is a repost of

Re: possible deadlock in start_this_handle (2)

2021-02-15 Thread Tetsuo Handa
On 2021/02/15 21:45, Jan Kara wrote: > On Sat 13-02-21 23:26:37, Tetsuo Handa wrote: >> Excuse me, but it seems to me that nothing prevents >> ext4_xattr_set_handle() from reaching ext4_xattr_inode_lookup_create() >> without memalloc_nofs_save() when hitting ext4_get_nojourna

Re: [PATCH 05/13] tty: remove tty_warn()

2021-04-08 Thread Tetsuo Handa
On 2021/04/08 21:51, Greg Kroah-Hartman wrote: > Remove users of tty_warn() and replace them with calls to dev_warn() > which provides more information about the tty that has the error and > uses the standard formatting logic. Ouch. This series would be good for clean up, but this series might be

error from checkpatch.pl version 0.10

2007-09-20 Thread Tetsuo Handa
Hello. I checked my patch using checkpatch.pl version 0.10 and I got the following error. ERROR: need consistent spacing around '*' (ctx:WxV) #2334: FILE: security/tomoyo/common.c:2306: +static unsigned int tmy_poll(struct file *file, poll_table *wait)

Re: error from checkpatch.pl version 0.10

2007-09-20 Thread Tetsuo Handa
Hello. Satyam Sharma wrote: > Looks like a checkpatch.pl bug to me -- that was nothing to warn about. I see. I'll wait for next version. > struct poll_table { > poll_queue_proc qproc; > }; poll_table is defined in include/linux/poll.h . To change this, we have to do "sed

Re: [TOMOYO 14/15] Conditional permission support.

2007-08-25 Thread Tetsuo Handa
Hello. Pavel Machek wrote: > What is that? Language parser in kernel? Yes. This is a policy parser in kernel. TOMOYO Linux' policy is passed from/to the kernel as a plain text (i.e. ASCII printable) file via /proc/tomoyo interface. For example, to add a permission to allow /usr/sbin/sshd to exe

Re: [PATCH] TaskTracker : Simplified thread information tracker.

2015-01-04 Thread Tetsuo Handa
Hello. Richard Guy Briggs wrote: > > Richard Guy Briggs wrote: > > > On 14/09/28, Tetsuo Handa wrote: > > > > (Q2) Does auxiliary record work with only type=SYSCALL case? > > > > > > Auxiliary records don't work with AUDIT_LOGIN because that re

Re: [PATCH] sched/fair: Fix RCU stall upon ENOMEM atsched_create_group().

2015-01-06 Thread Tetsuo Handa
Peter Zijlstra wrote: > On Thu, Dec 25, 2014 at 10:10:45PM +0900, Tetsuo Handa wrote: > > >From 052595ab1a1d1c5668d9de61395c9cc17694597e Mon Sep 17 00:00:00 2001 > > From: Tetsuo Handa > > Date: Thu, 25 Dec 2014 15:51:21 +0900 > > Subject: [PATCH] sched/fair

Re: [PATCH] sched/fair: Fix RCU stall upon ENOMEM atsched_create_group().

2015-01-06 Thread Tetsuo Handa
Peter Zijlstra wrote: > On Thu, Dec 25, 2014 at 10:10:45PM +0900, Tetsuo Handa wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index ef2b104..586ee15 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -7756,8 +7756,12 @@

[PATCH] sched: Fix potential call to __ffs(0) in sched_show_task()

2014-12-05 Thread Tetsuo Handa
>From 8579fc072fda603c4ae472cc3f0390d61556ac01 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Fri, 5 Dec 2014 21:22:22 +0900 Subject: [PATCH] sched: Fix potential call to __ffs(0) in sched_show_task() "struct task_struct"->state is "volatile long" and __ffs() warns

[RequestForTesters] SystemTap-based memory allocation failure injection

2014-12-25 Thread Tetsuo Handa
Since it has been an unwritten rule that GFP_KERNEL allocations for low-order (<=PAGE_ALLOC_COSTLY_ORDER) never fail unless chosen by the OOM killer, there are a lot of code where allocation failure error paths are hardly tested. This is an update of memory allocation failure injection tester whic

[PATCH] sched/fair: Fix RCU stall upon ENOMEM at sched_create_group().

2014-12-25 Thread Tetsuo Handa
>From 052595ab1a1d1c5668d9de61395c9cc17694597e Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Thu, 25 Dec 2014 15:51:21 +0900 Subject: [PATCH] sched/fair: Fix RCU stall upon ENOMEM at sched_create_group(). When alloc_fair_sched_group() in sched_create_group() failed, free_sched_group()

Re: [PATCH] TaskTracker : Simplified thread information tracker.

2015-01-11 Thread Tetsuo Handa
unctions for date formatting, I split it as a new function. If it is acceptable, I'd like to make that function public and replace tomoyo_convert_time() in security/tomoyo/util.c with that function. Regards. >From 50d59b5640a7501b8d5f843fb57283fcb62b1118 Mon Sep 17 0

Re: [LKP] [mm] cc87317726f: WARNING: CPU: 0 PID: 1atdrivers/iommu/io-pgtable-arm.c:413 __arm_lpae_unmap+0x341/0x380()

2015-03-20 Thread Tetsuo Handa
Huang Ying wrote: > > > BTW: the test is run on 32 bit system. > > > > That sounds like the cause of your problem. The system might be out of > > address space available for the kernel (only 1GB if x86_32). You should > > try running tests on 64 bit systems. > > We run test on 32 bit and 64 bit s

Re: [LKP] [mm] cc87317726f: WARNING: CPU: 0 PID:1atdrivers/iommu/io-pgtable-arm.c:413 __arm_lpae_unmap+0x341/0x380()

2015-03-20 Thread Tetsuo Handa
Michal Hocko wrote: > On Fri 20-03-15 22:34:21, Tetsuo Handa wrote: > > Huang Ying wrote: > > > > > BTW: the test is run on 32 bit system. > > > > > > > > That sounds like the cause of your problem. The system might be out of > > > >

Re: [LKP] [mm] cc87317726f: WARNING: CPU: 0 PID: 1 at drivers/iommu/io-pgtable-arm.c:413 __arm_lpae_unmap+0x341/0x380()

2015-03-20 Thread Tetsuo Handa
Michal Hocko wrote: > On Fri 20-03-15 23:02:09, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Fri 20-03-15 22:34:21, Tetsuo Handa wrote: > > > > Huang Ying wrote: > > > > > > > BTW: the test is run on 32 bit system. > > > > &

Re: [PATCH 0/2] Move away from non-failing small allocations

2015-04-02 Thread Tetsuo Handa
Tetsuo Handa wrote: > Michal Hocko wrote: > > We are seeing issues with the fs code now because the test cases which > > led to the current discussion exercise FS code. The code which does > > lock(); kmalloc(GFP_KERNEL) is not reduced there though. I am pretty sure

Re: [PATCH 0/9] mm: improve OOM mechanism v2

2015-04-28 Thread Tetsuo Handa
Johannes Weiner wrote: > There is a possible deadlock scenario between the page allocator and > the OOM killer. Most allocations currently retry forever inside the > page allocator, but when the OOM killer is invoked the chosen victim > might try taking locks held by the allocating task. This ser

Re: [PATCH 0/9] mm: improve OOM mechanism v2

2015-04-28 Thread Tetsuo Handa
Michal Hocko wrote: > On Tue 28-04-15 19:34:47, Tetsuo Handa wrote: > [...] > > [PATCH 8/9] makes the speed of allocating __GFP_FS pages extremely slow (5 > > seconds / page) because out_of_memory() serialized by the oom_lock sleeps > > for > > 5 seconds before retu

Re: [PATCH 6/9] mm: oom_kill: simplify OOM killer locking

2015-04-28 Thread Tetsuo Handa
David Rientjes wrote: > It's not vital and somewhat unrelated to your patch, but if we can't grab > the mutex with the trylock in __alloc_pages_may_oom() then I think it > would be more correct to do schedule_timeout_killable() rather than > uninterruptible. I just mention it if you happen to g

Re: [PATCH 0/9] mm: improve OOM mechanism v2

2015-04-29 Thread Tetsuo Handa
Michal Hocko wrote: > On Wed 29-04-15 08:55:06, Johannes Weiner wrote: > > What we can do to mitigate this is tie the timeout to the setting of > > TIF_MEMDIE so that the wait is not 5s from the point of calling > > out_of_memory() but from the point of where TIF_MEMDIE was set. > > Subsequent allo

Re: [PATCH 0/9] mm: improve OOM mechanism v2

2015-04-30 Thread Tetsuo Handa
Michal Hocko wrote: > I mean we should eventually fail all the allocation types but GFP_NOFS > is coming from _carefully_ handled code paths which is an easier starting > point than a random code path in the kernel/drivers. So can we finally > move at least in this direction? I agree that all the

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote: > * Implement netconsole retransmission support. Matching rx socket on > the source port is automatically created for extended targets and > the log receiver can request retransmission by sending reponse > packets. This is completely decoupled from the main write path and >

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote: > Hello, David. > > On Fri, Apr 17, 2015 at 01:17:12PM -0400, David Miller wrote: > > If userland cannot run properly, it is almost certain that neither will > > your complex reliability layer logic. > > * The bulk of patches are to pipe extended log messages to console > drive

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote: > > printk() cannot wait for ack. Trying to wait for ack would break something. > > How can you transmit subsequent kernel messages which failed to enqueue > > due to waiting for ack for previous kernel messages? > > Well, if log buffer overflows and the messages aren't at the log

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote: > On Sat, Apr 18, 2015 at 03:03:46AM +0900, Tetsuo Handa wrote: > > packet will be sufficient for finding out whether the packets were lost > > and/or > > reordered in flight. > > > > printk("Hello"); > >=> netc

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-18 Thread Tetsuo Handa
Tejun Heo wrote: > > If we can assume that scheduler is working, adding a kernel thread that > > does > > > > while (1) { > > read messages with metadata from /dev/kmsg > > send them using UDP network > > } > > > > might be easier than modifying netconsole module. > > But, I mean

Re: [patch 00/12] mm: page_alloc: improve OOM mechanism and policy

2015-04-11 Thread Tetsuo Handa
Johannes Weiner wrote: > The argument here was always that NOFS allocations are very limited in > their reclaim powers and will trigger OOM prematurely. However, the > way we limit dirty memory these days forces most cache to be clean at > all times, and direct reclaim in general hasn't been allow

Re: [RFC] panic_on_oom_timeout

2015-06-11 Thread Tetsuo Handa
Michal Hocko wrote: > > > The feature is implemented as a delayed work which is scheduled when > > > the OOM condition is declared for the first time (oom_victims is still > > > zero) in out_of_memory and it is canceled in exit_oom_victim after > > > the oom_victims count drops down to zero. For th

Re: [RFC] panic_on_oom_timeout

2015-06-11 Thread Tetsuo Handa
Michal Hocko wrote: > On Thu 11-06-15 22:12:40, Tetsuo Handa wrote: > > Michal Hocko wrote: > [...] > > > > The moom_work used by SysRq-f sometimes cannot be executed > > > > because some work which is processed before the moom_work is processed > > >

Re: [RFC] panic_on_oom_timeout

2015-06-12 Thread Tetsuo Handa
to use panic_on_oom > 0 by setting adequate values to these timeouts. >From e59b64683827151a35257384352c70bce61babdd Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Fri, 12 Jun 2015 23:56:18 +0900 Subject: [RFC] oom: im

Re: [RFC] panic_on_oom_timeout

2015-06-16 Thread Tetsuo Handa
Michal Hocko wrote: > > This patch implements system_memdie_panic_secs sysctl which configures > > a maximum timeout for the OOM killer to resolve the OOM situation. > > If the system is still under OOM (i.e. the OOM victim cannot release > > memory) after the timeout expires, it will panic the sys

Re: [RFC] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote a few minutes ago: > Subject: [RFC -v2] panic_on_oom_timeout Oops, we raced... Michal Hocko wrote: > On Tue 16-06-15 22:14:28, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > > This patch implements system_memdie_panic_secs sysctl which configures &

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: > Hi, > I was thinking about this and I am more and more convinced that we > shouldn't care about panic_on_oom=2 configuration for now and go with > the simplest solution first. I have revisited my original patch and > replaced delayed work by a timer based on the feedback from

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: > > > + if (sysctl_panic_on_oom_timeout) { > > > + if (sysctl_panic_on_oom > 1) { > > > + pr_warn("panic_on_oom_timeout is ignored for > > > panic_on_oom=2\n"); > > > + } else { > > > + /* > > > + * Only schedule

Re: [PATCH] oom: always panic on OOM when panic_on_oom is configured

2015-06-05 Thread Tetsuo Handa
Michal Hocko wrote: > > > Let's move check_panic_on_oom up before the current task is > > > checked so that the knob value is . Do the same for the memcg in > > > mem_cgroup_out_of_memory. > > > > > > Reported-by: Tetsuo Handa > > >

Re: [PATCH] oom: always panic on OOM when panic_on_oom is configured

2015-06-08 Thread Tetsuo Handa
Michal Hocko wrote: > On Sat 06-06-15 15:51:35, Tetsuo Handa wrote: > > For me, !__GFP_FS allocations not calling out_of_memory() _forever_ is a > > violation of the user policy. > > Yes, the current behavior of GFP_NOFS is highly suboptimal, but this has > _nothing_ what

Re: [RFC 0/2] mapping_gfp_mask from the page fault path

2015-06-03 Thread Tetsuo Handa
Andrew Morton wrote: > On Mon, 1 Jun 2015 15:00:01 +0200 Michal Hocko wrote: > > > I somehow forgot about these patches. The previous version was > > posted here: http://marc.info/?l=linux-mm&m=142668784122763&w=2. The > > first attempt was broken but even when fixed it seems like ignoring > > m

Re: oom: How to handle !__GFP_FS exception?

2015-06-09 Thread Tetsuo Handa
David Rientjes wrote: > On Sat, 6 Jun 2015, Tetsuo Handa wrote: > > > For me, !__GFP_FS allocations not calling out_of_memory() _forever_ is a > > violation of the user policy. > > > > I agree that we need work in this area to prevent livelocks that rely on >

Re: [RFC] panic_on_oom_timeout

2015-06-10 Thread Tetsuo Handa
Michal Hocko wrote: > Hi, > during the last iteration of the timeout based oom killer discussion > (http://marc.info/?l=linux-mm&m=143351457601723) I've proposed to > introduce panic_on_oom_timeout as an extension to panic_on_oom rather > than oom timeout which would allow OOM killer to select anot

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: > On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: > > Michal Hocko wrote: > [...] > > > But you have a point that we could have > > > - constrained OOM which elevates oom_victims > > > - global OOM killer strikes but wouldn't st

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: > Yes I was thinking about this as well because the primary assumption > of the OOM killer is that the victim will release some memory. And it > doesn't matter whether the OOM killer was constrained or the global > one. So the above looks good at first sight, I am just afraid it

Re: [RFC -v2] panic_on_oom_timeout

2015-06-20 Thread Tetsuo Handa
Tetsuo Handa wrote: > One case is that the system can not panic of threads are unable to call > out_of_memory() for some reason. ^ if > Well, if without analysis purpose, > > if (time_after(jiffies, oom_start + sysctl_panic_on_o

Re: INFO: task hung in blk_queue_enter

2018-05-22 Thread Tetsuo Handa
I checked counter values using debug printk() patch shown below, and found that q->q_usage_counter.count == 1 when this deadlock occurs. Since sum of percpu_count did not change after percpu_ref_kill(), this is not a race condition while folding percpu counter values into atomic counter value. Tha

Re: [PATCH] printk: inject caller information into the body of message

2018-05-23 Thread Tetsuo Handa
Sergey Senozhatsky wrote: > On (05/17/18 20:21), Sergey Senozhatsky wrote: > > Dunno... > > For instance, can we store context tracking info as a extended record > > data? We have that dict/dict_len thing. So may we can store tracking > > info there? Extended records will appear on the serial conso

Re: [lkp-robot] [printk] c162d5b433: BUG:KASAN:use-after-scope_in_c

2018-03-01 Thread Tetsuo Handa
> Forwarded by penguin-ker...@i-love.sakura.ne.jp --- Original Message --- From:Tetsuo Handa To: Petr Mladek Cc: kernel test robot , Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Vlastimil Babka , Peter Zijlstra ,

Re: KASAN: use-after-free Read in alloc_pid

2018-04-03 Thread Tetsuo Handa
On 2018/04/03 12:10, Eric Biggers wrote: > On Mon, Apr 02, 2018 at 06:00:57PM -0500, Eric W. Biederman wrote: >> syzbot writes: >> >>> Hello, >>> >>> syzbot hit the following crash on upstream commit >>> 9dd2326890d89a5179967c947dab2bab34d7ddee (Fri Mar 30 17:29:47 2018 +) >>> Merge tag 'ceph-

Re: WARNING in kill_block_super

2018-04-04 Thread Tetsuo Handa
Al and Michal, are you OK with this patch? >From bbc0d00935ebcb7e287403bae545fae9340830d9 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Wed, 4 Apr 2018 12:19:42 +0900 Subject: [PATCH] mm,vmscan: Allow preallocating memory for register_shrinker(). syzbot is catching so many bugs trigge

Re: INFO: rcu detected stall in bitmap_parselist

2018-04-04 Thread Tetsuo Handa
Yury, are you OK with this patch? >From 7f21827cdfe9780b4949b22bcd19efa721b463d2 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Wed, 4 Apr 2018 21:12:10 +0900 Subject: [PATCH] lib/bitmap: Rewrite __bitmap_parselist(). syzbot is catching stalls at __bitmap_parselist() [1]. The trigger

Re: INFO: task hung in lo_ioctl

2018-04-04 Thread Tetsuo Handa
This seems to be an AB-BA deadlock where the lockdep cannot report (due to use of nested lock?). When PID=6540 was (reported as hung) at mutex_lock_nested(&lo->lo_ctl_mutex, 1) (id=43ca8836), it was already holding down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING) (id=566d4c39). But when PI

Re: INFO: rcu detected stall in bitmap_parselist

2018-04-04 Thread Tetsuo Handa
Yury Norov wrote: > Hi Tetsuo, > > Thanks for the patch. > > On Wed, Apr 04, 2018 at 09:21:43PM +0900, Tetsuo Handa wrote: > > Yury, are you OK with this patch? > > > > > > >From 7f21827cdfe9780b4949b22bcd19efa721b463d2 Mon Sep 17 00:00:00 2001 > &g

[PATCH v2] locking/hung_task: Show all hung tasks before panic

2018-04-04 Thread Tetsuo Handa
;sb->s_umount); */ When reporting an AB-BA deadlock like shown above, it would be nice if trace of PID=6541 is printed as well as trace of PID=6540 before calling panic(). Showing hung tasks up to /proc/sys/kernel/hung_task_warnings could delay calling pa

Re: KASAN: user-memory-access Write in n_tty_set_termios

2018-04-05 Thread Tetsuo Handa
Hello. I manually simplified the reproducer. Since the bug is timing dependent, this reproducer might fail to reproducer the bug. Anyway, I guess that there is a race condition between vfree(ldata); tty->disc_data = NULL; at n_tty_close() by something (ioctl(TIOCVHANGUP) ?) and

Re: kernel panic: n_tty: init_tty

2018-04-05 Thread Tetsuo Handa
ult bug. Why not to fix? >From 7051b364605c65d4266a71c52e5140ca5dbb4ea9 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Thu, 5 Apr 2018 09:42:43 +0900 Subject: [PATCH] tty: Don't call panic() at tty_ldisc_init() syzbot is reporting kernel panic [1] triggered by memory allocation failur

Re: WARNING in tty_set_ldisc

2018-04-05 Thread Tetsuo Handa
ster >> compiler: gcc (GCC) 7.1.1 20170620 >> .config is attached >> Raw console output is attached. > > Again, what am I supposed to do with this? > > thanks, > > greg k-h > >From 023cf07f799d0efd160ec1c1617d5b8902577765 Mon Sep 17 00:00:00 2001 From: Tet

Re: KASAN: global-out-of-bounds Write in string

2018-04-05 Thread Tetsuo Handa
On 2018/04/04 2:01, syzbot wrote: > BUG: KASAN: global-out-of-bounds in string+0x1cb/0x200 lib/vsprintf.c:598 > Write of size 1 at addr 89e166a0 by task syz-executor0/4522 > > CPU: 1 PID: 4522 Comm: syz-executor0 Not tainted 4.16.0+ #12 > Hardware name: Google Google Compute Engine/Google

Re: INFO: task hung in __blkdev_get

2018-04-05 Thread Tetsuo Handa
I tried the reproducer in my environment. The reproducer can trivially reproduce a hung up. If the bug I'm observing is what the syzbot is reporting (I ran the reproducer using init= kernel command line option), the reason __blkdev_get() is blocked waiting for bdev->bd_mutex is that an exiting thre

Re: [PATCH v2] lockdep: Show address of "struct lockdep_map" at print_lock().

2018-03-29 Thread Tetsuo Handa
>From 91c081c4c5f6a99402542951e7de661c38f928ab Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 27 Mar 2018 19:38:33 +0900 Subject: [PATCH v2] lockdep: Show address of "struct lockdep_map" at print_lock(). Since "struct lockdep_map" is embedded into lock obj

Re: general protection fault in fuse_ctl_remove_conn

2018-04-27 Thread Tetsuo Handa
>From 9f41081f8bd6762a6f629e5e23e6d07a62bba69c Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Sat, 28 Apr 2018 11:24:09 +0900 Subject: [PATCH] fuse: don't keep inode-less dentry at fuse_ctl_add_dentry(). syzbot is reporting NULL pointer dereference at fuse_ctl_remove_conn() [1].

Re: WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected

2018-04-27 Thread Tetsuo Handa
OK. Patch was sent to linux.git as 6c1e851c4edc13a4. #syz fix: random: fix possible sleeping allocation from irq context

Re: INFO: rcu detected stall in blkdev_ioctl

2018-04-28 Thread Tetsuo Handa
Like I noted in a patch at https://groups.google.com/d/msg/syzkaller-bugs/2Rw8-OM6IbM/PzdobV8kAgAJ loop module is not thread safe. Can we use more global lock?

Re: WARNING: kmalloc bug in bfs_fill_super

2018-05-01 Thread Tetsuo Handa
>From 247cae4da0490c2e285e0a99e630ef963fabb6d5 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 1 May 2018 14:15:19 +0900 Subject: [PATCH] bfs: add sanity check at bfs_fill_super(). syzbot is reporting too large memory allocation at bfs_fill_super() [1]. Since file system image

Re: KASAN: use-after-free Read in fuse_kill_sb_blk

2018-05-01 Thread Tetsuo Handa
>From 606d54cd24b5b00e7a7e3597aabbe89719defc56 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 1 May 2018 13:12:14 +0900 Subject: [PATCH] fuse: don't keep dead fuse_conn at fuse_fill_super(). syzbot is reporting use-after-free at fuse_kill_sb_blk() [1]. Since sb->s_fs_info f

Re: kernel BUG at include/linux/mm.h:LINE!

2018-05-01 Thread Tetsuo Handa
>From d54b2acf63191eba3d5052bf34fe6d26e3580ac2 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 1 May 2018 15:36:52 +0900 Subject: [PATCH] x86/kexec: avoid double free_page() upon do_kexec_load() failure. syzbot is reporting crashes after memory allocation failure inside do_kexec_l

Re: general protection fault in n_tty_set_termios

2018-05-01 Thread Tetsuo Handa
This will be essentially same with below one. ioctl(TIOCVHANGUP) versus ioctl(TCSETS) can race. #syz dup: KASAN: user-memory-access Write in n_tty_set_termios

Re: INFO: task hung in wb_shutdown (2)

2018-05-01 Thread Tetsuo Handa
Tejun, Jan, Jens, Can you review this patch? syzbot has hit this bug for nearly 4000 times but is still unable to find a reproducer. Therefore, the only way to test would be to apply this patch upstream and test whether the problem is solved. On 2018/04/24 21:19, Tetsuo Handa wrote: >&g

Re: INFO: task hung in wb_shutdown (2)

2018-05-01 Thread Tetsuo Handa
>From 1b90d7f71d60e743c69cdff3ba41edd1f9f86f93 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Wed, 2 May 2018 07:07:55 +0900 Subject: [PATCH v2] bdi: wake up concurrent wb_shutdown() callers. syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in wb_shutdown() [1]. This seems

<    3   4   5   6   7   8   9   10   11   12   >