On Thu, Oct 12, 2017 at 05:28:13PM +0800, Fengguang Wu wrote:
> Please try this:
>
> rm openwrt-trinity-i386.cgz
> wget
> https://github.com/fengguang/reproduce-kernel-bug/raw/master/openwrt/openwrt-trinity-i386.cgz
Yep, that makes it go. Thanks!
[ 35.721719] Kernel panic - not syncing: No working init found. Try passing
init= option to kernel. See Linux Documentation/admin-guide/init.rst for
guidance.
Well I got the same result. The script and initrd image matches well
with my local version. I'll dig what goes wrong.
And it did d
On Thu, Oct 12, 2017 at 10:47:25AM +0200, Peter Zijlstra wrote:
On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote:
#!/bin/bash
kernel=$1
initrd=openwrt-trinity-i386.cgz
wget --no-clobber
https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qem
On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote:
> #!/bin/bash
>
> kernel=$1
> initrd=openwrt-trinity-i386.cgz
>
> wget --no-clobber
> https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
>
> kvm=(
> qemu-system-x86_64
> -enable-kvm
> -cpu
On Wed, Oct 11, 2017 at 09:56:05AM +0900, Byungchul Park wrote:
> Thank you very much for explaining it in detail.
>
> But let's shift a viewpoint. Precisely, I didn't want to work on locks
> but *waiters* becasue dependancies causing deadlocks only can be created
> by waiters - nevertheless I hav
On Tue, Oct 10, 2017 at 09:56:26AM -0700, Linus Torvalds wrote:
> On Tue, Oct 10, 2017 at 9:22 AM, Linus Torvalds
> wrote:
> >
> > I really would like to see the sites that do cross-thread lock/unlock
> > pairs themselves be annotated.
> >
> > So when you lock in one thread, and then unlock in ano
On Tue, Oct 10, 2017 at 08:14:09PM +0200, Peter Zijlstra wrote:
> On Tue, Oct 10, 2017 at 09:56:26AM -0700, Linus Torvalds wrote:
>
> > So I think the best model would be something like this:
> >
> > - T1:
> > mutex_lock(&lock)
> > ...
> > mutex_transfer(&lock)
> >
> >
On Wed, Oct 11, 2017 at 09:56:05AM +0900, Byungchul Park wrote:
> Thank you very much for explaining it in detail.
>
> But let's shift a viewpoint. Precisely, I didn't want to work on locks
> but *waiters* becasue dependancies causing deadlocks only can be created
> by waiters - nevertheless I hav
On Tue, Oct 10, 2017 at 09:22:26AM -0700, Linus Torvalds wrote:
> On Mon, Oct 9, 2017 at 10:48 PM, Byungchul Park
> wrote:
> >>
> >> The place where the release is done should simply be special.
> >>
> >> Because we should *not* encourage the whole "acquire by one context,
> >> release by another
On Tue, Oct 10, 2017 at 11:14 AM, Peter Zijlstra wrote:
>
> Ah, but that's not at all what cross-release is about. Nobody really
> does wonky ownership transfer of mutexes like that (although there might
> be someone doing something with semaphores, I didn't check). Its to
> allow detecting this d
On Tue, Oct 10, 2017 at 09:56:26AM -0700, Linus Torvalds wrote:
> So I think the best model would be something like this:
>
> - T1:
> mutex_lock(&lock)
> ...
> mutex_transfer(&lock)
>
> - T2:
> mutex_receive(&lock);
> ...
> mutex_unlock(&lock);
>
On Tue, Oct 10, 2017 at 9:22 AM, Linus Torvalds
wrote:
>
> I really would like to see the sites that do cross-thread lock/unlock
> pairs themselves be annotated.
>
> So when you lock in one thread, and then unlock in another, I'd
> actually prefer to see something like
>
> - T1:
> lock_mu
On Mon, Oct 9, 2017 at 10:48 PM, Byungchul Park wrote:
>>
>> The place where the release is done should simply be special.
>>
>> Because we should *not* encourage the whole "acquire by one context,
>> release by another" as being something normal and "just set the flag
>> to let lockdep know".
>
>
On Wed, Oct 04, 2017 at 10:34:30AM +0200, Peter Zijlstra wrote:
> Right, and print_circular_bug() uses @trace before it ever can be set,
> although I suspect the intention is that that only ever gets called from
> commit_xhlock() where we pass in an initialized @trace. A comment
> would've been goo
On Tue, Oct 03, 2017 at 09:57:02AM -0700, Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 9:54 AM, Linus Torvalds
> wrote:
> >
> > Can we consider just reverting the crossrelease thing?
> >
> > The apparent stack corruption really worries me [...]
>
> Side note: I also think the thing is just brok
On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote:
>
> * Linus Torvalds wrote:
>
> > On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote:
> > >
> > > This patch triggers a NULL-dereference bug at update_stack_state().
> > > Although its parent commit also has a NULL-dereference bug, ho
On Tue, Oct 03, 2017 at 09:54:31AM -0700, Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote:
> >
> > This patch triggers a NULL-dereference bug at update_stack_state().
> > Although its parent commit also has a NULL-dereference bug, however
> > the call stack looks rather
On Mon, Oct 09, 2017 at 05:44:46PM +0200, Peter Zijlstra wrote:
On Mon, Oct 09, 2017 at 11:41:30PM +0800, Fengguang Wu wrote:
> > [ 187.855027] init: plymouth-splash main process (418) terminated with
status 1
> > [ 187.953296] init: networking main process (419) terminated with status 1
> >
On Mon, Oct 09, 2017 at 11:41:30PM +0800, Fengguang Wu wrote:
> > > [ 187.855027] init: plymouth-splash main process (418) terminated with
> > > status 1
> > > [ 187.953296] init: networking main process (419) terminated with status
> > > 1
> > > [ 191.697721] [ cut here ]-
[ 187.855027] init: plymouth-splash main process (418) terminated with status 1
[ 187.953296] init: networking main process (419) terminated with status 1
[ 191.697721] [ cut here ]
[ 191.699318] WARNING: CPU: 0 PID: 424 at kernel/locking/lockdep.c:3928
check_flags+0x1
On Mon, Oct 09, 2017 at 10:17:06PM +0800, Fengguang Wu wrote:
> It works! I tried 500 boots and only find 1 occurrence of this error,
> which looks irrelevant to the current issue.
OK, I'll go write a Changelog for the lockdep patch.
>
> [ 187.855027] init: plymouth-splash main process (418) t
On Mon, Oct 09, 2017 at 08:26:05AM -0500, Josh Poimboeuf wrote:
On Mon, Oct 09, 2017 at 08:55:04PM +0800, Fengguang Wu wrote:
On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
> On Mon, Oct 09, 2017 at 12:50:55PM +0200, Peter Zijlstra wrote:
> > > Fengguang, if you're still listening
On Mon, Oct 09, 2017 at 08:55:04PM +0800, Fengguang Wu wrote:
> On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
> > On Mon, Oct 09, 2017 at 12:50:55PM +0200, Peter Zijlstra wrote:
> > > > Fengguang, if you're still listening, could you please rerun the tests
> > > > on top of ce07a941
On Mon, Oct 09, 2017 at 02:54:04PM +0200, Peter Zijlstra wrote:
> On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
> > > > From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001
> > > > Message-Id:
> > > >
> > > > From: Josh Poimboeuf
> > > > Date: Thu, 5 Oct 2017 09
On Mon, Oct 09, 2017 at 02:54:04PM +0200, Peter Zijlstra wrote:
On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
> > From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001
> > Message-Id:
> > From: Josh Poimboeuf
> > Date: Thu, 5 Oct 2017 09:43:59 -0500
> > Subjec
On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
On Mon, Oct 09, 2017 at 12:50:55PM +0200, Peter Zijlstra wrote:
Fengguang, if you're still listening, could you please rerun the tests
on top of ce07a9415f26, with the attached patches also applied?
Ping!? it would be very good to g
On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
> > > From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001
> > > Message-Id:
> > >
> > > From: Josh Poimboeuf
> > > Date: Thu, 5 Oct 2017 09:43:59 -0500
> > > Subject: [PATCH 1/2] unwinder fixes
> > >
> > > ---
> >
On Mon, Oct 09, 2017 at 12:50:55PM +0200, Peter Zijlstra wrote:
Fengguang, if you're still listening, could you please rerun the tests
on top of ce07a9415f26, with the attached patches also applied?
Ping!? it would be very good to get feedback on this asap.
Sorry for the delay!
From e7840ad
> Fengguang, if you're still listening, could you please rerun the tests
> on top of ce07a9415f26, with the attached patches also applied?
Ping!? it would be very good to get feedback on this asap.
> From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001
> Message-Id:
>
> From:
On Thu, Oct 05, 2017 at 08:01:46AM -0500, Josh Poimboeuf wrote:
> On Tue, Oct 03, 2017 at 09:54:31AM -0700, Linus Torvalds wrote:
> > On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote:
> > >
> > > This patch triggers a NULL-dereference bug at update_stack_state().
> > > Although its parent commit
On Thu, Oct 05, 2017 at 08:02:33PM +0900, Tetsuo Handa wrote:
> Josh Poimboeuf wrote:
> > On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote:
> > > Josh Poimboeuf wrote:
> > > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote:
> > > > > There are two bugs:
> > > > >
> > >
On Tue, Oct 03, 2017 at 09:54:31AM -0700, Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote:
> >
> > This patch triggers a NULL-dereference bug at update_stack_state().
> > Although its parent commit also has a NULL-dereference bug, however
> > the call stack looks rather
Josh Poimboeuf wrote:
> On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote:
> > Josh Poimboeuf wrote:
> > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote:
> > > > There are two bugs:
> > > >
> > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the
On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote:
> Josh Poimboeuf wrote:
> > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote:
> > > There are two bugs:
> > >
> > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the
> > >lockdep people to look at
On Wed, Oct 04, 2017 at 02:30:42PM -0700, Linus Torvalds wrote:
> On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf wrote:
> >
> > I compiled the same kernel with a similar version of GCC. It turns out
> > that GCC *does* create unaligned stacks with frame pointers enabled:
>
> Christ. What a piece
On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf wrote:
>
> I compiled the same kernel with a similar version of GCC. It turns out
> that GCC *does* create unaligned stacks with frame pointers enabled:
Christ. What a piece of crap.
It doesn't even seem to make any sense. Spill room for the "u16
i
On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote:
> Josh Poimboeuf wrote:
> > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote:
> > > There are two bugs:
> > >
> > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the
> > >lockdep people to look at
On Wed, Oct 04, 2017 at 11:20:52AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote:
> > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release,
> > because it breaks the 'owner task owns the lock' model.
>
> Still, you can get real dea
* Peter Zijlstra wrote:
> On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote:
> > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release,
> > because it breaks the 'owner task owns the lock' model.
>
> Still, you can get real deadlocks with completions...
>
> > Plu
On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote:
> Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release,
> because it breaks the 'owner task owns the lock' model.
Still, you can get real deadlocks with completions...
> Plus I don't think we found that many real b
On Tue, Oct 03, 2017 at 10:05:38AM -0500, Josh Poimboeuf wrote:
> I don't know the lockdep code, but one more comment from the peanut
> gallery. This code looks suspect to me:
>
>
> /*
>* Stop saving stack_trace if save_trace() was
>
Josh Poimboeuf wrote:
> On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote:
> > There are two bugs:
> >
> > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the
> >lockdep people to look at that.
> >
> > 2) The 32-bit FP unwinder isn't handling the corrupt stack
On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote:
> There are two bugs:
>
> 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the
>lockdep people to look at that.
>
> 2) The 32-bit FP unwinder isn't handling the corrupt stack very well,
>It's blindly derefe
* Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote:
> >
> > This patch triggers a NULL-dereference bug at update_stack_state().
> > Although its parent commit also has a NULL-dereference bug, however
> > the call stack looks rather different. Both dmesg files are attac
On Tue, Oct 3, 2017 at 9:54 AM, Linus Torvalds
wrote:
>
> Can we consider just reverting the crossrelease thing?
>
> The apparent stack corruption really worries me [...]
Side note: I also think the thing is just broken.
Any actual cross-releaser should be way more annotated than just "set
cross
On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote:
>
> This patch triggers a NULL-dereference bug at update_stack_state().
> Although its parent commit also has a NULL-dereference bug, however
> the call stack looks rather different. Both dmesg files are attached.
>
> It also triggers this warnin
On Tue, Oct 03, 2017 at 10:05:38AM -0500, Josh Poimboeuf wrote:
> I don't know the lockdep code, but one more comment from the peanut
> gallery. This code looks suspect to me:
>
>
> /*
>* Stop saving stack_trace if save_trace() was
>
On Tue, Oct 03, 2017 at 09:41:36AM -0500, Josh Poimboeuf wrote:
> On Tue, Oct 03, 2017 at 09:31:47AM -0500, Josh Poimboeuf wrote:
> > On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote:
> > > Hi Byungchul,
> > >
> > > This patch triggers a NULL-dereference bug at update_stack_state().
>
On Tue, Oct 03, 2017 at 09:31:47AM -0500, Josh Poimboeuf wrote:
> On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote:
> > Hi Byungchul,
> >
> > This patch triggers a NULL-dereference bug at update_stack_state().
> > Although its parent commit also has a NULL-dereference bug, however
> >
On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote:
> Hi Byungchul,
>
> This patch triggers a NULL-dereference bug at update_stack_state().
> Although its parent commit also has a NULL-dereference bug, however
> the call stack looks rather different. Both dmesg files are attached.
>
> I
50 matches
Mail list logo