Re: possible deadlock in generic_file_write_iter (2)

2017-12-07 Thread Byungchul Park

On 12/8/2017 2:07 AM, Dmitry Vyukov wrote:

On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Park  wrote:

On 12/4/2017 5:33 PM, Jan Kara wrote:


Hello,

adding Peter and Byungchul to CC since the lockdep report just looks
strange and cross-release seems to be involved. Guys, how did #5 get into
the lock chain and what does put_ucounts() have to do with sb_writers
there? Thanks!



Hello Jan,

In order to get full stack of #5, we have to pass a boot param,
"crossrelease_fullstack", to the kernel. Now that it only informs
put_ucounts() in the call trace, it's hard to find out what exactly
happened at that time, but I can tell #5 shows:

When acquire(sb_writers) in put_ucounts(), it was on the way to
complete((completion)) of wait_for_completion() in
devtmpfs_create_node().

If acquire(sb_writers) in put_ucounts() is stuck, then
wait_for_completion() in devtmpfs_create_node() would be also
stuck, since complete() being in the context of acquire(sb_writers)
cannot be called.

This is why cross-release added the lock chain.


Hi,

What is cross-release? Is it something new? Should we always enable
crossrelease_fullstack during testing?


Hello Dmitry,

Yes, it's new one making lockdep track wait_for_completion() as well.

And we should enable crossrelease_fullstack if you don't care system
slowdown but testing.


I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It
should have the same effect, right?


Sure.

--
Thanks,
Byungchul


Re: possible deadlock in generic_file_write_iter (2)

2017-12-07 Thread Byungchul Park

On 12/8/2017 2:07 AM, Dmitry Vyukov wrote:

On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Park  wrote:

On 12/4/2017 5:33 PM, Jan Kara wrote:


Hello,

adding Peter and Byungchul to CC since the lockdep report just looks
strange and cross-release seems to be involved. Guys, how did #5 get into
the lock chain and what does put_ucounts() have to do with sb_writers
there? Thanks!



Hello Jan,

In order to get full stack of #5, we have to pass a boot param,
"crossrelease_fullstack", to the kernel. Now that it only informs
put_ucounts() in the call trace, it's hard to find out what exactly
happened at that time, but I can tell #5 shows:

When acquire(sb_writers) in put_ucounts(), it was on the way to
complete((completion)) of wait_for_completion() in
devtmpfs_create_node().

If acquire(sb_writers) in put_ucounts() is stuck, then
wait_for_completion() in devtmpfs_create_node() would be also
stuck, since complete() being in the context of acquire(sb_writers)
cannot be called.

This is why cross-release added the lock chain.


Hi,

What is cross-release? Is it something new? Should we always enable
crossrelease_fullstack during testing?


Hello Dmitry,

Yes, it's new one making lockdep track wait_for_completion() as well.

And we should enable crossrelease_fullstack if you don't care system
slowdown but testing.


I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It
should have the same effect, right?


Sure.

--
Thanks,
Byungchul


Re: possible deadlock in generic_file_write_iter (2)

2017-12-07 Thread Dmitry Vyukov
On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Park  wrote:
>> > On 12/4/2017 5:33 PM, Jan Kara wrote:
>> >>
>> >> Hello,
>> >>
>> >> adding Peter and Byungchul to CC since the lockdep report just looks
>> >> strange and cross-release seems to be involved. Guys, how did #5 get into
>> >> the lock chain and what does put_ucounts() have to do with sb_writers
>> >> there? Thanks!
>> >
>> >
>> > Hello Jan,
>> >
>> > In order to get full stack of #5, we have to pass a boot param,
>> > "crossrelease_fullstack", to the kernel. Now that it only informs
>> > put_ucounts() in the call trace, it's hard to find out what exactly
>> > happened at that time, but I can tell #5 shows:
>> >
>> > When acquire(sb_writers) in put_ucounts(), it was on the way to
>> > complete((completion)) of wait_for_completion() in
>> > devtmpfs_create_node().
>> >
>> > If acquire(sb_writers) in put_ucounts() is stuck, then
>> > wait_for_completion() in devtmpfs_create_node() would be also
>> > stuck, since complete() being in the context of acquire(sb_writers)
>> > cannot be called.
>> >
>> > This is why cross-release added the lock chain.
>>
>> Hi,
>>
>> What is cross-release? Is it something new? Should we always enable
>> crossrelease_fullstack during testing?
>
> Hello Dmitry,
>
> Yes, it's new one making lockdep track wait_for_completion() as well.
>
> And we should enable crossrelease_fullstack if you don't care system
> slowdown but testing.

I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It
should have the same effect, right?


Re: possible deadlock in generic_file_write_iter (2)

2017-12-07 Thread Dmitry Vyukov
On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Park  wrote:
>> > On 12/4/2017 5:33 PM, Jan Kara wrote:
>> >>
>> >> Hello,
>> >>
>> >> adding Peter and Byungchul to CC since the lockdep report just looks
>> >> strange and cross-release seems to be involved. Guys, how did #5 get into
>> >> the lock chain and what does put_ucounts() have to do with sb_writers
>> >> there? Thanks!
>> >
>> >
>> > Hello Jan,
>> >
>> > In order to get full stack of #5, we have to pass a boot param,
>> > "crossrelease_fullstack", to the kernel. Now that it only informs
>> > put_ucounts() in the call trace, it's hard to find out what exactly
>> > happened at that time, but I can tell #5 shows:
>> >
>> > When acquire(sb_writers) in put_ucounts(), it was on the way to
>> > complete((completion)) of wait_for_completion() in
>> > devtmpfs_create_node().
>> >
>> > If acquire(sb_writers) in put_ucounts() is stuck, then
>> > wait_for_completion() in devtmpfs_create_node() would be also
>> > stuck, since complete() being in the context of acquire(sb_writers)
>> > cannot be called.
>> >
>> > This is why cross-release added the lock chain.
>>
>> Hi,
>>
>> What is cross-release? Is it something new? Should we always enable
>> crossrelease_fullstack during testing?
>
> Hello Dmitry,
>
> Yes, it's new one making lockdep track wait_for_completion() as well.
>
> And we should enable crossrelease_fullstack if you don't care system
> slowdown but testing.

I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It
should have the same effect, right?


Re: possible deadlock in generic_file_write_iter (2)

2017-12-06 Thread Andrew Morton
On Wed, 6 Dec 2017 14:05:47 +0900 Byungchul Park  wrote:

> > What is cross-release? Is it something new? Should we always enable
> > crossrelease_fullstack during testing?
> 
> Hello Dmitry,
> 
> Yes, it's new one making lockdep track wait_for_completion() as well.
> 
> And we should enable crossrelease_fullstack if you don't care system
> slowdown but testing.

We should update Documentation/process/submit-checklist.rst section 12.
But that list doesn't even mention CONFIG_LOCKDEP so a bit of
maintenance work will be needed there..



Re: possible deadlock in generic_file_write_iter (2)

2017-12-06 Thread Andrew Morton
On Wed, 6 Dec 2017 14:05:47 +0900 Byungchul Park  wrote:

> > What is cross-release? Is it something new? Should we always enable
> > crossrelease_fullstack during testing?
> 
> Hello Dmitry,
> 
> Yes, it's new one making lockdep track wait_for_completion() as well.
> 
> And we should enable crossrelease_fullstack if you don't care system
> slowdown but testing.

We should update Documentation/process/submit-checklist.rst section 12.
But that list doesn't even mention CONFIG_LOCKDEP so a bit of
maintenance work will be needed there..



Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Byungchul Park
On Tue, Dec 05, 2017 at 10:41:50AM +0100, Jan Kara wrote:
> 
> Hello Byungchul,
> 
> On Tue 05-12-17 13:58:09, Byungchul Park wrote:
> > On 12/4/2017 5:33 PM, Jan Kara wrote:
> > >adding Peter and Byungchul to CC since the lockdep report just looks
> > >strange and cross-release seems to be involved. Guys, how did #5 get into
> > >the lock chain and what does put_ucounts() have to do with sb_writers
> > >there? Thanks!
> > 
> > Hello Jan,
> > 
> > In order to get full stack of #5, we have to pass a boot param,
> > "crossrelease_fullstack", to the kernel. Now that it only informs
> > put_ucounts() in the call trace, it's hard to find out what exactly
> > happened at that time, but I can tell #5 shows:
> 
> OK, thanks for the tip.
> 
> > When acquire(sb_writers) in put_ucounts(), it was on the way to
> > complete((completion)) of wait_for_completion() in
> > devtmpfs_create_node().
> > 
> > If acquire(sb_writers) in put_ucounts() is stuck, then
> > wait_for_completion() in devtmpfs_create_node() would be also
> > stuck, since complete() being in the context of acquire(sb_writers)
> > cannot be called.
> 
> But this is something I don't get: There aren't sb_writers anywhere near
> put_ucounts(). So why the heck did lockdep think that sb_writers are
> acquired by put_ucounts()?

I also think it looks so weird. I just record _RET_IP_ or _THIS_IP_ when
acquire(sb_writers). Is it possible to get wrong _RET_IP_ or _THIS_IP_ by
any chance?

> 
>   Honza
> -- 
> Jan Kara 
> SUSE Labs, CR


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Byungchul Park
On Tue, Dec 05, 2017 at 10:41:50AM +0100, Jan Kara wrote:
> 
> Hello Byungchul,
> 
> On Tue 05-12-17 13:58:09, Byungchul Park wrote:
> > On 12/4/2017 5:33 PM, Jan Kara wrote:
> > >adding Peter and Byungchul to CC since the lockdep report just looks
> > >strange and cross-release seems to be involved. Guys, how did #5 get into
> > >the lock chain and what does put_ucounts() have to do with sb_writers
> > >there? Thanks!
> > 
> > Hello Jan,
> > 
> > In order to get full stack of #5, we have to pass a boot param,
> > "crossrelease_fullstack", to the kernel. Now that it only informs
> > put_ucounts() in the call trace, it's hard to find out what exactly
> > happened at that time, but I can tell #5 shows:
> 
> OK, thanks for the tip.
> 
> > When acquire(sb_writers) in put_ucounts(), it was on the way to
> > complete((completion)) of wait_for_completion() in
> > devtmpfs_create_node().
> > 
> > If acquire(sb_writers) in put_ucounts() is stuck, then
> > wait_for_completion() in devtmpfs_create_node() would be also
> > stuck, since complete() being in the context of acquire(sb_writers)
> > cannot be called.
> 
> But this is something I don't get: There aren't sb_writers anywhere near
> put_ucounts(). So why the heck did lockdep think that sb_writers are
> acquired by put_ucounts()?

I also think it looks so weird. I just record _RET_IP_ or _THIS_IP_ when
acquire(sb_writers). Is it possible to get wrong _RET_IP_ or _THIS_IP_ by
any chance?

> 
>   Honza
> -- 
> Jan Kara 
> SUSE Labs, CR


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Byungchul Park
On Tue, Dec 05, 2017 at 10:19:07AM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Park  wrote:
> > On 12/4/2017 5:33 PM, Jan Kara wrote:
> >>
> >> Hello,
> >>
> >> adding Peter and Byungchul to CC since the lockdep report just looks
> >> strange and cross-release seems to be involved. Guys, how did #5 get into
> >> the lock chain and what does put_ucounts() have to do with sb_writers
> >> there? Thanks!
> >
> >
> > Hello Jan,
> >
> > In order to get full stack of #5, we have to pass a boot param,
> > "crossrelease_fullstack", to the kernel. Now that it only informs
> > put_ucounts() in the call trace, it's hard to find out what exactly
> > happened at that time, but I can tell #5 shows:
> >
> > When acquire(sb_writers) in put_ucounts(), it was on the way to
> > complete((completion)) of wait_for_completion() in
> > devtmpfs_create_node().
> >
> > If acquire(sb_writers) in put_ucounts() is stuck, then
> > wait_for_completion() in devtmpfs_create_node() would be also
> > stuck, since complete() being in the context of acquire(sb_writers)
> > cannot be called.
> >
> > This is why cross-release added the lock chain.
> 
> Hi,
> 
> What is cross-release? Is it something new? Should we always enable
> crossrelease_fullstack during testing?

Hello Dmitry,

Yes, it's new one making lockdep track wait_for_completion() as well.

And we should enable crossrelease_fullstack if you don't care system
slowdown but testing.

--
Thanks,
Byungchul


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Byungchul Park
On Tue, Dec 05, 2017 at 10:19:07AM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Park  wrote:
> > On 12/4/2017 5:33 PM, Jan Kara wrote:
> >>
> >> Hello,
> >>
> >> adding Peter and Byungchul to CC since the lockdep report just looks
> >> strange and cross-release seems to be involved. Guys, how did #5 get into
> >> the lock chain and what does put_ucounts() have to do with sb_writers
> >> there? Thanks!
> >
> >
> > Hello Jan,
> >
> > In order to get full stack of #5, we have to pass a boot param,
> > "crossrelease_fullstack", to the kernel. Now that it only informs
> > put_ucounts() in the call trace, it's hard to find out what exactly
> > happened at that time, but I can tell #5 shows:
> >
> > When acquire(sb_writers) in put_ucounts(), it was on the way to
> > complete((completion)) of wait_for_completion() in
> > devtmpfs_create_node().
> >
> > If acquire(sb_writers) in put_ucounts() is stuck, then
> > wait_for_completion() in devtmpfs_create_node() would be also
> > stuck, since complete() being in the context of acquire(sb_writers)
> > cannot be called.
> >
> > This is why cross-release added the lock chain.
> 
> Hi,
> 
> What is cross-release? Is it something new? Should we always enable
> crossrelease_fullstack during testing?

Hello Dmitry,

Yes, it's new one making lockdep track wait_for_completion() as well.

And we should enable crossrelease_fullstack if you don't care system
slowdown but testing.

--
Thanks,
Byungchul


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Jan Kara

Hello Byungchul,

On Tue 05-12-17 13:58:09, Byungchul Park wrote:
> On 12/4/2017 5:33 PM, Jan Kara wrote:
> >adding Peter and Byungchul to CC since the lockdep report just looks
> >strange and cross-release seems to be involved. Guys, how did #5 get into
> >the lock chain and what does put_ucounts() have to do with sb_writers
> >there? Thanks!
> 
> Hello Jan,
> 
> In order to get full stack of #5, we have to pass a boot param,
> "crossrelease_fullstack", to the kernel. Now that it only informs
> put_ucounts() in the call trace, it's hard to find out what exactly
> happened at that time, but I can tell #5 shows:

OK, thanks for the tip.

> When acquire(sb_writers) in put_ucounts(), it was on the way to
> complete((completion)) of wait_for_completion() in
> devtmpfs_create_node().
> 
> If acquire(sb_writers) in put_ucounts() is stuck, then
> wait_for_completion() in devtmpfs_create_node() would be also
> stuck, since complete() being in the context of acquire(sb_writers)
> cannot be called.

But this is something I don't get: There aren't sb_writers anywhere near
put_ucounts(). So why the heck did lockdep think that sb_writers are
acquired by put_ucounts()?

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Jan Kara

Hello Byungchul,

On Tue 05-12-17 13:58:09, Byungchul Park wrote:
> On 12/4/2017 5:33 PM, Jan Kara wrote:
> >adding Peter and Byungchul to CC since the lockdep report just looks
> >strange and cross-release seems to be involved. Guys, how did #5 get into
> >the lock chain and what does put_ucounts() have to do with sb_writers
> >there? Thanks!
> 
> Hello Jan,
> 
> In order to get full stack of #5, we have to pass a boot param,
> "crossrelease_fullstack", to the kernel. Now that it only informs
> put_ucounts() in the call trace, it's hard to find out what exactly
> happened at that time, but I can tell #5 shows:

OK, thanks for the tip.

> When acquire(sb_writers) in put_ucounts(), it was on the way to
> complete((completion)) of wait_for_completion() in
> devtmpfs_create_node().
> 
> If acquire(sb_writers) in put_ucounts() is stuck, then
> wait_for_completion() in devtmpfs_create_node() would be also
> stuck, since complete() being in the context of acquire(sb_writers)
> cannot be called.

But this is something I don't get: There aren't sb_writers anywhere near
put_ucounts(). So why the heck did lockdep think that sb_writers are
acquired by put_ucounts()?

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Dmitry Vyukov
On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Park  wrote:
> On 12/4/2017 5:33 PM, Jan Kara wrote:
>>
>> Hello,
>>
>> adding Peter and Byungchul to CC since the lockdep report just looks
>> strange and cross-release seems to be involved. Guys, how did #5 get into
>> the lock chain and what does put_ucounts() have to do with sb_writers
>> there? Thanks!
>
>
> Hello Jan,
>
> In order to get full stack of #5, we have to pass a boot param,
> "crossrelease_fullstack", to the kernel. Now that it only informs
> put_ucounts() in the call trace, it's hard to find out what exactly
> happened at that time, but I can tell #5 shows:
>
> When acquire(sb_writers) in put_ucounts(), it was on the way to
> complete((completion)) of wait_for_completion() in
> devtmpfs_create_node().
>
> If acquire(sb_writers) in put_ucounts() is stuck, then
> wait_for_completion() in devtmpfs_create_node() would be also
> stuck, since complete() being in the context of acquire(sb_writers)
> cannot be called.
>
> This is why cross-release added the lock chain.

Hi,

What is cross-release? Is it something new? Should we always enable
crossrelease_fullstack during testing?

Thanks


Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Dmitry Vyukov
On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Park  wrote:
> On 12/4/2017 5:33 PM, Jan Kara wrote:
>>
>> Hello,
>>
>> adding Peter and Byungchul to CC since the lockdep report just looks
>> strange and cross-release seems to be involved. Guys, how did #5 get into
>> the lock chain and what does put_ucounts() have to do with sb_writers
>> there? Thanks!
>
>
> Hello Jan,
>
> In order to get full stack of #5, we have to pass a boot param,
> "crossrelease_fullstack", to the kernel. Now that it only informs
> put_ucounts() in the call trace, it's hard to find out what exactly
> happened at that time, but I can tell #5 shows:
>
> When acquire(sb_writers) in put_ucounts(), it was on the way to
> complete((completion)) of wait_for_completion() in
> devtmpfs_create_node().
>
> If acquire(sb_writers) in put_ucounts() is stuck, then
> wait_for_completion() in devtmpfs_create_node() would be also
> stuck, since complete() being in the context of acquire(sb_writers)
> cannot be called.
>
> This is why cross-release added the lock chain.

Hi,

What is cross-release? Is it something new? Should we always enable
crossrelease_fullstack during testing?

Thanks


Re: possible deadlock in generic_file_write_iter (2)

2017-12-04 Thread Byungchul Park

On 12/4/2017 5:33 PM, Jan Kara wrote:

Hello,

adding Peter and Byungchul to CC since the lockdep report just looks
strange and cross-release seems to be involved. Guys, how did #5 get into
the lock chain and what does put_ucounts() have to do with sb_writers
there? Thanks!


Hello Jan,

In order to get full stack of #5, we have to pass a boot param,
"crossrelease_fullstack", to the kernel. Now that it only informs
put_ucounts() in the call trace, it's hard to find out what exactly
happened at that time, but I can tell #5 shows:

When acquire(sb_writers) in put_ucounts(), it was on the way to
complete((completion)) of wait_for_completion() in
devtmpfs_create_node().

If acquire(sb_writers) in put_ucounts() is stuck, then
wait_for_completion() in devtmpfs_create_node() would be also
stuck, since complete() being in the context of acquire(sb_writers)
cannot be called.

This is why cross-release added the lock chain.

--
Thanks,
Byungchul


Re: possible deadlock in generic_file_write_iter (2)

2017-12-04 Thread Byungchul Park

On 12/4/2017 5:33 PM, Jan Kara wrote:

Hello,

adding Peter and Byungchul to CC since the lockdep report just looks
strange and cross-release seems to be involved. Guys, how did #5 get into
the lock chain and what does put_ucounts() have to do with sb_writers
there? Thanks!


Hello Jan,

In order to get full stack of #5, we have to pass a boot param,
"crossrelease_fullstack", to the kernel. Now that it only informs
put_ucounts() in the call trace, it's hard to find out what exactly
happened at that time, but I can tell #5 shows:

When acquire(sb_writers) in put_ucounts(), it was on the way to
complete((completion)) of wait_for_completion() in
devtmpfs_create_node().

If acquire(sb_writers) in put_ucounts() is stuck, then
wait_for_completion() in devtmpfs_create_node() would be also
stuck, since complete() being in the context of acquire(sb_writers)
cannot be called.

This is why cross-release added the lock chain.

--
Thanks,
Byungchul