Re: possible deadlock in generic_file_write_iter (2)
On 12/8/2017 2:07 AM, Dmitry Vyukov wrote: On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Parkwrote: On 12/4/2017 5:33 PM, Jan Kara wrote: Hello, adding Peter and Byungchul to CC since the lockdep report just looks strange and cross-release seems to be involved. Guys, how did #5 get into the lock chain and what does put_ucounts() have to do with sb_writers there? Thanks! Hello Jan, In order to get full stack of #5, we have to pass a boot param, "crossrelease_fullstack", to the kernel. Now that it only informs put_ucounts() in the call trace, it's hard to find out what exactly happened at that time, but I can tell #5 shows: When acquire(sb_writers) in put_ucounts(), it was on the way to complete((completion)) of wait_for_completion() in devtmpfs_create_node(). If acquire(sb_writers) in put_ucounts() is stuck, then wait_for_completion() in devtmpfs_create_node() would be also stuck, since complete() being in the context of acquire(sb_writers) cannot be called. This is why cross-release added the lock chain. Hi, What is cross-release? Is it something new? Should we always enable crossrelease_fullstack during testing? Hello Dmitry, Yes, it's new one making lockdep track wait_for_completion() as well. And we should enable crossrelease_fullstack if you don't care system slowdown but testing. I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It should have the same effect, right? Sure. -- Thanks, Byungchul
Re: possible deadlock in generic_file_write_iter (2)
On 12/8/2017 2:07 AM, Dmitry Vyukov wrote: On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Park wrote: On 12/4/2017 5:33 PM, Jan Kara wrote: Hello, adding Peter and Byungchul to CC since the lockdep report just looks strange and cross-release seems to be involved. Guys, how did #5 get into the lock chain and what does put_ucounts() have to do with sb_writers there? Thanks! Hello Jan, In order to get full stack of #5, we have to pass a boot param, "crossrelease_fullstack", to the kernel. Now that it only informs put_ucounts() in the call trace, it's hard to find out what exactly happened at that time, but I can tell #5 shows: When acquire(sb_writers) in put_ucounts(), it was on the way to complete((completion)) of wait_for_completion() in devtmpfs_create_node(). If acquire(sb_writers) in put_ucounts() is stuck, then wait_for_completion() in devtmpfs_create_node() would be also stuck, since complete() being in the context of acquire(sb_writers) cannot be called. This is why cross-release added the lock chain. Hi, What is cross-release? Is it something new? Should we always enable crossrelease_fullstack during testing? Hello Dmitry, Yes, it's new one making lockdep track wait_for_completion() as well. And we should enable crossrelease_fullstack if you don't care system slowdown but testing. I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It should have the same effect, right? Sure. -- Thanks, Byungchul
Re: possible deadlock in generic_file_write_iter (2)
On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Parkwrote: >> > On 12/4/2017 5:33 PM, Jan Kara wrote: >> >> >> >> Hello, >> >> >> >> adding Peter and Byungchul to CC since the lockdep report just looks >> >> strange and cross-release seems to be involved. Guys, how did #5 get into >> >> the lock chain and what does put_ucounts() have to do with sb_writers >> >> there? Thanks! >> > >> > >> > Hello Jan, >> > >> > In order to get full stack of #5, we have to pass a boot param, >> > "crossrelease_fullstack", to the kernel. Now that it only informs >> > put_ucounts() in the call trace, it's hard to find out what exactly >> > happened at that time, but I can tell #5 shows: >> > >> > When acquire(sb_writers) in put_ucounts(), it was on the way to >> > complete((completion)) of wait_for_completion() in >> > devtmpfs_create_node(). >> > >> > If acquire(sb_writers) in put_ucounts() is stuck, then >> > wait_for_completion() in devtmpfs_create_node() would be also >> > stuck, since complete() being in the context of acquire(sb_writers) >> > cannot be called. >> > >> > This is why cross-release added the lock chain. >> >> Hi, >> >> What is cross-release? Is it something new? Should we always enable >> crossrelease_fullstack during testing? > > Hello Dmitry, > > Yes, it's new one making lockdep track wait_for_completion() as well. > > And we should enable crossrelease_fullstack if you don't care system > slowdown but testing. I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It should have the same effect, right?
Re: possible deadlock in generic_file_write_iter (2)
On Wed, Dec 6, 2017 at 6:05 AM, Byungchul Park wrote: >> > On 12/4/2017 5:33 PM, Jan Kara wrote: >> >> >> >> Hello, >> >> >> >> adding Peter and Byungchul to CC since the lockdep report just looks >> >> strange and cross-release seems to be involved. Guys, how did #5 get into >> >> the lock chain and what does put_ucounts() have to do with sb_writers >> >> there? Thanks! >> > >> > >> > Hello Jan, >> > >> > In order to get full stack of #5, we have to pass a boot param, >> > "crossrelease_fullstack", to the kernel. Now that it only informs >> > put_ucounts() in the call trace, it's hard to find out what exactly >> > happened at that time, but I can tell #5 shows: >> > >> > When acquire(sb_writers) in put_ucounts(), it was on the way to >> > complete((completion)) of wait_for_completion() in >> > devtmpfs_create_node(). >> > >> > If acquire(sb_writers) in put_ucounts() is stuck, then >> > wait_for_completion() in devtmpfs_create_node() would be also >> > stuck, since complete() being in the context of acquire(sb_writers) >> > cannot be called. >> > >> > This is why cross-release added the lock chain. >> >> Hi, >> >> What is cross-release? Is it something new? Should we always enable >> crossrelease_fullstack during testing? > > Hello Dmitry, > > Yes, it's new one making lockdep track wait_for_completion() as well. > > And we should enable crossrelease_fullstack if you don't care system > slowdown but testing. I've enabled CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK. It should have the same effect, right?
Re: possible deadlock in generic_file_write_iter (2)
On Wed, 6 Dec 2017 14:05:47 +0900 Byungchul Parkwrote: > > What is cross-release? Is it something new? Should we always enable > > crossrelease_fullstack during testing? > > Hello Dmitry, > > Yes, it's new one making lockdep track wait_for_completion() as well. > > And we should enable crossrelease_fullstack if you don't care system > slowdown but testing. We should update Documentation/process/submit-checklist.rst section 12. But that list doesn't even mention CONFIG_LOCKDEP so a bit of maintenance work will be needed there..
Re: possible deadlock in generic_file_write_iter (2)
On Wed, 6 Dec 2017 14:05:47 +0900 Byungchul Park wrote: > > What is cross-release? Is it something new? Should we always enable > > crossrelease_fullstack during testing? > > Hello Dmitry, > > Yes, it's new one making lockdep track wait_for_completion() as well. > > And we should enable crossrelease_fullstack if you don't care system > slowdown but testing. We should update Documentation/process/submit-checklist.rst section 12. But that list doesn't even mention CONFIG_LOCKDEP so a bit of maintenance work will be needed there..
Re: possible deadlock in generic_file_write_iter (2)
On Tue, Dec 05, 2017 at 10:41:50AM +0100, Jan Kara wrote: > > Hello Byungchul, > > On Tue 05-12-17 13:58:09, Byungchul Park wrote: > > On 12/4/2017 5:33 PM, Jan Kara wrote: > > >adding Peter and Byungchul to CC since the lockdep report just looks > > >strange and cross-release seems to be involved. Guys, how did #5 get into > > >the lock chain and what does put_ucounts() have to do with sb_writers > > >there? Thanks! > > > > Hello Jan, > > > > In order to get full stack of #5, we have to pass a boot param, > > "crossrelease_fullstack", to the kernel. Now that it only informs > > put_ucounts() in the call trace, it's hard to find out what exactly > > happened at that time, but I can tell #5 shows: > > OK, thanks for the tip. > > > When acquire(sb_writers) in put_ucounts(), it was on the way to > > complete((completion)) of wait_for_completion() in > > devtmpfs_create_node(). > > > > If acquire(sb_writers) in put_ucounts() is stuck, then > > wait_for_completion() in devtmpfs_create_node() would be also > > stuck, since complete() being in the context of acquire(sb_writers) > > cannot be called. > > But this is something I don't get: There aren't sb_writers anywhere near > put_ucounts(). So why the heck did lockdep think that sb_writers are > acquired by put_ucounts()? I also think it looks so weird. I just record _RET_IP_ or _THIS_IP_ when acquire(sb_writers). Is it possible to get wrong _RET_IP_ or _THIS_IP_ by any chance? > > Honza > -- > Jan Kara> SUSE Labs, CR
Re: possible deadlock in generic_file_write_iter (2)
On Tue, Dec 05, 2017 at 10:41:50AM +0100, Jan Kara wrote: > > Hello Byungchul, > > On Tue 05-12-17 13:58:09, Byungchul Park wrote: > > On 12/4/2017 5:33 PM, Jan Kara wrote: > > >adding Peter and Byungchul to CC since the lockdep report just looks > > >strange and cross-release seems to be involved. Guys, how did #5 get into > > >the lock chain and what does put_ucounts() have to do with sb_writers > > >there? Thanks! > > > > Hello Jan, > > > > In order to get full stack of #5, we have to pass a boot param, > > "crossrelease_fullstack", to the kernel. Now that it only informs > > put_ucounts() in the call trace, it's hard to find out what exactly > > happened at that time, but I can tell #5 shows: > > OK, thanks for the tip. > > > When acquire(sb_writers) in put_ucounts(), it was on the way to > > complete((completion)) of wait_for_completion() in > > devtmpfs_create_node(). > > > > If acquire(sb_writers) in put_ucounts() is stuck, then > > wait_for_completion() in devtmpfs_create_node() would be also > > stuck, since complete() being in the context of acquire(sb_writers) > > cannot be called. > > But this is something I don't get: There aren't sb_writers anywhere near > put_ucounts(). So why the heck did lockdep think that sb_writers are > acquired by put_ucounts()? I also think it looks so weird. I just record _RET_IP_ or _THIS_IP_ when acquire(sb_writers). Is it possible to get wrong _RET_IP_ or _THIS_IP_ by any chance? > > Honza > -- > Jan Kara > SUSE Labs, CR
Re: possible deadlock in generic_file_write_iter (2)
On Tue, Dec 05, 2017 at 10:19:07AM +0100, Dmitry Vyukov wrote: > On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Parkwrote: > > On 12/4/2017 5:33 PM, Jan Kara wrote: > >> > >> Hello, > >> > >> adding Peter and Byungchul to CC since the lockdep report just looks > >> strange and cross-release seems to be involved. Guys, how did #5 get into > >> the lock chain and what does put_ucounts() have to do with sb_writers > >> there? Thanks! > > > > > > Hello Jan, > > > > In order to get full stack of #5, we have to pass a boot param, > > "crossrelease_fullstack", to the kernel. Now that it only informs > > put_ucounts() in the call trace, it's hard to find out what exactly > > happened at that time, but I can tell #5 shows: > > > > When acquire(sb_writers) in put_ucounts(), it was on the way to > > complete((completion)) of wait_for_completion() in > > devtmpfs_create_node(). > > > > If acquire(sb_writers) in put_ucounts() is stuck, then > > wait_for_completion() in devtmpfs_create_node() would be also > > stuck, since complete() being in the context of acquire(sb_writers) > > cannot be called. > > > > This is why cross-release added the lock chain. > > Hi, > > What is cross-release? Is it something new? Should we always enable > crossrelease_fullstack during testing? Hello Dmitry, Yes, it's new one making lockdep track wait_for_completion() as well. And we should enable crossrelease_fullstack if you don't care system slowdown but testing. -- Thanks, Byungchul
Re: possible deadlock in generic_file_write_iter (2)
On Tue, Dec 05, 2017 at 10:19:07AM +0100, Dmitry Vyukov wrote: > On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Park wrote: > > On 12/4/2017 5:33 PM, Jan Kara wrote: > >> > >> Hello, > >> > >> adding Peter and Byungchul to CC since the lockdep report just looks > >> strange and cross-release seems to be involved. Guys, how did #5 get into > >> the lock chain and what does put_ucounts() have to do with sb_writers > >> there? Thanks! > > > > > > Hello Jan, > > > > In order to get full stack of #5, we have to pass a boot param, > > "crossrelease_fullstack", to the kernel. Now that it only informs > > put_ucounts() in the call trace, it's hard to find out what exactly > > happened at that time, but I can tell #5 shows: > > > > When acquire(sb_writers) in put_ucounts(), it was on the way to > > complete((completion)) of wait_for_completion() in > > devtmpfs_create_node(). > > > > If acquire(sb_writers) in put_ucounts() is stuck, then > > wait_for_completion() in devtmpfs_create_node() would be also > > stuck, since complete() being in the context of acquire(sb_writers) > > cannot be called. > > > > This is why cross-release added the lock chain. > > Hi, > > What is cross-release? Is it something new? Should we always enable > crossrelease_fullstack during testing? Hello Dmitry, Yes, it's new one making lockdep track wait_for_completion() as well. And we should enable crossrelease_fullstack if you don't care system slowdown but testing. -- Thanks, Byungchul
Re: possible deadlock in generic_file_write_iter (2)
Hello Byungchul, On Tue 05-12-17 13:58:09, Byungchul Park wrote: > On 12/4/2017 5:33 PM, Jan Kara wrote: > >adding Peter and Byungchul to CC since the lockdep report just looks > >strange and cross-release seems to be involved. Guys, how did #5 get into > >the lock chain and what does put_ucounts() have to do with sb_writers > >there? Thanks! > > Hello Jan, > > In order to get full stack of #5, we have to pass a boot param, > "crossrelease_fullstack", to the kernel. Now that it only informs > put_ucounts() in the call trace, it's hard to find out what exactly > happened at that time, but I can tell #5 shows: OK, thanks for the tip. > When acquire(sb_writers) in put_ucounts(), it was on the way to > complete((completion)) of wait_for_completion() in > devtmpfs_create_node(). > > If acquire(sb_writers) in put_ucounts() is stuck, then > wait_for_completion() in devtmpfs_create_node() would be also > stuck, since complete() being in the context of acquire(sb_writers) > cannot be called. But this is something I don't get: There aren't sb_writers anywhere near put_ucounts(). So why the heck did lockdep think that sb_writers are acquired by put_ucounts()? Honza -- Jan KaraSUSE Labs, CR
Re: possible deadlock in generic_file_write_iter (2)
Hello Byungchul, On Tue 05-12-17 13:58:09, Byungchul Park wrote: > On 12/4/2017 5:33 PM, Jan Kara wrote: > >adding Peter and Byungchul to CC since the lockdep report just looks > >strange and cross-release seems to be involved. Guys, how did #5 get into > >the lock chain and what does put_ucounts() have to do with sb_writers > >there? Thanks! > > Hello Jan, > > In order to get full stack of #5, we have to pass a boot param, > "crossrelease_fullstack", to the kernel. Now that it only informs > put_ucounts() in the call trace, it's hard to find out what exactly > happened at that time, but I can tell #5 shows: OK, thanks for the tip. > When acquire(sb_writers) in put_ucounts(), it was on the way to > complete((completion)) of wait_for_completion() in > devtmpfs_create_node(). > > If acquire(sb_writers) in put_ucounts() is stuck, then > wait_for_completion() in devtmpfs_create_node() would be also > stuck, since complete() being in the context of acquire(sb_writers) > cannot be called. But this is something I don't get: There aren't sb_writers anywhere near put_ucounts(). So why the heck did lockdep think that sb_writers are acquired by put_ucounts()? Honza -- Jan Kara SUSE Labs, CR
Re: possible deadlock in generic_file_write_iter (2)
On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Parkwrote: > On 12/4/2017 5:33 PM, Jan Kara wrote: >> >> Hello, >> >> adding Peter and Byungchul to CC since the lockdep report just looks >> strange and cross-release seems to be involved. Guys, how did #5 get into >> the lock chain and what does put_ucounts() have to do with sb_writers >> there? Thanks! > > > Hello Jan, > > In order to get full stack of #5, we have to pass a boot param, > "crossrelease_fullstack", to the kernel. Now that it only informs > put_ucounts() in the call trace, it's hard to find out what exactly > happened at that time, but I can tell #5 shows: > > When acquire(sb_writers) in put_ucounts(), it was on the way to > complete((completion)) of wait_for_completion() in > devtmpfs_create_node(). > > If acquire(sb_writers) in put_ucounts() is stuck, then > wait_for_completion() in devtmpfs_create_node() would be also > stuck, since complete() being in the context of acquire(sb_writers) > cannot be called. > > This is why cross-release added the lock chain. Hi, What is cross-release? Is it something new? Should we always enable crossrelease_fullstack during testing? Thanks
Re: possible deadlock in generic_file_write_iter (2)
On Tue, Dec 5, 2017 at 5:58 AM, Byungchul Park wrote: > On 12/4/2017 5:33 PM, Jan Kara wrote: >> >> Hello, >> >> adding Peter and Byungchul to CC since the lockdep report just looks >> strange and cross-release seems to be involved. Guys, how did #5 get into >> the lock chain and what does put_ucounts() have to do with sb_writers >> there? Thanks! > > > Hello Jan, > > In order to get full stack of #5, we have to pass a boot param, > "crossrelease_fullstack", to the kernel. Now that it only informs > put_ucounts() in the call trace, it's hard to find out what exactly > happened at that time, but I can tell #5 shows: > > When acquire(sb_writers) in put_ucounts(), it was on the way to > complete((completion)) of wait_for_completion() in > devtmpfs_create_node(). > > If acquire(sb_writers) in put_ucounts() is stuck, then > wait_for_completion() in devtmpfs_create_node() would be also > stuck, since complete() being in the context of acquire(sb_writers) > cannot be called. > > This is why cross-release added the lock chain. Hi, What is cross-release? Is it something new? Should we always enable crossrelease_fullstack during testing? Thanks
Re: possible deadlock in generic_file_write_iter (2)
On 12/4/2017 5:33 PM, Jan Kara wrote: Hello, adding Peter and Byungchul to CC since the lockdep report just looks strange and cross-release seems to be involved. Guys, how did #5 get into the lock chain and what does put_ucounts() have to do with sb_writers there? Thanks! Hello Jan, In order to get full stack of #5, we have to pass a boot param, "crossrelease_fullstack", to the kernel. Now that it only informs put_ucounts() in the call trace, it's hard to find out what exactly happened at that time, but I can tell #5 shows: When acquire(sb_writers) in put_ucounts(), it was on the way to complete((completion)) of wait_for_completion() in devtmpfs_create_node(). If acquire(sb_writers) in put_ucounts() is stuck, then wait_for_completion() in devtmpfs_create_node() would be also stuck, since complete() being in the context of acquire(sb_writers) cannot be called. This is why cross-release added the lock chain. -- Thanks, Byungchul
Re: possible deadlock in generic_file_write_iter (2)
On 12/4/2017 5:33 PM, Jan Kara wrote: Hello, adding Peter and Byungchul to CC since the lockdep report just looks strange and cross-release seems to be involved. Guys, how did #5 get into the lock chain and what does put_ucounts() have to do with sb_writers there? Thanks! Hello Jan, In order to get full stack of #5, we have to pass a boot param, "crossrelease_fullstack", to the kernel. Now that it only informs put_ucounts() in the call trace, it's hard to find out what exactly happened at that time, but I can tell #5 shows: When acquire(sb_writers) in put_ucounts(), it was on the way to complete((completion)) of wait_for_completion() in devtmpfs_create_node(). If acquire(sb_writers) in put_ucounts() is stuck, then wait_for_completion() in devtmpfs_create_node() would be also stuck, since complete() being in the context of acquire(sb_writers) cannot be called. This is why cross-release added the lock chain. -- Thanks, Byungchul