Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, Aug 22 2018, NeilBrown wrote: > > Oh dear. > nfs4_alloc_lockdata contains: > memcpy(>fl, fl, sizeof(p->fl)); > > so any list_heads that are valid in fl will be invalid in p->fl. > > Maybe I should initialize the relevant list_heads at the start of wait > functions. > I should look more closely at what filesystems do with locks though. > I looked and it's complicated. Some call posix_lock_file() (which doesn't block, I think). Some call locks_lock_file_wait() (which can block, if FL_SLEEP is given). Some call both. Strangely, vfs_lock_file() either calls posix_lock_file(), which doesn't block, or filp->f_op->lock() which, I think, can. I'm confused. However I think this version of the patch should be safer. When I make time to test this, this will be part of what I test. Thanks, NeilBrown From: NeilBrown Date: Tue, 21 Aug 2018 15:09:06 +1000 Subject: [PATCH] fs/locks: always delete_block after waiting. Now that requests can block other requests, we need to be careful to always clean up those blocked requests. Any time that we wait for a request, we might have other requests attached, and when we stop waiting, we much clean them up. If the lock was granted, the requests might have been moved to the new lock, though when merged with a pre-exiting lock, this might not happen. No all cases we don't want blocked locks to remain attached, so we remove them to be safe. Note that when these locking routines are called without FL_SLEEP set, the list_head might not be properly initialize. In that case it is neither safe nor necessary to call locks_delete_block() Signed-off-by: NeilBrown --- fs/locks.c | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index de38bafb7f7b..2af9c657f81f 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1276,12 +1276,11 @@ static int posix_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + if (fl->fl_flags & FL_SLEEP) + locks_delete_block(fl); return error; } @@ -1971,12 +1970,11 @@ static int flock_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + if (fl->fl_flags & FL_SLEEP) + locks_delete_block(fl); return error; } @@ -2250,12 +2248,11 @@ static int do_lock_file_wait(struct file *filp, unsigned int cmd, if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + if (fl->fl_flags & FL_SLEEP) + locks_delete_block(fl); return error; } -- 2.14.0.rc0.dirty signature.asc Description: PGP signature
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, Aug 22 2018, NeilBrown wrote: > > Oh dear. > nfs4_alloc_lockdata contains: > memcpy(>fl, fl, sizeof(p->fl)); > > so any list_heads that are valid in fl will be invalid in p->fl. > > Maybe I should initialize the relevant list_heads at the start of wait > functions. > I should look more closely at what filesystems do with locks though. > I looked and it's complicated. Some call posix_lock_file() (which doesn't block, I think). Some call locks_lock_file_wait() (which can block, if FL_SLEEP is given). Some call both. Strangely, vfs_lock_file() either calls posix_lock_file(), which doesn't block, or filp->f_op->lock() which, I think, can. I'm confused. However I think this version of the patch should be safer. When I make time to test this, this will be part of what I test. Thanks, NeilBrown From: NeilBrown Date: Tue, 21 Aug 2018 15:09:06 +1000 Subject: [PATCH] fs/locks: always delete_block after waiting. Now that requests can block other requests, we need to be careful to always clean up those blocked requests. Any time that we wait for a request, we might have other requests attached, and when we stop waiting, we much clean them up. If the lock was granted, the requests might have been moved to the new lock, though when merged with a pre-exiting lock, this might not happen. No all cases we don't want blocked locks to remain attached, so we remove them to be safe. Note that when these locking routines are called without FL_SLEEP set, the list_head might not be properly initialize. In that case it is neither safe nor necessary to call locks_delete_block() Signed-off-by: NeilBrown --- fs/locks.c | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index de38bafb7f7b..2af9c657f81f 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1276,12 +1276,11 @@ static int posix_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + if (fl->fl_flags & FL_SLEEP) + locks_delete_block(fl); return error; } @@ -1971,12 +1970,11 @@ static int flock_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + if (fl->fl_flags & FL_SLEEP) + locks_delete_block(fl); return error; } @@ -2250,12 +2248,11 @@ static int do_lock_file_wait(struct file *filp, unsigned int cmd, if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + if (fl->fl_flags & FL_SLEEP) + locks_delete_block(fl); return error; } -- 2.14.0.rc0.dirty signature.asc Description: PGP signature
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Tue, Aug 21 2018, Jeff Layton wrote: > On Tue, 2018-08-21 at 15:11 +1000, NeilBrown wrote: >> On Thu, Aug 16 2018, NeilBrown wrote: >> >> > On Wed, Aug 15 2018, Jeff Layton wrote: >> > >> > > On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: >> > > > Hi, >> > > > >> > > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d >> > > > ("fs/locks: allow a lock request to block other requests.") to failure >> > > > boot of NFSv4 with root on several boards. >> > > > >> > > > Log is here: >> > > > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 >> > > > >> > > > With several errors: >> > > > kernel BUG at ../fs/locks.c:336! >> > > > Unable to handle kernel NULL pointer dereference at virtual address >> > > > 0004 >> > > > >> > > > Configuration: >> > > > 1. exynos_defconfig >> > > > 2. Arch ARM Linux >> > > > 3. Boards: >> > > > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) >> > > > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) >> > > > 4. Systemd: v236, 238 >> > > > 5. All boards boot from TFTP with NFS root (NFSv4) >> > > > >> > > > On Colibri VF50 I got slightly different errors: >> > > > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM >> > > > [ 12.455273] Unable to handle kernel NULL pointer dereference at >> > > > virtual address 0004 >> > > > and only with some specific GCC (v6.3) or with other conditions which >> > > > I did not bisect yet. Maybe Colibri's failure is unrelated to that >> > > > commit. >> > > > >> > > > Best regards, >> > > > Krzysztof >> > >> > Thanks a lot for the report Krzysztof!! >> > >> > > >> > > The BUG is due to a lock being freed when the fl_blocked list wasn't >> > > empty (implying that there were still blocked locks waiting on it). >> > > >> > > There are a number of calls to locks_delete_lock_ctx in posix_lock_inode >> > > and I don't think the fl_blocked list is being handled properly with all >> > > of them. It only transplants the blocked locks to a new lock when there >> > > are surviving locks on the list, and that may not be the case when the >> > > whole file is being unlocked. >> > >> > locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls >> > locks_wake_up_block() which doesn't only wake_up the blocks, but also >> > detached them. When that function completes, ->fl_blocked must be empty. >> > >> > The trace shows the locks_free_lock() call at the end of fcntl_setlk64() >> > as the problematic call. >> > This suggests that do_lock_file_wait() exited with ->fl_blocked >> > non-empty, which it shouldn't. >> > >> > I think we need to insert a call to locks_wake_up_block() in >> > do_lock_file_wait() before it returns. >> > I cannot find a sequence that would make this necessary, but >> > it isn't surprising that there might be one. >> > >> > I'll dig through the code a bit more later and make sure I understand >> > what is happening. >> > >> >> I think this problem if fixed by the following. It is probably >> triggered when the owner already has a lock for part of the requested >> range. After waiting for some other lock, the pending request gets >> merged with the existing lock, and blocked requests aren't moved across >> in that case. >> >> I still haven't done more testing, so this is just FYI, not a >> submission. >> >> Thanks, >> NeilBrown >> >> From: NeilBrown >> Date: Tue, 21 Aug 2018 15:09:06 +1000 >> Subject: [PATCH] fs/locks: always delete_block after waiting. >> >> Now that requests can block other requests, we >> need to be careful to always clean up those blocked >> requests. >> Any time that we wait for a request, we might have >> other requests attached, and when we stop waiting, >> we much clean them up. >> If the lock was granted, the requests might have been >> moved to the new lock, though when merged with a >> pre-exiting lock, this might not happen. >> No all cases we don't want blocked locks to remain >> attached, so we remove them to be safe. >> >> Signed-off-by: NeilBrown >> --- >> fs/locks.c | 24 +--- >> 1 file changed, 9 insertions(+), 15 deletions(-) >> >> diff --git a/fs/locks.c b/fs/locks.c >> index de38bafb7f7b..6b310112cf3b 100644 >> --- a/fs/locks.c >> +++ b/fs/locks.c >> @@ -1276,12 +1276,10 @@ static int posix_lock_inode_wait(struct inode >> *inode, struct file_lock *fl) >> if (error != FILE_LOCK_DEFERRED) >> break; >> error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); >> -if (!error) >> -continue; >> - >> -locks_delete_block(fl); >> -break; >> +if (error) >> +break; >> } >> +locks_delete_block(fl); >> return error; >> } >> >> @@ -1971,12 +1969,10 @@ static int flock_lock_inode_wait(struct inode >> *inode, struct file_lock *fl) >> if (error != FILE_LOCK_DEFERRED) >> break; >>
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Tue, Aug 21 2018, Jeff Layton wrote: > On Tue, 2018-08-21 at 15:11 +1000, NeilBrown wrote: >> On Thu, Aug 16 2018, NeilBrown wrote: >> >> > On Wed, Aug 15 2018, Jeff Layton wrote: >> > >> > > On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: >> > > > Hi, >> > > > >> > > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d >> > > > ("fs/locks: allow a lock request to block other requests.") to failure >> > > > boot of NFSv4 with root on several boards. >> > > > >> > > > Log is here: >> > > > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 >> > > > >> > > > With several errors: >> > > > kernel BUG at ../fs/locks.c:336! >> > > > Unable to handle kernel NULL pointer dereference at virtual address >> > > > 0004 >> > > > >> > > > Configuration: >> > > > 1. exynos_defconfig >> > > > 2. Arch ARM Linux >> > > > 3. Boards: >> > > > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) >> > > > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) >> > > > 4. Systemd: v236, 238 >> > > > 5. All boards boot from TFTP with NFS root (NFSv4) >> > > > >> > > > On Colibri VF50 I got slightly different errors: >> > > > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM >> > > > [ 12.455273] Unable to handle kernel NULL pointer dereference at >> > > > virtual address 0004 >> > > > and only with some specific GCC (v6.3) or with other conditions which >> > > > I did not bisect yet. Maybe Colibri's failure is unrelated to that >> > > > commit. >> > > > >> > > > Best regards, >> > > > Krzysztof >> > >> > Thanks a lot for the report Krzysztof!! >> > >> > > >> > > The BUG is due to a lock being freed when the fl_blocked list wasn't >> > > empty (implying that there were still blocked locks waiting on it). >> > > >> > > There are a number of calls to locks_delete_lock_ctx in posix_lock_inode >> > > and I don't think the fl_blocked list is being handled properly with all >> > > of them. It only transplants the blocked locks to a new lock when there >> > > are surviving locks on the list, and that may not be the case when the >> > > whole file is being unlocked. >> > >> > locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls >> > locks_wake_up_block() which doesn't only wake_up the blocks, but also >> > detached them. When that function completes, ->fl_blocked must be empty. >> > >> > The trace shows the locks_free_lock() call at the end of fcntl_setlk64() >> > as the problematic call. >> > This suggests that do_lock_file_wait() exited with ->fl_blocked >> > non-empty, which it shouldn't. >> > >> > I think we need to insert a call to locks_wake_up_block() in >> > do_lock_file_wait() before it returns. >> > I cannot find a sequence that would make this necessary, but >> > it isn't surprising that there might be one. >> > >> > I'll dig through the code a bit more later and make sure I understand >> > what is happening. >> > >> >> I think this problem if fixed by the following. It is probably >> triggered when the owner already has a lock for part of the requested >> range. After waiting for some other lock, the pending request gets >> merged with the existing lock, and blocked requests aren't moved across >> in that case. >> >> I still haven't done more testing, so this is just FYI, not a >> submission. >> >> Thanks, >> NeilBrown >> >> From: NeilBrown >> Date: Tue, 21 Aug 2018 15:09:06 +1000 >> Subject: [PATCH] fs/locks: always delete_block after waiting. >> >> Now that requests can block other requests, we >> need to be careful to always clean up those blocked >> requests. >> Any time that we wait for a request, we might have >> other requests attached, and when we stop waiting, >> we much clean them up. >> If the lock was granted, the requests might have been >> moved to the new lock, though when merged with a >> pre-exiting lock, this might not happen. >> No all cases we don't want blocked locks to remain >> attached, so we remove them to be safe. >> >> Signed-off-by: NeilBrown >> --- >> fs/locks.c | 24 +--- >> 1 file changed, 9 insertions(+), 15 deletions(-) >> >> diff --git a/fs/locks.c b/fs/locks.c >> index de38bafb7f7b..6b310112cf3b 100644 >> --- a/fs/locks.c >> +++ b/fs/locks.c >> @@ -1276,12 +1276,10 @@ static int posix_lock_inode_wait(struct inode >> *inode, struct file_lock *fl) >> if (error != FILE_LOCK_DEFERRED) >> break; >> error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); >> -if (!error) >> -continue; >> - >> -locks_delete_block(fl); >> -break; >> +if (error) >> +break; >> } >> +locks_delete_block(fl); >> return error; >> } >> >> @@ -1971,12 +1969,10 @@ static int flock_lock_inode_wait(struct inode >> *inode, struct file_lock *fl) >> if (error != FILE_LOCK_DEFERRED) >> break; >>
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Tue, 2018-08-21 at 15:11 +1000, NeilBrown wrote: > On Thu, Aug 16 2018, NeilBrown wrote: > > > On Wed, Aug 15 2018, Jeff Layton wrote: > > > > > On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: > > > > Hi, > > > > > > > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d > > > > ("fs/locks: allow a lock request to block other requests.") to failure > > > > boot of NFSv4 with root on several boards. > > > > > > > > Log is here: > > > > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 > > > > > > > > With several errors: > > > > kernel BUG at ../fs/locks.c:336! > > > > Unable to handle kernel NULL pointer dereference at virtual address > > > > 0004 > > > > > > > > Configuration: > > > > 1. exynos_defconfig > > > > 2. Arch ARM Linux > > > > 3. Boards: > > > > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) > > > > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) > > > > 4. Systemd: v236, 238 > > > > 5. All boards boot from TFTP with NFS root (NFSv4) > > > > > > > > On Colibri VF50 I got slightly different errors: > > > > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM > > > > [ 12.455273] Unable to handle kernel NULL pointer dereference at > > > > virtual address 0004 > > > > and only with some specific GCC (v6.3) or with other conditions which > > > > I did not bisect yet. Maybe Colibri's failure is unrelated to that > > > > commit. > > > > > > > > Best regards, > > > > Krzysztof > > > > Thanks a lot for the report Krzysztof!! > > > > > > > > The BUG is due to a lock being freed when the fl_blocked list wasn't > > > empty (implying that there were still blocked locks waiting on it). > > > > > > There are a number of calls to locks_delete_lock_ctx in posix_lock_inode > > > and I don't think the fl_blocked list is being handled properly with all > > > of them. It only transplants the blocked locks to a new lock when there > > > are surviving locks on the list, and that may not be the case when the > > > whole file is being unlocked. > > > > locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls > > locks_wake_up_block() which doesn't only wake_up the blocks, but also > > detached them. When that function completes, ->fl_blocked must be empty. > > > > The trace shows the locks_free_lock() call at the end of fcntl_setlk64() > > as the problematic call. > > This suggests that do_lock_file_wait() exited with ->fl_blocked > > non-empty, which it shouldn't. > > > > I think we need to insert a call to locks_wake_up_block() in > > do_lock_file_wait() before it returns. > > I cannot find a sequence that would make this necessary, but > > it isn't surprising that there might be one. > > > > I'll dig through the code a bit more later and make sure I understand > > what is happening. > > > > I think this problem if fixed by the following. It is probably > triggered when the owner already has a lock for part of the requested > range. After waiting for some other lock, the pending request gets > merged with the existing lock, and blocked requests aren't moved across > in that case. > > I still haven't done more testing, so this is just FYI, not a > submission. > > Thanks, > NeilBrown > > From: NeilBrown > Date: Tue, 21 Aug 2018 15:09:06 +1000 > Subject: [PATCH] fs/locks: always delete_block after waiting. > > Now that requests can block other requests, we > need to be careful to always clean up those blocked > requests. > Any time that we wait for a request, we might have > other requests attached, and when we stop waiting, > we much clean them up. > If the lock was granted, the requests might have been > moved to the new lock, though when merged with a > pre-exiting lock, this might not happen. > No all cases we don't want blocked locks to remain > attached, so we remove them to be safe. > > Signed-off-by: NeilBrown > --- > fs/locks.c | 24 +--- > 1 file changed, 9 insertions(+), 15 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index de38bafb7f7b..6b310112cf3b 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -1276,12 +1276,10 @@ static int posix_lock_inode_wait(struct inode *inode, > struct file_lock *fl) > if (error != FILE_LOCK_DEFERRED) > break; > error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); > - if (!error) > - continue; > - > - locks_delete_block(fl); > - break; > + if (error) > + break; > } > + locks_delete_block(fl); > return error; > } > > @@ -1971,12 +1969,10 @@ static int flock_lock_inode_wait(struct inode *inode, > struct file_lock *fl) > if (error != FILE_LOCK_DEFERRED) > break; > error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); > - if (!error) > - continue; > - > -
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Tue, 2018-08-21 at 15:11 +1000, NeilBrown wrote: > On Thu, Aug 16 2018, NeilBrown wrote: > > > On Wed, Aug 15 2018, Jeff Layton wrote: > > > > > On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: > > > > Hi, > > > > > > > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d > > > > ("fs/locks: allow a lock request to block other requests.") to failure > > > > boot of NFSv4 with root on several boards. > > > > > > > > Log is here: > > > > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 > > > > > > > > With several errors: > > > > kernel BUG at ../fs/locks.c:336! > > > > Unable to handle kernel NULL pointer dereference at virtual address > > > > 0004 > > > > > > > > Configuration: > > > > 1. exynos_defconfig > > > > 2. Arch ARM Linux > > > > 3. Boards: > > > > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) > > > > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) > > > > 4. Systemd: v236, 238 > > > > 5. All boards boot from TFTP with NFS root (NFSv4) > > > > > > > > On Colibri VF50 I got slightly different errors: > > > > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM > > > > [ 12.455273] Unable to handle kernel NULL pointer dereference at > > > > virtual address 0004 > > > > and only with some specific GCC (v6.3) or with other conditions which > > > > I did not bisect yet. Maybe Colibri's failure is unrelated to that > > > > commit. > > > > > > > > Best regards, > > > > Krzysztof > > > > Thanks a lot for the report Krzysztof!! > > > > > > > > The BUG is due to a lock being freed when the fl_blocked list wasn't > > > empty (implying that there were still blocked locks waiting on it). > > > > > > There are a number of calls to locks_delete_lock_ctx in posix_lock_inode > > > and I don't think the fl_blocked list is being handled properly with all > > > of them. It only transplants the blocked locks to a new lock when there > > > are surviving locks on the list, and that may not be the case when the > > > whole file is being unlocked. > > > > locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls > > locks_wake_up_block() which doesn't only wake_up the blocks, but also > > detached them. When that function completes, ->fl_blocked must be empty. > > > > The trace shows the locks_free_lock() call at the end of fcntl_setlk64() > > as the problematic call. > > This suggests that do_lock_file_wait() exited with ->fl_blocked > > non-empty, which it shouldn't. > > > > I think we need to insert a call to locks_wake_up_block() in > > do_lock_file_wait() before it returns. > > I cannot find a sequence that would make this necessary, but > > it isn't surprising that there might be one. > > > > I'll dig through the code a bit more later and make sure I understand > > what is happening. > > > > I think this problem if fixed by the following. It is probably > triggered when the owner already has a lock for part of the requested > range. After waiting for some other lock, the pending request gets > merged with the existing lock, and blocked requests aren't moved across > in that case. > > I still haven't done more testing, so this is just FYI, not a > submission. > > Thanks, > NeilBrown > > From: NeilBrown > Date: Tue, 21 Aug 2018 15:09:06 +1000 > Subject: [PATCH] fs/locks: always delete_block after waiting. > > Now that requests can block other requests, we > need to be careful to always clean up those blocked > requests. > Any time that we wait for a request, we might have > other requests attached, and when we stop waiting, > we much clean them up. > If the lock was granted, the requests might have been > moved to the new lock, though when merged with a > pre-exiting lock, this might not happen. > No all cases we don't want blocked locks to remain > attached, so we remove them to be safe. > > Signed-off-by: NeilBrown > --- > fs/locks.c | 24 +--- > 1 file changed, 9 insertions(+), 15 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index de38bafb7f7b..6b310112cf3b 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -1276,12 +1276,10 @@ static int posix_lock_inode_wait(struct inode *inode, > struct file_lock *fl) > if (error != FILE_LOCK_DEFERRED) > break; > error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); > - if (!error) > - continue; > - > - locks_delete_block(fl); > - break; > + if (error) > + break; > } > + locks_delete_block(fl); > return error; > } > > @@ -1971,12 +1969,10 @@ static int flock_lock_inode_wait(struct inode *inode, > struct file_lock *fl) > if (error != FILE_LOCK_DEFERRED) > break; > error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); > - if (!error) > - continue; > - > -
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Thu, Aug 16 2018, NeilBrown wrote: > On Wed, Aug 15 2018, Jeff Layton wrote: > >> On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: >>> Hi, >>> >>> Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d >>> ("fs/locks: allow a lock request to block other requests.") to failure >>> boot of NFSv4 with root on several boards. >>> >>> Log is here: >>> https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 >>> >>> With several errors: >>> kernel BUG at ../fs/locks.c:336! >>> Unable to handle kernel NULL pointer dereference at virtual address 0004 >>> >>> Configuration: >>> 1. exynos_defconfig >>> 2. Arch ARM Linux >>> 3. Boards: >>> a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) >>> b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) >>> 4. Systemd: v236, 238 >>> 5. All boards boot from TFTP with NFS root (NFSv4) >>> >>> On Colibri VF50 I got slightly different errors: >>> [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM >>> [ 12.455273] Unable to handle kernel NULL pointer dereference at >>> virtual address 0004 >>> and only with some specific GCC (v6.3) or with other conditions which >>> I did not bisect yet. Maybe Colibri's failure is unrelated to that >>> commit. >>> >>> Best regards, >>> Krzysztof > > Thanks a lot for the report Krzysztof!! > >> >> The BUG is due to a lock being freed when the fl_blocked list wasn't >> empty (implying that there were still blocked locks waiting on it). >> >> There are a number of calls to locks_delete_lock_ctx in posix_lock_inode >> and I don't think the fl_blocked list is being handled properly with all >> of them. It only transplants the blocked locks to a new lock when there >> are surviving locks on the list, and that may not be the case when the >> whole file is being unlocked. > > locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls > locks_wake_up_block() which doesn't only wake_up the blocks, but also > detached them. When that function completes, ->fl_blocked must be empty. > > The trace shows the locks_free_lock() call at the end of fcntl_setlk64() > as the problematic call. > This suggests that do_lock_file_wait() exited with ->fl_blocked > non-empty, which it shouldn't. > > I think we need to insert a call to locks_wake_up_block() in > do_lock_file_wait() before it returns. > I cannot find a sequence that would make this necessary, but > it isn't surprising that there might be one. > > I'll dig through the code a bit more later and make sure I understand > what is happening. > I think this problem if fixed by the following. It is probably triggered when the owner already has a lock for part of the requested range. After waiting for some other lock, the pending request gets merged with the existing lock, and blocked requests aren't moved across in that case. I still haven't done more testing, so this is just FYI, not a submission. Thanks, NeilBrown From: NeilBrown Date: Tue, 21 Aug 2018 15:09:06 +1000 Subject: [PATCH] fs/locks: always delete_block after waiting. Now that requests can block other requests, we need to be careful to always clean up those blocked requests. Any time that we wait for a request, we might have other requests attached, and when we stop waiting, we much clean them up. If the lock was granted, the requests might have been moved to the new lock, though when merged with a pre-exiting lock, this might not happen. No all cases we don't want blocked locks to remain attached, so we remove them to be safe. Signed-off-by: NeilBrown --- fs/locks.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index de38bafb7f7b..6b310112cf3b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1276,12 +1276,10 @@ static int posix_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + locks_delete_block(fl); return error; } @@ -1971,12 +1969,10 @@ static int flock_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + locks_delete_block(fl); return error; } @@ -2250,12 +2246,10 @@ static int do_lock_file_wait(struct file *filp, unsigned int cmd, if (error != FILE_LOCK_DEFERRED) break; error =
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Thu, Aug 16 2018, NeilBrown wrote: > On Wed, Aug 15 2018, Jeff Layton wrote: > >> On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: >>> Hi, >>> >>> Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d >>> ("fs/locks: allow a lock request to block other requests.") to failure >>> boot of NFSv4 with root on several boards. >>> >>> Log is here: >>> https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 >>> >>> With several errors: >>> kernel BUG at ../fs/locks.c:336! >>> Unable to handle kernel NULL pointer dereference at virtual address 0004 >>> >>> Configuration: >>> 1. exynos_defconfig >>> 2. Arch ARM Linux >>> 3. Boards: >>> a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) >>> b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) >>> 4. Systemd: v236, 238 >>> 5. All boards boot from TFTP with NFS root (NFSv4) >>> >>> On Colibri VF50 I got slightly different errors: >>> [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM >>> [ 12.455273] Unable to handle kernel NULL pointer dereference at >>> virtual address 0004 >>> and only with some specific GCC (v6.3) or with other conditions which >>> I did not bisect yet. Maybe Colibri's failure is unrelated to that >>> commit. >>> >>> Best regards, >>> Krzysztof > > Thanks a lot for the report Krzysztof!! > >> >> The BUG is due to a lock being freed when the fl_blocked list wasn't >> empty (implying that there were still blocked locks waiting on it). >> >> There are a number of calls to locks_delete_lock_ctx in posix_lock_inode >> and I don't think the fl_blocked list is being handled properly with all >> of them. It only transplants the blocked locks to a new lock when there >> are surviving locks on the list, and that may not be the case when the >> whole file is being unlocked. > > locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls > locks_wake_up_block() which doesn't only wake_up the blocks, but also > detached them. When that function completes, ->fl_blocked must be empty. > > The trace shows the locks_free_lock() call at the end of fcntl_setlk64() > as the problematic call. > This suggests that do_lock_file_wait() exited with ->fl_blocked > non-empty, which it shouldn't. > > I think we need to insert a call to locks_wake_up_block() in > do_lock_file_wait() before it returns. > I cannot find a sequence that would make this necessary, but > it isn't surprising that there might be one. > > I'll dig through the code a bit more later and make sure I understand > what is happening. > I think this problem if fixed by the following. It is probably triggered when the owner already has a lock for part of the requested range. After waiting for some other lock, the pending request gets merged with the existing lock, and blocked requests aren't moved across in that case. I still haven't done more testing, so this is just FYI, not a submission. Thanks, NeilBrown From: NeilBrown Date: Tue, 21 Aug 2018 15:09:06 +1000 Subject: [PATCH] fs/locks: always delete_block after waiting. Now that requests can block other requests, we need to be careful to always clean up those blocked requests. Any time that we wait for a request, we might have other requests attached, and when we stop waiting, we much clean them up. If the lock was granted, the requests might have been moved to the new lock, though when merged with a pre-exiting lock, this might not happen. No all cases we don't want blocked locks to remain attached, so we remove them to be safe. Signed-off-by: NeilBrown --- fs/locks.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index de38bafb7f7b..6b310112cf3b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1276,12 +1276,10 @@ static int posix_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + locks_delete_block(fl); return error; } @@ -1971,12 +1969,10 @@ static int flock_lock_inode_wait(struct inode *inode, struct file_lock *fl) if (error != FILE_LOCK_DEFERRED) break; error = wait_event_interruptible(fl->fl_wait, !fl->fl_blocker); - if (!error) - continue; - - locks_delete_block(fl); - break; + if (error) + break; } + locks_delete_block(fl); return error; } @@ -2250,12 +2246,10 @@ static int do_lock_file_wait(struct file *filp, unsigned int cmd, if (error != FILE_LOCK_DEFERRED) break; error =
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, Aug 15 2018, Jeff Layton wrote: > On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: >> Hi, >> >> Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d >> ("fs/locks: allow a lock request to block other requests.") to failure >> boot of NFSv4 with root on several boards. >> >> Log is here: >> https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 >> >> With several errors: >> kernel BUG at ../fs/locks.c:336! >> Unable to handle kernel NULL pointer dereference at virtual address 0004 >> >> Configuration: >> 1. exynos_defconfig >> 2. Arch ARM Linux >> 3. Boards: >> a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) >> b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) >> 4. Systemd: v236, 238 >> 5. All boards boot from TFTP with NFS root (NFSv4) >> >> On Colibri VF50 I got slightly different errors: >> [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM >> [ 12.455273] Unable to handle kernel NULL pointer dereference at >> virtual address 0004 >> and only with some specific GCC (v6.3) or with other conditions which >> I did not bisect yet. Maybe Colibri's failure is unrelated to that >> commit. >> >> Best regards, >> Krzysztof Thanks a lot for the report Krzysztof!! > > The BUG is due to a lock being freed when the fl_blocked list wasn't > empty (implying that there were still blocked locks waiting on it). > > There are a number of calls to locks_delete_lock_ctx in posix_lock_inode > and I don't think the fl_blocked list is being handled properly with all > of them. It only transplants the blocked locks to a new lock when there > are surviving locks on the list, and that may not be the case when the > whole file is being unlocked. locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls locks_wake_up_block() which doesn't only wake_up the blocks, but also detached them. When that function completes, ->fl_blocked must be empty. The trace shows the locks_free_lock() call at the end of fcntl_setlk64() as the problematic call. This suggests that do_lock_file_wait() exited with ->fl_blocked non-empty, which it shouldn't. I think we need to insert a call to locks_wake_up_block() in do_lock_file_wait() before it returns. I cannot find a sequence that would make this necessary, but it isn't surprising that there might be one. I'll dig through the code a bit more later and make sure I understand what is happening. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, Aug 15 2018, Jeff Layton wrote: > On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: >> Hi, >> >> Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d >> ("fs/locks: allow a lock request to block other requests.") to failure >> boot of NFSv4 with root on several boards. >> >> Log is here: >> https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 >> >> With several errors: >> kernel BUG at ../fs/locks.c:336! >> Unable to handle kernel NULL pointer dereference at virtual address 0004 >> >> Configuration: >> 1. exynos_defconfig >> 2. Arch ARM Linux >> 3. Boards: >> a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) >> b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) >> 4. Systemd: v236, 238 >> 5. All boards boot from TFTP with NFS root (NFSv4) >> >> On Colibri VF50 I got slightly different errors: >> [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM >> [ 12.455273] Unable to handle kernel NULL pointer dereference at >> virtual address 0004 >> and only with some specific GCC (v6.3) or with other conditions which >> I did not bisect yet. Maybe Colibri's failure is unrelated to that >> commit. >> >> Best regards, >> Krzysztof Thanks a lot for the report Krzysztof!! > > The BUG is due to a lock being freed when the fl_blocked list wasn't > empty (implying that there were still blocked locks waiting on it). > > There are a number of calls to locks_delete_lock_ctx in posix_lock_inode > and I don't think the fl_blocked list is being handled properly with all > of them. It only transplants the blocked locks to a new lock when there > are surviving locks on the list, and that may not be the case when the > whole file is being unlocked. locks_delete_lock_ctx() calls locks_unlink_lock_ctx() which calls locks_wake_up_block() which doesn't only wake_up the blocks, but also detached them. When that function completes, ->fl_blocked must be empty. The trace shows the locks_free_lock() call at the end of fcntl_setlk64() as the problematic call. This suggests that do_lock_file_wait() exited with ->fl_blocked non-empty, which it shouldn't. I think we need to insert a call to locks_wake_up_block() in do_lock_file_wait() before it returns. I cannot find a sequence that would make this necessary, but it isn't surprising that there might be one. I'll dig through the code a bit more later and make sure I understand what is happening. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: > Hi, > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d > ("fs/locks: allow a lock request to block other requests.") to failure > boot of NFSv4 with root on several boards. > > Log is here: > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 > > With several errors: > kernel BUG at ../fs/locks.c:336! > Unable to handle kernel NULL pointer dereference at virtual address 0004 > > Configuration: > 1. exynos_defconfig > 2. Arch ARM Linux > 3. Boards: > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) > 4. Systemd: v236, 238 > 5. All boards boot from TFTP with NFS root (NFSv4) > > On Colibri VF50 I got slightly different errors: > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM > [ 12.455273] Unable to handle kernel NULL pointer dereference at > virtual address 0004 > and only with some specific GCC (v6.3) or with other conditions which > I did not bisect yet. Maybe Colibri's failure is unrelated to that > commit. > > Best regards, > Krzysztof The BUG is due to a lock being freed when the fl_blocked list wasn't empty (implying that there were still blocked locks waiting on it). There are a number of calls to locks_delete_lock_ctx in posix_lock_inode and I don't think the fl_blocked list is being handled properly with all of them. It only transplants the blocked locks to a new lock when there are surviving locks on the list, and that may not be the case when the whole file is being unlocked. -- Jeff Layton
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: > Hi, > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d > ("fs/locks: allow a lock request to block other requests.") to failure > boot of NFSv4 with root on several boards. > > Log is here: > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 > > With several errors: > kernel BUG at ../fs/locks.c:336! > Unable to handle kernel NULL pointer dereference at virtual address 0004 > > Configuration: > 1. exynos_defconfig > 2. Arch ARM Linux > 3. Boards: > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) > 4. Systemd: v236, 238 > 5. All boards boot from TFTP with NFS root (NFSv4) > > On Colibri VF50 I got slightly different errors: > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM > [ 12.455273] Unable to handle kernel NULL pointer dereference at > virtual address 0004 > and only with some specific GCC (v6.3) or with other conditions which > I did not bisect yet. Maybe Colibri's failure is unrelated to that > commit. > > Best regards, > Krzysztof The BUG is due to a lock being freed when the fl_blocked list wasn't empty (implying that there were still blocked locks waiting on it). There are a number of calls to locks_delete_lock_ctx in posix_lock_inode and I don't think the fl_blocked list is being handled properly with all of them. It only transplants the blocked locks to a new lock when there are surviving locks on the list, and that may not be the case when the whole file is being unlocked. -- Jeff Layton
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: > Hi, > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d > ("fs/locks: allow a lock request to block other requests.") to failure > boot of NFSv4 with root on several boards. > > Log is here: > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 > > With several errors: > kernel BUG at ../fs/locks.c:336! > Unable to handle kernel NULL pointer dereference at virtual address 0004 > > Configuration: > 1. exynos_defconfig > 2. Arch ARM Linux > 3. Boards: > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) > 4. Systemd: v236, 238 > 5. All boards boot from TFTP with NFS root (NFSv4) > > On Colibri VF50 I got slightly different errors: > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM > [ 12.455273] Unable to handle kernel NULL pointer dereference at > virtual address 0004 > and only with some specific GCC (v6.3) or with other conditions which > I did not bisect yet. Maybe Colibri's failure is unrelated to that > commit. > > Best regards, > Krzysztof Thanks Krzysztof, I or Neil will see if this is reproducible and see about coming up with a fix. For now, I'll take this out of -next. Thanks, -- Jeff Layton
Re: [BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
On Wed, 2018-08-15 at 14:28 +0200, Krzysztof Kozlowski wrote: > Hi, > > Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d > ("fs/locks: allow a lock request to block other requests.") to failure > boot of NFSv4 with root on several boards. > > Log is here: > https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 > > With several errors: > kernel BUG at ../fs/locks.c:336! > Unable to handle kernel NULL pointer dereference at virtual address 0004 > > Configuration: > 1. exynos_defconfig > 2. Arch ARM Linux > 3. Boards: > a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) > b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) > 4. Systemd: v236, 238 > 5. All boards boot from TFTP with NFS root (NFSv4) > > On Colibri VF50 I got slightly different errors: > [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM > [ 12.455273] Unable to handle kernel NULL pointer dereference at > virtual address 0004 > and only with some specific GCC (v6.3) or with other conditions which > I did not bisect yet. Maybe Colibri's failure is unrelated to that > commit. > > Best regards, > Krzysztof Thanks Krzysztof, I or Neil will see if this is reproducible and see about coming up with a fix. For now, I'll take this out of -next. Thanks, -- Jeff Layton
[BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
Hi, Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d ("fs/locks: allow a lock request to block other requests.") to failure boot of NFSv4 with root on several boards. Log is here: https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 With several errors: kernel BUG at ../fs/locks.c:336! Unable to handle kernel NULL pointer dereference at virtual address 0004 Configuration: 1. exynos_defconfig 2. Arch ARM Linux 3. Boards: a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) 4. Systemd: v236, 238 5. All boards boot from TFTP with NFS root (NFSv4) On Colibri VF50 I got slightly different errors: [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM [ 12.455273] Unable to handle kernel NULL pointer dereference at virtual address 0004 and only with some specific GCC (v6.3) or with other conditions which I did not bisect yet. Maybe Colibri's failure is unrelated to that commit. Best regards, Krzysztof
[BUG][BISECT] NFSv4 root failures after "fs/locks: allow a lock request to block other requests."
Hi, Bisect pointed commit ce3147990450a68b3f549088b30f087742a08b5d ("fs/locks: allow a lock request to block other requests.") to failure boot of NFSv4 with root on several boards. Log is here: https://krzk.eu/#/builders/21/builds/836/steps/12/logs/serial0 With several errors: kernel BUG at ../fs/locks.c:336! Unable to handle kernel NULL pointer dereference at virtual address 0004 Configuration: 1. exynos_defconfig 2. Arch ARM Linux 3. Boards: a. Odroid family (ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC) b. Toradex Colibri VF50 (ARMv7, UP, Cortex-A5) 4. Systemd: v236, 238 5. All boards boot from TFTP with NFS root (NFSv4) On Colibri VF50 I got slightly different errors: [ 11.663204] Internal error: Oops - undefined instruction: 0 [#1] ARM [ 12.455273] Unable to handle kernel NULL pointer dereference at virtual address 0004 and only with some specific GCC (v6.3) or with other conditions which I did not bisect yet. Maybe Colibri's failure is unrelated to that commit. Best regards, Krzysztof