Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed 2018-10-24 10:32:37, Theodore Y. Ts'o wrote: > On Wed, Oct 24, 2018 at 09:15:34AM -0400, Theodore Y. Ts'o wrote: > > At least for ext4, the primary problem is that we want to use a 64-bit > > telldir/seekdir cookie if all 64-bits will make it to user space, and > > a 32-bit telldir cookie if only 32 bits will make it userspace. This > > impacts NFS as well because if there are people who are still using > > NFSv2, which has 32-bit directory offsets, we need to constrain the > > telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a > > 64-bit hash. > > Are there anyone still using NFSv2, BTW? One way of making the > problem *much* easier to sovle would be to drop NFSv2 support. :-) NFSv2 is now in "unexcpected" places such as U-Boot bootloader. I'm pretty sure someone still uses it... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: in_compat_syscall() returns from kernel thread for X86_32.
> On Oct 24, 2018, at 8:46 PM, NeilBrown wrote: > >> On Wed, Oct 24 2018, Theodore Y. Ts'o wrote: >> >>> On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote: >>> >>> I doubt it was copied - more likely independent evolution. >>> But on reflection, I see that it is probably reasonable that it >>> shouldn't be used this way - or at all in this context. >>> I'll try to understand what the issues really are and see if I can >>> find a solution that doesn't depend on this interface. >>> Thanks for your help. >> >> At least for ext4, the primary problem is that we want to use a 64-bit >> telldir/seekdir cookie if all 64-bits will make it to user space, and >> a 32-bit telldir cookie if only 32 bits will make it userspace. This >> impacts NFS as well because if there are people who are still using >> NFSv2, which has 32-bit directory offsets, we need to constrain the >> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a >> 64-bit hash. > > NFSd uses FMODE_32BITHASH or FMODE64BITHASH to allow ext4 to do the > right thing. FMODE_32BITHASH is set for NFSv2 only. > > Maybe sys_getdents needs to set FMODE_32BITHASH, and sys_getdent64 needs > to set FMODE_64BITHASH - or something like that. It’s possible for a 32-bit process and a 64-bit process to share a directory fd, so I don’t think it’s quite that simple. One option would be to add .llseek and .getdents flags or entire new compat operations to indicate that the caller expects 32-bit offsets. I wonder how overlayfs interacts with this whole mess.
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 24 2018, Theodore Y. Ts'o wrote: > On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote: >> >> I doubt it was copied - more likely independent evolution. >> But on reflection, I see that it is probably reasonable that it >> shouldn't be used this way - or at all in this context. >> I'll try to understand what the issues really are and see if I can >> find a solution that doesn't depend on this interface. >> Thanks for your help. > > At least for ext4, the primary problem is that we want to use a 64-bit > telldir/seekdir cookie if all 64-bits will make it to user space, and > a 32-bit telldir cookie if only 32 bits will make it userspace. This > impacts NFS as well because if there are people who are still using > NFSv2, which has 32-bit directory offsets, we need to constrain the > telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a > 64-bit hash. NFSd uses FMODE_32BITHASH or FMODE64BITHASH to allow ext4 to do the right thing. FMODE_32BITHASH is set for NFSv2 only. Maybe sys_getdents needs to set FMODE_32BITHASH, and sys_getdent64 needs to set FMODE_64BITHASH - or something like that. For lustre it is a bit more complex. The internal "inode number" is 128 bits and we (sort of) hash it to 32 or 64 bits. cp_compat_stat() just says -EOVERFLOW if we give a 64 bit number when 32 are expected, and there is no flag to say "this is a 32-bit 'stat' request". But I need to dig into exactly what that "sort-of" means - maybe there is an answer in there. NeilBrown signature.asc Description: PGP signature
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 24, 2018 at 09:15:34AM -0400, Theodore Y. Ts'o wrote: > At least for ext4, the primary problem is that we want to use a 64-bit > telldir/seekdir cookie if all 64-bits will make it to user space, and > a 32-bit telldir cookie if only 32 bits will make it userspace. This > impacts NFS as well because if there are people who are still using > NFSv2, which has 32-bit directory offsets, we need to constrain the > telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a > 64-bit hash. Are there anyone still using NFSv2, BTW? One way of making the problem *much* easier to sovle would be to drop NFSv2 support. :-) - Ted
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote: > > I doubt it was copied - more likely independent evolution. > But on reflection, I see that it is probably reasonable that it > shouldn't be used this way - or at all in this context. > I'll try to understand what the issues really are and see if I can > find a solution that doesn't depend on this interface. > Thanks for your help. At least for ext4, the primary problem is that we want to use a 64-bit telldir/seekdir cookie if all 64-bits will make it to user space, and a 32-bit telldir cookie if only 32 bits will make it userspace. This impacts NFS as well because if there are people who are still using NFSv2, which has 32-bit directory offsets, we need to constrain the telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a 64-bit hash. - Ted
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Thu, Oct 18 2018, Andy Lutomirski wrote: > On Wed, Oct 17, 2018 at 9:36 PM NeilBrown wrote: >> >> On Wed, Oct 17 2018, Andy Lutomirski wrote: >> >> > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown wrote: >> >> >> >> >> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to >> >> in_{ia32,x32}_syscall() >> >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: >> >> >> >> > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae >> >> > Gitweb: >> >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae >> >> > Author: Dmitry Safonov >> >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 >> >> > Committer: Ingo Molnar >> >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 >> >> > >> >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() >> >> > >> >> ... >> >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) >> >> > >> >> > static inline bool in_compat_syscall(void) >> >> > { >> >> > - return is_ia32_task() || is_x32_task(); >> >> > + return in_ia32_syscall() || in_x32_syscall(); >> >> > } >> >> >> >> Hi, >> >> I'm reply to this patch largely to make sure I get the right people >> >> . >> >> >> >> This test is always true when CONFIG_X86_32 is set, as that forces >> >> in_ia32_syscall() to true. >> >> However we might not be in a syscall at all - we might be running a >> >> kernel thread which is always in 64 mode. >> >> Every other implementation of in_compat_syscall() that I found is >> >> dependant on a thread flag or syscall register flag, and so returns >> >> "false" in a kernel thread. >> >> >> >> Might something like this be appropriate? >> >> >> >> diff --git a/arch/x86/include/asm/thread_info.h >> >> b/arch/x86/include/asm/thread_info.h >> >> index 2ff2a30a264f..c265b40a78f2 100644 >> >> --- a/arch/x86/include/asm/thread_info.h >> >> +++ b/arch/x86/include/asm/thread_info.h >> >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void >> >> * const stack, >> >> #ifndef __ASSEMBLY__ >> >> >> >> #ifdef CONFIG_X86_32 >> >> -#define in_ia32_syscall() true >> >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) >> >> #else >> >> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ >> >>current_thread_info()->status & TS_COMPAT) >> >> >> >> This came up in the (no out-of-tree) lustre filesystem where some code >> >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel >> >> threads. >> >> >> > >> > I could get on board with: >> > >> > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) >> > >> > The point of these accessors is to be used *in a syscall*. >> > >> > What on Earth is Lustre doing that makes it have this problem? >> >> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and >> ->rdev are appropriately sized. This isn't very different from the >> usage in ext4 to ensure the seek offset for directories is suitable. >> >> These interfaces can be used both from systemcalls and from kernel >> threads, such as via nfsd. >> >> I don't *know* if nfsd is the particular kthread that causes problems >> for lustre. All I know is that ->getattr returns 32bit squashed inode >> numbers in kthread context where 64 bit numbers would be expected. >> > > Well, that looks like Lustre is copying an ext4 bug. I doubt it was copied - more likely independent evolution. But on reflection, I see that it is probably reasonable that it shouldn't be used this way - or at all in this context. I'll try to understand what the issues really are and see if I can find a solution that doesn't depend on this interface. Thanks for your help. NeilBrown > > Hi ext4 people- > > ext4's is_32bit_api() function is bogus. You can't use > in_compat_syscall() unless you know you're in a syscall > > The buggy code was introduced in: > > commit d1f5273e9adb40724a85272f248f210dc4ce919a > Author: Fan Yong > Date: Sun Mar 18 22:44:40 2012 -0400 > > ext4: return 32/64-bit dir name hash according to usage type > > I don't know what the right solution is. Al, is it legit at all for > fops->llseek to care about the caller's bitness? If what ext4 is > doing is legit, then ISTM the VFS needs to gain a new API to tell > ->llseek what to do. But I'm wondering why FMODE_64BITHASH by itself > isn't sufficient, > > I'm quite tempted to add a warning to the x86 arch code to try to > catch this type of bug. Fortunately, a bit of grepping suggests that > ext4 is the only filesystem with this problem. > > --Andy signature.asc Description: PGP signature
Re: in_compat_syscall() returns from kernel thread for X86_32.
> On Oct 19, 2018, at 11:02 PM, Andreas Dilger wrote: > >> On Oct 18, 2018, at 11:26 AM, Andy Lutomirski wrote: >> >>> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown wrote: >>> On Wed, Oct 17 2018, Andy Lutomirski wrote: > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown wrote: > > > Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to > in_{ia32,x32}_syscall() >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: >> >> Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae >> Gitweb: >> http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae >> Author: Dmitry Safonov >> AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 >> Committer: Ingo Molnar >> CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 >> >> x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() >> > ... >> @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) >> >> static inline bool in_compat_syscall(void) >> { >> - return is_ia32_task() || is_x32_task(); >> + return in_ia32_syscall() || in_x32_syscall(); >> } > > Hi, > I'm reply to this patch largely to make sure I get the right people > . > > This test is always true when CONFIG_X86_32 is set, as that forces > in_ia32_syscall() to true. > However we might not be in a syscall at all - we might be running a > kernel thread which is always in 64 mode. > Every other implementation of in_compat_syscall() that I found is > dependant on a thread flag or syscall register flag, and so returns > "false" in a kernel thread. > > Might something like this be appropriate? > > diff --git a/arch/x86/include/asm/thread_info.h > b/arch/x86/include/asm/thread_info.h > index 2ff2a30a264f..c265b40a78f2 100644 > --- a/arch/x86/include/asm/thread_info.h > +++ b/arch/x86/include/asm/thread_info.h > @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void > * const stack, > #ifndef __ASSEMBLY__ > > #ifdef CONFIG_X86_32 > -#define in_ia32_syscall() true > +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) > #else > #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ > current_thread_info()->status & TS_COMPAT) > > This came up in the (no out-of-tree) lustre filesystem where some code > needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel > threads. > I could get on board with: ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) The point of these accessors is to be used *in a syscall*. What on Earth is Lustre doing that makes it have this problem? >>> >>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and >>> ->rdev are appropriately sized. This isn't very different from the >>> usage in ext4 to ensure the seek offset for directories is suitable. >>> >>> These interfaces can be used both from systemcalls and from kernel >>> threads, such as via nfsd. >>> >>> I don't *know* if nfsd is the particular kthread that causes problems >>> for lustre. All I know is that ->getattr returns 32bit squashed inode >>> numbers in kthread context where 64 bit numbers would be expected. >>> >> >> Well, that looks like Lustre is copying an ext4 bug. >> >> Hi ext4 people- >> >> ext4's is_32bit_api() function is bogus. You can't use >> in_compat_syscall() unless you know you're in a syscall >> >> The buggy code was introduced in: >> >> commit d1f5273e9adb40724a85272f248f210dc4ce919a >> Author: Fan Yong >> Date: Sun Mar 18 22:44:40 2012 -0400 >> >> ext4: return 32/64-bit dir name hash according to usage type >> >> I don't know what the right solution is. Al, is it legit at all for >> fops->llseek to care about the caller's bitness? If what ext4 is >> doing is legit, then ISTM the VFS needs to gain a new API to tell >> ->llseek what to do. But I'm wondering why FMODE_64BITHASH by itself >> isn't sufficient, >> >> I'm quite tempted to add a warning to the x86 arch code to try to >> catch this type of bug. Fortunately, a bit of grepping suggests that >> ext4 is the only filesystem with this problem. > > We need to know whether the readdir cookie returned to userspace > should be a 32-bit cookie or a 64-bit cookie. Trying to return > a 64-bit value will result in -EOVERFLOW for a 32-bit application, > but is preferable (if possible) because it reduces the chance of > hash collisions causing readdir to have problems. > > Let’s rope Al in. Sorry, I thought he was already on cc. The concept seems reasonable, but the implementation is problematic. For example, the behavior of calling vfs_llseek() is basically undefined. Is some VFS change needed to fix this? Maybe a .compat_llseek or some other explicit indication of whether a 64-bit hash is okay?
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Oct 18, 2018, at 11:26 AM, Andy Lutomirski wrote: > > On Wed, Oct 17, 2018 at 9:36 PM NeilBrown wrote: >> >> On Wed, Oct 17 2018, Andy Lutomirski wrote: >> >>> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown wrote: Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae > Gitweb: > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae > Author: Dmitry Safonov > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 > Committer: Ingo Molnar > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 > > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() > ... > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) > > static inline bool in_compat_syscall(void) > { > - return is_ia32_task() || is_x32_task(); > + return in_ia32_syscall() || in_x32_syscall(); > } Hi, I'm reply to this patch largely to make sure I get the right people . This test is always true when CONFIG_X86_32 is set, as that forces in_ia32_syscall() to true. However we might not be in a syscall at all - we might be running a kernel thread which is always in 64 mode. Every other implementation of in_compat_syscall() that I found is dependant on a thread flag or syscall register flag, and so returns "false" in a kernel thread. Might something like this be appropriate? diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 2ff2a30a264f..c265b40a78f2 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * const stack, #ifndef __ASSEMBLY__ #ifdef CONFIG_X86_32 -#define in_ia32_syscall() true +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) #else #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ current_thread_info()->status & TS_COMPAT) This came up in the (no out-of-tree) lustre filesystem where some code needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel threads. >>> >>> I could get on board with: >>> >>> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) >>> >>> The point of these accessors is to be used *in a syscall*. >>> >>> What on Earth is Lustre doing that makes it have this problem? >> >> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and >> ->rdev are appropriately sized. This isn't very different from the >> usage in ext4 to ensure the seek offset for directories is suitable. >> >> These interfaces can be used both from systemcalls and from kernel >> threads, such as via nfsd. >> >> I don't *know* if nfsd is the particular kthread that causes problems >> for lustre. All I know is that ->getattr returns 32bit squashed inode >> numbers in kthread context where 64 bit numbers would be expected. >> > > Well, that looks like Lustre is copying an ext4 bug. > > Hi ext4 people- > > ext4's is_32bit_api() function is bogus. You can't use > in_compat_syscall() unless you know you're in a syscall > > The buggy code was introduced in: > > commit d1f5273e9adb40724a85272f248f210dc4ce919a > Author: Fan Yong > Date: Sun Mar 18 22:44:40 2012 -0400 > >ext4: return 32/64-bit dir name hash according to usage type > > I don't know what the right solution is. Al, is it legit at all for > fops->llseek to care about the caller's bitness? If what ext4 is > doing is legit, then ISTM the VFS needs to gain a new API to tell > ->llseek what to do. But I'm wondering why FMODE_64BITHASH by itself > isn't sufficient, > > I'm quite tempted to add a warning to the x86 arch code to try to > catch this type of bug. Fortunately, a bit of grepping suggests that > ext4 is the only filesystem with this problem. We need to know whether the readdir cookie returned to userspace should be a 32-bit cookie or a 64-bit cookie. Trying to return a 64-bit value will result in -EOVERFLOW for a 32-bit application, but is preferable (if possible) because it reduces the chance of hash collisions causing readdir to have problems. Cheers, Andreas signature.asc Description: Message signed with OpenPGP
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 17, 2018 at 9:36 PM NeilBrown wrote: > > On Wed, Oct 17 2018, Andy Lutomirski wrote: > > > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown wrote: > >> > >> > >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to > >> in_{ia32,x32}_syscall() > >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: > >> > >> > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae > >> > Gitweb: > >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae > >> > Author: Dmitry Safonov > >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 > >> > Committer: Ingo Molnar > >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 > >> > > >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() > >> > > >> ... > >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) > >> > > >> > static inline bool in_compat_syscall(void) > >> > { > >> > - return is_ia32_task() || is_x32_task(); > >> > + return in_ia32_syscall() || in_x32_syscall(); > >> > } > >> > >> Hi, > >> I'm reply to this patch largely to make sure I get the right people > >> . > >> > >> This test is always true when CONFIG_X86_32 is set, as that forces > >> in_ia32_syscall() to true. > >> However we might not be in a syscall at all - we might be running a > >> kernel thread which is always in 64 mode. > >> Every other implementation of in_compat_syscall() that I found is > >> dependant on a thread flag or syscall register flag, and so returns > >> "false" in a kernel thread. > >> > >> Might something like this be appropriate? > >> > >> diff --git a/arch/x86/include/asm/thread_info.h > >> b/arch/x86/include/asm/thread_info.h > >> index 2ff2a30a264f..c265b40a78f2 100644 > >> --- a/arch/x86/include/asm/thread_info.h > >> +++ b/arch/x86/include/asm/thread_info.h > >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void > >> * const stack, > >> #ifndef __ASSEMBLY__ > >> > >> #ifdef CONFIG_X86_32 > >> -#define in_ia32_syscall() true > >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) > >> #else > >> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ > >>current_thread_info()->status & TS_COMPAT) > >> > >> This came up in the (no out-of-tree) lustre filesystem where some code > >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel > >> threads. > >> > > > > I could get on board with: > > > > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) > > > > The point of these accessors is to be used *in a syscall*. > > > > What on Earth is Lustre doing that makes it have this problem? > > Lustre uses it in the ->getattr method to make sure ->ino, ->dev and > ->rdev are appropriately sized. This isn't very different from the > usage in ext4 to ensure the seek offset for directories is suitable. > > These interfaces can be used both from systemcalls and from kernel > threads, such as via nfsd. > > I don't *know* if nfsd is the particular kthread that causes problems > for lustre. All I know is that ->getattr returns 32bit squashed inode > numbers in kthread context where 64 bit numbers would be expected. > Well, that looks like Lustre is copying an ext4 bug. Hi ext4 people- ext4's is_32bit_api() function is bogus. You can't use in_compat_syscall() unless you know you're in a syscall The buggy code was introduced in: commit d1f5273e9adb40724a85272f248f210dc4ce919a Author: Fan Yong Date: Sun Mar 18 22:44:40 2012 -0400 ext4: return 32/64-bit dir name hash according to usage type I don't know what the right solution is. Al, is it legit at all for fops->llseek to care about the caller's bitness? If what ext4 is doing is legit, then ISTM the VFS needs to gain a new API to tell ->llseek what to do. But I'm wondering why FMODE_64BITHASH by itself isn't sufficient, I'm quite tempted to add a warning to the x86 arch code to try to catch this type of bug. Fortunately, a bit of grepping suggests that ext4 is the only filesystem with this problem. --Andy
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 17 2018, Andy Lutomirski wrote: > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown wrote: >> >> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to >> in_{ia32,x32}_syscall() >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: >> >> > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae >> > Gitweb: >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae >> > Author: Dmitry Safonov >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 >> > Committer: Ingo Molnar >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 >> > >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() >> > >> ... >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) >> > >> > static inline bool in_compat_syscall(void) >> > { >> > - return is_ia32_task() || is_x32_task(); >> > + return in_ia32_syscall() || in_x32_syscall(); >> > } >> >> Hi, >> I'm reply to this patch largely to make sure I get the right people >> . >> >> This test is always true when CONFIG_X86_32 is set, as that forces >> in_ia32_syscall() to true. >> However we might not be in a syscall at all - we might be running a >> kernel thread which is always in 64 mode. >> Every other implementation of in_compat_syscall() that I found is >> dependant on a thread flag or syscall register flag, and so returns >> "false" in a kernel thread. >> >> Might something like this be appropriate? >> >> diff --git a/arch/x86/include/asm/thread_info.h >> b/arch/x86/include/asm/thread_info.h >> index 2ff2a30a264f..c265b40a78f2 100644 >> --- a/arch/x86/include/asm/thread_info.h >> +++ b/arch/x86/include/asm/thread_info.h >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * >> const stack, >> #ifndef __ASSEMBLY__ >> >> #ifdef CONFIG_X86_32 >> -#define in_ia32_syscall() true >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) >> #else >> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ >>current_thread_info()->status & TS_COMPAT) >> >> This came up in the (no out-of-tree) lustre filesystem where some code >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel >> threads. >> > > I could get on board with: > > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) > > The point of these accessors is to be used *in a syscall*. > > What on Earth is Lustre doing that makes it have this problem? Lustre uses it in the ->getattr method to make sure ->ino, ->dev and ->rdev are appropriately sized. This isn't very different from the usage in ext4 to ensure the seek offset for directories is suitable. These interfaces can be used both from systemcalls and from kernel threads, such as via nfsd. I don't *know* if nfsd is the particular kthread that causes problems for lustre. All I know is that ->getattr returns 32bit squashed inode numbers in kthread context where 64 bit numbers would be expected. Thanks, NeilBrown signature.asc Description: PGP signature
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 17, 2018 at 07:37:42PM -0700, Andy Lutomirski wrote: > I could get on board with: > > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) > > The point of these accessors is to be used *in a syscall*. > > What on Earth is Lustre doing that makes it have this problem? Plays with timestamps from a kernel thread, perhaps? See the old .su joke about adenoidectomy with rectal access for possible reasons for doing it that way...
Re: in_compat_syscall() returns from kernel thread for X86_32.
On Wed, Oct 17, 2018 at 6:48 PM NeilBrown wrote: > > > Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to > in_{ia32,x32}_syscall() > On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: > > > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae > > Gitweb: > > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae > > Author: Dmitry Safonov > > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 > > Committer: Ingo Molnar > > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 > > > > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() > > > ... > > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) > > > > static inline bool in_compat_syscall(void) > > { > > - return is_ia32_task() || is_x32_task(); > > + return in_ia32_syscall() || in_x32_syscall(); > > } > > Hi, > I'm reply to this patch largely to make sure I get the right people > . > > This test is always true when CONFIG_X86_32 is set, as that forces > in_ia32_syscall() to true. > However we might not be in a syscall at all - we might be running a > kernel thread which is always in 64 mode. > Every other implementation of in_compat_syscall() that I found is > dependant on a thread flag or syscall register flag, and so returns > "false" in a kernel thread. > > Might something like this be appropriate? > > diff --git a/arch/x86/include/asm/thread_info.h > b/arch/x86/include/asm/thread_info.h > index 2ff2a30a264f..c265b40a78f2 100644 > --- a/arch/x86/include/asm/thread_info.h > +++ b/arch/x86/include/asm/thread_info.h > @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * > const stack, > #ifndef __ASSEMBLY__ > > #ifdef CONFIG_X86_32 > -#define in_ia32_syscall() true > +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) > #else > #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ >current_thread_info()->status & TS_COMPAT) > > This came up in the (no out-of-tree) lustre filesystem where some code > needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel > threads. > I could get on board with: ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) The point of these accessors is to be used *in a syscall*. What on Earth is Lustre doing that makes it have this problem?
in_compat_syscall() returns from kernel thread for X86_32.
Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae > Gitweb: http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae > Author: Dmitry Safonov > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 > Committer: Ingo Molnar > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 > > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() > ... > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) > > static inline bool in_compat_syscall(void) > { > - return is_ia32_task() || is_x32_task(); > + return in_ia32_syscall() || in_x32_syscall(); > } Hi, I'm reply to this patch largely to make sure I get the right people . This test is always true when CONFIG_X86_32 is set, as that forces in_ia32_syscall() to true. However we might not be in a syscall at all - we might be running a kernel thread which is always in 64 mode. Every other implementation of in_compat_syscall() that I found is dependant on a thread flag or syscall register flag, and so returns "false" in a kernel thread. Might something like this be appropriate? diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 2ff2a30a264f..c265b40a78f2 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * const stack, #ifndef __ASSEMBLY__ #ifdef CONFIG_X86_32 -#define in_ia32_syscall() true +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) #else #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ current_thread_info()->status & TS_COMPAT) This came up in the (no out-of-tree) lustre filesystem where some code needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel threads. Thanks, NeilBrown signature.asc Description: PGP signature