Re: [PATCH v2 13/29] arch: add split IPC system calls where needed
On Fri, Jan 18, 2019 at 05:18:19PM +0100, Arnd Bergmann wrote: > The IPC system call handling is highly inconsistent across architectures, > some use sys_ipc, some use separate calls, and some use both. We also > have some architectures that require passing IPC_64 in the flags, and > others that set it implicitly. > > For the additon of a y2083 safe semtimedop() system call, I chose to only It's not critical, but there are two typos in that line: additon -> addition 2083 -> 2038 Gabriel > support the separate entry points, but that requires first supporting > the regular ones with their own syscall numbers. > > The IPC_64 is now implied by the new semctl/shmctl/msgctl system > calls even on the architectures that require passing it with the ipc() > multiplexer. > > I'm not adding the new semtimedop() or semop() on 32-bit architectures, > those will get implemented using the new semtimedop_time64() version > that gets added along with the other time64 calls. > Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop(). > > Signed-off-by: Arnd Bergmann
Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures
On Fri, Jan 18, 2019 at 8:53 PM Andy Lutomirski wrote: > I think we have two issues if we reuse those numbers for new syscalls. > First, I'd really like to see new syscalls be numbered consistently > everywhere, or at least on all x86 variants, and we can't on x32 > because they mean something else. Perhaps more importantly, due to > what is arguably a rather severe bug, issuing a native x86_64 syscall > (x32 bit clear) with nr in the range 512..547 does *not* return > -ENOSYS on a kernel with x32 enabled. Instead it does something that > is somewhat arbitrary. With my patch applied, it will return -ENOSYS, > but old kernels will still exist, and this will break syscall probing. > > Can we perhaps just start the consistent numbers above 547 or maybe > block out 512..547 in the new regime? I'm definitely fine with not reusing them ever, and jumping from 511 to 548 when we get there on all architectures, if you think that helps. While we could also jump to 548 *now*, I think that would be a bit wasteful. Syscall numbers are fairly cheap, but not entirely free, especially when you consider architectures like mips that have an upper bound of 1000 syscalls before they have to get inventive. Arnd
Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures
On Fri, Jan 18, 2019 at 11:33 AM Arnd Bergmann wrote: > > On Fri, Jan 18, 2019 at 7:50 PM Andy Lutomirski wrote: > > On Fri, Jan 18, 2019 at 8:25 AM Arnd Bergmann wrote: > > > - Once we get to 512, we clash with the x32 numbers (unless > > > we remove x32 support first), and probably have to skip > > > a few more. I also considered using the 512..547 space > > > for 32-bit-only calls (which never clash with x32), but > > > that also seems to add a bit of complexity. > > > > I have a patch that I'll send soon to make x32 use its own table. As > > far as I'm concerned, 547 is *it*. 548 is just a normal number and is > > not special. But let's please not reuse 512..547 for other purposes > > on x86 variants -- that way lies even more confusion, IMO. > > Fair enough, the space for those numbers is cheap enough here. > I take it you mean we also should not reuse that number space if > we were to decide to remove x32 soon, but you are not worried > about clashing with arch/alpha when everything else uses consistent > numbers? > I think we have two issues if we reuse those numbers for new syscalls. First, I'd really like to see new syscalls be numbered consistently everywhere, or at least on all x86 variants, and we can't on x32 because they mean something else. Perhaps more importantly, due to what is arguably a rather severe bug, issuing a native x86_64 syscall (x32 bit clear) with nr in the range 512..547 does *not* return -ENOSYS on a kernel with x32 enabled. Instead it does something that is somewhat arbitrary. With my patch applied, it will return -ENOSYS, but old kernels will still exist, and this will break syscall probing. Can we perhaps just start the consistent numbers above 547 or maybe block out 512..547 in the new regime? --Andy
Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures
On Fri, Jan 18, 2019 at 7:50 PM Andy Lutomirski wrote: > On Fri, Jan 18, 2019 at 8:25 AM Arnd Bergmann wrote: > > - Once we get to 512, we clash with the x32 numbers (unless > > we remove x32 support first), and probably have to skip > > a few more. I also considered using the 512..547 space > > for 32-bit-only calls (which never clash with x32), but > > that also seems to add a bit of complexity. > > I have a patch that I'll send soon to make x32 use its own table. As > far as I'm concerned, 547 is *it*. 548 is just a normal number and is > not special. But let's please not reuse 512..547 for other purposes > on x86 variants -- that way lies even more confusion, IMO. Fair enough, the space for those numbers is cheap enough here. I take it you mean we also should not reuse that number space if we were to decide to remove x32 soon, but you are not worried about clashing with arch/alpha when everything else uses consistent numbers? Arnd
Re: [PATCH v2 13/29] arch: add split IPC system calls where needed
On Fri, Jan 18, 2019 at 6:20 PM Gabriel Paubert wrote: > > On Fri, Jan 18, 2019 at 05:18:19PM +0100, Arnd Bergmann wrote: > > The IPC system call handling is highly inconsistent across architectures, > > some use sys_ipc, some use separate calls, and some use both. We also > > have some architectures that require passing IPC_64 in the flags, and > > others that set it implicitly. > > > > For the additon of a y2083 safe semtimedop() system call, I chose to only > > It's not critical, but there are two typos in that line: > additon -> addition > 2083 -> 2038 Fixed both, thanks! Arnd
Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures
On Fri, Jan 18, 2019 at 8:25 AM Arnd Bergmann wrote: > > This adds 21 new system calls on each ABI that has 32-bit time_t > today. All of these have the exact same semantics as their existing > counterparts, and the new ones all have macro names that end in 'time64' > for clarification. > > This gets us to the point of being able to safely use a C library > that has 64-bit time_t in user space. There are still a couple of > loose ends to tie up in various areas of the code, but this is the > big one, and should be entirely uncontroversial at this point. > > In particular, there are four system calls (getitimer, setitimer, > waitid, and getrusage) that don't have a 64-bit counterpart yet, > but these can all be safely implemented in the C library by wrapping > around the existing system calls because the 32-bit time_t they > pass only counts elapsed time, not time since the epoch. They > will be dealt with later. > > Signed-off-by: Arnd Bergmann > --- > The one point that still needs to be agreed on is the actual > number assignment. Following the earlier patch that added > the sysv IPC calls with common numbers where possible, I also > tried the same here, using consistent numbers on all 32-bit > architectures. > > There are a couple of minor issues with this: > > - On asm-generic, we now leave the numbers from 295 to 402 > unassigned, which wastes a small amount of kernel .data > segment. Originally I had asm-generic start at 300 and > everyone else start at 400 here, which was also not > perfect, and we have gone beyond 400 already, so I ended > up just using the same numbers as the rest here. > > - Once we get to 512, we clash with the x32 numbers (unless > we remove x32 support first), and probably have to skip > a few more. I also considered using the 512..547 space > for 32-bit-only calls (which never clash with x32), but > that also seems to add a bit of complexity. I have a patch that I'll send soon to make x32 use its own table. As far as I'm concerned, 547 is *it*. 548 is just a normal number and is not special. But let's please not reuse 512..547 for other purposes on x86 variants -- that way lies even more confusion, IMO. --Andy
Re: [PATCH v2 00/29] y2038: add time64 syscalls
On Fri, 2019-01-18 at 11:57 -0500, Dennis Clarke wrote: > On 1/18/19 11:18 AM, Arnd Bergmann wrote: > > This is a minor update of the patches I posted last week, I > > would like to add this into linux-next now, but would still do > > changes if there are concerns about the contents. The first > > version did not see a lot of replies, which could mean that > > either everyone is happy with it, or that it was largely ignored. > > > > See also the article at https://lwn.net/Articles/776435/. > > I would be happy to read "Approaching the kernel year-2038 end game" > however it is behind a pay wall. Perhaps it may be best to just > host interesting articles about open source idea elsewhere. Hey, this is an unfair characterization: lwn.net operates an early access subscription model, so you can't read the above for 14 days after publication without paying for an lwn.net subscription, but by the time these patches are upstream there will be no paywall because it will expire on 24 January and that URL will then be readable by all. That makes LWN.net a nice, reliable resource for us while still supporting some business model to keep it going. James
Re: [PATCH v2 00/29] y2038: add time64 syscalls
On 1/18/19 12:14 PM, Arnd Bergmann wrote: On Fri, Jan 18, 2019 at 5:57 PM Dennis Clarke wrote: On 1/18/19 11:18 AM, Arnd Bergmann wrote: This is a minor update of the patches I posted last week, I would like to add this into linux-next now, but would still do changes if there are concerns about the contents. The first version did not see a lot of replies, which could mean that either everyone is happy with it, or that it was largely ignored. See also the article at https://lwn.net/Articles/776435/. I would be happy to read "Approaching the kernel year-2038 end game" however it is behind a pay wall. Perhaps it may be best to just host interesting articles about open source idea elsewhere. It's a short summary of the current state. Oh, I pay. Also to FSF and other places however I was merely ranting very very quietly that so much open source is becoming commercialized in so many ways. Sort of expected really. Pardon my little rant .. I will go back to hacking OpenSSL 1.1.1a and trying to get Apache httpd 2.4.38 release running cleanly. Dennis
Re: [PATCH v2 00/29] y2038: add time64 syscalls
On Fri, Jan 18, 2019 at 5:57 PM Dennis Clarke wrote: > > On 1/18/19 11:18 AM, Arnd Bergmann wrote: > > This is a minor update of the patches I posted last week, I > > would like to add this into linux-next now, but would still do > > changes if there are concerns about the contents. The first > > version did not see a lot of replies, which could mean that > > either everyone is happy with it, or that it was largely ignored. > > > > See also the article at https://lwn.net/Articles/776435/. > > I would be happy to read "Approaching the kernel year-2038 end game" > however it is behind a pay wall. Perhaps it may be best to just > host interesting articles about open source idea elsewhere. It's a short summary of the current state. You can also find a video and slides from my ELC presentation online for a little more context. Generally speaking, I'd recommend paying for the subscription to lwn.net to anyone interested in the kernel, but it should become visible to everyone with the next day (a week after the initial publication). In the meantime, you can find the article at https://lwn.net/SubscriberLink/776435/a59d93d01d1addfc/. Finally, I've made a list of the remaining work that Deepa and I are planning to still continue (this should be mostly complete but may be missing a few things): syscalls - merge big series for 5.1, to allow time64 syscalls - waitid/wait4/getrusage should get a replacement based on __kernel_timespec - getitimer/setitimer should probably follow getrusage - vdso, waiting for consolidation series from Vincenzo Frascino before adding time64 entry points file systems - range checks on timestamps - xfs - NFS - hfs/hfsplus - coda - hostfs - relatime_need_update drivers - media - alsa - sockets - af_packet - ppp ioctl - rtc ioctl - omap3isp core kernel - fix ELF core files (elfcore.h) - syscall Audit code (kernel/audit.c, kernel/auditsc.c) - make all time32 code conditional - remove include/linux/timekeeping32.h - remove compat_time* from time32.h - remove timeval - remove timespec - remove time_t Arnd
Re: [PATCH v2 00/29] y2038: add time64 syscalls
On 1/18/19 11:18 AM, Arnd Bergmann wrote: This is a minor update of the patches I posted last week, I would like to add this into linux-next now, but would still do changes if there are concerns about the contents. The first version did not see a lot of replies, which could mean that either everyone is happy with it, or that it was largely ignored. See also the article at https://lwn.net/Articles/776435/. I would be happy to read "Approaching the kernel year-2038 end game" however it is behind a pay wall. Perhaps it may be best to just host interesting articles about open source idea elsewhere. Dennis Clarke
[PATCH v2 06/29] ARM: add migrate_pages() system call
The migrate_pages system call has an assigned number on all architectures except ARM. When it got added initially in commit d80ade7b3231 ("ARM: Fix warning: #warning syscall migrate_pages not implemented"), it was intentionally left out based on the observation that there are no 32-bit ARM NUMA systems. However, there are now arm64 NUMA machines that can in theory run 32-bit kernels (actually enabling NUMA there would require additional work) as well as 32-bit user space on 64-bit kernels, so that argument is no longer very strong. Assigning the number lets us use the system call on 64-bit kernels as well as providing a more consistent set of syscalls across architectures. Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/unistd.h | 1 - arch/arm/tools/syscall.tbl| 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ 4 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h index 88ef2ce1f69a..d713587dfcf4 100644 --- a/arch/arm/include/asm/unistd.h +++ b/arch/arm/include/asm/unistd.h @@ -45,7 +45,6 @@ * Unimplemented (or alternatively implemented) syscalls */ #define __IGNORE_fadvise64_64 -#define __IGNORE_migrate_pages #ifdef __ARM_EABI__ /* diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 8edf93b4490f..86de9eb34296 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -414,3 +414,4 @@ 397common statx sys_statx 398common rseqsys_rseq 399common io_pgetevents sys_io_pgetevents +400common migrate_pages sys_migrate_pages diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index a7b1fc58ffdf..261216c3336e 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -44,7 +44,7 @@ #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 400 +#define __NR_compat_syscalls 401 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index 04ee190b90fe..f15bcbacb8f6 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -821,6 +821,8 @@ __SYSCALL(__NR_statx, sys_statx) __SYSCALL(__NR_rseq, sys_rseq) #define __NR_io_pgetevents 399 __SYSCALL(__NR_io_pgetevents, compat_sys_io_pgetevents) +#define __NR_migrate_pages 400 +__SYSCALL(__NR_migrate_pages, compat_sys_migrate_pages) /* * Please add new compat syscalls above this comment and update -- 2.20.0
[PATCH v2 28/29] y2038: rename old time and utime syscalls
The time, stime, utime, utimes, and futimesat system calls are only used on older architectures, and we do not provide y2038 safe variants of them, as they are replaced by clock_gettime64, clock_settime64, and utimensat_time64. However, for consistency it seems better to have the 32-bit architectures that still use them call the "time32" entry points (leaving the traditional handlers for the 64-bit architectures), like we do for system calls that now require two versions. Note: We used to always define __ARCH_WANT_SYS_TIME and __ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and __ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat mode. The resulting asm/unistd.h changes look a bit counterintuitive. This is only a cleanup patch and it should not change any behavior. Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/unistd.h | 4 ++-- arch/arm/tools/syscall.tbl | 10 +- arch/m68k/include/asm/unistd.h | 4 ++-- arch/m68k/kernel/syscalls/syscall.tbl | 10 +- arch/microblaze/include/asm/unistd.h| 4 ++-- arch/microblaze/kernel/syscalls/syscall.tbl | 10 +- arch/mips/include/asm/unistd.h | 4 ++-- arch/mips/kernel/syscalls/syscall_o32.tbl | 10 +- arch/parisc/include/asm/unistd.h| 9 ++--- arch/parisc/kernel/syscalls/syscall.tbl | 15 ++- arch/powerpc/include/asm/unistd.h | 8 arch/powerpc/kernel/syscalls/syscall.tbl| 19 ++- arch/s390/include/asm/unistd.h | 2 +- arch/sh/include/asm/unistd.h| 4 ++-- arch/sh/kernel/syscalls/syscall.tbl | 10 +- arch/sparc/include/asm/unistd.h | 8 arch/sparc/kernel/syscalls/syscall.tbl | 14 +- arch/x86/entry/syscalls/syscall_32.tbl | 10 +- arch/x86/include/asm/unistd.h | 8 arch/xtensa/include/asm/unistd.h| 2 +- arch/xtensa/kernel/syscalls/syscall.tbl | 6 +++--- kernel/time/time.c | 4 ++-- 22 files changed, 98 insertions(+), 77 deletions(-) diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h index d713587dfcf4..7a39e77984ef 100644 --- a/arch/arm/include/asm/unistd.h +++ b/arch/arm/include/asm/unistd.h @@ -26,10 +26,10 @@ #define __ARCH_WANT_SYS_SIGPROCMASK #define __ARCH_WANT_SYS_OLD_MMAP #define __ARCH_WANT_SYS_OLD_SELECT -#define __ARCH_WANT_SYS_UTIME +#define __ARCH_WANT_SYS_UTIME32 #if !defined(CONFIG_AEABI) || defined(CONFIG_OABI_COMPAT) -#define __ARCH_WANT_SYS_TIME +#define __ARCH_WANT_SYS_TIME32 #define __ARCH_WANT_SYS_IPC #define __ARCH_WANT_SYS_OLDUMOUNT #define __ARCH_WANT_SYS_ALARM diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 200f4b878a46..a96d9b5ee04e 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -24,7 +24,7 @@ 10 common unlink sys_unlink 11 common execve sys_execve 12 common chdir sys_chdir -13 oabitimesys_time +13 oabitimesys_time32 14 common mknod sys_mknod 15 common chmod sys_chmod 16 common lchown sys_lchown16 @@ -36,12 +36,12 @@ 22 oabiumount sys_oldumount 23 common setuid sys_setuid16 24 common getuid sys_getuid16 -25 oabistime sys_stime +25 oabistime sys_stime32 26 common ptrace sys_ptrace 27 oabialarm sys_alarm # 28 was sys_fstat 29 common pause sys_pause -30 oabiutime sys_utime +30 oabiutime sys_utime32 # 31 was sys_stty # 32 was sys_gtty 33 common access sys_access @@ -283,7 +283,7 @@ 266common statfs64sys_statfs64_wrapper 267common fstatfs64 sys_fstatfs64_wrapper 268common tgkill sys_tgkill -269common utimes sys_utimes +269common utimes sys_utimes_time32 270common arm_fadvise64_64sys_arm_fadvise64_64 271common pciconfig_iobasesys_pciconfig_iobase 272common pciconfig_read sys_pciconfig_read @@ -340,7 +340,7 @@ 323common mkdirat sys_mkdirat 324common mknodat sys_mknodat 325common fchownatsys_fchownat -326common futimesat sys_futimesat +326common futimesat sys_futimesat_time32 327common fstatat64
[PATCH v2 21/29] sparc64: add custom adjtimex/clock_adjtime functions
sparc64 is the only architecture on Linux that has a 'timeval' definition with a 32-bit tv_usec but a 64-bit tv_sec. This causes problems for sparc32 compat mode when we convert it to use the new __kernel_timex type that has the same layout as all other 64-bit architectures. To avoid adding sparc64 specific code into the generic adjtimex implementation, this adds a wrapper in the sparc64 system call handling that converts the sparc64 'timex' into the new '__kernel_timex'. At this point, the two structures are defined to be identical, but that will change in the next step once we convert sparc32. Signed-off-by: Arnd Bergmann --- arch/sparc/kernel/sys_sparc_64.c | 59 +- arch/sparc/kernel/syscalls/syscall.tbl | 6 ++- include/linux/timex.h | 2 + kernel/time/posix-timers.c | 24 +-- 4 files changed, 76 insertions(+), 15 deletions(-) diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c index 1c079e7bab09..37de18a11207 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -28,8 +28,9 @@ #include #include #include - +#include #include + #include #include @@ -544,6 +545,62 @@ SYSCALL_DEFINE2(getdomainname, char __user *, name, int, len) return err; } +SYSCALL_DEFINE1(sparc_adjtimex, struct timex __user *, txc_p) +{ + struct timex txc; /* Local copy of parameter */ + struct timex *kt = (void *) + int ret; + + /* Copy the user data space into the kernel copy +* structure. But bear in mind that the structures +* may change +*/ + if (copy_from_user(, txc_p, sizeof(struct timex))) + return -EFAULT; + + /* +* override for sparc64 specific timeval type: tv_usec +* is 32 bit wide instead of 64-bit in __kernel_timex +*/ + kt->time.tv_usec = txc.time.tv_usec; + ret = do_adjtimex(kt); + txc.time.tv_usec = kt->time.tv_usec; + + return copy_to_user(txc_p, , sizeof(struct timex)) ? -EFAULT : ret; +} + +SYSCALL_DEFINE2(sparc_clock_adjtime, const clockid_t, which_clock,struct timex __user *, txc_p) +{ + struct timex txc; /* Local copy of parameter */ + struct timex *kt = (void *) + int ret; + + if (!IS_ENABLED(CONFIG_POSIX_TIMERS)) { + pr_err_once("process %d (%s) attempted a POSIX timer syscall " + "while CONFIG_POSIX_TIMERS is not set\n", + current->pid, current->comm); + + return -ENOSYS; + } + + /* Copy the user data space into the kernel copy +* structure. But bear in mind that the structures +* may change +*/ + if (copy_from_user(, txc_p, sizeof(struct timex))) + return -EFAULT; + + /* +* override for sparc64 specific timeval type: tv_usec +* is 32 bit wide instead of 64-bit in __kernel_timex +*/ + kt->time.tv_usec = txc.time.tv_usec; + ret = do_clock_adjtime(which_clock, kt); + txc.time.tv_usec = kt->time.tv_usec; + + return copy_to_user(txc_p, , sizeof(struct timex)) ? -EFAULT : ret; +} + SYSCALL_DEFINE5(utrap_install, utrap_entry_t, type, utrap_handler_t, new_p, utrap_handler_t, new_d, utrap_handler_t __user *, old_p, diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 24ebef675184..e70110375399 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -258,7 +258,8 @@ 21664 sigreturn sys_nis_syscall 217common clone sys_clone 218common ioprio_get sys_ioprio_get -219common adjtimexsys_adjtimex compat_sys_adjtimex +21932 adjtimexsys_adjtimex compat_sys_adjtimex +21964 adjtimexsys_sparc_adjtimex 22032 sigprocmask sys_sigprocmask compat_sys_sigprocmask 22064 sigprocmask sys_nis_syscall 221common create_module sys_ni_syscall @@ -377,7 +378,8 @@ 331common prlimit64 sys_prlimit64 332common name_to_handle_at sys_name_to_handle_at 333common open_by_handle_at sys_open_by_handle_at compat_sys_open_by_handle_at -334common clock_adjtime sys_clock_adjtime compat_sys_clock_adjtime +33432 clock_adjtime sys_clock_adjtime compat_sys_clock_adjtime +33464 clock_adjtime sys_sparc_clock_adjtime 335common syncfs sys_syncfs 336common sendmmsgsys_sendmmsg compat_sys_sendmmsg 337common setns sys_setns diff --git a/include/linux/timex.h
[PATCH v2 23/29] timex: change syscalls to use struct __kernel_timex
From: Deepa Dinamani struct timex is not y2038 safe. Switch all the syscall apis to use y2038 safe __kernel_timex. Note that sys_adjtimex() does not have a y2038 safe solution. C libraries can implement it by calling clock_adjtime(CLOCK_REALTIME, ...). Signed-off-by: Deepa Dinamani Signed-off-by: Arnd Bergmann --- include/linux/syscalls.h | 6 +++--- kernel/time/posix-timers.c | 2 +- kernel/time/time.c | 4 +++- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index baa4b70b02d3..09330d5bda0c 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -54,7 +54,7 @@ struct __sysctl_args; struct sysinfo; struct timespec; struct timeval; -struct timex; +struct __kernel_timex; struct timezone; struct tms; struct utimbuf; @@ -695,7 +695,7 @@ asmlinkage long sys_gettimeofday(struct timeval __user *tv, struct timezone __user *tz); asmlinkage long sys_settimeofday(struct timeval __user *tv, struct timezone __user *tz); -asmlinkage long sys_adjtimex(struct timex __user *txc_p); +asmlinkage long sys_adjtimex(struct __kernel_timex __user *txc_p); /* kernel/timer.c */ asmlinkage long sys_getpid(void); @@ -870,7 +870,7 @@ asmlinkage long sys_open_by_handle_at(int mountdirfd, struct file_handle __user *handle, int flags); asmlinkage long sys_clock_adjtime(clockid_t which_clock, - struct timex __user *tx); + struct __kernel_timex __user *tx); asmlinkage long sys_syncfs(int fd); asmlinkage long sys_setns(int fd, int nstype); asmlinkage long sys_sendmmsg(int fd, struct mmsghdr __user *msg, diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index 2d84b3db1ade..de79f85ae14f 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -1060,7 +1060,7 @@ int do_clock_adjtime(const clockid_t which_clock, struct __kernel_timex * ktx) } SYSCALL_DEFINE2(clock_adjtime, const clockid_t, which_clock, - struct timex __user *, utx) + struct __kernel_timex __user *, utx) { struct __kernel_timex ktx; int err; diff --git a/kernel/time/time.c b/kernel/time/time.c index d179d33f639a..78b5c8f1495a 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -263,7 +263,8 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, } #endif -SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p) +#if !defined(CONFIG_64BIT_TIME) || defined(CONFIG_64BIT) +SYSCALL_DEFINE1(adjtimex, struct __kernel_timex __user *, txc_p) { struct __kernel_timex txc; /* Local copy of parameter */ int ret; @@ -277,6 +278,7 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p) ret = do_adjtimex(); return copy_to_user(txc_p, , sizeof(struct __kernel_timex)) ? -EFAULT : ret; } +#endif #ifdef CONFIG_COMPAT_32BIT_TIME int get_old_timex32(struct __kernel_timex *txc, const struct old_timex32 __user *utp) -- 2.20.0
[PATCH v2 02/29] ia64: add statx and io_pgetevents syscalls
All architectures should implement these two, so assign numbers and hook them up on ia64. Signed-off-by: Arnd Bergmann --- arch/ia64/kernel/syscalls/syscall.tbl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index e97caf51be42..52585281205b 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -335,3 +335,5 @@ 323common copy_file_range sys_copy_file_range 324common preadv2 sys_preadv2 325common pwritev2sys_pwritev2 +326common statx sys_statx +327common io_pgetevents sys_io_pgetevents -- 2.20.0
[PATCH v2 03/29] ia64: assign syscall numbers for perf and seccomp
Most architectures have assigned numbers for both seccomp and perf_event_open, even when they do not implement either. ia64 is an exception here, so for consistency lets add numbers for both of them. Unless CONFIG_PERF_EVENTS and CONFIG_SECCOMP are implemented, the system calls just return -ENOSYS. Signed-off-by: Arnd Bergmann --- arch/ia64/kernel/syscalls/syscall.tbl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 52585281205b..2e93dbdcdb80 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -337,3 +337,5 @@ 325common pwritev2sys_pwritev2 326common statx sys_statx 327common io_pgetevents sys_io_pgetevents +328common perf_event_open sys_perf_event_open +329common seccomp sys_seccomp -- 2.20.0
[PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures
This adds 21 new system calls on each ABI that has 32-bit time_t today. All of these have the exact same semantics as their existing counterparts, and the new ones all have macro names that end in 'time64' for clarification. This gets us to the point of being able to safely use a C library that has 64-bit time_t in user space. There are still a couple of loose ends to tie up in various areas of the code, but this is the big one, and should be entirely uncontroversial at this point. In particular, there are four system calls (getitimer, setitimer, waitid, and getrusage) that don't have a 64-bit counterpart yet, but these can all be safely implemented in the C library by wrapping around the existing system calls because the 32-bit time_t they pass only counts elapsed time, not time since the epoch. They will be dealt with later. Signed-off-by: Arnd Bergmann --- The one point that still needs to be agreed on is the actual number assignment. Following the earlier patch that added the sysv IPC calls with common numbers where possible, I also tried the same here, using consistent numbers on all 32-bit architectures. There are a couple of minor issues with this: - On asm-generic, we now leave the numbers from 295 to 402 unassigned, which wastes a small amount of kernel .data segment. Originally I had asm-generic start at 300 and everyone else start at 400 here, which was also not perfect, and we have gone beyond 400 already, so I ended up just using the same numbers as the rest here. - Once we get to 512, we clash with the x32 numbers (unless we remove x32 support first), and probably have to skip a few more. I also considered using the 512..547 space for 32-bit-only calls (which never clash with x32), but that also seems to add a bit of complexity. - On alpha, we have already used up the space up to 527 (with a small hole between 261 and 299). We could sync up with that as well, but my feeling was that alpha syscalls are already special enough that I don't care. Let me know if you have other ideas. --- arch/alpha/kernel/syscalls/syscall.tbl | 2 + arch/arm/tools/syscall.tbl | 21 ++ arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 41 +++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 20 + arch/microblaze/kernel/syscalls/syscall.tbl | 21 ++ arch/mips/kernel/syscalls/syscall_n32.tbl | 21 ++ arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 20 + arch/parisc/kernel/syscalls/syscall.tbl | 21 ++ arch/powerpc/kernel/syscalls/syscall.tbl| 20 + arch/s390/kernel/syscalls/syscall.tbl | 20 + arch/sh/kernel/syscalls/syscall.tbl | 20 + arch/sparc/kernel/syscalls/syscall.tbl | 20 + arch/x86/entry/syscalls/syscall_32.tbl | 20 + arch/xtensa/kernel/syscalls/syscall.tbl | 21 ++ include/uapi/asm-generic/unistd.h | 45 - scripts/checksyscalls.sh| 40 ++ 19 files changed, 375 insertions(+), 2 deletions(-) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 337b8108771a..936a33fae3c9 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -461,3 +461,5 @@ 530common getegid sys_getegid 531common geteuid sys_geteuid 532common getppid sys_getppid +# all other architectures have common numbers for new syscall, alpha +# is the exception. diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index a96d9b5ee04e..286afdc43283 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -416,3 +416,24 @@ 399common io_pgetevents sys_io_pgetevents_time32 400common migrate_pages sys_migrate_pages 401common kexec_file_load sys_kexec_file_load +# 402 is unused +403common clock_gettime64 sys_clock_gettime +404common clock_settime64 sys_clock_settime +405common clock_adjtime64 sys_clock_adjtime +406common clock_getres_time64 sys_clock_getres +407common clock_nanosleep_time64 sys_clock_nanosleep +408common timer_gettime64 sys_timer_gettime +409common timer_settime64 sys_timer_settime +410common timerfd_gettime64 sys_timerfd_gettime +411common timerfd_settime64 sys_timerfd_settime +412common utimensat_time64sys_utimensat +413common pselect6_time64 sys_pselect6 +414common ppoll_time64sys_ppoll +416common io_pgetevents_time64
[PATCH v2 04/29] alpha: wire up io_pgetevents system call
The io_pgetevents system call was added in linux-4.18 but has no entry for alpha: warning: #warning syscall io_pgetevents not implemented [-Wcpp] Assign a the next system call number here. Cc: sta...@vger.kernel.org Signed-off-by: Arnd Bergmann --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 7b56a53be5e3..e09558edae73 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -451,3 +451,4 @@ 520common preadv2 sys_preadv2 521common pwritev2sys_pwritev2 522common statx sys_statx +523common io_pgetevents sys_io_pgetevents -- 2.20.0
[PATCH v2 07/29] ARM: add kexec_file_load system call number
A couple of architectures including arm64 already implement the kexec_file_load system call, on many others we have assigned a system call number for it, but not implemented it yet. Adding the number in arch/arm/ lets us use the system call on arm64 systems in compat mode, and also reduces the number of differences between architectures. If we want to implement kexec_file_load on ARM in the future, the number assignment means that kexec tools can already be built with the now current set of kernel headers. Signed-off-by: Arnd Bergmann --- arch/arm/tools/syscall.tbl| 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 86de9eb34296..20ed7e026723 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -415,3 +415,4 @@ 398common rseqsys_rseq 399common io_pgetevents sys_io_pgetevents 400common migrate_pages sys_migrate_pages +401common kexec_file_load sys_kexec_file_load diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 261216c3336e..2c30e6f145ff 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -44,7 +44,7 @@ #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 401 +#define __NR_compat_syscalls 402 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index f15bcbacb8f6..8ca1d4c304f4 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -823,6 +823,8 @@ __SYSCALL(__NR_rseq, sys_rseq) __SYSCALL(__NR_io_pgetevents, compat_sys_io_pgetevents) #define __NR_migrate_pages 400 __SYSCALL(__NR_migrate_pages, compat_sys_migrate_pages) +#define __NR_kexec_file_load 401 +__SYSCALL(__NR_kexec_file_load, sys_kexec_file_load) /* * Please add new compat syscalls above this comment and update -- 2.20.0
[PATCH v2 10/29] sh: add statx system call
statx is available on almost all other architectures but got missed on sh, so add it now. Signed-off-by: Arnd Bergmann --- arch/sh/kernel/syscalls/syscall.tbl | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index 21ec75288562..a70db013dbc7 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -390,3 +390,4 @@ 380common copy_file_range sys_copy_file_range 381common preadv2 sys_preadv2 382common pwritev2sys_pwritev2 +383common statx sys_statx -- 2.20.0
[PATCH v2 01/29] ia64: add __NR_umount2 definition
Other architectures commonly use __NR_umount2 for sys_umount, only ia64 and alpha use __NR_umount here. In order to synchronize the generated tables, use umount2 like everyone else, and add back the old name from asm/unistd.h for compatibility. The __IGNORE_* lines are now all obsolete and can be removed as a side-effect. Signed-off-by: Arnd Bergmann --- arch/ia64/include/asm/unistd.h| 14 -- arch/ia64/include/uapi/asm/unistd.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 2 +- 3 files changed, 3 insertions(+), 15 deletions(-) diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h index 0b08ebd2dfde..9ba6110b10b9 100644 --- a/arch/ia64/include/asm/unistd.h +++ b/arch/ia64/include/asm/unistd.h @@ -12,20 +12,6 @@ #define NR_syscalls__NR_syscalls /* length of syscall table */ -/* - * The following defines stop scripts/checksyscalls.sh from complaining about - * unimplemented system calls. Glibc provides for each of these by using - * more modern equivalent system calls. - */ -#define __IGNORE_fork /* clone() */ -#define __IGNORE_time /* gettimeofday() */ -#define __IGNORE_alarm /* setitimer(ITIMER_REAL, ... */ -#define __IGNORE_pause /* rt_sigprocmask(), rt_sigsuspend() */ -#define __IGNORE_utime /* utimes() */ -#define __IGNORE_getpgrp /* getpgid() */ -#define __IGNORE_vfork /* clone() */ -#define __IGNORE_umount2 /* umount() */ - #define __ARCH_WANT_NEW_STAT #define __ARCH_WANT_SYS_UTIME diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index b2513922dcb5..013e0bcacc39 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -15,6 +15,8 @@ #define __NR_Linux 1024 +#define __NR_umount __NR_umount2 + #include #endif /* _UAPI_ASM_IA64_UNISTD_H */ diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index b22203b40bfe..e97caf51be42 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -29,7 +29,7 @@ 17 common getpid sys_getpid 18 common getppid sys_getppid 19 common mount sys_mount -20 common umount sys_umount +20 common umount2 sys_umount 21 common setuid sys_setuid 22 common getuid sys_getuid 23 common geteuid sys_geteuid -- 2.20.0
[PATCH v2 08/29] m68k: assign syscall number for seccomp
Most architectures have assigned a numbers for the seccomp syscall even when they do not implement it. m68k is an exception here, so for consistency lets add the number. Unless CONFIG_SECCOMP is implemented, the system call just returns -ENOSYS. Signed-off-by: Arnd Bergmann --- arch/m68k/kernel/syscalls/syscall.tbl | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 1a95c4a1bc0d..85779d6ef935 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -387,3 +387,4 @@ 377common preadv2 sys_preadv2 378common pwritev2sys_pwritev2 379common statx sys_statx +380common seccomp sys_seccomp -- 2.20.0
[PATCH v2 14/29] arch: add pkey and rseq syscall numbers everywhere
Most architectures define system call numbers for the rseq and pkey system calls, even when they don't support the features, and perhaps never will. Only a few architectures are missing these, so just define them anyway for consistency. If we decide to add them later to one of these, the system call numbers won't get out of sync then. Signed-off-by: Arnd Bergmann --- arch/alpha/include/asm/unistd.h | 4 arch/alpha/kernel/syscalls/syscall.tbl | 4 arch/ia64/kernel/syscalls/syscall.tbl | 4 arch/m68k/kernel/syscalls/syscall.tbl | 4 arch/parisc/include/asm/unistd.h| 3 --- arch/parisc/kernel/syscalls/syscall.tbl | 4 arch/s390/include/asm/unistd.h | 3 --- arch/s390/kernel/syscalls/syscall.tbl | 3 +++ arch/sh/kernel/syscalls/syscall.tbl | 4 arch/sparc/include/asm/unistd.h | 5 - arch/sparc/kernel/syscalls/syscall.tbl | 4 arch/xtensa/kernel/syscalls/syscall.tbl | 1 + 12 files changed, 28 insertions(+), 15 deletions(-) diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h index 564ba87bdc38..31ad350b58a0 100644 --- a/arch/alpha/include/asm/unistd.h +++ b/arch/alpha/include/asm/unistd.h @@ -29,9 +29,5 @@ #define __IGNORE_getppid #define __IGNORE_getuid -/* Alpha doesn't have protection keys. */ -#define __IGNORE_pkey_mprotect -#define __IGNORE_pkey_alloc -#define __IGNORE_pkey_free #endif /* _ALPHA_UNISTD_H */ diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index b0e247287908..25b4a7e76943 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -452,3 +452,7 @@ 521common pwritev2sys_pwritev2 522common statx sys_statx 523common io_pgetevents sys_io_pgetevents +524common pkey_alloc sys_pkey_alloc +525common pkey_free sys_pkey_free +526common pkey_mprotect sys_pkey_mprotect +527common rseqsys_rseq diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 2e93dbdcdb80..84e03de00177 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -339,3 +339,7 @@ 327common io_pgetevents sys_io_pgetevents 328common perf_event_open sys_perf_event_open 329common seccomp sys_seccomp +330common pkey_alloc sys_pkey_alloc +331common pkey_free sys_pkey_free +332common pkey_mprotect sys_pkey_mprotect +333common rseqsys_rseq diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 5354ba02eed2..ae88b85d068e 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -388,6 +388,10 @@ 378common pwritev2sys_pwritev2 379common statx sys_statx 380common seccomp sys_seccomp +381common pkey_alloc sys_pkey_alloc +382common pkey_free sys_pkey_free +383common pkey_mprotect sys_pkey_mprotect +384common rseqsys_rseq # room for arch specific calls 393common semget sys_semget 394common semctl sys_semctl diff --git a/arch/parisc/include/asm/unistd.h b/arch/parisc/include/asm/unistd.h index c2c2afb28941..9ec1026af877 100644 --- a/arch/parisc/include/asm/unistd.h +++ b/arch/parisc/include/asm/unistd.h @@ -12,9 +12,6 @@ #define __IGNORE_select/* newselect */ #define __IGNORE_fadvise64 /* fadvise64_64 */ -#define __IGNORE_pkey_mprotect -#define __IGNORE_pkey_alloc -#define __IGNORE_pkey_free #ifndef ASM_LINE_SEP # define ASM_LINE_SEP ; diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 9bbd2f9f56c8..e07231de3597 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -367,3 +367,7 @@ 348common pwritev2sys_pwritev2 compat_sys_pwritev2 349common statx sys_statx 350common io_pgetevents sys_io_pgetevents compat_sys_io_pgetevents +351common pkey_alloc sys_pkey_alloc +352common pkey_free sys_pkey_free +353common pkey_mprotect sys_pkey_mprotect +354common rseqsys_rseq diff --git a/arch/s390/include/asm/unistd.h b/arch/s390/include/asm/unistd.h index a1fbf15d53aa..ed08f114ee91 100644 --- a/arch/s390/include/asm/unistd.h +++
Re: [PATCH v4 1/3] fs: hoist EFSCORRUPTED definition into uapi header
On Fri, Jan 18, 2019 at 5:15 PM Jann Horn wrote: > > Multiple filesystems can already return EFSCORRUPTED errors to userspace; > however, so far, definitions of EFSCORRUPTED were in filesystem-private > headers. > > I wanted to use EUCLEAN to indicate data corruption in the VFS layer; > Dave Chinner says that I should instead hoist the definitions of > EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED. > > This patch is marked for stable backport because it is a prerequisite for > the following patch. > > Cc: sta...@vger.kernel.org > Suggested-by: Dave Chinner > Signed-off-by: Jann Horn > --- > fs/ext2/ext2.h | 1 - > fs/ext4/ext4.h | 1 - > fs/xfs/xfs_linux.h | 1 - > include/linux/jbd2.h | 1 - > include/uapi/asm-generic/errno.h | 1 + > 5 files changed, 1 insertion(+), 4 deletions(-) For asm-generic: Acked-by: Arnd Bergmann
[PATCH v2 13/29] arch: add split IPC system calls where needed
The IPC system call handling is highly inconsistent across architectures, some use sys_ipc, some use separate calls, and some use both. We also have some architectures that require passing IPC_64 in the flags, and others that set it implicitly. For the additon of a y2083 safe semtimedop() system call, I chose to only support the separate entry points, but that requires first supporting the regular ones with their own syscall numbers. The IPC_64 is now implied by the new semctl/shmctl/msgctl system calls even on the architectures that require passing it with the ipc() multiplexer. I'm not adding the new semtimedop() or semop() on 32-bit architectures, those will get implemented using the new semtimedop_time64() version that gets added along with the other time64 calls. Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop(). Signed-off-by: Arnd Bergmann --- One aspect here that might be a bit controversial is the use of the same system call numbers across all architectures, synchronizing all of them with the x86-32 numbers. With the new syscall.tbl files, I hope we can just keep doing that in the future, and no longer require the architecture maintainers to assign a number. This is mainly useful for implementers of the C libraries: if we can add future system calls everywhere at the same time, using a particular version of the kernel headers also guarantees that the system call number macro is visible. --- arch/m68k/kernel/syscalls/syscall.tbl | 11 +++ arch/mips/kernel/syscalls/syscall_o32.tbl | 11 +++ arch/powerpc/kernel/syscalls/syscall.tbl | 13 + arch/s390/kernel/syscalls/syscall.tbl | 12 arch/sh/kernel/syscalls/syscall.tbl | 11 +++ arch/sparc/kernel/syscalls/syscall.tbl| 12 arch/x86/entry/syscalls/syscall_32.tbl| 11 +++ 7 files changed, 81 insertions(+) diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 85779d6ef935..5354ba02eed2 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -388,3 +388,14 @@ 378common pwritev2sys_pwritev2 379common statx sys_statx 380common seccomp sys_seccomp +# room for arch specific calls +393common semget sys_semget +394common semctl sys_semctl +395common shmget sys_shmget +396common shmctl sys_shmctl +397common shmat sys_shmat +398common shmdt sys_shmdt +399common msgget sys_msgget +400common msgsnd sys_msgsnd +401common msgrcv sys_msgrcv +402common msgctl sys_msgctl diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 3d5a47b80d2b..fa47ea8cc6ef 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -380,3 +380,14 @@ 366o32 statx sys_statx 367o32 rseqsys_rseq 368o32 io_pgetevents sys_io_pgetevents compat_sys_io_pgetevents +# room for arch specific calls +393o32 semget sys_semget +394o32 semctl sys_semctl compat_sys_semctl +395o32 shmget sys_shmget +396o32 shmctl sys_shmctl compat_sys_shmctl +397o32 shmat sys_shmat compat_sys_shmat +398o32 shmdt sys_shmdt +399o32 msgget sys_msgget +400o32 msgsnd sys_msgsnd compat_sys_msgsnd +401o32 msgrcv sys_msgrcv compat_sys_msgrcv +402o32 msgctl sys_msgctl compat_sys_msgctl diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index db3bbb8744af..7555874ce39c 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -414,6 +414,7 @@ 363spu switch_endian sys_ni_syscall 364common userfaultfd sys_userfaultfd 365common membarrier sys_membarrier +# 366-377 originally left for IPC, now unused 378nospu mlock2 sys_mlock2 379nospu copy_file_range sys_copy_file_range 380common preadv2 sys_preadv2
[PATCH v2 15/29] alpha: add standard statfs64/fstatfs64 syscalls
As Joseph Myers points out, alpha has never had a standard statfs64 interface and instead returns only 32-bit numbers here. While there is an old osf_statfs64 system call that returns additional data, this has some other quirks and does not get used in glibc. I considered making the stat64 structure layout compatible with with the one used by the kernel on most other 64 bit architecture that implement it (ia64, parisc, powerpc, and sparc), but in the end decided to stay with the one that was traditionally defined in the alpha headers but not used, since this is also what glibc exposes to user space. Signed-off-by: Arnd Bergmann --- arch/alpha/kernel/syscalls/syscall.tbl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 25b4a7e76943..0ebd59fdcb8b 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -456,3 +456,5 @@ 525common pkey_free sys_pkey_free 526common pkey_mprotect sys_pkey_mprotect 527common rseqsys_rseq +528common statfs64sys_statfs64 +529common fstatfs64 sys_fstatfs64 -- 2.20.0
Re: [PATCH v3 1/2] fs: don't let getdents return bogus names
On Tue, Jan 15, 2019 at 1:01 AM Dave Chinner wrote: > On Mon, Jan 14, 2019 at 07:23:17PM +0100, Jann Horn wrote: > > When you e.g. run `find` on a directory for which getdents returns > > "filenames" that contain slashes, `find` passes those "filenames" back to > > the kernel, which then interprets them as paths. That could conceivably > > cause userspace to do something bad when accessing something like an > > untrusted USB stick, but I'm not aware of any specific example. > > > > Instead of returning bogus filenames to userspace, return -EUCLEAN. > > Please don't use EUCLEAN directly to indicate filesystem corruption > directly. If we want to indicate that the filesystem is corrupted, > please hoist the multiple XFS/ext4 definitions of: > > #define EFSCORRUPTED EUCLEAN > > up into include/uapi/asm-generic/errno.h and then use EFSCORRUPTED Alright, I've added a patch that moves EFSCORRUPTED into the uapi header in front of my series and changed the following patches to use EFSCORRUPTED instead of EUCLEAN; see the v4 version I just sent out.
[PATCH v2 09/29] sh: remove duplicate unistd_32.h file
When I merged this patch, the file was accidentally left intact instead of being removed, which means any changes to syscall.tbl have no effect. Fixes: 2b3c5a99d5f3 ("sh: generate uapi header and syscall table header files") Signed-off-by: Arnd Bergmann --- arch/sh/include/uapi/asm/unistd_32.h | 403 --- 1 file changed, 403 deletions(-) delete mode 100644 arch/sh/include/uapi/asm/unistd_32.h diff --git a/arch/sh/include/uapi/asm/unistd_32.h b/arch/sh/include/uapi/asm/unistd_32.h deleted file mode 100644 index 31c85aa251ab.. --- a/arch/sh/include/uapi/asm/unistd_32.h +++ /dev/null @@ -1,403 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -#ifndef __ASM_SH_UNISTD_32_H -#define __ASM_SH_UNISTD_32_H - -/* - * Copyright (C) 1999 Niibe Yutaka - */ - -/* - * This file contains the system call numbers. - */ - -#define __NR_restart_syscall 0 -#define __NR_exit1 -#define __NR_fork2 -#define __NR_read3 -#define __NR_write 4 -#define __NR_open5 -#define __NR_close 6 -#define __NR_waitpid 7 -#define __NR_creat 8 -#define __NR_link9 -#define __NR_unlink 10 -#define __NR_execve 11 -#define __NR_chdir 12 -#define __NR_time 13 -#define __NR_mknod 14 -#define __NR_chmod 15 -#define __NR_lchown 16 -/* 17 was sys_break */ -#define __NR_oldstat18 -#define __NR_lseek 19 -#define __NR_getpid 20 -#define __NR_mount 21 -#define __NR_umount 22 -#define __NR_setuid 23 -#define __NR_getuid 24 -#define __NR_stime 25 -#define __NR_ptrace 26 -#define __NR_alarm 27 -#define __NR_oldfstat 28 -#define __NR_pause 29 -#define __NR_utime 30 -/* 31 was sys_stty */ -/* 32 was sys_gtty */ -#define __NR_access 33 -#define __NR_nice 34 -/* 35 was sys_ftime */ -#define __NR_sync 36 -#define __NR_kill 37 -#define __NR_rename 38 -#define __NR_mkdir 39 -#define __NR_rmdir 40 -#define __NR_dup41 -#define __NR_pipe 42 -#define __NR_times 43 -/* 44 was sys_prof */ -#define __NR_brk45 -#define __NR_setgid 46 -#define __NR_getgid 47 -#define __NR_signal 48 -#define __NR_geteuid49 -#define __NR_getegid50 -#define __NR_acct 51 -#define __NR_umount252 -/* 53 was sys_lock */ -#define __NR_ioctl 54 -#define __NR_fcntl 55 -/* 56 was sys_mpx */ -#define __NR_setpgid57 -/* 58 was sys_ulimit */ -/* 59 was sys_olduname */ -#define __NR_umask 60 -#define __NR_chroot 61 -#define __NR_ustat 62 -#define __NR_dup2 63 -#define __NR_getppid64 -#define __NR_getpgrp65 -#define __NR_setsid 66 -#define __NR_sigaction 67 -#define __NR_sgetmask 68 -#define __NR_ssetmask 69 -#define __NR_setreuid 70 -#define __NR_setregid 71 -#define __NR_sigsuspend 72 -#define __NR_sigpending 73 -#define __NR_sethostname74 -#define __NR_setrlimit 75 -#define __NR_getrlimit 76 /* Back compatible 2Gig limited rlimit */ -#define __NR_getrusage 77 -#define __NR_gettimeofday 78 -#define __NR_settimeofday 79 -#define __NR_getgroups 80 -#define __NR_setgroups 81 -/* 82 was sys_oldselect */ -#define __NR_symlink83 -#define __NR_oldlstat 84 -#define __NR_readlink 85 -#define __NR_uselib 86 -#define __NR_swapon 87 -#define __NR_reboot 88 -#define __NR_readdir89 -#define __NR_mmap 90 -#define __NR_munmap 91 -#define __NR_truncate 92 -#define __NR_ftruncate 93 -#define __NR_fchmod 94 -#define __NR_fchown 95 -#define __NR_getpriority96 -#define __NR_setpriority97 -/* 98 was sys_profil */ -#define __NR_statfs 99 -#define __NR_fstatfs 100 - /* 101 was sys_ioperm */ -#define __NR_socketcall102 -#define __NR_syslog103 -#define __NR_setitimer 104 -#define __NR_getitimer
[PATCH v2 18/29] time: make adjtime compat handling available for 32 bit
We want to reuse the compat_timex handling on 32-bit architectures the same way we are using the compat handling for timespec when moving to 64-bit time_t. Move all definitions related to compat_timex out of the compat code into the normal timekeeping code, along with a rename to old_timex32, corresponding to the timespec/timeval structures, and make it controlled by CONFIG_COMPAT_32BIT_TIME, which 32-bit architectures will then select. Signed-off-by: Arnd Bergmann --- include/linux/compat.h | 35 ++- include/linux/time32.h | 32 - kernel/compat.c| 64 -- kernel/time/posix-timers.c | 14 ++-- kernel/time/time.c | 70 +++--- 5 files changed, 102 insertions(+), 113 deletions(-) diff --git a/include/linux/compat.h b/include/linux/compat.h index 056be0d03722..657ca6abd855 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -132,37 +132,6 @@ struct compat_tms { compat_clock_t tms_cstime; }; -struct compat_timex { - compat_uint_t modes; - compat_long_t offset; - compat_long_t freq; - compat_long_t maxerror; - compat_long_t esterror; - compat_int_t status; - compat_long_t constant; - compat_long_t precision; - compat_long_t tolerance; - struct old_timeval32 time; - compat_long_t tick; - compat_long_t ppsfreq; - compat_long_t jitter; - compat_int_t shift; - compat_long_t stabil; - compat_long_t jitcnt; - compat_long_t calcnt; - compat_long_t errcnt; - compat_long_t stbcnt; - compat_int_t tai; - - compat_int_t:32; compat_int_t:32; compat_int_t:32; compat_int_t:32; - compat_int_t:32; compat_int_t:32; compat_int_t:32; compat_int_t:32; - compat_int_t:32; compat_int_t:32; compat_int_t:32; -}; - -struct timex; -int compat_get_timex(struct timex *, const struct compat_timex __user *); -int compat_put_timex(struct compat_timex __user *, const struct timex *); - #define _COMPAT_NSIG_WORDS (_COMPAT_NSIG / _COMPAT_NSIG_BPW) typedef struct { @@ -808,7 +777,7 @@ asmlinkage long compat_sys_gettimeofday(struct old_timeval32 __user *tv, struct timezone __user *tz); asmlinkage long compat_sys_settimeofday(struct old_timeval32 __user *tv, struct timezone __user *tz); -asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp); +asmlinkage long compat_sys_adjtimex(struct old_timex32 __user *utp); /* kernel/timer.c */ asmlinkage long compat_sys_sysinfo(struct compat_sysinfo __user *info); @@ -911,7 +880,7 @@ asmlinkage long compat_sys_open_by_handle_at(int mountdirfd, struct file_handle __user *handle, int flags); asmlinkage long compat_sys_clock_adjtime(clockid_t which_clock, -struct compat_timex __user *tp); +struct old_timex32 __user *tp); asmlinkage long compat_sys_sendmmsg(int fd, struct compat_mmsghdr __user *mmsg, unsigned vlen, unsigned int flags); asmlinkage ssize_t compat_sys_process_vm_readv(compat_pid_t pid, diff --git a/include/linux/time32.h b/include/linux/time32.h index 118b9977080c..820a22e2b98b 100644 --- a/include/linux/time32.h +++ b/include/linux/time32.h @@ -10,6 +10,7 @@ */ #include +#include #define TIME_T_MAX (time_t)((1UL << ((sizeof(time_t) << 3) - 1)) - 1) @@ -35,13 +36,42 @@ struct old_utimbuf32 { old_time32_tmodtime; }; +struct old_timex32 { + u32 modes; + s32 offset; + s32 freq; + s32 maxerror; + s32 esterror; + s32 status; + s32 constant; + s32 precision; + s32 tolerance; + struct old_timeval32 time; + s32 tick; + s32 ppsfreq; + s32 jitter; + s32 shift; + s32 stabil; + s32 jitcnt; + s32 calcnt; + s32 errcnt; + s32 stbcnt; + s32 tai; + + s32:32; s32:32; s32:32; s32:32; + s32:32; s32:32; s32:32; s32:32; + s32:32; s32:32; s32:32; +}; + extern int get_old_timespec32(struct timespec64 *, const void __user *); extern int put_old_timespec32(const struct timespec64 *, void __user *); extern int get_old_itimerspec32(struct itimerspec64 *its, const struct old_itimerspec32 __user *uits); extern int put_old_itimerspec32(const struct itimerspec64 *its, struct old_itimerspec32 __user *uits); - +struct timex; +int get_old_timex32(struct timex *, const struct old_timex32 __user *); +int put_old_timex32(struct old_timex32 __user *, const struct timex *); #if __BITS_PER_LONG == 64 diff --git a/kernel/compat.c b/kernel/compat.c index f01affa17e22..d8a36c6ad7c9 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -20,7 +20,6 @@ #include
[PATCH v2 17/29] syscalls: remove obsolete __IGNORE_ macros
These are all for ignoring the lack of obsolete system calls, which have been marked the same way in scripts/checksyscall.sh, so these can be removed. Signed-off-by: Arnd Bergmann --- arch/mips/include/asm/unistd.h | 16 arch/parisc/include/asm/unistd.h | 3 --- arch/s390/include/asm/unistd.h | 2 -- arch/xtensa/include/asm/unistd.h | 12 4 files changed, 33 deletions(-) diff --git a/arch/mips/include/asm/unistd.h b/arch/mips/include/asm/unistd.h index b23d74a601b3..5e9eeb83d8d4 100644 --- a/arch/mips/include/asm/unistd.h +++ b/arch/mips/include/asm/unistd.h @@ -53,22 +53,6 @@ #define __ARCH_WANT_SYS_FORK #define __ARCH_WANT_SYS_CLONE -/* whitelists for checksyscalls */ -#define __IGNORE_select -#define __IGNORE_vfork -#define __IGNORE_time -#define __IGNORE_uselib -#define __IGNORE_fadvise64_64 -#define __IGNORE_getdents64 -#if _MIPS_SIM == _MIPS_SIM_NABI32 -#define __IGNORE_truncate64 -#define __IGNORE_ftruncate64 -#define __IGNORE_stat64 -#define __IGNORE_lstat64 -#define __IGNORE_fstat64 -#define __IGNORE_fstatat64 -#endif - #endif /* !__ASSEMBLY__ */ #endif /* _ASM_UNISTD_H */ diff --git a/arch/parisc/include/asm/unistd.h b/arch/parisc/include/asm/unistd.h index 9ec1026af877..385eae49ed02 100644 --- a/arch/parisc/include/asm/unistd.h +++ b/arch/parisc/include/asm/unistd.h @@ -10,9 +10,6 @@ #define SYS_ify(syscall_name) __NR_##syscall_name -#define __IGNORE_select/* newselect */ -#define __IGNORE_fadvise64 /* fadvise64_64 */ - #ifndef ASM_LINE_SEP # define ASM_LINE_SEP ; #endif diff --git a/arch/s390/include/asm/unistd.h b/arch/s390/include/asm/unistd.h index ed08f114ee91..59202ceea1f6 100644 --- a/arch/s390/include/asm/unistd.h +++ b/arch/s390/include/asm/unistd.h @@ -10,8 +10,6 @@ #include #include -#define __IGNORE_time - #define __ARCH_WANT_NEW_STAT #define __ARCH_WANT_OLD_READDIR #define __ARCH_WANT_SYS_ALARM diff --git a/arch/xtensa/include/asm/unistd.h b/arch/xtensa/include/asm/unistd.h index 0d34629dafc5..81cc52ea1bd5 100644 --- a/arch/xtensa/include/asm/unistd.h +++ b/arch/xtensa/include/asm/unistd.h @@ -10,18 +10,6 @@ #define __ARCH_WANT_SYS_UTIME #define __ARCH_WANT_SYS_GETPGRP -/* - * Ignore legacy system calls in the checksyscalls.sh script - */ - -#define __IGNORE_fork /* use clone */ -#define __IGNORE_time -#define __IGNORE_alarm /* use setitimer */ -#define __IGNORE_pause -#define __IGNORE_mmap /* use mmap2 */ -#define __IGNORE_vfork /* use clone */ -#define __IGNORE_fadvise64 /* use fadvise64_64 */ - #define NR_syscalls__NR_syscalls #endif /* _XTENSA_UNISTD_H */ -- 2.20.0
[PATCH v2 26/29] y2038: use time32 syscall names on 32-bit
This is the big flip, where all 32-bit architectures set COMPAT_32BIT_TIME abd use the _time32 system calls from the former compat layer instead of the system calls that take __kernel_timespec and similar arguments. The temporary redirects for __kernel_timespec, __kernel_itimerspec and __kernel_timex can get removed with this. It would be easy to split this commit by architecture, but with the new generated system call tables, it's easy enough to do it all at once, which makes it a little easier to check that the changes are the same in each table. Signed-off-by: Arnd Bergmann --- arch/Kconfig| 2 +- arch/arm/kernel/sys_oabi-compat.c | 8 +- arch/arm/tools/syscall.tbl | 46 ++-- arch/m68k/kernel/syscalls/syscall.tbl | 42 +-- arch/microblaze/kernel/syscalls/syscall.tbl | 46 ++-- arch/mips/kernel/syscalls/syscall_o32.tbl | 44 +-- arch/parisc/kernel/syscalls/syscall.tbl | 69 +++-- arch/powerpc/kernel/syscalls/syscall.tbl| 82 +++-- arch/sh/kernel/syscalls/syscall.tbl | 42 +-- arch/sparc/kernel/syscalls/syscall.tbl | 64 ++-- arch/x86/entry/syscalls/syscall_32.tbl | 44 +-- arch/xtensa/kernel/syscalls/syscall.tbl | 44 +-- include/uapi/asm-generic/unistd.h | 56 +++--- 13 files changed, 335 insertions(+), 254 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 4cfb6de48f79..46db715a7f42 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -759,7 +759,7 @@ config 64BIT_TIME handling. config COMPAT_32BIT_TIME - def_bool (!64BIT && 64BIT_TIME) || COMPAT + def_bool !64BIT || COMPAT help This enables 32 bit time_t support in addition to 64 bit time_t support. This is relevant on all 32-bit architectures, and 64-bit architectures diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c index 92ab36f38795..acd054a42ba2 100644 --- a/arch/arm/kernel/sys_oabi-compat.c +++ b/arch/arm/kernel/sys_oabi-compat.c @@ -317,10 +317,10 @@ struct oabi_sembuf { asmlinkage long sys_oabi_semtimedop(int semid, struct oabi_sembuf __user *tsops, unsigned nsops, - const struct timespec __user *timeout) + const struct old_timespec32 __user *timeout) { struct sembuf *sops; - struct timespec local_timeout; + struct old_timespec32 local_timeout; long err; int i; @@ -350,7 +350,7 @@ asmlinkage long sys_oabi_semtimedop(int semid, } else { mm_segment_t fs = get_fs(); set_fs(KERNEL_DS); - err = sys_semtimedop(semid, sops, nsops, timeout); + err = sys_semtimedop_time32(semid, sops, nsops, timeout); set_fs(fs); } kfree(sops); @@ -375,7 +375,7 @@ asmlinkage int sys_oabi_ipc(uint call, int first, int second, int third, return sys_oabi_semtimedop(first, (struct oabi_sembuf __user *)ptr, second, - (const struct timespec __user *)fifth); + (const struct old_timespec32 __user *)fifth); default: return sys_ipc(call, first, second, third, ptr, fifth); } diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index b54b7f2bc24a..200f4b878a46 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -137,7 +137,7 @@ 121common setdomainname sys_setdomainname 122common uname sys_newuname # 123 was sys_modify_ldt -124common adjtimexsys_adjtimex +124common adjtimexsys_adjtimex_time32 125common mprotectsys_mprotect 126common sigprocmask sys_sigprocmask # 127 was sys_create_module @@ -174,8 +174,8 @@ 158common sched_yield sys_sched_yield 159common sched_get_priority_max sys_sched_get_priority_max 160common sched_get_priority_min sys_sched_get_priority_min -161common sched_rr_get_interval sys_sched_rr_get_interval -162common nanosleep sys_nanosleep +161common sched_rr_get_interval sys_sched_rr_get_interval_time32 +162common nanosleep sys_nanosleep_time32 163common mremap sys_mremap 164common setresuid sys_setresuid16 165common getresuid sys_getresuid16 @@ -190,7 +190,7 @@ 174common rt_sigactionsys_rt_sigaction 175common rt_sigprocmask sys_rt_sigprocmask 176common rt_sigpending sys_rt_sigpending -177common
[PATCH v2 19/29] time: Add struct __kernel_timex
From: Deepa Dinamani struct timex uses struct timeval internally. struct timeval is not y2038 safe. Introduce a new UAPI type struct __kernel_timex that is y2038 safe. struct __kernel_timex uses a timeval type that is similar to struct __kernel_timespec which preserves the same structure size across 32 bit and 64 bit ABIs. struct __kernel_timex also restructures other members of the structure to make the structure the same on 64 bit and 32 bit architectures. Note that struct __kernel_timex is the same as struct timex on a 64 bit architecture. The above solution is similar to other new y2038 syscalls that are being introduced: both 32 bit and 64 bit ABIs have a common entry, and the compat entry supports the old 32 bit syscall interface. Alternatives considered were: 1. Add new time type to struct timex that makes use of padded bits. This time type could be based on the struct __kernel_timespec. modes will use a flag to notify which time structure should be used internally. This needs some application level changes on both 64 bit and 32 bit architectures. Although 64 bit machines could continue to use the older timeval structure without any changes. 2. Add a new u8 type to struct timex that makes use of padded bits. This can be used to save higher order tv_sec bits. modes will use a flag to notify presence of such a type. This will need some application level changes on 32 bit architectures. 3. Add a new compat_timex structure that differs in only the size of the time type; keep rest of struct timex the same. This requires extra syscalls to manage all 3 cases on 64 bit architectures. This will not need any application level changes but will add more complexity from kernel side. Signed-off-by: Deepa Dinamani --- include/linux/timex.h | 7 +++ include/uapi/linux/timex.h | 41 ++ 2 files changed, 48 insertions(+) diff --git a/include/linux/timex.h b/include/linux/timex.h index 39c25dbebfe8..7f40e9e42ecc 100644 --- a/include/linux/timex.h +++ b/include/linux/timex.h @@ -53,6 +53,13 @@ #ifndef _LINUX_TIMEX_H #define _LINUX_TIMEX_H +/* CONFIG_64BIT_TIME enables new 64 bit time_t syscalls in the compat path + * and 32-bit emulation. + */ +#ifndef CONFIG_64BIT_TIME +#define __kernel_timex timex +#endif + #include #define ADJ_ADJTIME0x8000 /* switch between adjtime/adjtimex modes */ diff --git a/include/uapi/linux/timex.h b/include/uapi/linux/timex.h index 92685d826444..a1c6b73016a5 100644 --- a/include/uapi/linux/timex.h +++ b/include/uapi/linux/timex.h @@ -92,6 +92,47 @@ struct timex { int :32; int :32; int :32; }; +struct __kernel_timex_timeval { + __kernel_time64_t tv_sec; + long long tv_usec; +}; + +#ifndef __kernel_timex +struct __kernel_timex { + unsigned int modes; /* mode selector */ + int :32;/* pad */ + long long offset; /* time offset (usec) */ + long long freq; /* frequency offset (scaled ppm) */ + long long maxerror;/* maximum error (usec) */ + long long esterror;/* estimated error (usec) */ + int status; /* clock command/status */ + int :32;/* pad */ + long long constant;/* pll time constant */ + long long precision;/* clock precision (usec) (read only) */ + long long tolerance;/* clock frequency tolerance (ppm) + * (read only) + */ + struct __kernel_timex_timeval time; /* (read only, except for ADJ_SETOFFSET) */ + long long tick; /* (modified) usecs between clock ticks */ + + long long ppsfreq;/* pps frequency (scaled ppm) (ro) */ + long long jitter; /* pps jitter (us) (ro) */ + int shift; /* interval duration (s) (shift) (ro) */ + int :32;/* pad */ + long long stabil;/* pps stability (scaled ppm) (ro) */ + long long jitcnt; /* jitter limit exceeded (ro) */ + long long calcnt; /* calibration intervals (ro) */ + long long errcnt; /* calibration errors (ro) */ + long long stbcnt; /* stability limit exceeded (ro) */ + + int tai;/* TAI offset (ro) */ + + int :32; int :32; int :32; int :32; + int :32; int :32; int :32; int :32; + int :32; int :32; int :32; +}; +#endif + /* * Mode codes (timex.mode) */ -- 2.20.0
[PATCH v2 12/29] ipc: rename old-style shmctl/semctl/msgctl syscalls
The behavior of these system calls is slightly different between architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION symbol. Most architectures that implement the split IPC syscalls don't set that symbol and only get the modern version, but alpha, arm, microblaze, mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag. For the architectures that so far only implement sys_ipc(), i.e. m68k, mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior when adding the split syscalls, so we need to distinguish between the two groups of architectures. The method I picked for this distinction is to have a separate system call entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl() does not. The system call tables of the five architectures are changed accordingly. As an additional benefit, we no longer need the configuration specific definition for ipc_parse_version(), it always does the same thing now, but simply won't get called on architectures with the modern interface. A small downside is that on architectures that do set ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points that are never called. They only add a few bytes of bloat, so it seems better to keep them compared to adding yet another Kconfig symbol. I considered adding new syscall numbers for the IPC_64 variants for consistency, but decided against that for now. Signed-off-by: Arnd Bergmann --- arch/alpha/kernel/syscalls/syscall.tbl | 6 ++-- arch/arm/tools/syscall.tbl | 6 ++-- arch/arm64/include/asm/unistd32.h | 6 ++-- arch/microblaze/kernel/syscalls/syscall.tbl | 6 ++-- arch/mips/kernel/syscalls/syscall_n32.tbl | 6 ++-- arch/mips/kernel/syscalls/syscall_n64.tbl | 6 ++-- arch/xtensa/kernel/syscalls/syscall.tbl | 6 ++-- include/linux/syscalls.h| 3 ++ ipc/msg.c | 39 ipc/sem.c | 39 ipc/shm.c | 40 + ipc/syscall.c | 12 +++ ipc/util.h | 21 --- kernel/sys_ni.c | 3 ++ 14 files changed, 137 insertions(+), 62 deletions(-) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index f920b65e8c49..b0e247287908 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -174,17 +174,17 @@ 187common osf_alt_sigpending sys_ni_syscall 188common osf_alt_setsid sys_ni_syscall 199common osf_swapon sys_swapon -200common msgctl sys_msgctl +200common msgctl sys_old_msgctl 201common msgget sys_msgget 202common msgrcv sys_msgrcv 203common msgsnd sys_msgsnd -204common semctl sys_semctl +204common semctl sys_old_semctl 205common semget sys_semget 206common semop sys_semop 207common osf_utsname sys_osf_utsname 208common lchown sys_lchown 209common shmat sys_shmat -210common shmctl sys_shmctl +210common shmctl sys_old_shmctl 211common shmdt sys_shmdt 212common shmget sys_shmget 213common osf_mvalid sys_ni_syscall diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 20ed7e026723..b54b7f2bc24a 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -314,15 +314,15 @@ 297common recvmsg sys_recvmsg 298common semop sys_semop sys_oabi_semop 299common semget sys_semget -300common semctl sys_semctl +300common semctl sys_old_semctl 301common msgsnd sys_msgsnd 302common msgrcv sys_msgrcv 303common msgget sys_msgget -304common msgctl sys_msgctl +304common msgctl sys_old_msgctl 305common shmat sys_shmat 306common shmdt sys_shmdt 307common shmget sys_shmget -308common shmctl sys_shmctl +308common shmctl sys_old_shmctl 309common add_key sys_add_key 310common request_key sys_request_key 311common keyctl sys_keyctl diff --git
[PATCH v2 05/29] alpha: update syscall macro definitions
Other architectures commonly use __NR_umount2 for sys_umount, only ia64 and alpha use __NR_umount here. In order to synchronize the generated tables, use umount2 like everyone else, and add back the old name from asm/unistd.h for compatibility. For shmat, alpha uses the osf_shmat name, we can do the same thing here, which means we don't have to add an entry in the __IGNORE list now that shmat is mandatory everywhere alarm, creat, pause, time, and utime are optional everywhere these days, no need to list them here any more. I considered also adding the regular versions of the get*id system calls that have different names and calling conventions on alpha, which would further help unify the syscall ABI, but for now I decided against that. Signed-off-by: Arnd Bergmann --- arch/alpha/include/asm/unistd.h| 6 -- arch/alpha/include/uapi/asm/unistd.h | 5 + arch/alpha/kernel/syscalls/syscall.tbl | 4 ++-- 3 files changed, 7 insertions(+), 8 deletions(-) diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h index 21b706a5b772..564ba87bdc38 100644 --- a/arch/alpha/include/asm/unistd.h +++ b/arch/alpha/include/asm/unistd.h @@ -22,18 +22,12 @@ /* * Ignore legacy syscalls that we don't use. */ -#define __IGNORE_alarm -#define __IGNORE_creat #define __IGNORE_getegid #define __IGNORE_geteuid #define __IGNORE_getgid #define __IGNORE_getpid #define __IGNORE_getppid #define __IGNORE_getuid -#define __IGNORE_pause -#define __IGNORE_time -#define __IGNORE_utime -#define __IGNORE_umount2 /* Alpha doesn't have protection keys. */ #define __IGNORE_pkey_mprotect diff --git a/arch/alpha/include/uapi/asm/unistd.h b/arch/alpha/include/uapi/asm/unistd.h index 9ba724f116f1..4507071f995f 100644 --- a/arch/alpha/include/uapi/asm/unistd.h +++ b/arch/alpha/include/uapi/asm/unistd.h @@ -2,6 +2,11 @@ #ifndef _UAPI_ALPHA_UNISTD_H #define _UAPI_ALPHA_UNISTD_H +/* These are traditionally the names linux-alpha uses for + * the two otherwise generic system calls */ +#define __NR_umount__NR_umount2 +#define __NR_osf_shmat __NR_shmat + #include #endif /* _UAPI_ALPHA_UNISTD_H */ diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index e09558edae73..f920b65e8c49 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -29,7 +29,7 @@ 19 common lseek sys_lseek 20 common getxpid sys_getxpid 21 common osf_mount sys_osf_mount -22 common umount sys_umount +22 common umount2 sys_umount 23 common setuid sys_setuid 24 common getxuid sys_getxuid 25 common exec_with_loadersys_ni_syscall @@ -183,7 +183,7 @@ 206common semop sys_semop 207common osf_utsname sys_osf_utsname 208common lchown sys_lchown -209common osf_shmat sys_shmat +209common shmat sys_shmat 210common shmctl sys_shmctl 211common shmdt sys_shmdt 212common shmget sys_shmget -- 2.20.0
[PATCH v2 20/29] time: fix sys_timer_settime prototype
A small typo has crept into the y2038 conversion of the timer_settime system call. So far this was completely harmless, but once we start using the new version, this has to be fixed. Fixes: 6ff847350702 ("time: Change types to new y2038 safe __kernel_itimerspec") Signed-off-by: Arnd Bergmann --- include/linux/syscalls.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 938d8908b9e0..baa4b70b02d3 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -591,7 +591,7 @@ asmlinkage long sys_timer_gettime(timer_t timer_id, asmlinkage long sys_timer_getoverrun(timer_t timer_id); asmlinkage long sys_timer_settime(timer_t timer_id, int flags, const struct __kernel_itimerspec __user *new_setting, - struct itimerspec __user *old_setting); + struct __kernel_itimerspec __user *old_setting); asmlinkage long sys_timer_delete(timer_t timer_id); asmlinkage long sys_clock_settime(clockid_t which_clock, const struct __kernel_timespec __user *tp); -- 2.20.0
[PATCH v2 22/29] timex: use __kernel_timex internally
From: Deepa Dinamani struct timex is not y2038 safe. Replace all uses of timex with y2038 safe __kernel_timex. Note that struct __kernel_timex is an ABI interface definition. We could define a new structure based on __kernel_timex that is only available internally instead. Right now, there isn't a strong motivation for this as the structure is isolated to a few defined struct timex interfaces and such a structure would be exactly the same as struct timex. The patch was generated by the following coccinelle script: virtual patch @depends on patch forall@ identifier ts; expression e; @@ ( - struct timex ts; + struct __kernel_timex ts; | - struct timex ts = {}; + struct __kernel_timex ts = {}; | - struct timex ts = e; + struct __kernel_timex ts = e; | - struct timex *ts; + struct __kernel_timex *ts; | (memset \| copy_from_user \| copy_to_user \)(..., - sizeof(struct timex)) + sizeof(struct __kernel_timex)) ) @depends on patch forall@ identifier ts; identifier fn; @@ fn(..., - struct timex *ts, + struct __kernel_timex *ts, ...) { ... } @depends on patch forall@ identifier ts; identifier fn; @@ fn(..., - struct timex *ts) { + struct __kernel_timex *ts) { ... } Signed-off-by: Deepa Dinamani Cc: linux-alpha@vger.kernel.org Cc: net...@vger.kernel.org --- arch/alpha/kernel/osf_sys.c | 5 +++-- arch/sparc/kernel/sys_sparc_64.c | 4 ++-- drivers/ptp/ptp_clock.c | 2 +- include/linux/posix-clock.h | 2 +- include/linux/time32.h | 6 +++--- include/linux/timex.h| 4 ++-- kernel/time/ntp.c| 18 ++ kernel/time/ntp_internal.h | 2 +- kernel/time/posix-clock.c| 2 +- kernel/time/posix-timers.c | 8 kernel/time/posix-timers.h | 2 +- kernel/time/time.c | 14 +++--- kernel/time/timekeeping.c| 4 ++-- 13 files changed, 38 insertions(+), 35 deletions(-) diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index 792586038808..bf497b8b0ec6 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -1253,7 +1253,7 @@ struct timex32 { SYSCALL_DEFINE1(old_adjtimex, struct timex32 __user *, txc_p) { -struct timex txc; + struct __kernel_timex txc; int ret; /* copy relevant bits of struct timex. */ @@ -1270,7 +1270,8 @@ SYSCALL_DEFINE1(old_adjtimex, struct timex32 __user *, txc_p) if (copy_to_user(txc_p, , offsetof(struct timex32, time)) || (copy_to_user(_p->tick, , sizeof(struct timex32) - offsetof(struct timex32, tick))) || - (put_tv_to_tv32(_p->time, ))) + (put_user(txc.time.tv_sec, _p->time.tv_sec)) || + (put_user(txc.time.tv_usec, _p->time.tv_usec))) return -EFAULT; return ret; diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c index 37de18a11207..9825ca6a6020 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -548,7 +548,7 @@ SYSCALL_DEFINE2(getdomainname, char __user *, name, int, len) SYSCALL_DEFINE1(sparc_adjtimex, struct timex __user *, txc_p) { struct timex txc; /* Local copy of parameter */ - struct timex *kt = (void *) + struct __kernel_timex *kt = (void *) int ret; /* Copy the user data space into the kernel copy @@ -572,7 +572,7 @@ SYSCALL_DEFINE1(sparc_adjtimex, struct timex __user *, txc_p) SYSCALL_DEFINE2(sparc_clock_adjtime, const clockid_t, which_clock,struct timex __user *, txc_p) { struct timex txc; /* Local copy of parameter */ - struct timex *kt = (void *) + struct __kernel_timex *kt = (void *) int ret; if (!IS_ENABLED(CONFIG_POSIX_TIMERS)) { diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 48f3594a7458..79bd102c9bbc 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -124,7 +124,7 @@ static int ptp_clock_gettime(struct posix_clock *pc, struct timespec64 *tp) return err; } -static int ptp_clock_adjtime(struct posix_clock *pc, struct timex *tx) +static int ptp_clock_adjtime(struct posix_clock *pc, struct __kernel_timex *tx) { struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock); struct ptp_clock_info *ops; diff --git a/include/linux/posix-clock.h b/include/linux/posix-clock.h index 3a3bc71017d5..18674d7d5b1c 100644 --- a/include/linux/posix-clock.h +++ b/include/linux/posix-clock.h @@ -51,7 +51,7 @@ struct posix_clock; struct posix_clock_operations { struct module *owner; - int (*clock_adjtime)(struct posix_clock *pc, struct timex *tx); + int (*clock_adjtime)(struct posix_clock *pc, struct __kernel_timex *tx); int (*clock_gettime)(struct posix_clock *pc, struct timespec64 *ts); diff --git a/include/linux/time32.h b/include/linux/time32.h index 820a22e2b98b..0a1f302a1753 100644 ---
[PATCH v2 00/29] y2038: add time64 syscalls
This is a minor update of the patches I posted last week, I would like to add this into linux-next now, but would still do changes if there are concerns about the contents. The first version did not see a lot of replies, which could mean that either everyone is happy with it, or that it was largely ignored. See also the article at https://lwn.net/Articles/776435/. Changes since v1: - posting as a combined series for simplicity - dropped one mips patch that was merged as a 5.0 fix - reworked s390 compat syscall handling (posted separately) and rebased on top of that series - minor fixes for arm64 and powerpc - added alpha statfs64 interfaces - added alpha get{eg,eu,g,p,u,pp}id() Arnd v1 description for cleanup: The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. Most of the contents of this series are unrelated to the actual y2038 work, but for the moment, that second series is based on this one. If there are any concerns about changes here, I can drop or rewrite any individual patch in this series. My plan is to merge any patches in this series that are found to be good together with the y2038 patches for linux-5.1, so please review and provide Acks for merging through my tree, or pick them up for 5.0 if they seem urgent enough. v1 description for y2038 patches: This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. I hope that the actual conversion should be uncontroversial by now, even if some of the patches are rather large. The one area that may need a little discussion is for the system call numbers assigned in the final patch: Can we get consensus on whether the idea of using the same numbers on all architectures, as well as my choice of numbers makes sense here? So far, I have done a lot of build testing across most architectures, which has found a number of bugs. I have also done an LTP run on arm32 with existing user space, but not on the other architectures. I did LTP tests with a modified musl libc[2] last summer on an older version of this series to make sure that the new 64-bit time_t interfaces work. The version there will need updates for testing with this new kernel patch series; I plan to do that next. For testing, the series plus the preparatory patches is available at [3]. Once there is a general agreement on this series and I have done more tests for the new system calls, I plan to add this to linux-next through my asm-generic tree or Thomas' timers tree. Please review and test! Arnd [1] https://lore.kernel.org/lkml/20190110162435.309262-1-a...@arndb.de/T/ [2] https://git.linaro.org/people/arnd/musl-y2038.git/ [3] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git y2038-5.0-rc1 Arnd Bergmann (26): ia64: add __NR_umount2 definition ia64: add statx and io_pgetevents syscalls ia64: assign syscall numbers for perf and seccomp alpha: wire up io_pgetevents system call alpha: update syscall macro definitions ARM: add migrate_pages() system call ARM: add kexec_file_load system call number m68k: assign syscall number for seccomp sh: remove duplicate unistd_32.h file sh: add statx system call sparc64: fix sparc_ipc type conversion ipc: rename old-style shmctl/semctl/msgctl syscalls arch: add split IPC system calls where needed arch: add pkey and rseq syscall numbers everywhere alpha: add standard statfs64/fstatfs64 syscalls alpha: add generic get{eg,eu,g,p,u,pp}id() syscalls syscalls: remove obsolete __IGNORE_ macros time: make adjtime compat handling available for 32 bit time: fix sys_timer_settime prototype sparc64: add custom adjtimex/clock_adjtime functions x86/x32: use time64 versions of sigtimedwait and recvmmsg y2038: syscalls: rename
[PATCH v2 24/29] x86/x32: use time64 versions of sigtimedwait and recvmmsg
x32 has always followed the time64 calling conventions of these syscalls, which required a special hack in compat_get_timespec aka get_old_timespec32 to continue working. Since we now have the time64 syscalls, use those explicitly. Signed-off-by: Arnd Bergmann --- arch/x86/entry/syscalls/syscall_64.tbl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index f0b1709a5ffb..43a622aec07e 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -361,7 +361,7 @@ 520x32 execve __x32_compat_sys_execve/ptregs 521x32 ptrace __x32_compat_sys_ptrace 522x32 rt_sigpending __x32_compat_sys_rt_sigpending -523x32 rt_sigtimedwait __x32_compat_sys_rt_sigtimedwait +523x32 rt_sigtimedwait __x32_compat_sys_rt_sigtimedwait_time64 524x32 rt_sigqueueinfo __x32_compat_sys_rt_sigqueueinfo 525x32 sigaltstack __x32_compat_sys_sigaltstack 526x32 timer_create__x32_compat_sys_timer_create @@ -375,7 +375,7 @@ 534x32 preadv __x32_compat_sys_preadv64 535x32 pwritev __x32_compat_sys_pwritev64 536x32 rt_tgsigqueueinfo __x32_compat_sys_rt_tgsigqueueinfo -537x32 recvmmsg__x32_compat_sys_recvmmsg +537x32 recvmmsg__x32_compat_sys_recvmmsg_time64 538x32 sendmmsg__x32_compat_sys_sendmmsg 539x32 process_vm_readv__x32_compat_sys_process_vm_readv 540x32 process_vm_writev __x32_compat_sys_process_vm_writev -- 2.20.0
[PATCH v2 11/29] sparc64: fix sparc_ipc type conversion
__kernel_timespec and timespec are currently the same type, but once they are different, the type cast has to be changed here. Signed-off-by: Arnd Bergmann --- arch/sparc/kernel/sys_sparc_64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c index 274ed0b9b3e0..1c079e7bab09 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -344,7 +344,7 @@ SYSCALL_DEFINE6(sparc_ipc, unsigned int, call, int, first, unsigned long, second goto out; case SEMTIMEDOP: err = sys_semtimedop(first, ptr, (unsigned int)second, - (const struct timespec __user *) + (const struct __kernel_timespec __user *) (unsigned long) fifth); goto out; case SEMGET: -- 2.20.0
[PATCH v4 1/3] fs: hoist EFSCORRUPTED definition into uapi header
Multiple filesystems can already return EFSCORRUPTED errors to userspace; however, so far, definitions of EFSCORRUPTED were in filesystem-private headers. I wanted to use EUCLEAN to indicate data corruption in the VFS layer; Dave Chinner says that I should instead hoist the definitions of EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED. This patch is marked for stable backport because it is a prerequisite for the following patch. Cc: sta...@vger.kernel.org Suggested-by: Dave Chinner Signed-off-by: Jann Horn --- fs/ext2/ext2.h | 1 - fs/ext4/ext4.h | 1 - fs/xfs/xfs_linux.h | 1 - include/linux/jbd2.h | 1 - include/uapi/asm-generic/errno.h | 1 + 5 files changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h index e770cd100a6a..7fafc19e5aa0 100644 --- a/fs/ext2/ext2.h +++ b/fs/ext2/ext2.h @@ -369,7 +369,6 @@ struct ext2_inode { */ #defineEXT2_VALID_FS 0x0001 /* Unmounted cleanly */ #defineEXT2_ERROR_FS 0x0002 /* Errors detected */ -#defineEFSCORRUPTEDEUCLEAN /* Filesystem is corrupted */ /* * Mount flags diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 185a05d3257e..9397e97fc15b 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3247,6 +3247,5 @@ extern const struct iomap_ops ext4_iomap_ops; #endif /* __KERNEL__ */ #define EFSBADCRC EBADMSG /* Bad CRC detected */ -#define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ #endif /* _EXT4_H */ diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index edbd5a210df2..36e5c6549f15 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -125,7 +125,6 @@ typedef __u32 xfs_nlink_t; #define ENOATTRENODATA /* Attribute not found */ #define EWRONGFS EINVAL /* Mount with wrong filesystem type */ -#define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ #define EFSBADCRC EBADMSG /* Bad CRC detected */ #define SYNCHRONIZE() barrier() diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 0f919d5fe84f..1d0da9c78216 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1640,6 +1640,5 @@ static inline tid_t jbd2_get_latest_transaction(journal_t *journal) #endif /* __KERNEL__ */ #define EFSBADCRC EBADMSG /* Bad CRC detected */ -#define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ #endif /* _LINUX_JBD2_H */ diff --git a/include/uapi/asm-generic/errno.h b/include/uapi/asm-generic/errno.h index cf9c51ac49f9..5ddebc1bf951 100644 --- a/include/uapi/asm-generic/errno.h +++ b/include/uapi/asm-generic/errno.h @@ -98,6 +98,7 @@ #defineEINPROGRESS 115 /* Operation now in progress */ #defineESTALE 116 /* Stale file handle */ #defineEUCLEAN 117 /* Structure needs cleaning */ +#defineEFSCORRUPTEDEUCLEAN /* Filesystem is corrupted */ #defineENOTNAM 118 /* Not a XENIX named type file */ #defineENAVAIL 119 /* No XENIX semaphores available */ #defineEISNAM 120 /* Is a named type file */ -- 2.20.1.321.g9e740568ce-goog
[PATCH v4 2/3] fs: don't let getdents return bogus names
When you e.g. run `find` on a directory for which getdents returns "filenames" that contain slashes, `find` passes those "filenames" back to the kernel, which then interprets them as paths. That could conceivably cause userspace to do something bad when accessing something like an untrusted USB stick, but I'm not aware of any specific example. Instead of returning bogus filenames to userspace, return -EUCLEAN. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: sta...@vger.kernel.org Signed-off-by: Jann Horn --- I ordered this fix before the refactoring one so that it can easily be backported. changed in v2: - move bogus_dirent_name() out of the #ifdef (kbuild test robot) changed in v3: - change calling convention (Al Viro) - comment fix changed in v4: - use EFSCORRUPTED instead of EUCLEAN (Dave Chinner) arch/alpha/kernel/osf_sys.c | 4 fs/readdir.c| 35 +++ include/linux/fs.h | 2 ++ 3 files changed, 41 insertions(+) diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index 792586038808..db1c2144d477 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -117,6 +118,9 @@ osf_filldir(struct dir_context *ctx, const char *name, int namlen, unsigned int reclen = ALIGN(NAME_OFFSET + namlen + 1, sizeof(u32)); unsigned int d_ino; + buf->error = check_dirent_name(name, namlen); + if (unlikely(buf->error)) + return -EFSCORRUPTED; buf->error = -EINVAL; /* only used if we fail */ if (reclen > buf->count) return -EINVAL; diff --git a/fs/readdir.c b/fs/readdir.c index 2f6a4534e0df..58088510bb9c 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -64,6 +64,26 @@ int iterate_dir(struct file *file, struct dir_context *ctx) } EXPORT_SYMBOL(iterate_dir); +/* + * Most filesystems don't filter out bogus directory entry names, and userspace + * can get very confused by such names. Behave as if a filesystem error had + * happened while reading directory entries. + */ +int check_dirent_name(const char *name, int namlen) +{ + if (namlen == 0) { + pr_err_once("%s: filesystem returned bogus empty name\n", + __func__); + return -EFSCORRUPTED; + } + if (memchr(name, '/', namlen)) { + pr_err_once("%s: filesystem returned bogus name '%*pEhp' (contains slash)\n", + __func__, namlen, name); + return -EFSCORRUPTED; + } + return 0; +} + /* * Traditional linux readdir() handling.. * @@ -98,6 +118,9 @@ static int fillonedir(struct dir_context *ctx, const char *name, int namlen, if (buf->result) return -EINVAL; + buf->result = check_dirent_name(name, namlen); + if (unlikely(buf->result)) + return -EFSCORRUPTED; d_ino = ino; if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) { buf->result = -EOVERFLOW; @@ -173,6 +196,9 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen, int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2, sizeof(long)); + buf->error = check_dirent_name(name, namlen); + if (unlikely(buf->error)) + return -EFSCORRUPTED; buf->error = -EINVAL; /* only used if we fail.. */ if (reclen > buf->count) return -EINVAL; @@ -259,6 +285,9 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1, sizeof(u64)); + buf->error = check_dirent_name(name, namlen); + if (unlikely(buf->error)) + return -EFSCORRUPTED; buf->error = -EINVAL; /* only used if we fail.. */ if (reclen > buf->count) return -EINVAL; @@ -358,6 +387,9 @@ static int compat_fillonedir(struct dir_context *ctx, const char *name, if (buf->result) return -EINVAL; + buf->result = check_dirent_name(name, namlen); + if (unlikely(buf->result)) + return -EFSCORRUPTED; d_ino = ino; if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) { buf->result = -EOVERFLOW; @@ -427,6 +459,9 @@ static int compat_filldir(struct dir_context *ctx, const char *name, int namlen, int reclen = ALIGN(offsetof(struct compat_linux_dirent, d_name) + namlen + 2, sizeof(compat_long_t)); + buf->error = check_dirent_name(name, namlen); + if (unlikely(buf->error)) + return -EFSCORRUPTED; buf->error = -EINVAL; /* only used if we fail.. */ if (reclen > buf->count) return -EINVAL; diff --git a/include/linux/fs.h b/include/linux/fs.h index
[PATCH v4 3/3] fs: let filldir_t return bool instead of an error code
As Al Viro pointed out, many filldir_t functions return error codes, but all callers of filldir_t functions just check whether the return value is non-zero (to determine whether to continue reading the directory); more precise errors have to be signalled via struct dir_context. Change all filldir_t functions to return bool instead of int. Suggested-by: Al Viro Signed-off-by: Jann Horn --- arch/alpha/kernel/osf_sys.c | 12 +++ fs/afs/dir.c| 30 + fs/ecryptfs/file.c | 13 fs/exportfs/expfs.c | 8 ++--- fs/fat/dir.c| 8 ++--- fs/gfs2/export.c| 6 ++-- fs/nfsd/nfs4recover.c | 8 ++--- fs/nfsd/vfs.c | 6 ++-- fs/ocfs2/dir.c | 10 +++--- fs/ocfs2/journal.c | 14 fs/overlayfs/readdir.c | 24 +++--- fs/readdir.c| 64 ++--- fs/reiserfs/xattr.c | 20 ++-- fs/xfs/scrub/dir.c | 8 ++--- fs/xfs/scrub/parent.c | 4 +-- include/linux/fs.h | 10 +++--- 16 files changed, 125 insertions(+), 120 deletions(-) diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index db1c2144d477..14e5ae0dac50 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -108,7 +108,7 @@ struct osf_dirent_callback { int error; }; -static int +static bool osf_filldir(struct dir_context *ctx, const char *name, int namlen, loff_t offset, u64 ino, unsigned int d_type) { @@ -120,14 +120,14 @@ osf_filldir(struct dir_context *ctx, const char *name, int namlen, buf->error = check_dirent_name(name, namlen); if (unlikely(buf->error)) - return -EFSCORRUPTED; + return false; buf->error = -EINVAL; /* only used if we fail */ if (reclen > buf->count) - return -EINVAL; + return false; d_ino = ino; if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) { buf->error = -EOVERFLOW; - return -EOVERFLOW; + return false; } if (buf->basep) { if (put_user(offset, buf->basep)) @@ -144,10 +144,10 @@ osf_filldir(struct dir_context *ctx, const char *name, int namlen, dirent = (void __user *)dirent + reclen; buf->dirent = dirent; buf->count -= reclen; - return 0; + return true; Efault: buf->error = -EFAULT; - return -EFAULT; + return false; } SYSCALL_DEFINE4(osf_getdirentries, unsigned int, fd, diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 8a2562e3a316..84d74cc25127 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -26,10 +26,12 @@ static int afs_dir_open(struct inode *inode, struct file *file); static int afs_readdir(struct file *file, struct dir_context *ctx); static int afs_d_revalidate(struct dentry *dentry, unsigned int flags); static int afs_d_delete(const struct dentry *dentry); -static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name, int nlen, - loff_t fpos, u64 ino, unsigned dtype); -static int afs_lookup_filldir(struct dir_context *ctx, const char *name, int nlen, - loff_t fpos, u64 ino, unsigned dtype); +static bool afs_lookup_one_filldir(struct dir_context *ctx, const char *name, + int nlen, loff_t fpos, u64 ino, + unsigned int dtype); +static bool afs_lookup_filldir(struct dir_context *ctx, const char *name, + int nlen, loff_t fpos, u64 ino, + unsigned int dtype); static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode, bool excl); static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode); @@ -493,7 +495,7 @@ static int afs_readdir(struct file *file, struct dir_context *ctx) * - if afs_dir_iterate_block() spots this function, it'll pass the FID * uniquifier through dtype */ -static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name, +static bool afs_lookup_one_filldir(struct dir_context *ctx, const char *name, int nlen, loff_t fpos, u64 ino, unsigned dtype) { struct afs_lookup_one_cookie *cookie = @@ -509,16 +511,16 @@ static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name, if (cookie->name.len != nlen || memcmp(cookie->name.name, name, nlen) != 0) { - _leave(" = 0 [no]"); - return 0; + _leave(" = true [no]"); + return true; } cookie->fid.vnode = ino; cookie->fid.unique = dtype; cookie->found = 1; - _leave(" = -1 [found]"); - return -1; + _leave(" = false [found]"); + return false; } /* @@ -561,12 +563,12 @@ static int
Re: [PATCH 21/21] memblock: drop memblock_alloc_*_nopanic() variants
On Wed, Jan 16, 2019 at 03:44:21PM +0200, Mike Rapoport wrote: > As all the memblock allocation functions return NULL in case of error > rather than panic(), the duplicates with _nopanic suffix can be removed. > > Signed-off-by: Mike Rapoport > --- > arch/arc/kernel/unwind.c | 3 +-- > arch/sh/mm/init.c | 2 +- > arch/x86/kernel/setup_percpu.c | 10 +- > arch/x86/mm/kasan_init_64.c| 14 -- > drivers/firmware/memmap.c | 2 +- > drivers/usb/early/xhci-dbc.c | 2 +- > include/linux/memblock.h | 35 --- > kernel/dma/swiotlb.c | 2 +- > kernel/printk/printk.c | 17 +++-- > mm/memblock.c | 35 --- > mm/page_alloc.c| 10 +- > mm/page_ext.c | 2 +- > mm/percpu.c| 11 --- > mm/sparse.c| 6 ++ > 14 files changed, 37 insertions(+), 114 deletions(-) Acked-by: Greg Kroah-Hartman
Re: [PATCH 19/21] treewide: add checks for the return value of memblock_alloc*()
On Wed, Jan 16, 2019 at 03:44:19PM +0200, Mike Rapoport wrote: > Add check for the return value of memblock_alloc*() functions and call > panic() in case of error. > The panic message repeats the one used by panicing memblock allocators with > adjustment of parameters to include only relevant ones. > > The replacement was mostly automated with semantic patches like the one > below with manual massaging of format strings. > > @@ > expression ptr, size, align; > @@ > ptr = memblock_alloc(size, align); > + if (!ptr) > + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, > size, align); > > Signed-off-by: Mike Rapoport ... > diff --git a/arch/s390/numa/toptree.c b/arch/s390/numa/toptree.c > index 71a608c..0118c77 100644 > --- a/arch/s390/numa/toptree.c > +++ b/arch/s390/numa/toptree.c > @@ -31,10 +31,14 @@ struct toptree __ref *toptree_alloc(int level, int id) > { > struct toptree *res; > > - if (slab_is_available()) > + if (slab_is_available()) { > res = kzalloc(sizeof(*res), GFP_KERNEL); > - else > + } else { > res = memblock_alloc(sizeof(*res), 8); > + if (!res) > + panic("%s: Failed to allocate %zu bytes align=0x%x\n", > + __func__, sizeof(*res), 8); > + } > if (!res) > return res; Please remove this hunk, since the code _should_ be able to handle allocation failures anyway (see end of quoted code). Otherwise for the s390 bits: Acked-by: Heiko Carstens