Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-11 Thread Palmer Dabbelt
On Tue, 11 Jul 2017 06:55:28 PDT (-0700), h...@infradead.org wrote:
> On Tue, Jul 11, 2017 at 02:22:15PM +0100, Will Deacon wrote:
>> The problem is that by supporting these hypothetical designs that can't do
>> atomics, you hurt sensible designs that *can* do the atomics because you
>> force them to take an additional indirection that could otherwise be
>> avoided.
>
> Agreed.  But the new patchset seems to remove it already, so I guess
> we're fine on the kernel side.  Now we just need to make sure the
> glibc API doesn't use any indirections.
>
> Note that it might make sense to emit these for very low end nommu
> designs.  Maybe even running Linux, but in that case they'll just need
> a special non-standard ABI for very limited use cases.

glibc has never used these calls on machines with the A extension.  They're
only used in one specific header file to emulate cmpxchg, and they're guarded
by something like "#ifdef riscv_atomic".  Here's the glibc code (from a
slightly older version, glibc is also in submission so everything's a bit of a
mess there too) for reference

  /* If the A (atomic) extension is not present, we need help from the
 kernel to do atomic accesses.  Linux provides two system calls for
 this purpose.  RISCV_ATOMIC_CMPXCHG will perform an atomic compare
 and exchange operation for a 32-bit value.  RISCV_ATOMIC_CMPXCHG64
 will do the same for a 64-bit value. */

  #include 
  #include 

  #define __HAVE_64B_ATOMICS (__riscv_xlen >= 64)
  #define USE_ATOMIC_COMPILER_BUILTINS 0

  #define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
(abort (), (__typeof (*mem)) 0)

  #define __arch_compare_and_exchange_val_16_acq(mem, newval, oldval) \
(abort (), (__typeof (*mem)) 0)

  /* The only basic operation needed is compare and exchange.  */
  #define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \
({\
  INTERNAL_SYSCALL_DECL (__err);  \
  (__typeof (*mem)) INTERNAL_SYSCALL (sysriscv, __err, 4, \
  RISCV_ATOMIC_CMPXCHG, mem, oldval, newval); \
})

  #define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \
({\
  INTERNAL_SYSCALL_DECL (__err);  \
  (__typeof (*mem)) INTERNAL_SYSCALL (sysriscv, __err, 4, \
  RISCV_ATOMIC_CMPXCHG64, mem, oldval, newval);   \
})

We originally had these as a special Kconfig option, but then it was pointed
out that user binaries built on non-A systems wouldn't run on A systems.  That
seemed like a bad idea, so we just enabled it everywhere.

I think we should just table this discussion for now: we can always add the
system calls back in if people build non-A Linux systems.  We'll mark our glibc
port as requiring the A extension and delete the dead code there so nothing
knows about the syscalls.

Thanks!


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-11 Thread Palmer Dabbelt
On Tue, 11 Jul 2017 06:22:15 PDT (-0700), will.dea...@arm.com wrote:
> On Mon, Jul 10, 2017 at 01:00:29PM -0700, Palmer Dabbelt wrote:
>> On Thu, 06 Jul 2017 08:45:13 PDT (-0700), will.dea...@arm.com wrote:
>> > On Thu, Jul 06, 2017 at 08:34:27AM -0700, Christoph Hellwig wrote:
>> >> On Thu, Jul 06, 2017 at 09:55:03AM +0100, Will Deacon wrote:
>> >> > Agreed on the indirection; it feels like this is something that should 
>> >> > be in
>> >> > the vDSO, which could use the cmpxchg instruction if it's available, or
>> >> > otherwise just uses plain loads and stores.
>>
>> These are already in the vDSO, and use the corresponding atomic instructions 
>> on
>> systems with the A extension.  The vDSO routines call the system calls in 
>> non-A
>> systems.  As far as I can tell that's necessary to preserve atomicity, which 
>> we
>> currently do by disabling scheduling.  If there's a way to do this without
>> entering the kernel then I'd be happy to support it, but I'm not sure how we
>> could maintain atomicity using only regular loads and stores.
>
> Take a look at the ARM code I mentioned. You can do away with the syscall if
> you notice that you preempt a thread inside the critical section of the
> vDSO, and, in that case you resume execution at a known "restart" address.
>
>> >> Even that seems like a lot of indirection for something that is in
>> >> the critical fast path for synchronization.  I really can't understand
>> >> how a new ISA / ABI could even come up with an idea as stupid as making
>> >> essential synchronization primitives optional.
>> >
>> > No disagreement there!
>>
>> The default set of multilibs on Linux are:
>>
>>  * rv32imac: 32-bit; Multiply, Atomic, and Compressed extensions
>>  * rv32imafdc: like above, but with single+double float
>>  * rv64imac: 64-bit, Multiply, Atomic and Compressed
>>  * rv64imafdc: like above, but with single+double float
>>
>> all of which support the A extension.  We certainly don't plan on building 
>> any
>> systems that support Linux without the A extension at SiFive, so I'm fine
>> removing the system call -- this was originally added by a user, so there was
>> at least enough interest for someone to add the system call.
>>
>> We've found people are retrofitting other cores to run RISC-V, and I could
>> certainly imagine an older design that lacks a beefy enough memory system to
>> support our atomics (which are LR/SC based) being a design that might arise.
>> There's a lot of systems where people don't seem to care that much about the
>> performance and just want something to work -- if they're on such a tiny 
>> system
>> they can't implement the A extension then they're probably not going to be
>> doing a lot of atomics anyway, so maybe it doesn't matter if atomics are 
>> slow.
>> As the cost for supporting these A-less systems seems fairly small, it seemed
>> like the right thing to do -- one of the points of making RISC-V have many
>> optional extensions was to let people pick the ones they view as important.
>> Since I don't know the performance constraints of their systems or the cost 
>> of
>> implementing the A extension in their design, I'm not really qualified to 
>> tell
>> them a cmpxchg syscall is a bad idea.
>
> The problem is that by supporting these hypothetical designs that can't do
> atomics, you hurt sensible designs that *can* do the atomics because you
> force them to take an additional indirection that could otherwise be
> avoided.

I just went ahead and removed the system calls from the port -- they're not
going to get called on any systems SiFive is building, so if someone complains
then we'll just sort it out later.


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-11 Thread Christoph Hellwig
On Tue, Jul 11, 2017 at 02:22:15PM +0100, Will Deacon wrote:
> The problem is that by supporting these hypothetical designs that can't do
> atomics, you hurt sensible designs that *can* do the atomics because you
> force them to take an additional indirection that could otherwise be
> avoided.

Agreed.  But the new patchset seems to remove it already, so I guess
we're fine on the kernel side.  Now we just need to make sure the
glibc API doesn't use any indirections.

Note that it might make sense to emit these for very low end nommu
designs.  Maybe even running Linux, but in that case they'll just need
a special non-standard ABI for very limited use cases.


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-11 Thread Will Deacon
On Mon, Jul 10, 2017 at 01:00:29PM -0700, Palmer Dabbelt wrote:
> On Thu, 06 Jul 2017 08:45:13 PDT (-0700), will.dea...@arm.com wrote:
> > On Thu, Jul 06, 2017 at 08:34:27AM -0700, Christoph Hellwig wrote:
> >> On Thu, Jul 06, 2017 at 09:55:03AM +0100, Will Deacon wrote:
> >> > Agreed on the indirection; it feels like this is something that should 
> >> > be in
> >> > the vDSO, which could use the cmpxchg instruction if it's available, or
> >> > otherwise just uses plain loads and stores.
> 
> These are already in the vDSO, and use the corresponding atomic instructions 
> on
> systems with the A extension.  The vDSO routines call the system calls in 
> non-A
> systems.  As far as I can tell that's necessary to preserve atomicity, which 
> we
> currently do by disabling scheduling.  If there's a way to do this without
> entering the kernel then I'd be happy to support it, but I'm not sure how we
> could maintain atomicity using only regular loads and stores.

Take a look at the ARM code I mentioned. You can do away with the syscall if
you notice that you preempt a thread inside the critical section of the
vDSO, and, in that case you resume execution at a known "restart" address.

> >> Even that seems like a lot of indirection for something that is in
> >> the critical fast path for synchronization.  I really can't understand
> >> how a new ISA / ABI could even come up with an idea as stupid as making
> >> essential synchronization primitives optional.
> >
> > No disagreement there!
> 
> The default set of multilibs on Linux are:
> 
>  * rv32imac: 32-bit; Multiply, Atomic, and Compressed extensions
>  * rv32imafdc: like above, but with single+double float
>  * rv64imac: 64-bit, Multiply, Atomic and Compressed
>  * rv64imafdc: like above, but with single+double float
> 
> all of which support the A extension.  We certainly don't plan on building any
> systems that support Linux without the A extension at SiFive, so I'm fine
> removing the system call -- this was originally added by a user, so there was
> at least enough interest for someone to add the system call.
> 
> We've found people are retrofitting other cores to run RISC-V, and I could
> certainly imagine an older design that lacks a beefy enough memory system to
> support our atomics (which are LR/SC based) being a design that might arise.
> There's a lot of systems where people don't seem to care that much about the
> performance and just want something to work -- if they're on such a tiny 
> system
> they can't implement the A extension then they're probably not going to be
> doing a lot of atomics anyway, so maybe it doesn't matter if atomics are slow.
> As the cost for supporting these A-less systems seems fairly small, it seemed
> like the right thing to do -- one of the points of making RISC-V have many
> optional extensions was to let people pick the ones they view as important.
> Since I don't know the performance constraints of their systems or the cost of
> implementing the A extension in their design, I'm not really qualified to tell
> them a cmpxchg syscall is a bad idea.

The problem is that by supporting these hypothetical designs that can't do
atomics, you hurt sensible designs that *can* do the atomics because you
force them to take an additional indirection that could otherwise be
avoided.

Will


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-10 Thread Palmer Dabbelt
On Mon, 10 Jul 2017 13:00:29 PDT (-0700), Palmer Dabbelt wrote:
> On Thu, 06 Jul 2017 08:45:13 PDT (-0700), will.dea...@arm.com wrote:
>> On Thu, Jul 06, 2017 at 08:34:27AM -0700, Christoph Hellwig wrote:
>>> On Thu, Jul 06, 2017 at 09:55:03AM +0100, Will Deacon wrote:
>>> > Agreed on the indirection; it feels like this is something that should be 
>>> > in
>>> > the vDSO, which could use the cmpxchg instruction if it's available, or
>>> > otherwise just uses plain loads and stores.
>
> These are already in the vDSO, and use the corresponding atomic instructions 
> on
> systems with the A extension.  The vDSO routines call the system calls in 
> non-A
> systems.  As far as I can tell that's necessary to preserve atomicity, which 
> we
> currently do by disabling scheduling.  If there's a way to do this without
> entering the kernel then I'd be happy to support it, but I'm not sure how we
> could maintain atomicity using only regular loads and stores.
>
>>> Even that seems like a lot of indirection for something that is in
>>> the critical fast path for synchronization.  I really can't understand
>>> how a new ISA / ABI could even come up with an idea as stupid as making
>>> essential synchronization primitives optional.
>>
>> No disagreement there!
>
> The default set of multilibs on Linux are:
>
>  * rv32imac: 32-bit; Multiply, Atomic, and Compressed extensions
>  * rv32imafdc: like above, but with single+double float
>  * rv64imac: 64-bit, Multiply, Atomic and Compressed
>  * rv64imafdc: like above, but with single+double float
>
> all of which support the A extension.  We certainly don't plan on building any
> systems that support Linux without the A extension at SiFive, so I'm fine
> removing the system call -- this was originally added by a user, so there was
> at least enough interest for someone to add the system call.
>
> We've found people are retrofitting other cores to run RISC-V, and I could
> certainly imagine an older design that lacks a beefy enough memory system to
> support our atomics (which are LR/SC based) being a design that might arise.
> There's a lot of systems where people don't seem to care that much about the
> performance and just want something to work -- if they're on such a tiny 
> system
> they can't implement the A extension then they're probably not going to be
> doing a lot of atomics anyway, so maybe it doesn't matter if atomics are slow.
> As the cost for supporting these A-less systems seems fairly small, it seemed
> like the right thing to do -- one of the points of making RISC-V have many
> optional extensions was to let people pick the ones they view as important.
> Since I don't know the performance constraints of their systems or the cost of
> implementing the A extension in their design, I'm not really qualified to tell
> them a cmpxchg syscall is a bad idea.
>
> I'm fine either way here: if someone's core can't support the A extension they
> can always just buy one that does (ideally from us :)).  If it was up to be 
> I'd
> leave the calls in there, as I generally don't like to tell users we won't
> support their use case, but since you guys seem to know a lot more about this
> than I do I'll just leave the decision up to you.
>
> If you want the system call (and the corresponding vDSO entry, which will be
> unnecessary if we mandate A) gone then I'll remove it for our v5.  Just give 
> me
> a heads up.
>
> Thanks, and sorry for wasting your time!

I mangled this message when sending it so I'm trying again.


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-06 Thread Will Deacon
On Thu, Jul 06, 2017 at 08:34:27AM -0700, Christoph Hellwig wrote:
> On Thu, Jul 06, 2017 at 09:55:03AM +0100, Will Deacon wrote:
> > Agreed on the indirection; it feels like this is something that should be in
> > the vDSO, which could use the cmpxchg instruction if it's available, or
> > otherwise just uses plain loads and stores.
> 
> Even that seems like a lot of indirection for something that is in
> the critical fast path for synchronization.  I really can't understand
> how a new ISA / ABI could even come up with an idea as stupid as making
> essential synchronization primitives optional. 

No disagreement there!

Will


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-06 Thread Christoph Hellwig
On Thu, Jul 06, 2017 at 09:55:03AM +0100, Will Deacon wrote:
> Agreed on the indirection; it feels like this is something that should be in
> the vDSO, which could use the cmpxchg instruction if it's available, or
> otherwise just uses plain loads and stores.

Even that seems like a lot of indirection for something that is in
the critical fast path for synchronization.  I really can't understand
how a new ISA / ABI could even come up with an idea as stupid as making
essential synchronization primitives optional. 


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-06 Thread Dave P Martin
On Wed, Jul 05, 2017 at 09:49:36AM -0700, Palmer Dabbelt wrote:
> On Mon, 03 Jul 2017 16:06:39 PDT (-0700), james.ho...@imgtec.com wrote:
> > On Thu, Jun 29, 2017 at 02:42:38PM -0700, Palmer Dabbelt wrote:
> >> On Wed, 28 Jun 2017 15:42:37 PDT (-0700), james.ho...@imgtec.com wrote:
> >> > On Wed, Jun 28, 2017 at 11:55:37AM -0700, Palmer Dabbelt wrote:
> >> >> diff --git a/arch/riscv/include/uapi/asm/ucontext.h 
> >> >> b/arch/riscv/include/uapi/asm/ucontext.h
> >> >> new file mode 100644
> >> >> index ..52eff9febcfd
> >> >> --- /dev/null
> >> >> +++ b/arch/riscv/include/uapi/asm/ucontext.h
> >> > ...
> >> >> +struct ucontext {
> >> >> +   unsigned long uc_flags;
> >> >> +   struct ucontext  *uc_link;
> >> >> +   stack_t   uc_stack;
> >> >> +   sigset_t  uc_sigmask;
> >> >> +   /* glibc uses a 1024-bit sigset_t */
> >> >> +   __u8  __unused[1024 / 8 - sizeof(sigset_t)];
> >> >> +   /* last for future expansion */
> >> >> +   struct sigcontext uc_mcontext;
> >> >> +};
> >> >
> >> > Any particular reason not to use the asm-generic ucontext?
> >>
> >> In the generic ucontext, 'uc_sigmask' is at the end of the structure so it 
> >> can
> >> be expanded.  Since we want our mcontext to be expandable as well, we
> >> pre-allocate some expandable space for sigmask and then put mcontext at the
> >> end.
> >>
> >> We stole this idea from arm64.
> >
> > Curious. __unused seems like overkill to be honest given that expanding
> > the number of signals up to 128 causes other issues (as discovered on
> > MIPS e.g. the waitpid() status, with stopsig not fitting below the exit
> > code (shift 8) and core dump flag (bit 7)), but perhaps it could be
> > carefully expanded by splitting the stopsig field.
> 
> Sorry, I don't understand the intricacies of this in the slightest.  In 
> general
> we try to avoid surprises in software land in RISC-V, so whenever we do
> something we go look at the most popular architectures (Intel and ARM) and try
> to ensure we don't paint ourselves into any corners that they didn't.

I think Catalin was concerned that putting uc_sigmask at the end breaks
extensibility of sigcontext.

[Catalin, please comment if I'm misquoting you ;) ]


Generic ucontext seems broken in any case, when kernel/user sigset_t
sizes differ:

sigset_t oldmask, newmask;

void handler(int n, siginfo_t *si, void *uc_)
{
ucontext_t *uc = uc_;

oldmask = uc->uc_sigmask; uc->uc_sigmask = newmask;
}

With generic ucontext, this can overrun and corrupt memory if the user/
kernel sigset_t sizes differ.  The only fix is to reserve space in
ucontext for the larger of the two sigset sizes, which generic ucontext
does not do.

There's also the problem you comment on where only 7 bits of signal
number are available in wait() status values: with 0x7f being magic and
0 not a valid signal number, that probably allows up to 126 signals.
An arch could possibly have its own definitions to get beyond this, but
it's all done by magic numbers and open-coding in core code today.

i.e., the "extensibility" in generic ucontext may be bogus.


So, you can commit to a sane maximum number of signals (say, 64) in
your ABI, but this means that your libc sigset_t size probably needs to
match and you can never grow beyond that without an ABI break.

Or you can reserve enough space in ucontext for the userspace sigset_t.
Using generic signal.h and ucontext.h effectively commits you to max 64
signals AFAICT.  The extra space may be permanently wasted, but that's
preferable to memory corruption.


(Note, I don't know myself where the "1024" comes from.  Are there any
POSIXish systems implementing anywhere near that number of signals?  Is
there a real usecase for it?  Maybe it's just overzealous
futureproofing?)

> > Looks harmless here I suppose so I defer to others. If it is the
> > preferred approach does it make sense to make it the "default" for new
> > architectures at some point?
> 
> Again, this isn't really my thing, but we chose this because we thought it was
> the sane way to do it.  Unless we're doing something silly, I don't see why it
> wouldn't be a reasonable default.  This is predicated on having expandable
> architectural state, otherwise putting sigmask at the end seems sane.

Note, the contents of sigcontext are nonportable but are nonetheless
ABI, and some userspace software does expect to be able to poke about in
there, modify the signal return state, etc.

Additionally the whole signal frame cannot safely exceed MINSIGSTKSZ in
size (which looks like 2K if you're relying on generic signal.h -- not
a big number when compared against upcoming vector architectures).

Going beyond MINSIGSTKSZ is a user ABI break, as is any non-probeable
change to the contents of struct sigcontext, including changes to its
size.


We are burned by this on arm64 with SVE: arm64 has its own MINSIGSTKSZ
(5K), which is not enough for the biggest possible SVE implemetati

Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-06 Thread Will Deacon
On Wed, Jul 05, 2017 at 07:01:41PM -0700, Christoph Hellwig wrote:
> I'm a bit concerned about these cmpxchg syscalls, and I'd like to
> understand if my concerns are justified.
> 
> For a new instruction set that starts out in the 201x years we really
> should have cmpxchg as a mandatory instruction - if not in the CPU
> it should be in the Linux ABI so that we don't have to deal with a mess
> where programs will either need an indirection for CPUs that have
> cmpxchg capabilities vs those that don't, and we don't need to worry
> if given binaries work on all CPUs.

Agreed on the indirection; it feels like this is something that should be in
the vDSO, which could use the cmpxchg instruction if it's available, or
otherwise just uses plain loads and stores. In the latter case, the kernel
can then detect if you're preempted in the vdso critical region and restart
the cmpxchg (we do something similar for arch/arm/, but it's slightly
simpler with the vectors page (see entry-armv.S).

Alternatively, we could revisit the restartable sequences work from Mathieu,
but I think that's stalled pending real-world performance data.

> What keeps from from declaring that the RISV-A extension is mandatory
> for Linux?

Or that!

Will


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-05 Thread Christoph Hellwig
I'm a bit concerned about these cmpxchg syscalls, and I'd like to
understand if my concerns are justified.

For a new instruction set that starts out in the 201x years we really
should have cmpxchg as a mandatory instruction - if not in the CPU
it should be in the Linux ABI so that we don't have to deal with a mess
where programs will either need an indirection for CPUs that have
cmpxchg capabilities vs those that don't, and we don't need to worry
if given binaries work on all CPUs.

What keeps from from declaring that the RISV-A extension is mandatory
for Linux?


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-05 Thread Palmer Dabbelt
On Mon, 03 Jul 2017 16:06:39 PDT (-0700), james.ho...@imgtec.com wrote:
> On Thu, Jun 29, 2017 at 02:42:38PM -0700, Palmer Dabbelt wrote:
>> On Wed, 28 Jun 2017 15:42:37 PDT (-0700), james.ho...@imgtec.com wrote:
>> > On Wed, Jun 28, 2017 at 11:55:37AM -0700, Palmer Dabbelt wrote:
>> >> diff --git a/arch/riscv/include/uapi/asm/ucontext.h 
>> >> b/arch/riscv/include/uapi/asm/ucontext.h
>> >> new file mode 100644
>> >> index ..52eff9febcfd
>> >> --- /dev/null
>> >> +++ b/arch/riscv/include/uapi/asm/ucontext.h
>> > ...
>> >> +struct ucontext {
>> >> + unsigned long uc_flags;
>> >> + struct ucontext  *uc_link;
>> >> + stack_t   uc_stack;
>> >> + sigset_t  uc_sigmask;
>> >> + /* glibc uses a 1024-bit sigset_t */
>> >> + __u8  __unused[1024 / 8 - sizeof(sigset_t)];
>> >> + /* last for future expansion */
>> >> + struct sigcontext uc_mcontext;
>> >> +};
>> >
>> > Any particular reason not to use the asm-generic ucontext?
>>
>> In the generic ucontext, 'uc_sigmask' is at the end of the structure so it 
>> can
>> be expanded.  Since we want our mcontext to be expandable as well, we
>> pre-allocate some expandable space for sigmask and then put mcontext at the
>> end.
>>
>> We stole this idea from arm64.
>
> Curious. __unused seems like overkill to be honest given that expanding
> the number of signals up to 128 causes other issues (as discovered on
> MIPS e.g. the waitpid() status, with stopsig not fitting below the exit
> code (shift 8) and core dump flag (bit 7)), but perhaps it could be
> carefully expanded by splitting the stopsig field.

Sorry, I don't understand the intricacies of this in the slightest.  In general
we try to avoid surprises in software land in RISC-V, so whenever we do
something we go look at the most popular architectures (Intel and ARM) and try
to ensure we don't paint ourselves into any corners that they didn't.

> Looks harmless here I suppose so I defer to others. If it is the
> preferred approach does it make sense to make it the "default" for new
> architectures at some point?

Again, this isn't really my thing, but we chose this because we thought it was
the sane way to do it.  Unless we're doing something silly, I don't see why it
wouldn't be a reasonable default.  This is predicated on having expandable
architectural state, otherwise putting sigmask at the end seems sane.


Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-05 Thread James Hogan
On Tue, Jul 04, 2017 at 12:51:01PM -0700, Palmer Dabbelt wrote:
> diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
> new file mode 100644
> index ..2720d5e97354
> --- /dev/null
> +++ b/arch/riscv/kernel/ptrace.c
> @@ -0,0 +1,138 @@

> +/* Put registers back to task. */
> +static void putregs(struct task_struct *child, struct pt_regs *uregs)
> +{
> + struct pt_regs *regs = task_pt_regs(child);
> + *regs = *uregs;
> +}
> +
> +static int riscv_gpr_get(struct task_struct *target,
> +  const struct user_regset *regset,
> +  unsigned int pos, unsigned int count,
> +  void *kbuf, void __user *ubuf)
> +{
> + struct pt_regs *regs;
> +
> + regs = task_pt_regs(target);
> + return user_regset_copyout(&pos, &count, &kbuf, &ubuf, regs, 0,
> +sizeof(*regs));

sizeof(struct pt_regs) > sizeof(struct user_regs_struct), which allows
supervisor registers to be copied too. I think you should be using
sizeof(struct user_regs_struct) instead.

> +}
> +
> +static int riscv_gpr_set(struct task_struct *target,
> +  const struct user_regset *regset,
> +  unsigned int pos, unsigned int count,
> +  const void *kbuf, const void __user *ubuf)
> +{
> + int ret;
> + struct pt_regs regs;
> +
> + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, ®s, 0,
> +  sizeof(regs));

same

> + if (ret)
> + return ret;
> +
> + putregs(target, ®s);

you're still copying via the stack without initialising the non-written
fields. If userland does a short PTRACE_SETREGSET the remaining fields
will be copied from the uninitialised kernel stack and accessible to
userland via PTRACE_GETREGSET.

Even if the user does a full sizeof(struct user_regs_struct) (not
pt_regs) PTRACE_SETREGSET the supervisor registers will be overwritten
with uninitialised stack content.

> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
> new file mode 100644
> index ..4419604ff46c
> --- /dev/null
> +++ b/arch/riscv/kernel/sys_riscv.c

> +SYSCALL_DEFINE3(sysriscv_cmpxchg32, u32 __user *, ptr, u32, new, u32, old)
> +{
> + u32 prev;
> + unsigned int err;
> +
> + if (!access_ok(VERIFY_WRITE, ptr, sizeof(*ptr)))
> + return -EFAULT;
> +
> +#ifdef CONFIG_ISA_A
> + err = 0;
> + prev = cmpxchg32(ptr, old, new);

I think this needs a special version of cmpxchg (or for cmpxchg to be
modified) with fixup protection to return -EFAULT in case the page isn't
mapped or is paged out/read only.

> +#else
> + preempt_disable();
> + err = __get_user(prev, ptr);
> + if (likely(!err && prev == old))
> + err = __put_user(new, ptr);
> + preempt_enable();
> +#endif
> +
> + return unlikely(err) ? err : prev;
> +}
> +
> +SYSCALL_DEFINE3(sysriscv_cmpxchg64, u64 __user *, ptr, u64, new, u64, old)
> +{
> +#ifdef CONFIG_64BIT
> + u64 prev;
> + unsigned int err;
> +
> + if (!access_ok(VERIFY_WRITE, ptr, sizeof(*ptr)))
> + return -EFAULT;
> +
> +#ifdef CONFIG_ISA_A
> + err = 0;
> + prev = cmpxchg64(ptr, old, new);

Likewise

> +#else
> + preempt_disable();
> + err = __get_user(prev, ptr);
> + if (likely(!err && prev == old))
> + err = __put_user(new, ptr);
> + preempt_enable();
> +#endif
> + return unlikely(err) ? err : prev;
> +#else
> + return -ENOTSUPP;

I think -ENOSYS is more standard for missing/unimplemented system calls.

A better way IMO would be to #ifdef the definitions in unistd.h, then
the __NR_* definitions could also be more accurately extracted from the
kernel headers, and you could just ifdef CONFIG_64BIT the whole syscall
implementation, i.e.:

> diff --git a/arch/riscv/include/uapi/asm/unistd.h 
> b/arch/riscv/include/uapi/asm/unistd.h
> new file mode 100644
> index ..37a5429cc896
> --- /dev/null
> +++ b/arch/riscv/include/uapi/asm/unistd.h
> @@ -0,0 +1,23 @@

> +/*
> + * These system calls add support for AMOs on RISC-V systems without support
> + * for the A extension.
> + */
> +#define __NR_sysriscv_cmpxchg32  (__NR_arch_specific_syscall + 0)
> +__SYSCALL(__NR_sysriscv_cmpxchg32, sys_sysriscv_cmpxchg32)

+ifdef WHATEVER_BUILTIN_GCC_DEFINES_FOR_64BIT_RISCV_ABI

In case its helpful the following should list builtin preprocessor
defines for given CFLAGS:
${CROSS_COMPILE}gcc ${CFLAGS} -dM -E - +#define __NR_sysriscv_cmpxchg64  (__NR_arch_specific_syscall + 1)
> +__SYSCALL(__NR_sysriscv_cmpxchg64, sys_sysriscv_cmpxchg64)

+endif

Cheers
James


signature.asc
Description: Digital signature


[PATCH 8/9] RISC-V: User-facing API

2017-07-04 Thread Palmer Dabbelt
This patch contains code that is in some way visible to the user:
including via system calls, the VDSO, module loading and signal
handling.  It also contains some generic code that is ABI visible.

Signed-off-by: Palmer Dabbelt 
---
 arch/riscv/include/asm/elf.h  |   3 +-
 arch/riscv/include/asm/mmu.h  |  26 +++
 arch/riscv/include/asm/ptrace.h   | 116 
 arch/riscv/include/asm/syscall.h  |  90 ++
 arch/riscv/include/asm/syscalls.h |  25 +++
 arch/riscv/include/asm/unistd.h   |  16 ++
 arch/riscv/include/asm/vdso.h |  32 
 arch/riscv/include/uapi/asm/auxvec.h  |  24 +++
 arch/riscv/include/uapi/asm/bitsperlong.h |  25 +++
 arch/riscv/include/uapi/asm/byteorder.h   |  23 +++
 arch/riscv/include/uapi/asm/elf.h |  83 +
 arch/riscv/include/uapi/asm/hwcap.h   |  36 
 arch/riscv/include/uapi/asm/ptrace.h  |  90 ++
 arch/riscv/include/uapi/asm/sigcontext.h  |  30 
 arch/riscv/include/uapi/asm/siginfo.h |  24 +++
 arch/riscv/include/uapi/asm/ucontext.h|  35 
 arch/riscv/include/uapi/asm/unistd.h  |  23 +++
 arch/riscv/kernel/cpufeature.c|  61 +++
 arch/riscv/kernel/module.c| 217 ++
 arch/riscv/kernel/ptrace.c| 138 ++
 arch/riscv/kernel/riscv_ksyms.c   |  15 ++
 arch/riscv/kernel/signal.c| 289 ++
 arch/riscv/kernel/sys_riscv.c |  90 ++
 arch/riscv/kernel/syscall_table.c |  25 +++
 arch/riscv/kernel/vdso/.gitignore |   1 +
 arch/riscv/kernel/vdso/cmpxchg32.S|  40 +
 arch/riscv/kernel/vdso/cmpxchg64.S|  40 +
 arch/riscv/kernel/vdso/sigreturn.S|  24 +++
 arch/riscv/kernel/vdso/vdso.S |  27 +++
 29 files changed, 1667 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/include/asm/mmu.h
 create mode 100644 arch/riscv/include/asm/ptrace.h
 create mode 100644 arch/riscv/include/asm/syscall.h
 create mode 100644 arch/riscv/include/asm/syscalls.h
 create mode 100644 arch/riscv/include/asm/unistd.h
 create mode 100644 arch/riscv/include/asm/vdso.h
 create mode 100644 arch/riscv/include/uapi/asm/auxvec.h
 create mode 100644 arch/riscv/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/riscv/include/uapi/asm/byteorder.h
 create mode 100644 arch/riscv/include/uapi/asm/elf.h
 create mode 100644 arch/riscv/include/uapi/asm/hwcap.h
 create mode 100644 arch/riscv/include/uapi/asm/ptrace.h
 create mode 100644 arch/riscv/include/uapi/asm/sigcontext.h
 create mode 100644 arch/riscv/include/uapi/asm/siginfo.h
 create mode 100644 arch/riscv/include/uapi/asm/ucontext.h
 create mode 100644 arch/riscv/include/uapi/asm/unistd.h
 create mode 100644 arch/riscv/kernel/cpufeature.c
 create mode 100644 arch/riscv/kernel/module.c
 create mode 100644 arch/riscv/kernel/ptrace.c
 create mode 100644 arch/riscv/kernel/riscv_ksyms.c
 create mode 100644 arch/riscv/kernel/signal.c
 create mode 100644 arch/riscv/kernel/sys_riscv.c
 create mode 100644 arch/riscv/kernel/syscall_table.c
 create mode 100644 arch/riscv/kernel/vdso/.gitignore
 create mode 100644 arch/riscv/kernel/vdso/cmpxchg32.S
 create mode 100644 arch/riscv/kernel/vdso/cmpxchg64.S
 create mode 100644 arch/riscv/kernel/vdso/sigreturn.S
 create mode 100644 arch/riscv/kernel/vdso/vdso.S

diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index 5ded3f6f83ea..a1ef503d616e 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -59,7 +59,8 @@
  * instruction set this CPU supports.  This could be done in user space,
  * but it's not easy, and we've already done it here.
  */
-#define ELF_HWCAP  (0)
+#define ELF_HWCAP  (elf_hwcap)
+extern unsigned long elf_hwcap;
 
 /*
  * This yields a string that ld.so will use to load implementation
diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
new file mode 100644
index ..66805cba9a27
--- /dev/null
+++ b/arch/riscv/include/asm/mmu.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ *   This program is free software; you can redistribute it and/or
+ *   modify it under the terms of the GNU General Public License
+ *   as published by the Free Software Foundation, version 2.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ */
+
+
+#ifndef _ASM_RISCV_MMU_H
+#define _ASM_RISCV_MMU_H
+
+#ifndef __ASSEMBLY__
+
+typedef struct {
+   void *vdso;
+} mm_context_t;
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_MMU_H */
diff --git a/arch/riscv/include/asm/ptrace.h b/arch/riscv/include/asm/ptrace.h
new file mode 100644
index ..340201868842
-

Re: [PATCH 8/9] RISC-V: User-facing API

2017-07-03 Thread James Hogan
On Thu, Jun 29, 2017 at 02:42:38PM -0700, Palmer Dabbelt wrote:
> On Wed, 28 Jun 2017 15:42:37 PDT (-0700), james.ho...@imgtec.com wrote:
> > On Wed, Jun 28, 2017 at 11:55:37AM -0700, Palmer Dabbelt wrote:
> >> diff --git a/arch/riscv/include/uapi/asm/ucontext.h 
> >> b/arch/riscv/include/uapi/asm/ucontext.h
> >> new file mode 100644
> >> index ..52eff9febcfd
> >> --- /dev/null
> >> +++ b/arch/riscv/include/uapi/asm/ucontext.h
> > ...
> >> +struct ucontext {
> >> +  unsigned long uc_flags;
> >> +  struct ucontext  *uc_link;
> >> +  stack_t   uc_stack;
> >> +  sigset_t  uc_sigmask;
> >> +  /* glibc uses a 1024-bit sigset_t */
> >> +  __u8  __unused[1024 / 8 - sizeof(sigset_t)];
> >> +  /* last for future expansion */
> >> +  struct sigcontext uc_mcontext;
> >> +};
> >
> > Any particular reason not to use the asm-generic ucontext?
> 
> In the generic ucontext, 'uc_sigmask' is at the end of the structure so it can
> be expanded.  Since we want our mcontext to be expandable as well, we
> pre-allocate some expandable space for sigmask and then put mcontext at the
> end.
> 
> We stole this idea from arm64.

Curious. __unused seems like overkill to be honest given that expanding
the number of signals up to 128 causes other issues (as discovered on
MIPS e.g. the waitpid() status, with stopsig not fitting below the exit
code (shift 8) and core dump flag (bit 7)), but perhaps it could be
carefully expanded by splitting the stopsig field.

Looks harmless here I suppose so I defer to others. If it is the
preferred approach does it make sense to make it the "default" for new
architectures at some point?

Cheers
James


signature.asc
Description: Digital signature


Re: [PATCH 8/9] RISC-V: User-facing API

2017-06-29 Thread Palmer Dabbelt
On Wed, 28 Jun 2017 15:42:37 PDT (-0700), james.ho...@imgtec.com wrote:
> Hi Palmer,
>
> On Wed, Jun 28, 2017 at 11:55:37AM -0700, Palmer Dabbelt wrote:
>> diff --git a/arch/riscv/include/asm/syscalls.h 
>> b/arch/riscv/include/asm/syscalls.h
>> new file mode 100644
>> index ..d85267c4f7ea
>> --- /dev/null
>> +++ b/arch/riscv/include/asm/syscalls.h
>> @@ -0,0 +1,25 @@
> ...
>> +/* kernel/sys_riscv.c */
>> +asmlinkage long sys_sysriscv(unsigned long, unsigned long,
>> +unsigned long, unsigned long);
>
> You suggested in the cover letter this wasn't muxed any longer, maybe
> you should have a prototype for each of the cmpxchg syscalls instead?

Sorry, I just missed that.  I'll fix it for the v4

  diff --git a/arch/riscv/include/asm/syscalls.h 
b/arch/riscv/include/asm/syscalls.h
  index d85267c4f7ea..6490274fbb76 100644
  --- a/arch/riscv/include/asm/syscalls.h
  +++ b/arch/riscv/include/asm/syscalls.h
  @@ -19,7 +19,7 @@
   #include 

   /* kernel/sys_riscv.c */
  -asmlinkage long sys_sysriscv(unsigned long, unsigned long,
  -   unsigned long, unsigned long);
  +asmlinkage long sys_sysriscv_cmpxchg32(u32 __user * ptr, u32 new, u32 old);
  +asmlinkage long sys_sysriscv_cmpxchg64(u64 __user * ptr, u64 new, u64 old);

   #endif /* _ASM_RISCV_SYSCALLS_H */

>> diff --git a/arch/riscv/include/uapi/asm/ptrace.h 
>> b/arch/riscv/include/uapi/asm/ptrace.h
>> new file mode 100644
>> index ..01aee1654eae
>> --- /dev/null
>> +++ b/arch/riscv/include/uapi/asm/ptrace.h
> ...
>> +struct __riscv_f_ext_state {
>> +__u32 f[32];
>> +__u32 fcsr;
>> +};
>> +
>> +struct __riscv_d_ext_state {
>> +__u64 f[32];
>> +__u32 fcsr;
>> +};
>> +
>> +struct __riscv_q_ext_state {
>> +__u64 f[64] __attribute__((aligned(16)));
>> +__u32 fcsr;
>> +/* Reserved for expansion of sigcontext structure.  Currently zeroed
>> + * upon signal, and must be zero upon sigreturn.  */
>> +__u32 reserved[3];
>> +};
>> +
>> +union __riscv_fp_state {
>> +struct __riscv_f_ext_state f;
>> +struct __riscv_d_ext_state d;
>> +struct __riscv_q_ext_state q;
>> +};
>
> Out of interest, how does one tell which fp format is in use?

We might need another tag here -- I'll talk to Andrew (who did the glibc side
of this) and make sure we can handle something like running F user code on a D
kernel.

>> diff --git a/arch/riscv/include/uapi/asm/ucontext.h 
>> b/arch/riscv/include/uapi/asm/ucontext.h
>> new file mode 100644
>> index ..52eff9febcfd
>> --- /dev/null
>> +++ b/arch/riscv/include/uapi/asm/ucontext.h
> ...
>> +struct ucontext {
>> +unsigned long uc_flags;
>> +struct ucontext  *uc_link;
>> +stack_t   uc_stack;
>> +sigset_t  uc_sigmask;
>> +/* glibc uses a 1024-bit sigset_t */
>> +__u8  __unused[1024 / 8 - sizeof(sigset_t)];
>> +/* last for future expansion */
>> +struct sigcontext uc_mcontext;
>> +};
>
> Any particular reason not to use the asm-generic ucontext?

In the generic ucontext, 'uc_sigmask' is at the end of the structure so it can
be expanded.  Since we want our mcontext to be expandable as well, we
pre-allocate some expandable space for sigmask and then put mcontext at the
end.

We stole this idea from arm64.

>> diff --git a/arch/riscv/include/uapi/asm/unistd.h 
>> b/arch/riscv/include/uapi/asm/unistd.h
>> new file mode 100644
>> index ..7e3909ac3c18
>> --- /dev/null
>> +++ b/arch/riscv/include/uapi/asm/unistd.h
> ...
>> +/* FIXME: This exists for now in order to maintain compatibility with our
>> + * pre-upstream glibc, and will be removed for our real Linux submission.
>> + */
>> +#define __ARCH_WANT_RENAMEAT
>> +
>
> Don't forget ;-)
>
> Have you seen the patches floating around for dropping
> getrlimit/setrlimit (in favour of prlimit64) and fstatat64/fstat64 (in
> favour of statx)? I guess its no big deal.

Yes, but we're trying to make this glibc release so we decided to hold off on
them.  If we can't make it then we might reconsider, but they seem like fairly
small issues.

>> +#include 
>> +
>> +/*
>> + * These system calls add support for AMOs on RISC-V systems without support
>> + * for the A extension.
>> + */
>> +#define __NR_sysriscv_cmpxchg32 (__NR_arch_specific_syscall + 0)
>> +#define __NR_sysriscv_cmpxchg64 (__NR_arch_specific_syscall + 1)
>
> I think you need the magic __SYSCALL invocations here like in
> include/uapi/asm/unistd.h, otherwise they won't get included in your
> syscall table.

OK, I've added those.

  diff --git a/arch/riscv/include/uapi/asm/unistd.h 
b/arch/riscv/include/uapi/asm/unistd.h
  index 7e3909ac3c18..3cdb32912ac7 100644
  --- a/arch/riscv/include/uapi/asm/unistd.h
  +++ b/arch/riscv/include/uapi/asm/unistd.h
  @@ -23,4 +23,6 @@
* for the A extension.
*/
   #define __NR_sysriscv_cmpxchg32(__NR_arch_specific_syscall + 
0)
  +__SYSCALL(__NR_sysriscv_cmpxchg32, sys_sysriscv_cmpxchg32)
   #define __NR_sys

Re: [PATCH 8/9] RISC-V: User-facing API

2017-06-29 Thread Palmer Dabbelt
On Wed, 28 Jun 2017 14:49:44 PDT (-0700), t...@linutronix.de wrote:
> On Wed, 28 Jun 2017, Palmer Dabbelt wrote:
>> +
>> +SYSCALL_DEFINE3(sysriscv_cmpxchg32, unsigned long, arg1, unsigned long, 
>> arg2,
>> +unsigned long, arg3)
>> +{
>> +unsigned long flags;
>> +unsigned long prev;
>> +unsigned int *ptr;
>> +unsigned int err;
>> +
>> +ptr = (unsigned int *)arg1;
>
> Errm. Why isn't arg1 a proper pointer type and the arguments arg2/3 u32?
>
> And please give the arguments a proper name, so it's obvious what is what.
>
> SYSCALL_DEFINE3(sysriscv_cmpxchg32, u32 __user *, ptr, u32 new, u32 old)
>
> Hmm?

Sorry about that -- this used to be a multiplexed system call, and I guess I
was just being stupid when demultiplexing it.  That's much better, I've
converted these over.

>> +if (!access_ok(VERIFY_WRITE, ptr, sizeof(unsigned int)))
>> +return -EFAULT;
>> +
>> +preempt_disable();
>> +raw_local_irq_save(flags);
>
> Why do you want to disable interrupts here? This is thread context and
> accessing user space memory, so the only protection this needs is against
> preemption.

OK, that makes sense.

>> +err = __get_user(prev, ptr);
>> +if (likely(!err && prev == arg2))
>> +err = __put_user(arg3, ptr);
>> +raw_local_irq_restore(flags);
>> +preempt_enable();
>> +
>> +return unlikely(err) ? err : prev;
>> +}
>> +
>> +SYSCALL_DEFINE3(sysriscv_cmpxchg64, unsigned long, arg1, unsigned long, 
>> arg2,
>> +unsigned long, arg3)
>
> This one is even worse. How does this implement cmpxchg64 on a 32bit machine?
>
> Answer: Not at all, because arg2 and 3 are 32bit 

Thanks for catching that -- this was just a bit of copy-and-paste gone wrong.

>> +{
>> +unsigned long flags;
>> +unsigned long prev;
>> +unsigned int *ptr;
>> +unsigned int err;
>> +
>> +ptr = (unsigned int *)arg1;
>
> Type casting to random pointer types makes the code more obvious
> and safe, right? What the heck has a int pointer to do with u64?
>
>> +if (!access_ok(VERIFY_WRITE, ptr, sizeof(unsigned long)))
>> +return -EFAULT;
>> +
>> +preempt_disable();
>> +raw_local_irq_save(flags);
>
> Same as above.
>
>> +err = __get_user(prev, ptr);
>
> Sigh. Type safety is overrated, right?

Again, this was due to the multiplexing that has been removed.  I've gone ahead
and cleaned up this system call here

  
https://github.com/riscv/riscv-linux/commit/1af46852b968db5af044ec3a0329a73116b3e6ec

We'll include this in our v4 patch set.

Thanks!


Re: [PATCH 8/9] RISC-V: User-facing API

2017-06-28 Thread James Hogan
Hi Palmer,

On Wed, Jun 28, 2017 at 11:55:37AM -0700, Palmer Dabbelt wrote:
> diff --git a/arch/riscv/include/asm/syscalls.h 
> b/arch/riscv/include/asm/syscalls.h
> new file mode 100644
> index ..d85267c4f7ea
> --- /dev/null
> +++ b/arch/riscv/include/asm/syscalls.h
> @@ -0,0 +1,25 @@
...
> +/* kernel/sys_riscv.c */
> +asmlinkage long sys_sysriscv(unsigned long, unsigned long,
> + unsigned long, unsigned long);

You suggested in the cover letter this wasn't muxed any longer, maybe
you should have a prototype for each of the cmpxchg syscalls instead?

> diff --git a/arch/riscv/include/uapi/asm/ptrace.h 
> b/arch/riscv/include/uapi/asm/ptrace.h
> new file mode 100644
> index ..01aee1654eae
> --- /dev/null
> +++ b/arch/riscv/include/uapi/asm/ptrace.h
...
> +struct __riscv_f_ext_state {
> + __u32 f[32];
> + __u32 fcsr;
> +};
> +
> +struct __riscv_d_ext_state {
> + __u64 f[32];
> + __u32 fcsr;
> +};
> +
> +struct __riscv_q_ext_state {
> + __u64 f[64] __attribute__((aligned(16)));
> + __u32 fcsr;
> + /* Reserved for expansion of sigcontext structure.  Currently zeroed
> +  * upon signal, and must be zero upon sigreturn.  */
> + __u32 reserved[3];
> +};
> +
> +union __riscv_fp_state {
> + struct __riscv_f_ext_state f;
> + struct __riscv_d_ext_state d;
> + struct __riscv_q_ext_state q;
> +};

Out of interest, how does one tell which fp format is in use?

> diff --git a/arch/riscv/include/uapi/asm/ucontext.h 
> b/arch/riscv/include/uapi/asm/ucontext.h
> new file mode 100644
> index ..52eff9febcfd
> --- /dev/null
> +++ b/arch/riscv/include/uapi/asm/ucontext.h
...
> +struct ucontext {
> + unsigned long uc_flags;
> + struct ucontext  *uc_link;
> + stack_t   uc_stack;
> + sigset_t  uc_sigmask;
> + /* glibc uses a 1024-bit sigset_t */
> + __u8  __unused[1024 / 8 - sizeof(sigset_t)];
> + /* last for future expansion */
> + struct sigcontext uc_mcontext;
> +};

Any particular reason not to use the asm-generic ucontext?

> diff --git a/arch/riscv/include/uapi/asm/unistd.h 
> b/arch/riscv/include/uapi/asm/unistd.h
> new file mode 100644
> index ..7e3909ac3c18
> --- /dev/null
> +++ b/arch/riscv/include/uapi/asm/unistd.h
...
> +/* FIXME: This exists for now in order to maintain compatibility with our
> + * pre-upstream glibc, and will be removed for our real Linux submission.
> + */
> +#define __ARCH_WANT_RENAMEAT
> +

Don't forget ;-)

Have you seen the patches floating around for dropping
getrlimit/setrlimit (in favour of prlimit64) and fstatat64/fstat64 (in
favour of statx)? I guess its no big deal.

> +#include 
> +
> +/*
> + * These system calls add support for AMOs on RISC-V systems without support
> + * for the A extension.
> + */
> +#define __NR_sysriscv_cmpxchg32  (__NR_arch_specific_syscall + 0)
> +#define __NR_sysriscv_cmpxchg64  (__NR_arch_specific_syscall + 1)

I think you need the magic __SYSCALL invocations here like in
include/uapi/asm/unistd.h, otherwise they won't get included in your
syscall table.

> diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
> new file mode 100644
> index ..69b3b2d10664
> --- /dev/null
> +++ b/arch/riscv/kernel/ptrace.c
...
> +enum riscv_regset {
> + REGSET_X,
> +};
> +
> +/*
> + * Get registers from task and ready the result for userspace.
> + */
> +static char *getregs(struct task_struct *child, struct pt_regs *uregs)
> +{
> + *uregs = *task_pt_regs(child);
> + return (char *)uregs;
> +}
> +
> +/* Put registers back to task. */
> +static void putregs(struct task_struct *child, struct pt_regs *uregs)
> +{
> + struct pt_regs *regs = task_pt_regs(child);
> + *regs = *uregs;
> +}
> +
> +static int riscv_gpr_get(struct task_struct *target,
> +  const struct user_regset *regset,
> +  unsigned int pos, unsigned int count,
> +  void *kbuf, void __user *ubuf)
> +{
> + struct pt_regs regs;
> +
> + getregs(target, ®s);
> +
> + return user_regset_copyout(&pos, &count, &kbuf, &ubuf, ®s, 0,
> +sizeof(regs));

Shouldn't this be limited to sizeof(struct user_regs_struct)?

Why not copy straight out of task_pt_regs(target) instead of bouncing
via the stack?

> +}
> +
> +static int riscv_gpr_set(struct task_struct *target,
> +  const struct user_regset *regset,
> +  unsigned int pos, unsigned int count,
> +  const void *kbuf, const void __user *ubuf)
> +{
> + int ret;
> + struct pt_regs regs;
> +
> + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, ®s, 0,
> +  sizeof(regs));

likewise.

In fact if userland supplies insufficient data then this looks
vulnerable to a kernel stack data leak, since regs will remain partially
uninitialised and then get written to 

Re: [PATCH 8/9] RISC-V: User-facing API

2017-06-28 Thread Thomas Gleixner
On Wed, 28 Jun 2017, Thomas Gleixner wrote:
> On Wed, 28 Jun 2017, Palmer Dabbelt wrote:
> > +   err = __get_user(prev, ptr);
> 
> Sigh. Type safety is overrated, right?

But the comment above your __get_user() implementation says:

+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.



Re: [PATCH 8/9] RISC-V: User-facing API

2017-06-28 Thread Thomas Gleixner
On Wed, 28 Jun 2017, Palmer Dabbelt wrote:
> +
> +SYSCALL_DEFINE3(sysriscv_cmpxchg32, unsigned long, arg1, unsigned long, arg2,
> + unsigned long, arg3)
> +{
> + unsigned long flags;
> + unsigned long prev;
> + unsigned int *ptr;
> + unsigned int err;
> +
> + ptr = (unsigned int *)arg1;

Errm. Why isn't arg1 a proper pointer type and the arguments arg2/3 u32?

And please give the arguments a proper name, so it's obvious what is what.

SYSCALL_DEFINE3(sysriscv_cmpxchg32, u32 __user *, ptr, u32 new, u32 old)

Hmm?

> + if (!access_ok(VERIFY_WRITE, ptr, sizeof(unsigned int)))
> + return -EFAULT;
> +
> + preempt_disable();
> + raw_local_irq_save(flags);

Why do you want to disable interrupts here? This is thread context and
accessing user space memory, so the only protection this needs is against
preemption.

> + err = __get_user(prev, ptr);
> + if (likely(!err && prev == arg2))
> + err = __put_user(arg3, ptr);
> + raw_local_irq_restore(flags);
> + preempt_enable();
> +
> + return unlikely(err) ? err : prev;
> +}
> +
> +SYSCALL_DEFINE3(sysriscv_cmpxchg64, unsigned long, arg1, unsigned long, arg2,
> + unsigned long, arg3)

This one is even worse. How does this implement cmpxchg64 on a 32bit machine?

Answer: Not at all, because arg2 and 3 are 32bit 

> +{
> + unsigned long flags;
> + unsigned long prev;
> + unsigned int *ptr;
> + unsigned int err;
> +
> + ptr = (unsigned int *)arg1;

Type casting to random pointer types makes the code more obvious
and safe, right? What the heck has a int pointer to do with u64?

> + if (!access_ok(VERIFY_WRITE, ptr, sizeof(unsigned long)))
> + return -EFAULT;
> +
> + preempt_disable();
> + raw_local_irq_save(flags);

Same as above.

> + err = __get_user(prev, ptr);

Sigh. Type safety is overrated, right?

Thanks,

tglx


[PATCH 8/9] RISC-V: User-facing API

2017-06-28 Thread Palmer Dabbelt
This patch contains code that is in some way visible to the user:
including via system calls, the VDSO, module loading and signal
handling.  It also contains some generic code that is ABI visible.

Signed-off-by: Palmer Dabbelt 
---
 arch/riscv/include/asm/mmu.h  |  26 +++
 arch/riscv/include/asm/ptrace.h   | 116 
 arch/riscv/include/asm/syscall.h  |  90 ++
 arch/riscv/include/asm/syscalls.h |  25 +++
 arch/riscv/include/asm/unistd.h   |  16 ++
 arch/riscv/include/asm/vdso.h |  32 
 arch/riscv/include/uapi/asm/auxvec.h  |  24 +++
 arch/riscv/include/uapi/asm/bitsperlong.h |  25 +++
 arch/riscv/include/uapi/asm/byteorder.h   |  23 +++
 arch/riscv/include/uapi/asm/elf.h |  83 +
 arch/riscv/include/uapi/asm/ptrace.h  |  87 +
 arch/riscv/include/uapi/asm/sigcontext.h  |  29 +++
 arch/riscv/include/uapi/asm/siginfo.h |  24 +++
 arch/riscv/include/uapi/asm/ucontext.h|  35 
 arch/riscv/include/uapi/asm/unistd.h  |  26 +++
 arch/riscv/kernel/module.c| 215 ++
 arch/riscv/kernel/ptrace.c| 147 +++
 arch/riscv/kernel/riscv_ksyms.c   |  15 ++
 arch/riscv/kernel/signal.c| 288 ++
 arch/riscv/kernel/sys_riscv.c |  87 +
 arch/riscv/kernel/syscall_table.c |  25 +++
 arch/riscv/kernel/vdso/.gitignore |   1 +
 arch/riscv/kernel/vdso/cmpxchg32.S|  40 +
 arch/riscv/kernel/vdso/cmpxchg64.S|  40 +
 arch/riscv/kernel/vdso/sigreturn.S|  24 +++
 arch/riscv/kernel/vdso/vdso.S |  27 +++
 26 files changed, 1570 insertions(+)
 create mode 100644 arch/riscv/include/asm/mmu.h
 create mode 100644 arch/riscv/include/asm/ptrace.h
 create mode 100644 arch/riscv/include/asm/syscall.h
 create mode 100644 arch/riscv/include/asm/syscalls.h
 create mode 100644 arch/riscv/include/asm/unistd.h
 create mode 100644 arch/riscv/include/asm/vdso.h
 create mode 100644 arch/riscv/include/uapi/asm/auxvec.h
 create mode 100644 arch/riscv/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/riscv/include/uapi/asm/byteorder.h
 create mode 100644 arch/riscv/include/uapi/asm/elf.h
 create mode 100644 arch/riscv/include/uapi/asm/ptrace.h
 create mode 100644 arch/riscv/include/uapi/asm/sigcontext.h
 create mode 100644 arch/riscv/include/uapi/asm/siginfo.h
 create mode 100644 arch/riscv/include/uapi/asm/ucontext.h
 create mode 100644 arch/riscv/include/uapi/asm/unistd.h
 create mode 100644 arch/riscv/kernel/module.c
 create mode 100644 arch/riscv/kernel/ptrace.c
 create mode 100644 arch/riscv/kernel/riscv_ksyms.c
 create mode 100644 arch/riscv/kernel/signal.c
 create mode 100644 arch/riscv/kernel/sys_riscv.c
 create mode 100644 arch/riscv/kernel/syscall_table.c
 create mode 100644 arch/riscv/kernel/vdso/.gitignore
 create mode 100644 arch/riscv/kernel/vdso/cmpxchg32.S
 create mode 100644 arch/riscv/kernel/vdso/cmpxchg64.S
 create mode 100644 arch/riscv/kernel/vdso/sigreturn.S
 create mode 100644 arch/riscv/kernel/vdso/vdso.S

diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
new file mode 100644
index ..66805cba9a27
--- /dev/null
+++ b/arch/riscv/include/asm/mmu.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ *   This program is free software; you can redistribute it and/or
+ *   modify it under the terms of the GNU General Public License
+ *   as published by the Free Software Foundation, version 2.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ */
+
+
+#ifndef _ASM_RISCV_MMU_H
+#define _ASM_RISCV_MMU_H
+
+#ifndef __ASSEMBLY__
+
+typedef struct {
+   void *vdso;
+} mm_context_t;
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_MMU_H */
diff --git a/arch/riscv/include/asm/ptrace.h b/arch/riscv/include/asm/ptrace.h
new file mode 100644
index ..340201868842
--- /dev/null
+++ b/arch/riscv/include/asm/ptrace.h
@@ -0,0 +1,116 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ *   This program is free software; you can redistribute it and/or
+ *   modify it under the terms of the GNU General Public License
+ *   as published by the Free Software Foundation, version 2.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PTRACE_H
+#define _ASM_RISCV_PTRACE_H
+
+#include 
+#include 
+
+#ifndef __ASSEMBLY__
+
+struct pt_regs {
+   unsigned long sepc;
+   unsigned long ra;
+