[PATCH 2/2] powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode

2021-01-02 Thread Will Springer
Swap upper/lower 32 bits for 64-bit compat syscalls, conditioned on
endianness. This is modeled after the same functionality in
arch/mips/kernel/linux32.c.

This fixes compat_sys on ppc64le, when called by 32-bit little-endian
processes.

Tested with `file /bin/bash` (pread64) and `truncate -s 5G test`
(ftruncate64).

Signed-off-by: Will Springer 
---
 arch/powerpc/kernel/sys_ppc32.c | 49 +++--
 1 file changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index d36c6391eaf5..16ff0399a257 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -59,57 +59,64 @@ unsigned long compat_sys_mmap2(unsigned long addr, size_t 
len,
 /* 
  * long long munging:
  * The 32 bit ABI passes long longs in an odd even register pair.
+ * High and low parts are swapped depending on endian mode,
+ * so define a macro (similar to mips linux32) to handle that.
  */
+#ifdef __LITTLE_ENDIAN__
+#define merge_64(low, high) ((u64)high << 32) | low
+#else
+#define merge_64(high, low) ((u64)high << 32) | low
+#endif
 
 compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, 
compat_size_t count,
-u32 reg6, u32 poshi, u32 poslo)
+u32 reg6, u32 pos1, u32 pos2)
 {
-   return ksys_pread64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
+   return ksys_pread64(fd, ubuf, count, merge_64(pos1, pos2));
 }
 
 compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, 
compat_size_t count,
- u32 reg6, u32 poshi, u32 poslo)
+ u32 reg6, u32 pos1, u32 pos2)
 {
-   return ksys_pwrite64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
+   return ksys_pwrite64(fd, ubuf, count, merge_64(pos1, pos2));
 }
 
-compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offhi, u32 offlo, u32 
count)
+compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, 
u32 count)
 {
-   return ksys_readahead(fd, ((loff_t)offhi << 32) | offlo, count);
+   return ksys_readahead(fd, merge_64(offset1, offset2), count);
 }
 
 asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4,
-   unsigned long high, unsigned long low)
+   unsigned long len1, unsigned long len2)
 {
-   return ksys_truncate(path, (high << 32) | low);
+   return ksys_truncate(path, merge_64(len1, len2));
 }
 
-asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offhi, u32 offlo,
-u32 lenhi, u32 lenlo)
+asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 
offset2,
+u32 len1, u32 len2)
 {
-   return ksys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
-((loff_t)lenhi << 32) | lenlo);
+   return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2,
+merge_64(len1, len2));
 }
 
-asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long 
high,
-unsigned long low)
+asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long 
len1,
+unsigned long len2)
 {
-   return ksys_ftruncate(fd, (high << 32) | low);
+   return ksys_ftruncate(fd, merge_64(len1, len2));
 }
 
-long ppc32_fadvise64(int fd, u32 unused, u32 offset_high, u32 offset_low,
+long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2,
 size_t len, int advice)
 {
-   return ksys_fadvise64_64(fd, (u64)offset_high << 32 | offset_low, len,
+   return ksys_fadvise64_64(fd, merge_64(offset1, offset2), len,
 advice);
 }
 
 asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags,
-  unsigned offset_hi, unsigned offset_lo,
-  unsigned nbytes_hi, unsigned nbytes_lo)
+  unsigned offset1, unsigned offset2,
+  unsigned nbytes1, unsigned nbytes2)
 {
-   loff_t offset = ((loff_t)offset_hi << 32) | offset_lo;
-   loff_t nbytes = ((loff_t)nbytes_hi << 32) | nbytes_lo;
+   loff_t offset = merge_64(offset1, offset2);
+   loff_t nbytes = merge_64(nbytes1, nbytes2);
 
return ksys_sync_file_range(fd, offset, nbytes, flags);
 }
-- 
2.29.2







[PATCH 1/2] powerpc: use kernel endianness in MSR in 32-bit signal handler

2021-01-02 Thread Will Springer
From: Joseph J Allen 

This mirrors the behavior in handle_rt_signal32, to obey kernel endianness
rather than assume a 32-bit process is big-endian. Without this change,
any 32-bit little-endian process will SIGILL immediately upon handling a
signal.

Signed-off-by: Joseph J Allen 
Signed-off-by: Will Springer 
---
 arch/powerpc/kernel/signal_32.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 934cbdf6dd10..75ee918a120a 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -929,8 +929,9 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
regs->gpr[3] = ksig->sig;
regs->gpr[4] = (unsigned long) sc;
regs->nip = (unsigned long)ksig->ka.sa.sa_handler;
-   /* enter the signal handler in big-endian mode */
+   /* enter the signal handler in native-endian mode */
regs->msr &= ~MSR_LE;
+   regs->msr |= (MSR_KERNEL & MSR_LE);
return 0;
 
 failed:
-- 
2.29.2







[PATCH 0/2] powerpc: fixes for 32-bit little-endian processes

2021-01-02 Thread Will Springer
These are a couple small fixes that enable 32-bit little endian ("ppcle")
processes to run on a ppc64le kernel. Currently this is of interest for
the purposes of emulating ia32 programs with native userland assistance
via box86[1] (see PR#279 for initial ppc support), but a standalone
userland is functional, and may be used to complement a future ppcle
kernel port. We (those of us working on the userland effort in the
void-ppc project[2]) hope to come up with an ABI proposal to submit to
submit to the libc projects as a new port.

Cheers to Christophe Leroy and Michael Ellerman for converting the ppc
vDSO to C, and Michael in particular for tracking down a small issue
with it on ppcle, meaning the 32-bit LE vDSO gets to be functional
instead of half-broken with the old asm. (Sorry it took a minute to push
these patches, protonmail would not cooperate with git-send-email and then
I took off for the holidays.)

Cheers,
Will Springer [she/her]

[1]: https://github.com/ptitSeb/box86
[2]: https://voidlinux-ppc.org/

Joseph J Allen (1):
  powerpc: use kernel endianness in MSR in 32-bit signal handler

Will Springer (1):
  powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode

 arch/powerpc/kernel/signal_32.c |  3 +-
 arch/powerpc/kernel/sys_ppc32.c | 49 +++--
 2 files changed, 30 insertions(+), 22 deletions(-)

-- 
2.29.2







Re: CONFIG_PPC_VAS depends on 64k pages...?

2020-12-02 Thread Will Springer
On Tuesday, December 1, 2020 5:16:51 AM PST Bulent Abali wrote:
> I don't know anything about VAS page size requirements in the kernel.  I
> checked the user compression library and saw that we do a sysconf to
> get the page size; so the library should be immune to page size by
> design. But it wouldn't surprise me if a 64KB constant is inadvertently
> hardcoded somewhere else in the library.  Giving heads up to Tulio and
> Raphael who are owners of the github repo.
> 
> https://github.com/libnxz/power-gzip/blob/master/lib/nx_zlib.c#L922
> 
> If we got this wrong in the library it might manifest itself as an error
> message of the sort "excessive page faults".  The library must touch
> pages ahead to make them present in the memory; occasional page faults
> is acceptable. It will retry.

Hm, good to know. As I said I haven't noticed any problems so far, over a 
few different days of testing. My change is now in the Void Linux kernel 
package, and is working for others as well (including the Void maintainer 
Daniel/q66 who I CC'd initially).

> 
> Bulent
> 
> 
> 
> 
> From:    "Sukadev Bhattiprolu" 
> To:"Christophe Leroy" 
> Cc:"Will Springer" ,
> linuxppc-dev@lists.ozlabs.org, dan...@octaforge.org, Bulent
> Abali/Watson/IBM@IBM, ha...@linux.ibm.com Date:12/01/2020 12:53
> AM
> Subject:Re: CONFIG_PPC_VAS depends on 64k pages...?
> 
> Christophe Leroy [christophe.le...@csgroup.eu] wrote:
> > Hi,
> > 
> > Le 19/11/2020 à 11:58, Will Springer a écrit :
> > > I learned about the POWER9 gzip accelerator a few months ago when
> > > the
> > > support hit upstream Linux 5.8. However, for some reason the Kconfig
> > > dictates that VAS depends on a 64k page size, which is problematic
> > > as I
> > > run Void Linux, which uses a 4k-page kernel.
> > > 
> > > Some early poking by others indicated there wasn't an obvious page
> > > size
> > > dependency in the code, and suggested I try modifying the config to
> > > switch it on. I did so, but was stopped by a minor complaint of an
> > > "unexpected DT configuration" by the VAS code. I wasn't equipped to
> > > figure out exactly what this meant, even after finding the
> > > offending condition, so after writing a very drawn-out forum post
> > > asking for help, I dropped the subject.
> > > 
> > > Fast forward to today, when I was reminded of the whole thing again,
> > > and decided to debug a bit further. Apparently the VAS platform
> > > device (derived from the DT node) has 5 resources on my 4k kernel,
> > > instead of 4 (which evidently works for others who have had success
> > > on 64k kernels). I have no idea what this means in practice (I
> > > don't know how to introspect it), but after making a tiny patch[1],
> > > everything came up smoothly and I was doing blazing-fast gzip
> > > (de)compression in no time.
> > > 
> > > Everything seems to work fine on 4k pages. So, what's up? Are there
> > > pitfalls lurking around that I've yet to stumble over? More
> > > reasonably,
> > > I'm curious as to why the feature supposedly depends on 64k pages,
> > > or if there's anything else I should be concerned about.
> 
> Will,
> 
> The reason I put in that config check is because we were only able to
> test 64K pages at that point.
> 
> It is interesting that it is working for you. Following code in skiboot
> https://github.com/open-power/skiboot/blob/master/hw/vas.cshould
> restrict it to 64K pages. IIRC there is also a corresponding change in
> some NX registers that should also be configured to allow 4K pages. 

Huh, that is interesting indeed. As far as the kernel code, the only thing 
specific to 64k pages I could find was in [1], where 
VAS_XLATE_LPCR_PAGE_SIZE is set. There is also NX_PAGE_SIZE in drivers/
crypto/nx/nx.h, which is set to 4096, but I don't know if that's related to 
kernel page size at all. Without a better idea of the code base, I didn't
examine more thoroughly.

[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/powerpc/platforms/powernv/vas-window.c#n293

> static int init_north_ctl(struct proc_chip *chip)
> {
>  uint64_t val = 0ULL;
> 
>  val = SETFIELD(VAS_64K_MODE_MASK, val,
> true); val = SETFIELD(VAS_ACCEPT_PASTE_MASK, val, true); val =
> SETFIELD(VAS_ENABLE_WC_MMIO_BAR, val, true); val =
> SETFIELD(VAS_ENABLE_UWC_MMIO_BAR, val, true); val =
&g

CONFIG_PPC_VAS depends on 64k pages...?

2020-11-19 Thread Will Springer
I learned about the POWER9 gzip accelerator a few months ago when the 
support hit upstream Linux 5.8. However, for some reason the Kconfig 
dictates that VAS depends on a 64k page size, which is problematic as I 
run Void Linux, which uses a 4k-page kernel.

Some early poking by others indicated there wasn't an obvious page size 
dependency in the code, and suggested I try modifying the config to switch 
it on. I did so, but was stopped by a minor complaint of an "unexpected DT 
configuration" by the VAS code. I wasn't equipped to figure out exactly what 
this meant, even after finding the offending condition, so after writing a 
very drawn-out forum post asking for help, I dropped the subject.

Fast forward to today, when I was reminded of the whole thing again, and 
decided to debug a bit further. Apparently the VAS platform device 
(derived from the DT node) has 5 resources on my 4k kernel, instead of 4 
(which evidently works for others who have had success on 64k kernels). I 
have no idea what this means in practice (I don't know how to introspect 
it), but after making a tiny patch[1], everything came up smoothly and I 
was doing blazing-fast gzip (de)compression in no time.

Everything seems to work fine on 4k pages. So, what's up? Are there 
pitfalls lurking around that I've yet to stumble over? More reasonably, 
I'm curious as to why the feature supposedly depends on 64k pages, or if 
there's anything else I should be concerned about.

I do have to say I'm quite satisfied with the results of the NX 
accelerator, though. Being able to shuffle data to a RaptorCS box over gigE 
and get compressed data back faster than most software gzip could ever
hope to achieve is no small feat, let alone the instantaneous results locally.
:)

Cheers,
Will Springer [she/her]

[1]: 
https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch





Re: [musl] ppc64le and 32-bit LE userland compatibility

2020-06-09 Thread Will Springer
On Saturday, May 30, 2020 3:56:47 PM PDT you wrote:
> On Friday, May 29, 2020 12:24:27 PM PDT Rich Felker wrote:
> > The argument passing for pread/pwrite is historically a mess and
> > differs between archs. musl has a dedicated macro that archs can
> > define to override it. But it looks like it should match regardless of
> > BE vs LE, and musl already defines it for powerpc with the default
> > definition, adding a zero arg to start on an even arg-slot index,
> > which is an odd register (since ppc32 args start with an odd one, r3).
> > 
> > > [6]:
> > > https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c
> > 
> > I don't think this is correct, but I'm confused about where it's
> > getting messed up because it looks like it should already be right.
> 
> Hmm, interesting. Will have to go back to it I guess...
> 
> > > This was enough to fix up the `file` bug. I'm no seasoned kernel
> > > hacker, though, and there is still concern over the right way to
> > > approach this, whether it should live in the kernel or libc, etc.
> > > Frankly, I don't know the ABI structure enough to understand why the
> > > register padding has to be different in this case, or what
> > > lower-level component is responsible for it.. For comparison, I had
> > > a
> > > look at the mips tree, since it's bi-endian and has a similar 32/64
> > > situation. There is a macro conditional upon endianness that is
> > > responsible for munging long longs; it uses __MIPSEB__ and
> > > __MIPSEL__
> > > instead of an if/else on the generic __LITTLE_ENDIAN__. Not sure
> > > what
> > > to make of that. (It also simply swaps registers for LE, unlike what
> > > I did for ppc.)
> > 
> > Indeed the problem is probably that you need to swap registers for LE,
> > not remove the padding slot. Did you check what happens if you pass a
> > value larger than 32 bits?
> > 
> > If so, the right way to fix this on the kernel side would be to
> > construct the value as a union rather than by bitwise ops so it's
> > 
> > endian-agnostic:
> > (union { u32 parts[2]; u64 val; }){{ arg1, arg2 }}.val
> > 
> > But the kernel folks might prefer endian ifdefs for some odd reason...
> 
> You are right, this does seem odd considering what the other archs do.
> It's quite possible I made a silly mistake, of course...
> 
> I haven't tested with values outside the 32-bit range yet; again, this
> is new territory for me, so I haven't exactly done exhaustive tests on
> everything. I'll give it a closer look.

I took some cues from the mips linux32 syscall setup, and drafted a new 
patch defining a macro to compose the hi/lo parts within the function, 
instead of swapping the args at the function definition. `file /bin/bash` 
and `truncate -s 5G test` both work correctly now. This appears to be the 
correct solution, so I'm not sure what silly mistake I made before, but 
apologies for the confusion. I've updated my gist with the new patch [1].

> > > Also worth noting is the one other outstanding bug, where the
> > > time-related syscalls in the 32-bit vDSO seem to return garbage. It
> > > doesn't look like an endian bug to me, and it doesn't affect
> > > standard
> > > syscalls (which is why if you run `date` on musl it prints the
> > > correct time, unlike on glibc). The vDSO time functions are
> > > implemented in ppc asm (arch/powerpc/kernel/vdso32/ gettimeofday.S),
> > > and I've never touched the stuff, so if anyone has a clue I'm all
> > > ears.
> > 
> > Not sure about this. Worst-case, just leave it disabled until someone
> > finds a fix.
> 
> Apparently these asm implementations are being replaced by the generic C
> ones [1], so it may be this fixes itself on its own.
> 
> Thanks,
> Will [she/her]
> 
> [1]:
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231

I mentioned in Christophe's thread the other day, but his patchset does 
solve the vdso32 issues, though it introduced problems in vdso64 in my 
testing. With that solved and the syscall situation established, I think 
the kernel state is stable enough to start looking at solidifying libc/
compiler stuff. I'll try to get a larger userland built in the near future 
to try to catch any remaining problems (before rebuilding it all when 
libc/ABI support becomes explicit).

Cheers,
Will [she/her]

[1]: https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c






Re: ppc64le and 32-bit LE userland compatibility

2020-06-05 Thread Will Springer
On Saturday, May 30, 2020 3:17:24 PM PDT Will Springer wrote:
> On Saturday, May 30, 2020 8:37:43 AM PDT Christophe Leroy wrote:
> > There is a series at
> > https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231
> > to switch powerpc to the Generic C VDSO.
> > 
> > Can you try and see whether it fixes your issue ?
> > 
> > Christophe
> 
> Sure thing, I spotted that after making the initial post. Will report
> back with results.
> 
> Will [she/her]

Sorry for the wait, I just sat down to work on this again yesterday.

Tested this series on top of stable/linux-5.7.y (5.7.0 at the time of 
writing), plus the one-line signal handler patch. Had to rewind to the 
state of powerpc/merge at the time of the mail before the patch would 
apply, then cherry-picked to 5.6 until I realized the patchset used some 
functionality that didn't land until 5.7, so I moved it there.

Good news is that `date` now works correctly with the vdso call in 32-bit 
LE. Bad news is it seems to have broken things on the 64-bit side—in my 
testing, Void kicks off runit but hangs after starting eudev, and in a 
Debian Stretch system, systemd doesn't get to the point of printing 
anything whatsoever. (I had to `init=/bin/sh` to confirm the date worked 
in ppcle, although in ppc64le running `date` also hung the system when it 
made the vdso call...) Not sure how to approach debugging that, so I'd 
appreciate any pointers.

Will [she/her]





Re: [musl] ppc64le and 32-bit LE userland compatibility

2020-05-30 Thread Will Springer
On Friday, May 29, 2020 12:24:27 PM PDT Rich Felker wrote:
> The argument passing for pread/pwrite is historically a mess and
> differs between archs. musl has a dedicated macro that archs can
> define to override it. But it looks like it should match regardless of
> BE vs LE, and musl already defines it for powerpc with the default
> definition, adding a zero arg to start on an even arg-slot index,
> which is an odd register (since ppc32 args start with an odd one, r3).
> 
> > [6]:
> > https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c
> I don't think this is correct, but I'm confused about where it's
> getting messed up because it looks like it should already be right.

Hmm, interesting. Will have to go back to it I guess...

> > This was enough to fix up the `file` bug. I'm no seasoned kernel
> > hacker, though, and there is still concern over the right way to
> > approach this, whether it should live in the kernel or libc, etc.
> > Frankly, I don't know the ABI structure enough to understand why the
> > register padding has to be different in this case, or what
> > lower-level component is responsible for it.. For comparison, I had a
> > look at the mips tree, since it's bi-endian and has a similar 32/64
> > situation. There is a macro conditional upon endianness that is
> > responsible for munging long longs; it uses __MIPSEB__ and __MIPSEL__
> > instead of an if/else on the generic __LITTLE_ENDIAN__. Not sure what
> > to make of that. (It also simply swaps registers for LE, unlike what
> > I did for ppc.)
> Indeed the problem is probably that you need to swap registers for LE,
> not remove the padding slot. Did you check what happens if you pass a
> value larger than 32 bits?
> 
> If so, the right way to fix this on the kernel side would be to
> construct the value as a union rather than by bitwise ops so it's
> endian-agnostic:
> 
>   (union { u32 parts[2]; u64 val; }){{ arg1, arg2 }}.val
> 
> But the kernel folks might prefer endian ifdefs for some odd reason...

You are right, this does seem odd considering what the other archs do. 
It's quite possible I made a silly mistake, of course...

I haven't tested with values outside the 32-bit range yet; again, this is 
new territory for me, so I haven't exactly done exhaustive tests on 
everything. I'll give it a closer look.

> > Also worth noting is the one other outstanding bug, where the
> > time-related syscalls in the 32-bit vDSO seem to return garbage. It
> > doesn't look like an endian bug to me, and it doesn't affect standard
> > syscalls (which is why if you run `date` on musl it prints the
> > correct time, unlike on glibc). The vDSO time functions are
> > implemented in ppc asm (arch/powerpc/kernel/vdso32/ gettimeofday.S),
> > and I've never touched the stuff, so if anyone has a clue I'm all
> > ears.
> Not sure about this. Worst-case, just leave it disabled until someone
> finds a fix.

Apparently these asm implementations are being replaced by the generic C 
ones [1], so it may be this fixes itself on its own.

Thanks,
Will [she/her]

[1]: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231








Re: ppc64le and 32-bit LE userland compatibility

2020-05-30 Thread Will Springer
On Saturday, May 30, 2020 12:22:12 PM PDT Segher Boessenkool wrote:
> The original sysv PowerPC supplement
> http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf
> supports LE as well, and most powerpcle ports use that.  But, the
> big-endian Linux ABI differs in quite a few places, and it of course
> makes a lot better sense if powerpcle-linux follows that.

Right, I should have clarified I was talking about Linux ABIs 
specifically.

> What patches did you need?  I regularly build >30 cross compilers (on
> both BE and LE hosts; I haven't used 32-bit hosts for a long time, but
> in the past those worked fine as well).  I also cross-built
> powerpcle-linux-gcc quite a few times (from powerpc64le, from powerpc64,
> from various x86).

There was just an assumption that LE == powerpc64le in libgo, spotted by 
q66 (daniel@ on the CC). I just pushed the patch to [1].

> Almost no project that used 32-bit PowerPC in LE mode has sent patches
> to the upstreams.

Right, but I have heard concerns from at least one person familiar with 
the ppc kernel about breaking existing users of this arch-endianness 
combo, if any. It seems likely that none of those use upstream, though ^^;

> The ABI says long longs are passed in the same order in registers as it
> would be in memory; so the high part and the low part are swapped
> between BE and LE.  Which registers make up a pair is exactly the same
> between the two.  (You can verify this with an existing powerpcle-*
> compiler, too; I did, and we implement it correctly as far as I can
> see).

I'll give it a closer look. This is my first time poking at this sort of 
thing in depth, so excuse my unfamiliarity!

> A huge factor in having good GCC support for powerpcle-linux (or
> anything else) is someone needs to regularly test it, and share test
> results with us (via gcc-testresults@).  Hint hint hint :-)
> 
> That way we know it is in good shape, know when we are regressing it,
> know there is interest in it.

Once I have more of a bootstrapped userland than a barely-functional 
cross chroot, I'll get back to you on that :)
 
> gl;hf,
> 
> 
> Segher

Thanks,
Will [she/her]

[1]: 
https://github.com/Skirmisher/void-packages/blob/master/srcpkgs/gcc/patches/libgo-ppcle.patch






Re: ppc64le and 32-bit LE userland compatibility

2020-05-30 Thread Will Springer
On Saturday, May 30, 2020 8:37:43 AM PDT Christophe Leroy wrote:
> There is a series at
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231 to
> switch powerpc to the Generic C VDSO.
> 
> Can you try and see whether it fixes your issue ?
> 
> Christophe

Sure thing, I spotted that after making the initial post. Will report back 
with results.

Will [she/her]






ppc64le and 32-bit LE userland compatibility

2020-05-29 Thread Will Springer
Hey all, a couple of us over in #talos-workstation on freenode have been
working on an effort to bring up a Linux PowerPC userland that runs in 32-bit
little-endian mode, aka ppcle. As far as we can tell, no ABI has ever been
designated for this (unless you count the patchset from a decade ago [1]), so
it's pretty much uncharted territory as far as Linux is concerned. We want to
sync up with libc and the relevant kernel folks to establish the best path
forward.

The practical application that drove these early developments (as you might
expect) is x86 emulation. The box86 project [2] implements a translation layer
for ia32 library calls to native architecture ones; this way, emulation
overhead is significantly reduced by relying on native libraries where
possible (libc, libGL, etc.) instead of emulating an entire x86 userspace.
box86 is primarily targeted at ARM, but it can be adapted to other
architectures—so long as they match ia32's 32-bit, little-endian nature. Hence
the need for a ppcle userland; modern POWER brought ppc64le as a supported
configuration, but without a 32-bit equivalent there is no option for a 32/64
multilib environment, as seen with ppc/ppc64 and arm/aarch64.

Surprisingly, beyond minor patching of gcc to get crosscompile going,
bootstrapping the initial userland was not much of a problem. The work has
been done on top of the Void Linux PowerPC project [3], and much of that is
now present in its source package tree [4].

The first issue with running the userland came from the ppc32 signal handler 
forcing BE in the MSR, causing any 32LE process receiving a signal (such as a 
shell receiving SIGCHLD) to terminate with SIGILL. This was trivially patched, 
along with enabling the 32-bit vDSO on ppc64le kernels [5]. (Given that this 
behavior has been in place since 2006, I don't think anyone has been using the 
kernel in this state to run ppcle userlands.)

The next problem concerns the ABI more directly. The failure mode was `file`
surfacing EINVAL from pread64 when invoked on an ELF; pread64 was passed a
garbage value for `pos`, which didn't appear to be caused by anything in 
`file`. Initially it seemed as though the 32-bit components of the arg were
getting swapped, and we made hacky fixes to glibc and musl to put them in the
"right order"; however, we weren't sure if that was the correct approach, or
if there were knock-on effects we didn't know about. So we found the relevant
compat code path in the kernel, at arch/powerpc/kernel/sys_ppc32.c, where
there exists this comment:

> /*
>  * long long munging:
>  * The 32 bit ABI passes long longs in an odd even register pair.
>  */

It seems that the opposite is true in LE mode, and something is expecting long
longs to start on an even register. I realized this after I tried swapping hi/
lo `u32`s here and didn't see an improvement. I whipped up a patch [6] that
switches which syscalls use padding arguments depending on endianness, while
hopefully remaining tidy enough to be unobtrusive. (I took some liberties with
variable names/types so that the macro could be consistent.)

This was enough to fix up the `file` bug. I'm no seasoned kernel hacker,
though, and there is still concern over the right way to approach this,
whether it should live in the kernel or libc, etc. Frankly, I don't know the
ABI structure enough to understand why the register padding has to be
different in this case, or what lower-level component is responsible for it. 
For comparison, I had a look at the mips tree, since it's bi-endian and has a 
similar 32/64 situation. There is a macro conditional upon endianness that is 
responsible for munging long longs; it uses __MIPSEB__ and __MIPSEL__ instead 
of an if/else on the generic __LITTLE_ENDIAN__. Not sure what to make of that. 
(It also simply swaps registers for LE, unlike what I did for ppc.)

Also worth noting is the one other outstanding bug, where the time-related
syscalls in the 32-bit vDSO seem to return garbage. It doesn't look like an
endian bug to me, and it doesn't affect standard syscalls (which is why if you
run `date` on musl it prints the correct time, unlike on glibc). The vDSO time
functions are implemented in ppc asm (arch/powerpc/kernel/vdso32/
gettimeofday.S), and I've never touched the stuff, so if anyone has a clue I'm 
all ears.

Again, I'd appreciate feedback on the approach to take here, in order to 
touch/special-case only the minimum necessary, while keeping the kernel/libc 
folks happy.

Cheers,
Will [she/her]

(p.s. there is ancillary interest in a ppcle-native kernel as well; that's a 
good deal more work and not the focus of this message at all, but it is a 
topic of interest)

[1]: https://lwn.net/Articles/408845/
[2]: https://github.com/ptitSeb/box86
[3]: https://voidlinux-ppc.org/
[4]: https://github.com/void-ppc/void-packages
[5]: https://gist.github.com/eerykitty/01707dc6bca2be32b4c5e30d15d15dcf
[6]: https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c