On Wed, Aug 20, 2014 at 8:30 PM, Matt Thomas <[email protected]> wrote:

>
> Recently, I started making NetBSD support OpenRisc.
>
>
Great!


> I'm using binutils from the top of the tree and GCC 4.9 for my toolchain.
>
> I looked at using llvm-openrisc but NetBSD's LLVM is 3.6 while
> llvm-openrisc
> is 3.1.  Since my expertise with toolchains is more gcc centric, I went
> that
> way.
>
>
https://github.com/openrisc/llvm-or1k is actually 3.5 and these guys have
even more recent versions
http://compilergroup-srv.elet.polimi.it/pulp/git/pulp-public
(Their git server seems a bit unreliable though, I haven't been able to
pull anything from it ever)


> So I'm wondering on what ISA features I can count on.  Are OR32BIS II
> instructions widely implemented? floating point?
>
>
Strictly, you can't count on any of the OR32BIS II instructions being
implemented, but in practice, 'all' implementations have support for mul
div and ff1/fl1.
FPU is not that commonly implemented/used, but I doubt there are code in
the kernel that depends on FPU?
I also doubt there are much user space FPU code that is written in asm?


> I was deciding on whether to focus on whether to just support the no-delay
> version of the ISA.  I found that PIC code and -mno-delay seem to be
> incompatible at the moment.
>

Delay-slot implementations are still most dominant, especially if you want
a mmu.
I have a long-term plan on doing a delay-slot-less version of
mor1kx-cappuccino (https://github.com/openrisc/mor1kx),
and there's this https://github.com/pgavin/carpe
I would suggest working on making the code delay-slot agnostic instead of
choosing one path.


> The problem is computing the GOT pointer doesn't take into account
> -mno-delay
> or -mcompat-delay.  It's always emitted as:
>
>        l.jal           8
>        l.movhi         r16,gotpchi(_GLOBAL_OFFSET_TABLE_-4)
>        l.ori           r16,r16,gotpclo(_GLOBAL_OFFSET_TABLE_+0)
>        l.add           r16,r16,r9
>
> The problem is for no-delay the l.jal should have an argument of 4 or the
> l.movhi will never be executed since it was branched over.  I think for -m
> no-delay or -mcompat-delay it should be:
>
>        l.jal           4
>        l.movhi         r16,gotpchi(_GLOBAL_OFFSET_TABLE_+0)
>        l.ori           r16,r16,gotpclo(_GLOBAL_OFFSET_TABLE_+4)
>        l.add           r16,r16,r9
>
>
Yes, this is a deficiency in the implementation in gcc (llvm actually
handles this correctly).
But, that said, you will never be able to make the got pointer acquiring
work with -mcompat-delay, since the
l.jal           4
will be wrong for implementations that features delay-slot.
IOW, PIC code will need to always choose to either be delay or no-delay.


> I notice that r16 is being as the GOT pointer and r10 as the thread pointer
> though there aren't document as such in the OpenRISC 1.1 Architecture.
>
>
Yes, we should update the arch spec ABI section with this...


> I was surprised to see that patterns for ffssi2, ctzsi2, and clzsi2 aren't
> present for gcc given the l.ff1 and l.fl1 instructions.
>
>
True, the l.ff1 and l.fl1 are optional, so I guess that's why that haven't
been implemented.
I might take a look at adding support for that at some point, if someone
doesn't beat me to it.
llvm (can) make use of these though.


> Looking at the emitted gcc code, I see.
>
>        l.addi  r1,r1,16
>        l.lwz           r9,-4(r1)        # SI load
>        l.lwz           r1,-16(r1)       # SI load
>
> The load of r1 after the l.addi serves no useful purpose.
>
>
It's a known issue... and it's in my todo-pipeline to fix that as soon as
I'm done with what I'm currently working on.
The reason that code is emitted is to work-around some dwarf2 issue that
Christian Svensson noticed.
If you want to take a look at it yourself, this is the code that make it
happen.
https://github.com/openrisc/or1k-gcc/blob/or1k/gcc/config/or1k/or1k.c#L132-L136

Stefan
_______________________________________________
OpenRISC mailing list
[email protected]
http://lists.openrisc.net/listinfo/openrisc

Reply via email to