On Sat, Apr 21, 2012 at 9:42 PM, Jeremy Bennett
<[email protected]> wrote:
> On Sat, 2012-04-21 at 17:28 +0100, Julius Baxter wrote:
>> On Tue, Aug 23, 2011 at 2:24 PM, Jeremy Bennett
>> <[email protected]> wrote:
>> > On Tue, 2011-08-23 at 14:01 +0100, Julius Baxter wrote:
>> >> Just to comment on some instructions.
>> >>
>> >> > For reference, the ORBIS32 class II instructions are:
>> >> > l.cmov
>> >>
>> >> Is this ever used by anyone? It's not emitted by the compiler as far
>> >> as I can tell.
>> >>
>> >> > l.csync
>> >> > l.cust1-8
>> >> > l.div*
>> >> > l.ext[bhw]*
>> >> > l.ff1
>> >> > l.fl1
>> >> > l.mac*,l.msb
>> >> > l.mul*
>> >> > l.psync
>> >> > l.ror,l.rori
>> >> > l.sf*i (l.sfeqi,l.sfgesi etc.)
>> >>
>> >> These should probably not be class II. They are emitted by default
>> >> from the compiler I believe.
>> >>
>> >> > l.trap
>> >
>> > Absolutely. GDB won't work without it.
>> >
>>
>> Better late than never.
>>
>> A proposed updated scheme is outlined here:
>>
>> http://opencores.org/or1k/Architecture_Specification#Instruction_Classes
>>
>> Taken from the page:
>>
>> Proposed ORBIS Classifications
>>
>> Class I should remain mandatory to implement.
>>
>> A new classification is proposed:
>> * Class II - Optional Maths: l.div*, l.mul*
>> * Class III - Optional Bit Manipulation: l.ext[bwh]*, l.ff1, l.fl1,
>> l.ror, l.rori
>> * Class IV - MAC Instructions - l.mac*, l.msb
>> * Class V - Remaining Optional Instructions: l.cmov, l.csync, l.msync,
>> l.psync, l.cust1-8, l.trap
>
> Hi Julius,
>
> It's a logical structure, except l.trap must be in class I for GDB.

I guess so, but what of a tiny OR1K implementation without debug unit
(so no ability to software debug, one of the major uses of the trap
instruction) and without operating system making use of l.trap? I'd
like to say this is a valid configuration. GCC won't emit this guy,
too.

If we could tie instructions to their units (like l.mac instructions
to the MAC unit, floating point to FPU etc.) I'd say l.trap is likely
to be needed really only when you have a debug unit. Perhaps this is
wrong in the case of an OR1K CPU running Linux (which I believe uses
l.trap instructions for userspace debugging?) but you're more than
likely to have a debug unit in that implementation anyway.

It's a very marginal gain in terms of implementation size, so maybe
it's not worth it, but I still thing it should be an optional
instruction, basically depending on if you have the debug unit
implemented, and so should be class V.
>
> But it is a MULTILIB nightmare. Not something we have ever really sorted
> out properly for OpenRISC, but we will have to. With 5 classes, as a
> baseline there will be 5 versions of each plain library and 5 versions
> of each debug library. There is no point having a separate class if you
> can't compile for it.
>
> Then you'll want the versions with and without the FPU. Now you have 20
> versions of the libraries.

Class IV and all of class V are not likely to be present in the
standard libraries we'd want to multilib, right? None of those are
emitted by GCC and I think it's not too much for us to say they cannot
be used in any hand-coded parts of GCC? Leaving us with just 3 classes
to support - I, II and III and we can cope with that I would think.

>
> And then there are the profiling versions...
>
> GCC 4.7 does introduce some more fine-grained control of MULTILIB, but
> you are still going to have to decide which ones you want.
>
> I can think of some simplifications. Class V doesn't affect the compiler
> - at most you're going to do these through built-ins or hand-coded
> assembler. I suspect class III is a tiny bit of logic compare to mul and
> div, so I suggest merging class II and III. That then gets you down to 3
> classes that affect the compiler (if you have MAC instructions, you want
> the compiler to use them).

Serial multipliers and dividers are not so huge, and those rotate
instructions can have large logic overheads - each potentially infers
most of a barrel shifter. I do see what you're saying though - and
it's an extension of why I've lumped integer multiply and divide
support together - if you're going to have one, you're clearly not on
a major area budget and it's not likely to push you over the edge if
you have the other - and I agree that to an extent the same goes for
class III with class II. Maybe we should do this...

I don't think we want to make the MAC instructions emitted by the
compiler, do we? That appears to be something for hand-coded DSP
algorithms alone.

Cheers

Julius
_______________________________________________
OpenRISC mailing list
[email protected]
http://lists.openrisc.net/listinfo/openrisc

Reply via email to