Speaking of which, this is something that annoys me about the OR1K ISA and I think should be changed for OR2K (if it hasn't already). It's pretty much impossible to implement the current ISA with a stock 32-bit multiplier, so you have to use a 64-bit multiplier if you don't want to hand code a multiplier.
And the MAC truncates to 32-bits *before* the add, is that right? But it keeps the whole 64-bits of the accumulation? Is there a good reason for that? I don't understand the justifications for this in the ISA, but it certainly seems odd and difficult to implement, if you ask me. It seems to me the best thing would just to have a single, 64-bit MAC that takes 32-bit operands and can be used for both the MAC and for regular multiply. -Pete On Fri, Sep 28, 2012 at 10:27 PM, Stefan Kristiansson <[email protected]> wrote: > On Fri, Sep 28, 2012 at 09:11:19PM +0200, R. Diez wrote: >> Hi all OpenRISC gurus: >> >> I'd like to implement the multiply and divide ORBIS instructions on >> my OR10 CPU. >> >> I did a quick research, and this area is not straightforward, or >> maybe I haven't found the right website yet. I could invest more >> time and study the existing implementations, but I'm hoping somebody >> here can help me save a lot of time and effort. >> >> I guess implementing a generic multiplicator in pure Verilog will >> end up taking a lot or resources, so I looked at the Xilinx >> primitives / IP generator. The first thing I noticed is that there >> are no carry or overflow signals. As discussed in this list before, >> the existing or1200 implementation does not generate the same carry >> and overflow results as or1ksim. Can anybody point me to a good >> website where I can copy a correct Verilog carry/overflow >> implementation from? Can I still use the Xilinx primitives to save >> FGPA area and calculate carry/overflow separately? >> >> Should I go for a pipelined version, in order to let the rest of the >> CPU run at a higher speed when not multiplying? I think or1200 has a >> configuration option to disable carry and overflow, should I go >> ahead with Xilinx's IP cores and leave carry and overflow out? Or >> does GCC need them? >> >> Can I reuse a multiplicator somehow for signed and unsigned >> integers, or do I have to implement one for each kind? >> >> Is there a Xilinx IP core to divide integers, or can I resort to >> some other trick with other components? Or do I have to resort to a >> pure Verilog implementation? >> > > For the division, I'd suggest you use a simple serial divider. > Feel free to take inspiration from the one I did in mor1kx: > https://github.com/openrisc/mor1kx/blob/master/rtl/verilog/mor1kx_execute_alu.v#L310 > > Stefan > _______________________________________________ > OpenRISC mailing list > [email protected] > http://lists.openrisc.net/listinfo/openrisc _______________________________________________ OpenRISC mailing list [email protected] http://lists.openrisc.net/listinfo/openrisc
