On Sat, 23 Apr 2016, Sean Conner wrote: > > > One major problem with adding a faster CPU to an SGI is the MIPS chip > > > itself---code compiled for one MIPS CPU (say, the R3000) won't run on > > > another MIPS CPU (say, the R4400) due to the differences in the pipeline. > > > MIPS compilers were specific for a chip because such details were not > > > hidden > > > in the CPU itself, but left to the compiler to deal with. > > > > Having written a bunch of R3000 and R4000/4200/4300/4400/4600 assembly > > code in the 1990s, my (possibly faulty) recollection disagrees with > > you. There are differences in supervisor-mode programming, but I don't > > recall any issues with running 32-bit user-mode R3000 code on any > > R4xxx. The programmer-visible pipelline behavior (e.g., branch delay > > slots) were the same. > > Hmm ... I might have been misremembering. I just checked the book I have > on the MIPS, and yes, the supervisor stuff is different between the R2000, > R3000, R4000 and R6000. Also, the R2000, R3000 and R6000 have a five stage > pipeline, and the R4000 has an eight stage pipeline.
Pipeline restrictions were gradually relaxed by adding more and more interlocks as the architecture evolved. So while user mode code compiled for a higher ISA might not necessarily work with an older one even if it only used instructions defined in the older ISA, there was no issue the other way round, old code was forward compatible with newer hardware (or, depending on how you look at it new hardware was backward compatible with older code). The timeline was roughly: - MIPS II -- removed load delay slots -- for memory read instructions targetting both general purpose and coprocessor registers, - MIPS IV -- removed coprocessor transfer and condition code delay slots -- for instructions used to move data between general purpose and coprocessor registers as well as ones setting or reading coprocessor condition codes. The original MIPS I ISA only had an interlock on multiply-divide unit (MDU) accumulator accesses, so all the other pipeline hazards had to be handled in software, by inserting the right number of instructions between the producer and the consumer of data; NOPs were used where no useful instructions could be scheduled. Some operations continued to require a manual resolution of pipeline hazards even in the MIPS IV ISA, like moves to the MDU accumulator, as well as many privileged operations (TLB writes, mode switches, etc.). For these the SSNOP (superscalar NOP) instruction was introduced, which was guaranteed not to be nullified with superscalar pipelines. The encoding was chosen such that it was backwards compatible, using one of the already existing ways to express an operation with no visible effects other than incrementing the PC, which given the design of the MIPS instruction set there has been always a plethora of. Consequently SSNOP was executed as an ordinary NOP by older ISA implementations. NB despite the hardware interlocks it has always been preferable to avoid pipeline stalls triggered by them by scheduling the right minimum number of instructions between data producers and the respective consumers anyway and compilers have had options to adapt here to specific processor implementations. The addition of hardware interlocks made the life of compiler (and handcoded assembly) writers a little bit easier as a missed optimisation didn't result in broken code. Also more compact code could be produced where there was no way to schedule useful code to satisfy pipeline hazards and NOP would have to be inserted otherwise. I won't dive into the details of the further evolution with modern MIPS ISAs here, for obvious reasons. Maciej