Yes, you might say that the 6600 was the first hyperthreaded machine, 10 
hardware threads in one processor.  It's a fascinating design especially if you 
dig deep: the PPU state rotates around a 10-stage shift register (the 
"barrel").  The general descriptions show it as a storage element with the 
logic living at one stage, sort of in a "gap" in the circumference.  Reality 
was more complicated; for example, the memory read is issued from a spot 6 
stages around the barrel, i.e., 6 minor cycles before the instruction is due to 
be executed.  And there's decoding ahead of execution, not surprisingly.

I can imagine that name space collisions would be painful.  If the code were 
C++ and each machine a class, that could be handled nicely.  Probably too 
invasive a change for a not very common scenario.  Though there were other 
multiprocessors around; I have seen some documentation of a PDP-8 front-end for 
an Electrologica X8 timesharing system, unfortunately the PDP8 code was 
preserved but the X8 code was not.

        paul

> On May 24, 2018, at 8:39 PM, Bob Supnik <b...@supnik.org> wrote:
> 
> That's how I built the SiCortex simulator - six instances of the Mips CPU 
> were executed, round-robin, from within a single thread to assure the 
> lock-step behavior that was required.
> 
> Tom's implementation accurately represents how the CDC machines were built - 
> or at least, how the original 6600 was built. There was only one set of logic 
> for the 10? 12? peripheral processors, and it was time-sliced in strict 
> round-robin form. One of Chuck Thacker's classic designs at Xerox operated 
> the same way; the Alto, perhaps?
> 
> I looked fairly carefully at a software threaded model but concluded that the 
> numerous name-space collisions between the PDP15 and PDP11 simulators would 
> make merging them into a single executable too invasive. With the 
> multicore/multisimulator approach, the changes to both simulators are very 
> modest.
> 
> /Bob
> 
> On 5/24/2018 8:04 PM, Paul Koning wrote:
>> 
>>> On May 18, 2018, at 2:16 PM, Bob Supnik <b...@supnik.org> wrote:
>>> 
>>> At long last, I've finished testing the PDP15/UC15 combination, and it 
>>> works well enough to run a full XVM/DOS-15 sysgen. I've sent an "RC1" 
>>> package to Mark fP. or trial integration.
>>> 
>>> The configuration consists of two separate simulators - the PDP15 and a 
>>> stripped down PDP11 called the UC15. It uses the "shared memory" facility 
>>> that Mark P. created for both the shared memory between the PDP15 and the 
>>> PDP11 and the control link state. Getting decent performance requires 
>>> multiple cores and tight polling, so this initial implementation has a 
>>> number of restrictions: ...
>> Impressive!
>> 
>> I wonder if there might be some inspiration in Tom Hunter's DtCyber 
>> emulator.  That is also a multi-processor simulation with tightly controlled 
>> timing and shared memory.  Tom's implementation supports CDC Cyber 
>> configurations with 10 or 20 peripheral processors plus one central 
>> processor.  The central processor is actually not all that time critical, 
>> and I have extended his code (in a fork) with dual-CPU support using a 
>> separate thread for the other processor.  That required no special 
>> considerations to get the timing right.
>> 
>> But it turns out that near lockstep operation of the PPUs is critical.  At 
>> one point I tried splitting those into separate threads, but deadstart 
>> (system boot) fails miserably then.  Tom's answer is straightforward: the 
>> simulator is single threaded, timeslicing among the individual emulated 
>> processors a few cycles at a time.  It actually does one PPU cycle for each 
>> PPU, then N CPU cycles (for configurable N -- 8 or so is typical to mimic 
>> the real hardware performance ratio).  It's likely that it would still work 
>> with M > 1 PPU cycles per iteration, but that hasn't been tried as far as I 
>> know.
>> 
>> This structure of course means that entry and exit from each processor cycle 
>> emulation is frequent, which puts a premium on low overhead entry/exit to 
>> the CPU cycle action.  But it works quite well without requiring multiple 
>> processes with tight sync between multiple host CPU cores.
>> 
>> DtCyber doesn't have idling (it isn't part of the hardware architecture) 
>> though it's conceivable something could be constructed that would work on 
>> the standard Cyber OS.  There isn't a whole lot of push for that.  I made a 
>> stab at it but the initial attempt wasn't successful and I set it aside for 
>> lack of strong need.
>> 
>> Anyway... it's open source, and might be worth a look.
>> 
>>      paul
>> 
>> 
> 

_______________________________________________
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Reply via email to