Re: [m5-dev] locked memory accesses

Gabriel Michael Black Thu, 05 Mar 2009 19:27:00 -0800

Quoting Steve Reinhardt <ste...@gmail.com>:

> On Thu, Mar 5, 2009 at 6:33 PM, Gabriel Michael Black
> <gbl...@eecs.umich.edu> wrote:
>> Quoting Steve Reinhardt <ste...@gmail.com>:
>>
>>> On Wed, Mar 4, 2009 at 7:03 AM, Steve Reinhardt <ste...@gmail.com> wrote:
>>>> I think there are two possible solutions:
>>>> 1. Add a "retry" response code for atomic requests (along the lines of
>>>> the error codes we alrady have in packet.hh) and then make sure that
>>>> all the places where we issue atomic requests can deal with them
>>>> appropriately.  Oddly enough it's reminiscent of the LL/SC solution,
>>>> though this is different since it only applies in atomic mode.
>>>> 2. Force any cpu or device that wants to do locked accesses in atomic
>>>> mode to do both the lock and unlock accesses back-to-back within the
>>>> same event (e.g., in the same call to tick()).
>>>>
>>>> Neither of these sound particularly attractive.  I like #2 better [...]
>>>
>>> Another advantage of #2 is that atomic-mode atomicity comes "for free"
>>> without touching the memory system at all.  This is nice since it
>>> gives you a baseline that will work on any memory system (e.g., Ruby).
>>>
>>> There's also some possibility that we could avoid implementing
>>> timing-mode locking in main memory with this approach, by making the
>>> reasonable restriction that if you want to run in timing mode then you
>>> have to use caches.  That may not hold if we have to deal with locked
>>> uncached accesses... I know these exist in real life, but I'm hoping
>>> that that's one of the "features" we can avoid by only running modern
>>> 64-bit software.  Gabe, do you know off hand if there are locked
>>> uncached accesses in any of the code you've run so far?
>>>
>>> Steve
>>> _______________________________________________
>>> m5-dev mailing list
>>> m5-dev@m5sim.org
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>
>> I haven't reached a conclusion on what to do, but I have a few
>> thoughts. First, I don't know if we'd be able to use #2 to get away
>> with not implementing locking in the main memory for timing accesses.
>> It would be impossible to do a read and write of a location in the
>> same event because you'd have to wait around for each to complete.
>> You'd either have to force all other accesses to wait around which
>> might be a big performance hit. Also things might get really sticky
>> with speculative accesses, although I think that may be true in general.
>
> Yea, you're right, but that wasn't my point... what I was trying to
> say was that if you (1) use option #2 to make atomicity "just work" in
> atomic mode from the memory system's point of view and (2) require the
> use of caches in timing mode so that you can rely on the cache
> coherence protocol to handle atomicity in timing mode (which probably
> isn't that hard, since you already have mechanisms for deferring
> invalidations), then the main memory object would never have to worry
> about implementing atomicity.  (In contrast to the current situation,
> where main memory has its own independent implementation of LL/SC
> support just to deal with cacheless configs.)
>
>> I like the idea of having lock and unlock flags, although I think it
>> would be a good idea to not allow intervening operations even from the
>> same source. With that restriction, locks could set up a condition
>> where you'd only allow an access to happen if it was an unlock. If
>> everything obeyed the rule of a lock and then an unlock that would
>> guarantee atomicity and has the nice property that you don't have to
>> keep track of -who- the access was originally from, just where the
>> access was to. It may be next to impossible to actually know who has a
>> lock if you have, for instance, a shared cache intervening.
>
> Yes, that's exactly what I was thinking already.
>
>> All of the accesses
>> I mentioned before and all of the CPU maintained data structure
>> updates can happen with a single load-op-store, although I could
>> believe there's some other ISA that needs something more elaborate.
>
> I'm not sure that it matters, as long as the CPU doesn't do any
> intervening unlocked memory accesses to the same block.
>
>> As far as how to make atomic mode atomic (which -is- weird), I don't
>> really like either of the options you mentioned either. The first one
>> seems more naturally analogous to timing mode, but then why do we need
>> atomic mode? The second one limits the effects of supporting this sort
>> of instruction/operation, but it would probably really complicate the
>> CPU and I'd bet have a lot of unintended consequences. I've been
>> thinking about this in the background and I haven't been able to come
>> up with a better idea though. I'll keep trying.
>
> What if for AtomicSimpleCPU we execute a macroinstruction on each tick
> instead of a microop?  We're not really caring that much about timing
> anyway, and all of the other CPU models only work in timing mode so
> they won't be affected.
>
> The only place where this gets really ridiculous is for REP prefixes,
> but if it's not too hard we could only execute one iteration of a REP
> instead of the whole macroinstruction in just that case.


There are other cases where there are lots of microops inside one  
macroop (INT, IRET, far JMP, far CALL, etc.), although it's a lot more  
bounded than REP. One problem is that REP doesn't necessarily actually  
mean REP. In some cases it's used to differentiate opcodes. The  
macroops themselves don't know the difference since they just have  
loops inside them. There are actually relatively few macroops that can  
be atomic, so it might be better to just do those all in one swoop.  
They could be marked with a generic "atomic" flag themselves so the  
CPU would know to use the special behavior. It might be obnoxious to  
work that into the x86 ISA description system since it's pretty  
complex and relatively constrained, but it likely could be done. I'm  
still not excited about how it'd complicate that CPU, though it might  
not be that bad and we might not have a choice.

Gabe
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] locked memory accesses

Reply via email to