nathan binkert wrote:
>>    I'm trying to factor out PAL mode from O3, and I found the following
>> line
>>
>> http://repo.m5sim.org/m5/file/fc12f4d657f0/src/cpu/o3/fetch_impl.hh#l579
>>
>>    } else if (interruptPending && !(fetch_PC & 0x3)) {
>>
>>
>> which is basically making fetch stop fetching once it sees that an
>> interrupt is pending and it's no longer fetching PAL mode code. Why is
>> that there? If interrupts are blocked while in PAL mode, why doesn't O3
>> just rely on commit waiting until committed state is out of PAL mode and
>> then start handling the interrupt?
>>     
> Interrupts cannot be taken during palmode.  I'm not sure how this
> would be handled from commit.  What needs to happen is that fetch
> cannot be redirected to the interrupt handler until the CPU exits PAL
> (i.e. the pal routine needs to finish without being interrupted at
> all).
>   

I think what happens is that commit asks the Interrupts if there's an
interrupt for it to service, if so it gets it and calls its invoke
method, and that, when it sets state through the threadcontext, flushes
out the rest of the machine and restarts execution. Fetch is just along
for the ride, and the Interrupts object is already deciding when its
appropriate to take an interrupt.

>   
>> This behavior would be very annoying
>> and perhaps impossible to preserve in an ISA agnostic way and I'd really
>> like to just get rid of it, but if there's a good reason for it I'll
>> have to be more creative.
>>     
> We could have a more generic ISA specific function
> areInterruptsEnabled() which would be inlined to return true in most
> ISAs.
>   

Interrupts aren't going to be universally enabled in most ISAs, and the
state they use to decide that is going to usually be different than the
PC. They would need to read that state through the thread context which
reflects committed state, and the Interrupts object is already doing
that in the commit stage.

>   
>> It innocuously sits there inert for SPARC and
>> MIPS, but for ARM in thumb mode with 16 bit instructions and especially
>> X86 where the PC can be anything with no special meaning this will cause
>> a serious problem. This is something I'd like to have straightened out
>> in the short to medium term.
>>     
> Seems perfectly reasonable and not too difficult to me.
>
>   
>> Longer term I see a problem with the way PAL mode is communicated to the
>> ITLB as well. First, the PC is passed as part of a request as a single
>> address. This would make sense if the PC was the actual memory address
>> used for the fetch, but that's already in the vaddr field. Generalized
>> PCs have more structure than that. Second, this is the mangled PC which
>> is only there to signal PAL mode in Alpha and has no current use in any
>> other ISA. Even in Alpha only one bit actually matters. Carting this
>> value around for this one (admittedly important) use is a waste. This is
>> slightly a lie because the stride prefetcher also uses the PC field, but
>> it looks like it could (and realistically would) use the paddr or vaddr
>> instead. Third, its use depends on the fact that the magical property
>> communicated by the PC carries over to all the instructions being
>> fetched as part of that access, or in other words effectively the cache
>> line. This is something that happens to work for Alpha but won't work
>> generally. Fourth, if the PAL mode bit is pull out of the fetch address
>> used by the CPUs so that they truly deal with memory addresses and not
>> architectural PC values, the architectural PC, and effectively the
>> PAL-ness of the Alpha PC, won't be accessible to the ITLB anymore. It
>> looks like we can either have the PC based PAL mode bit in Alpha or more
>> complete CPU model abstraction universally but not both.
>>     
> I'm not clear on why Alpha couldn't keep the PAL mode bit in the PC.
> Seems that any ISA independent code (i.e. the CPU models) when asking
> for the PC should get the masked value.  The ISA dependent code (i.e.
> stuff in arch/alpha) should get the unfiltered one.  Functions like
> areInterruptsEnabled would be machine dependent and would have access
> to the ExecContext and the PC structure to grab the pal mode bit.
>   

The ExecContext is only for instructions when they execute (although I'm
not 100% sure that's all it could be used for). I thought of this
approach, but it doesn't work because the thing that puts together the
fetch request would have to know the actual PC. If it gets the filtered
one like you're proposing, then the PAL bit will be masked out and the
TLB will never see it. From experience, this makes the ALPHA_FS
regressions just sit there and miss in the ITLB over and over. If there
was some other way to communicate PAL-ness to the TLB, then yes I think
that scheme would likely work.

>   
>> I don't have a better idea although I'm trying to think of one, but this
>> definitely seems like something that can and should be improved on for
>> all these reasons. One possible solution would be to store the actual
>> PCState object in the request since that would at least be more general
>> and allow getting the PAL bit out of the fetch address, but then we'd be
>> carting around a handful of 64 bit values instead of just 1 when we
>> still only care about a single bit only some of the time.
>>     
> Perhaps the general PCState object should be broken up into sub
> objects.  I.e. a single PC object for a single PC that can encapsulate
> information about that PC.  The PCState object could be build from
> those.
>   

I'm not sure I follow. The PCState object for, say, Alpha is two Addrs,
the PC and the NPC. For ARM it's the PC, NPC, microPC, nextMicroPC,
flags, and nextFlags, for SPARC its the PC, NPC, microPC, and
nextMicroPC. I don't see how that can be broken up generically and not
just undo the fact that there's a PCState object. I also don't see how
that would address any of the problems above.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to