Quoting Steve Reinhardt <[email protected]>:

> (Redirecting this one to m5-dev too.  To keep these threads more
> manageable I'm just responding to the parts related to the original
> topic of splitting up decoder.cc.)
>
> On Mon, Aug 24, 2009 at 12:39 AM, Gabe Black<[email protected]> wrote:
>> Steve Reinhardt wrote:
>>>
>>> My thought was
>>> that a key problem is that, because the decoder is monolithic, all of
>>> the instruction class declarations have to be visible at the point
>>> where the decoder is compiled, which means that there's a lower bound
>>> on the extent to which you can divide the code into independent
>>> chunks.  If you could split the decoder into independent, separately
>>> compiled C++ functions, then the instruction class declarations would
>>> only need to be visible to the subcomponent of the decoder where they
>>> are used.  So splitting the decoder is a means to full independent
>>> compilation of ISA subsets, not an end in itself.
>>>
>>> If you did it right, then you might be able to (more or less) map each
>>> .isa file to an independently compilable .cc file.
>>>
>>
>> That I think is at least partially true. You would still have to include
>> all the .hh files, but I'm not convinced that that adds significantly to
>> the compile time. I think it's all the things that spur the generation
>> of actual code that make it take significantly longer. I may be totally
>> wrong.
>
> Actually I was hoping that you wouldn't have to include all the .hh
> files.  If the main decoder in main_decoder.cc calls out to a
> subdecoder for x87 ops in x87_decoder.cc, then a header that declares
> the instruction objects for x87 instructions would only need to be
> included in the latter .cc file.

That's true, but then we may have function call overhead at those  
points since I don't know if gcc does cross object module inlining. If  
the header files prove to be a major contributor then that may still  
be worth it.

>
>>>> What I was thinking about earlier and what would be harder to implement
>>>> is to define a first class instruction abstraction for the ISA parser.
>>>> The instructions would be defined outside of the decoder and could
>>>> potentially be guided into different files that contain a reasonable
>>>> grouping of instructions. For instance, all the adds and subtracts could
>>>> go together. Part of what makes things tricky right now is that
>>>> instructions are a second level concept. They show up in the decoder as
>>>> a side effect of processing formats at leaves, but they don't have their
>>>> own identity and can't be managed easily and automatically by the parser.
>>>>
>>>
>>> This sounds doable, but again I'm not sure how much you'd buy by
>>> separating instructions into separate files if the declarations all
>>> have to be read in for the monolithic decode function.  Note that the
>>> execute methods are already segregated into separate .cc files by cpu
>>> model.
>>>
>>
>> That's true, but in the case of x86 there are actually very few execute
>> methods since those are only defined for the microops. The are many,
>> many more instances of the functions that are defined for each macroop,
>> namely constructors and generateDisassembly. If a change only affects
>> the execute methods, recompiling is usually very fast. If a change
>> involves changing the composition of macroops, it's very slow.
>
> Are these constructors and generateDisassembly methods still defined
> inline in the class declaration?

I don't think they are.

> If we don't care that they're
> actually inlined (and certainly for the generateDisassembly methods at
> least we shouldn't), then it should be trivial to define them outside
> the class declaration dump them in a separately compiled .cc file.  We
> could even do something very basic like call a method for outputting
> these definitions, and have that method transparently track the number
> of lines its generated, then just start a new file whenever that
> number gets above some threshold.  So we'd end up with
> decoder_methods1.cc, decoder_methods2.cc, etc.  You'd still probably
> have to include all the declarations in a single .hh that gets
> included everywhere, since there are going to be inheritance
> relationships that we won't be tracking, but if your theory that the
> declarations are not the problem is true, that won't matter.

That would work mostly I think. You'd have to be careful you didn't  
split something in half, and also that any static functions are  
available everywhere. I don't remember off hand if I used any of those  
so it might be a moot point.

>
> It might be a little tricky to get scons to figure out how many
> decoder_methodsN.cc files there are to compile, though.  I'm sure Nate
> can figure something out :-).
>
> Seems like the first step is to verify where the compilation
> bottleneck really is so that we know which of these approaches might
> actually solve the problem...

Yes. I'm not sure how to do that.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to