Quoting nathan binkert <n...@binkert.org>:

That doesn't really fit with how the ISA files work. They get broken into an
AST, but that gets consumed as it goes,
Does it have to be?

Making it not work that way would likely be very painful. The parser part is finicky (like they all are in any language) and we have lots and lots of very intricate code built on top of it in the form of the descriptions themselves.


and it has a lot of anonymous python
in it that just gets executed somehow. I want to move more into the python,
so the AST will be less and less useful.
Does the AST not contain enough information to know what files are
being generated?  The anonymous python itself creates files?  That
sounds crazy.

This may not be quite right, but off hand here is a summary of the isa parser's inputs and outputs. Going in, the parser starts with main.isa. There are ##includes (there are two #s on purpose) which bring in other .isa files. The parser reads in all those files by following the ##includes, stitches them together into one huge string, and then crunches through it all. As that's being processed, the description can read in other files that have, for instance, microcode in them. This already hits basically the problem we're talking about since the involved are determined by execution, not static landmarks like ##include. I "solve" this problem by manually listing all microcode files in the SConscript. It's a nasty hack, but it avoids not rebuilding when microcode changes which is even more annoying.

On the output side, the parser generates two files which implement the decoder, decoder.hh and decoder.cc. It also outputs one file for each CPU model involved that implements the exec (and related) functions. These are called something_something_exec.cc I think.

The problem is that for x86 for sure, but also now for ARM and likely for any other ISA with a lot of complexity and/or fidelity, those output files get to be very, very large. It's easy to run out or RAM, especially if scons tries to build more than one at a time or if you're on a smaller machine. Then the build grinds to a halt, as does everything else. Often the only solutions are to wait until it finishes or you die (whichever comes first) or rebooting the machine and trying again with more conservative settings.

What this mechanism would do would be to allow you to put different portions of the output into different files which would be compiled independently. Then scons compiling three things at once is equivalent to three normal files at once, not a million lines of code all at once.

To do that, you have to decide how to split things up so they still build. You could try hacking things up in an automated way, but that would likely either be overly restrictive, ineffective, incorrect, or all three. My plan is to expose the idea of different files to the ISA description author so that they can choose to put all the, say, floating point loads and stores together along with their utility functions. These may not all be defined in the same place or even in the same directory since there are ordering constraints in python as well as in the resulting C++. It might be that you define output files in a fixed place (def output floatMem, for instance) and then refer to them later when it comes time to put C++ someplace. That might make the most sense. It could also be that you have batches of similarly named output clusters (these will likely involve more than one file at a time, like a .cc and a .hh) and you'd want to generate them all programatically. I'm not sure exactly what it would look like to select an output file either. You might want to just put down markers that say, essentially, "henceforth output goes in floatMem". Or pass an output cluster name into the outputting function (whatever that looks like).

It might work best in the near to mid term to put in static, ISA language defined declarations of output files which would be feasible to scan. In the mid to long term, though, I want to move away from having a custom language and move towards having the same machinery (extended and parameterized more) exposed as a module or something inside regular python scripts. Maybe something ala scons's SConscripts which are regular python that run in an armature, sort of.

I would be hesitant to make the ISA descriptions open and write to files themselves directly, but primarily because that would be cumbersome and error prone. I don't think we should design it out, though, unless it's just too evil to support.

Gabe
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to