Quoting Steve Reinhardt <[email protected]>:

> (Is there a reason we're not having this conversation on m5-dev?  I
> moved it back there for posterity, and in case anyone else wants to
> chime in.  Note that we're discussing better processing of the C++
> code snippets in the ISA description language.)
>
> On Tue, Aug 25, 2009 at 1:08 AM, Gabe Black<[email protected]> wrote:
>> nathan binkert wrote:
>>>> One other possible answer is to try and build a reasonable approximate
>>>> C++ parser that's good enough for the snippets that we use.  That
>>>> would be non-trivial, but since (1) we don't have to generate code, so
>>>> we only have to be more accurate than the current regex system and (2)
>>>> it's not the end of the world if there are tricky things the parser
>>>> can't handle and we're forced to rewrite a few code snippets, I don't
>>>> see it as an impossible task.  I think there are multiple
>>>> public-domain C++ grammars we could start with too.
>>>>
>>>
>>> I have a couple of comments here.  First, C++ can't be parsed with
>>> ply.  I'm not sure which parts of the language are the problem, but
>>> the language is ambiguous and not context free.  That said, what
>>> exactly are the problems with what we have?  I can try to see if I can
>>> improve things (or teach gabe enough about ply so he can do it.)
>>>
>>>   Nate
>>>
>>
>> I actually try to avoid the problem areas so I can't list them
>> exhaustively, but basically the way things work follows this basic rule
>> (I think). If the name of the operand appears in the text of the code
>> with or without an optional type modifier, it's an operand. If it's in
>> front of an equal sign, it's a destination, if not, it's a source. Even
>> though that's pretty simple it works remarkably well. Unfortunately it's
>> confused by things like pass by reference function arguments, using it
>> as a temporary without actually meaning to access it's original value
>> (ie. reading it to compute flag bits), setting it conditionally, and
>> maybe a few other things. It would be really hard to get those things
>> right without understanding the syntax of C++, and even then, without
>> knowing how functions are defined, etc., perfectly parsing the C++ won't
>> give you all the information you might need. That's what makes making
>> g++ figure it out attractive since it necessarily figures out all those
>> things at some point. The hard/impossible part is tricking it into using
>> that information to set up the operand index arrays in the static inst,
>> set up the reading and writing code, etc. I think templates kind of,
>> sort of might do the trick, but I just don't think you can get it to
>> automatically fill in the members of a class at construction time based
>> on the code in its member functions.
>
> Yes, doing a full parse is impossible for a number of reasons, not
> just the fact that C++ is context sensitive, but that in the case of
> the code snippets you don't even have all the context (and I think
> trying to generate the full context as Gabe is suggesting is probably
> impractical, as that would require sucking in lots of header files for
> each snippet and only lengthen compile times even further).  That's
> why I said "approximate".

I was thinking that might happen as part of the C++ compile phase,  
independent of the ISA parser. It would be pretty hard or maybe even  
impossible to get C++ to manage things for us, though.

>
> My (half-baked) thought was to build a parser that at least understood
> the basics of C++ expression syntax and could parse the snippets by
> making some charitable assumptions about what was a type and what was
> not (or perhaps we could require the use of typename declarations...
> I'd hope not too much, but it could be a fallback for resolving
> ambiguities).  Note that we already effectively restrict these
> snippets to a subset of C++ to avoid confusing the regexes, so I'm
> sure whatever we do would enable a larger subset than what's currently
> supported.
>
> I think this would solve most of Gabe's issues, since it could tell
> when the only read of an operand occurs after a write, not get
> confused by operand mentions in comments, robustly distinguish RHS
> from LHS of assignments, etc. More importantly, it would solve the
> biggest problem with the status quo, which is that right now there's
> no indication that the regex scan is getting confused because you've
> strayed out of the supported subset and encountered any of Gabe's
> issues; you have to look at the instruction object definition and
> notice that the operand list is not what you expected.  A key
> potential capability of a real parser would be for it to robustly
> determine when it can't figure out what's going on, so at least we
> could avoid these silent errors.

That would be great.

>
> Note that some of Gabe's issues aren't related to the parser and are
> more fundamental.  In particular:
> - It's not clear what to do about conditional updates.  They can't
> really be handled properly in hardware the face of register renaming,
> so my inclination is that if the parser could recognize situations
> where an update only occurs on one branch of an if statement then it
> should flag the snippet as an error.  I'm not sure what Gabe has in
> mind.  There's no support in any of our models for indicating a
> conditional output anyway.

That seems like a reasonable thing to do.

> - Pass by reference operands should also just be flagged as errors,
> since there's no way to know if the operand is read, written, or both.

It's more difficult than that, though, since it's not possible as far  
as I know to tell when an operand is passed as reference without a  
prototype.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to