> Laurence: Thanks for extensive explanation! It's really great! :) You're welcome.
> > So to summarize , I have two choices: > > 1. An extensive type checking in the actions, i.e for my case (JIT > mode, native x86 instructions will be generated by the parser) emit > the right arithmetic operation by testing the type of the identifier > in *every* relevant action. If it's and int then use the general > registers else use the floating point unit registers (st) to > load/compute/store the operands. It's not that extensive. The First Law of Computing states that "there is no free lunch". If an item may have more than one type (but only one at a time), one either has to test to see what type it has or store this information somewhere. The latter option will _usually_ (but not necessarily always) be better, because it avoids the use of run-time type information. A `struct' or (equivalently) a `class' is, in my opinion, better than a union for this purpose, because a union doesn't provide a way of storing the information of what type is the right one at run-time. If you access a member of a union, the programmer is expected to know which one he or she wants. There may be some special trick I don't know about, so this is just to the best of my knowledge. On the other hand, my intuition about C and C++ is usually pretty reliable (though not infallible). > 2. Drive bison while doing its parsing by pushing dummy/special tokens > that would lead the parser to the specialized (per type) rules that > would emit those specialized native instructions directly w/o checking > the type (since we already hit the specialized grammar rule). Sort of, except that you have to test the type of the object referred to by `variable' or (in your example) `identifier' in order to determine what token to push onto the stack. A corollary to the First Law of Computing states that "you can't get something for nothing". I forgot to mention that semantic values, if any, can be pushed onto the stack along with the tokens themselves. I haven't looked at this code for awhile (a sign that it works), so I don't remember all the details. The stack is a C++ `stack' (from the Standard Template Library) and a data member of an object of type `Scanner_Type', which is what is passed to `yyparse' and `yylex' as a parameter (in the form of a `void*'. There's one for each thread and there's a thread for each pair of calls to `yyparse' and `yylex'. In short, if a symbol can have a value of more than one type, you will have to test for the type somewhere. In my opinion, the easiest and best way to do this is to store the type somewhere. I strongly recommend against using RTTI. > Are these the *only* solutions to solve the issue? Probably not, but I can't think of any others off the top of my head. However, the principles will always apply. For a general discussion of this issue, you might want to try the newsgroup `comp.compilers'. I haven't posted there in a long time, but you will find people there who understand the theory of compilers far better than I do. I'm happy with my solution so I haven't really been moved to look into the problem more deeply. I'm an applications programmer rather than a computer scientist and I'm more preoccupied with finding the intersection figures of conic sections and quadratic curves. One thing you might want to consider is that C, wonderful language though it is, has some flaws and ambiguities. For example, the way signal handlers are declared is not really a thing a beauty. There can also be some question whether a construction is a declaration or a function call. I believe I read about this in the GNU Libtool manual. If you want to write a compiler, there might be a better choice for a language on which to base the grammar you're defining. Personally, I find Donald Knuth's MMIX very interesting. Laurence _______________________________________________ [email protected] http://lists.gnu.org/mailman/listinfo/help-bison
