On Sunday, 3 March 2013 at 03:06:15 UTC, Daniel Murphy wrote:
Every single one of these would have to be special-cased. If you had a domain-specific language you could keep track of whether you were mid-declaration, mid-statement, or mid-string-literal. Half the stuff you special-case could probably be applied to other C++ projects as well.

If this works, the benefits are just enormous. In fact, I would actually like to "waste" my time trying to make this work, but I'm going to need to ask a lot of questions because my current programming skills are nowhere near the average level of posters at this forum.

I would like a c++ lexer (with whitespace) to start with. Then a discussion of parsers and emitters. Then a ton of questions just on learning github and other basics.

I would also like the sanction of some of the more experienced people here, saying it's at least worth a go, even if other strategies are simultaneously pursued.

Something like this https://github.com/yebblies/magicport2 ?

Since you're obviously way ahead of me on this, I'm going to go ahead and say everything I've been thinking about this issue.

My approach to translating the source would be more-or-less naive. That is, I would be trying to do simple pattern-matching and replacement as much as possible. I would try to go as far as I could without the scanner knowing any context-sensitive information. When I added a piece of context-sensitive information, I would do so by observing the failures of the naive output, and adding pieces one by one, searching for the most bang for my context-sensitive buck. It would be nice to see upwards of 50 percent or more of the code conquered by just a few such carefully selected context-sensitive bucks.

Eventually the point of diminishing returns would be met with these simple additions. It would be of utility to have a language at that point, which, instead of seeking direct gains in its ability to transform dmd code, saw its gains in the ease and flexibility with which one could add the increasingly obscure and detailed special cases to it. I don't know how to set up that language or its data structures, but I can tell you what I'd like to be able to do with it.

I would like to be able to query which function I am in, which class I am assembling, etc. I would like to be able to take a given piece of text and say exactly what text should replace it, so that complex macros could be rewritten to their equivalent static pure D functions. In other words, when push comes to shove, I want to be able to brute-force a particularly hard substitution direct access to the context-sensitive data structure. For example, suppose I know that some strange macro peculiarities of a function add an extra '}' brace which is not read by C++ but is picked up by the naive nesting '{}' tracker, which botches up its 'nestedBraceLevel' variable. It would be necessary to be able to say:

if (currentFunction == "oneIKnowToBeMessedUp" &&
   currentLine >= funcList.oneIKnowToBeMessedUp.startingLine +50)
   { --nestedBraceLevel; }

My founding principle is Keep It Simple Stupid. I don't know if it's the best way to start, but barring expert advice steering me away from it, it would be the best for someone like me who had no experience and needed to learn from the ground up what worked and what didn't.

Another advantage of the domain-specific language as described above would its reusability of whatever transformations are common in C++, say transforming 'strcmp(a,b)' -> 'a == b', and it's possible use for adding special cases to translating from one language to another generally speaking . I don't know the difference between what I'm describing and a basic macro text processing language - they might be the same.

My last thought is probably well-tread ground, but the translation program should have import dependency charts for its target program, and automate imports on a per-symbol basis, so it lays out the total file in two steps.

import std.array : front, array;

One thing I'm specifically avoiding in this proposal is a sophisticated awareness of the C++ grammar. I'm hoping special cases cover whatever ground might be more perfectly trod by a totally grammar-aware conversion mechanism.

Now you're as up-to-date as I am on what I'm thinking.

Reply via email to