On 26.08.2011 03:08, dsimcha wrote:
I'm working on a parallel array ops implementation for
std.parallel_algorithm. (For the latest work in progress see
https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d
).

To make it (somewhat) pretty, I need to be able to tokenize a single
statement worth of D source code at compile time. Right now, the syntax
requires manual tokenization:

mixin(parallelArrayOp(
"lhs[]", "=", "op1[]", "*", "op2[]", "/", "op3[]"
));

where lhs, op1, op2, op3 are arrays.

I'd like it to be something like:

mixin(parallelArrayOp(
"lhs[] = op1[] * op2[] / op3[]"
));

Does anyone have/is there any easy way to write a compile-time D tokenizer?

The lexer used by Visual D is also CTFE capable:

http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d

As Timon pointed out, it will separate into D tokens, not the more combined elements in your array.

Here's my small CTFE test:

///////////////////////////////////////////////////////////////////////
int[] ctfeLexer(string s)
{
        Lexer lex;
        int state;
        uint pos;
        
        int[] ids;
        while(pos < s.length)
        {
                uint prevpos = pos;
                int id;
                int type = lex.scan(state, s, pos, id);
                assert(prevpos < pos);
                if(!Lexer.isCommentOrSpace(type, s[prevpos .. pos]))
                        ids ~= id;
        }
        return ids;
}

unittest
{
        static assert(ctfeLexer(q{int /* comment to skip */ a;}) ==
                [ TOK_int, TOK_Identifier, TOK_semicolon ]);
}

If you want the tokens as strings rather than just the token ID, you can collect "s[prevpos .. pos]" instead of "id" into an array.

Reply via email to