On 26.08.2011 03:08, dsimcha wrote:
I'm working on a parallel array ops implementation for
std.parallel_algorithm. (For the latest work in progress see
https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d
).
To make it (somewhat) pretty, I need to be able to tokenize a single
statement worth of D source code at compile time. Right now, the syntax
requires manual tokenization:
mixin(parallelArrayOp(
"lhs[]", "=", "op1[]", "*", "op2[]", "/", "op3[]"
));
where lhs, op1, op2, op3 are arrays.
I'd like it to be something like:
mixin(parallelArrayOp(
"lhs[] = op1[] * op2[] / op3[]"
));
Does anyone have/is there any easy way to write a compile-time D tokenizer?
The lexer used by Visual D is also CTFE capable:
http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d
As Timon pointed out, it will separate into D tokens, not the more
combined elements in your array.
Here's my small CTFE test:
///////////////////////////////////////////////////////////////////////
int[] ctfeLexer(string s)
{
Lexer lex;
int state;
uint pos;
int[] ids;
while(pos < s.length)
{
uint prevpos = pos;
int id;
int type = lex.scan(state, s, pos, id);
assert(prevpos < pos);
if(!Lexer.isCommentOrSpace(type, s[prevpos .. pos]))
ids ~= id;
}
return ids;
}
unittest
{
static assert(ctfeLexer(q{int /* comment to skip */ a;}) ==
[ TOK_int, TOK_Identifier, TOK_semicolon ]);
}
If you want the tokens as strings rather than just the token ID, you can
collect "s[prevpos .. pos]" instead of "id" into an array.