On Thursday, 16 June 2016 at 17:20:39 UTC, cy wrote:
On Wednesday, 15 June 2016 at 07:16:31 UTC, Basile B. wrote:
You're right it's not so simple and you're also right about
"everything", my "everything" is not used adequatly...
Sorry, I don't mean to complain. Actually the work has already
all been done, rather elegantly in fact. If libdparse can get
through a significant subset of D2 code, I have to say I'm
pretty impressed with the project, and can't praise it enough.
https://github.com/Hackerpilot/libdparse // disclaimer: this
link not endorsed by the hackerpilot org ltd
It already has a D formatter in it, which dumps (prettified!) D
code to any sort of output range, and there's a case in it for
every single kind of node in the AST.
Yes, libdparse is the reference and when someone has to parse D
code he really should use it. Among all the D libraries it's the
one I know the more. I use it to build the CE's symbol list (it's
an AST visitor) and to detect the "TODO comments" ;)
But somtimes it's too much: (I speak for me here) for example if
you need to parse only simple constructs. In CE the **only**
constructs that are parsed directly in the IDE (the two other
cases mentioned previously are done in external tools) are
ModuleDeclaration and VersionCondition. For them libdparse is not
mandatory, they can be detected by hand in the token list.
(speaking of which, when are we getting static switch
statements?)
What I meant by "D is not simple" isn't that I'm up a creek,
without a paddle, but that the paddle is really complex, and
I'd have no hope of tackling it if it wasn't already done. The
complexity of D's syntax is not so much a problem here, as a
spectacle.
It depends on the grammatical construct you want to parse. But
it's already much more simple when the comments are removed
from the lexical token list.
I suppose. What's complicated is the shoving of expressions
everywhere, since those spider out to all possible forms of
construct. That means the difficulty of parsing does NOT depend
on the grammatical construct you want to parse, except for a
few, very minor constructs, only the ones that don't even
*potentially* include expressions in their grammar.
So, regardless of what you're doing, you pretty much have to
handle every single kind of construct,
No simple constructs can be detected in a token list. But if I
understand correctly you've started the topic because you wished
to detect functionDeclaration, right ?
Obviously here you need the AST. Function declarations can be
disabled via a versionCondition or enabled by a static if,
injected by a mixin template, injected by a string... They cannot
be accurately detected by picking 4 or 5 tokens in a list.
but if "handle" means "transform, then output" and you can
separate those two steps, then if someone does all the output
for you, the "transform" step can be very simple and specific.
Not because you can remove the comment nodes, but because you
can ignore ALL nodes that you're not interested in transforming.