Re: Parsing D Maybe Not Such a Good Idea <_<;

Basile B. via Digitalmars-d Fri, 17 Jun 2016 01:11:41 -0700

On Thursday, 16 June 2016 at 17:20:39 UTC, cy wrote:

On Wednesday, 15 June 2016 at 07:16:31 UTC, Basile B. wrote:
You're right it's not so simple and you're also right about"everything", my "everything" is not used adequatly...
Sorry, I don't mean to complain. Actually the work has alreadyall been done, rather elegantly in fact. If libdparse can getthrough a significant subset of D2 code, I have to say I'mpretty impressed with the project, and can't praise it enough.
https://github.com/Hackerpilot/libdparse // disclaimer: thislink not endorsed by the hackerpilot org ltd
It already has a D formatter in it, which dumps (prettified!) Dcode to any sort of output range, and there's a case in it forevery single kind of node in the AST.

Yes, libdparse is the reference and when someone has to parse Dcode he really should use it. Among all the D libraries it's theone I know the more. I use it to build the CE's symbol list (it'san AST visitor) and to detect the "TODO comments" ;)

But somtimes it's too much: (I speak for me here) for example ifyou need to parse only simple constructs. In CE the **only**constructs that are parsed directly in the IDE (the two othercases mentioned previously are done in external tools) areModuleDeclaration and VersionCondition. For them libdparse is notmandatory, they can be detected by hand in the token list.

(speaking of which, when are we getting static switchstatements?)
What I meant by "D is not simple" isn't that I'm up a creek,without a paddle, but that the paddle is really complex, andI'd have no hope of tackling it if it wasn't already done. Thecomplexity of D's syntax is not so much a problem here, as aspectacle.
It depends on the grammatical construct you want to parse. Butit's already much more simple when the comments are removedfrom the lexical token list.
I suppose. What's complicated is the shoving of expressionseverywhere, since those spider out to all possible forms ofconstruct. That means the difficulty of parsing does NOT dependon the grammatical construct you want to parse, except for afew, very minor constructs, only the ones that don't even*potentially* include expressions in their grammar.
So, regardless of what you're doing, you pretty much have tohandle every single kind of construct,

No simple constructs can be detected in a token list. But if Iunderstand correctly you've started the topic because you wishedto detect functionDeclaration, right ?Obviously here you need the AST. Function declarations can bedisabled via a versionCondition or enabled by a static if,injected by a mixin template, injected by a string... They cannotbe accurately detected by picking 4 or 5 tokens in a list.

but if "handle" means "transform, then output" and you canseparate those two steps, then if someone does all the outputfor you, the "transform" step can be very simple and specific.Not because you can remove the comment nodes, but because youcan ignore ALL nodes that you're not interested in transforming.

Re: Parsing D Maybe Not Such a Good Idea <_<;

Reply via email to