[drools-dev] Meta Data driven parser for IDEs

Mark Proctor Wed, 10 May 2006 09:47:41 -0700

Currently we use the same API to parse and compile packages as we do forthe IDE. This is proving quite heavy. I propose that we make anadditional API for IDE integration that is fine grained and meta datadriven. I believe that Eclipse has extensive meta data capabilities,which it stores on disk -- we should really try and leverage this.Although the core API should try and be independent, so other IDEs canleverage it.

1) Split the main document up into package, imports, globals,functions and rules -- but do not parse contents.

a. Maybe we will build up the meta-data for imports and globals atthis stage, as I imagine that's easy.

b. Record, as meta data, the start and end line numbers, i.e. therange, for all sections.

2) Tokenise all expressions and blocks -- do not loose linenumbers though, so will need to pad.

a. Record, as meta data, the start and end line/col numbers, i.e.the range, for all expressions and blocks.


3)       Parse each Rule

a.       Validate Conditional Element and Field Constraint structure

b. Validate Columns and Fields -- record this in meta data. So weknow the dependant classes and fields for this rule.

c. Record Column and Field bindings as meta data -- as used forexpressions.

d. Validate operators and the RHS value type, checks it's validwith LHS and operator.

4) Determine required declarations for each expression and recordas meta data

a. If we can it might be nice to also determine dependant classesfrom the Imports and Globals , beyond declarations, and record as metadata.

5) Compile each expression and block, using a helper util, recordthe errors as meta data and then forget the compiled .class

6) Develop intelligent balanced-text recovery. So when scanning 1)and 2) we need to check for balanced text, if we detect incorrectbalancing we then mark that section as invalid and find the start of thenext valid section -- nothing inside those invalid sections will be parsed.

a. i.e. if we have an invalid expression, incorrect number ofbrackets on the LHS of the rule, we try and recover to the next validarea -- ideally this would be the next valid conditional element -- butthat may be hard and it could be the RHS of the rule. Start simple, makeit coarse, and intelligent fine grained can be added later.


7)       Intelligent re-parsing  for project wide changes.

a. We have class dependencies in meta-data for the varioussections -- and also fields in constraints. So we can determine errorsfrom the meta data, without having to reparse.

b. In the case of expressions we can use the meta data, to avoidre-compiling. I guess if the changes are too dramatic then we canrecompile the expression/consequence. We have the ranges in meta datafor the dependant sections, so we can rescan to suck up the expression,without having to parse the entire document

c. In document editing we only recompile expressions consequenceif the user edits them -- again we should know that we are in anexpression, we know the start so we scan from the start of t heexpression/block to the end and compile. Avoid reparse the entiredocument or even the rule. If bindings are changed we can also determinethe dependant expressions and recompile to get errors.

I'm sure there is a lot more complexity to this and missed stages. Butit should be enough to show you the direction I want to take this. Sowe are extensively using meta data to minimise the re-parsing andre-compiling. We are also using meta data to localise the areas that wedo need to re-parse for changes. The key to this is always being ableto know exactly what we are editing. We may have to extend the descr, orcreate a new structure to handle meta data driven parsing. Non of thisreplaces the existing ruleparse/descry/packagebuidler implementationswhich are still required for compiling and deploying real rule bases -although hopefully we can leverage parts from both, to avoid duplication.

It may be worth taking my previous email and this one and putting intojira.

Mark

------------------------------------------------------------------------

*From:* Mark Proctor
*Sent:* 10 May 2006 14:46
*To:* '[EMAIL PROTECTED]'; 'Kris Verlaenen'; Michael Neale
*Subject:* RE: going great with antlr 3

I've been thinking more about the parser and I think there is a lot morewe can do, with regards to iterative building and intelligent compiling.Initially we decided to push as much onto the rule builder -- I knowbelieve the reverse is true. In fact I don't think the Eclipse parsershould probably use the Rule Builder API at all -- instead building itsown meta data. This might mean we need two parsers -- one for buildingDescr for PackageBuilder and another for IDE integration.

-Antlr Grammer knows the full descr structure and we can ensure correctAST trees (except the contents of expressions and blocks) using antlr --without having to build the Descr structure.

-The parser maintains its own valid import entries, it can reuse thePackageBuilder TypeResolver here.

-We can identify when we are dealing with Columns and their fields, givethe parser the ability know valid classes and field -- as mentionedpreviously.

-We could probably even have a helper class to identify valid operatorson fields.

-Cache the line starts of important parts - Package, Rule, attributes,LHS and RHS. So we don't have to parse the entire rule, just from thecurrent parseable section to the end.

-Cache meta data of Rules -- I think this should probably just be usedClasses, to help when recompiling entire eclipse projects - so we onlyre/parse dependant packages/rules.

The above should allow us to deal with the bulk of user DRL editing,without having to create a complete package Descr or do any compiling.If we introduce a functions only for code blocks, that too can behandled with antlr grammer.Further to this we still need to handle expressions and blocks -- forany language. For the IDE I don't believe using PackageBuilder isefficient -- we don't need the entire compiled structure -- eachexpression and block is fully independent, it only needs requireddeclarations. If we implement the above we should know which expressionsor blocks we are currently editing. Further to this the Antlr Grammerand the helper TypeResolver can resolve bindings and cache in the metadata. Finally we can still use Antr, this should be pluggeable tosupport other languages, to examine expression to determine requireddeclarations, cached in meta data -- we can use this to compile theexpression to extract error messages/validity. This compilation is a oneof, via a helper utility, just to feed back the error messages to theparser -- once its compiled it can forget it.

This also needs to be combined with a way to make expressions and blocksfool proof -- we might need to combine with a pre-processor if ANTLRcannot be made to handle this -- were we tokenise expression and blocks.


Mark

[drools-dev] Meta Data driven parser for IDEs

Reply via email to