Currently we use the same API to parse and compile packages as we do for
the IDE. This is proving quite heavy. I propose that we make an
additional API for IDE integration that is fine grained and meta data
driven. I believe that Eclipse has extensive meta data capabilities,
which it stores on disk -- we should really try and leverage this.
Although the core API should try and be independent, so other IDEs can
leverage it.
1) Split the main document up into package, imports, globals,
functions and rules -- but do not parse contents.
a. Maybe we will build up the meta-data for imports and globals at
this stage, as I imagine that's easy.
b. Record, as meta data, the start and end line numbers, i.e. the
range, for all sections.
2) Tokenise all expressions and blocks -- do not loose line
numbers though, so will need to pad.
a. Record, as meta data, the start and end line/col numbers, i.e.
the range, for all expressions and blocks.
3) Parse each Rule
a. Validate Conditional Element and Field Constraint structure
b. Validate Columns and Fields -- record this in meta data. So we
know the dependant classes and fields for this rule.
c. Record Column and Field bindings as meta data -- as used for
expressions.
d. Validate operators and the RHS value type, checks it's valid
with LHS and operator.
4) Determine required declarations for each expression and record
as meta data
a. If we can it might be nice to also determine dependant classes
from the Imports and Globals , beyond declarations, and record as meta
data.
5) Compile each expression and block, using a helper util, record
the errors as meta data and then forget the compiled .class
6) Develop intelligent balanced-text recovery. So when scanning 1)
and 2) we need to check for balanced text, if we detect incorrect
balancing we then mark that section as invalid and find the start of the
next valid section -- nothing inside those invalid sections will be parsed.
a. i.e. if we have an invalid expression, incorrect number of
brackets on the LHS of the rule, we try and recover to the next valid
area -- ideally this would be the next valid conditional element -- but
that may be hard and it could be the RHS of the rule. Start simple, make
it coarse, and intelligent fine grained can be added later.
7) Intelligent re-parsing for project wide changes.
a. We have class dependencies in meta-data for the various
sections -- and also fields in constraints. So we can determine errors
from the meta data, without having to reparse.
b. In the case of expressions we can use the meta data, to avoid
re-compiling. I guess if the changes are too dramatic then we can
recompile the expression/consequence. We have the ranges in meta data
for the dependant sections, so we can rescan to suck up the expression,
without having to parse the entire document
c. In document editing we only recompile expressions consequence
if the user edits them -- again we should know that we are in an
expression, we know the start so we scan from the start of t he
expression/block to the end and compile. Avoid reparse the entire
document or even the rule. If bindings are changed we can also determine
the dependant expressions and recompile to get errors.
I'm sure there is a lot more complexity to this and missed stages. But
it should be enough to show you the direction I want to take this. So
we are extensively using meta data to minimise the re-parsing and
re-compiling. We are also using meta data to localise the areas that we
do need to re-parse for changes. The key to this is always being able
to know exactly what we are editing. We may have to extend the descr, or
create a new structure to handle meta data driven parsing. Non of this
replaces the existing ruleparse/descry/packagebuidler implementations
which are still required for compiling and deploying real rule bases -
although hopefully we can leverage parts from both, to avoid duplication.
It may be worth taking my previous email and this one and putting into
jira.
Mark
------------------------------------------------------------------------
*From:* Mark Proctor
*Sent:* 10 May 2006 14:46
*To:* '[EMAIL PROTECTED]'; 'Kris Verlaenen'; Michael Neale
*Subject:* RE: going great with antlr 3
I've been thinking more about the parser and I think there is a lot more
we can do, with regards to iterative building and intelligent compiling.
Initially we decided to push as much onto the rule builder -- I know
believe the reverse is true. In fact I don't think the Eclipse parser
should probably use the Rule Builder API at all -- instead building its
own meta data. This might mean we need two parsers -- one for building
Descr for PackageBuilder and another for IDE integration.
-Antlr Grammer knows the full descr structure and we can ensure correct
AST trees (except the contents of expressions and blocks) using antlr --
without having to build the Descr structure.
-The parser maintains its own valid import entries, it can reuse the
PackageBuilder TypeResolver here.
-We can identify when we are dealing with Columns and their fields, give
the parser the ability know valid classes and field -- as mentioned
previously.
-We could probably even have a helper class to identify valid operators
on fields.
-Cache the line starts of important parts - Package, Rule, attributes,
LHS and RHS. So we don't have to parse the entire rule, just from the
current parseable section to the end.
-Cache meta data of Rules -- I think this should probably just be used
Classes, to help when recompiling entire eclipse projects - so we only
re/parse dependant packages/rules.
The above should allow us to deal with the bulk of user DRL editing,
without having to create a complete package Descr or do any compiling.
If we introduce a functions only for code blocks, that too can be
handled with antlr grammer.
Further to this we still need to handle expressions and blocks -- for
any language. For the IDE I don't believe using PackageBuilder is
efficient -- we don't need the entire compiled structure -- each
expression and block is fully independent, it only needs required
declarations. If we implement the above we should know which expressions
or blocks we are currently editing. Further to this the Antlr Grammer
and the helper TypeResolver can resolve bindings and cache in the meta
data. Finally we can still use Antr, this should be pluggeable to
support other languages, to examine expression to determine required
declarations, cached in meta data -- we can use this to compile the
expression to extract error messages/validity. This compilation is a one
of, via a helper utility, just to feed back the error messages to the
parser -- once its compiled it can forget it.
This also needs to be combined with a way to make expressions and blocks
fool proof -- we might need to combine with a pre-processor if ANTLR
cannot be made to handle this -- were we tokenise expression and blocks.
Mark