Hello,
Various stakeholders in Apache TinkerPop have been wondering weather mm-ADT can
be leveraged in TinkerPop3. While I originally planned for mm-ADT to form the
foundation of TinkerPop4, there are a subset of features in mm-ADT that could
really help TP3 moving forward. Here is a preliminary outline of the mm-ADT
features that could push the TP3 roadmap.
1. Type system: mm-ADT has a nominal type system for the built-in types
and a structural type system for all derived types. Bytecode instructions that
CRUD on database data can by statically typed and reasoned on at compile time.
2. Strategies: mm-ADT has a completely different approach to query
optimization than TP3. While there are compile-time strategies for manipulating
a query into a semantically equivalent, though computationally more efficient
form, the concept of “provider strategies” (indices) goes out the window in
favor of reference graphs. The primary benefit of the mm-ADT model is that the
implementation for providers will be much simpler, less error prone, doesn’t
require custom instructions, and is able to naturally capitalize on other
internal provider optimizations such as schemas, denormalizations, views, etc.
3. Instruction Set: mm-ADT’s instruction set is less adhoc than TP3.
Relational operators are polymorphic. Math operators are polymorphic. Container
(collection) operators are polymorphic. Unlike TP3, a “vertex” is just a map
like any other map. Thus, has(), value(), where(), select(), etc. operate
across all such derivations. Moreover, mm-ADT’s instruction set greatly reduces
the number of ways in which an expression can be represented, relying primarily
on reference graphs (see #2 above) as the means of optimization. This should
help limit the degrees of freedom in the Gremlin language and reduce its
apparent complexity to newcomers.
4. References: mm-ADT introduces references (pointers) as first-class
citizens. References form one of the primary data types in mm-ADT with numerous
usages including:
* Query planning. (providers exposing secondary data access
paths via reference graphs -- see #2 above)
* Modeling complex objects. (will not come into play given
TP3’s central focus on the property graph data type).
* Bytecode arguments. (nested bytecode are dynamic references
and every instruction’s arguments can take references (even the opcode
itself!)).
* Remote proxies. (TP3 detached vertices are awkward and
limiting in comparison to mm-ADT proxy references).
* Schemas. (will probably not come into play, but “person”
vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph schemas,
mm-ADT provides the functionality).
I’ll leave it at that for now. Any questions, please ask.
Take care,
Marko.
http://rredux.com <http://rredux.com/>