Re: [DISCUSS] Primitive Types, Complex Types, and their Entailments in TP4

Marko Rodriguez Tue, 16 Apr 2019 12:29:21 -0700

Hi,

I just saw Stephen reply to a guy on gremlin-users@ about String manipulation 
operations in Gremlin3.


If this email thread’s direction proves correct, then TP4 will have a static 
set of primitive types. We must ensure that each primitive type has a 
corresponding set of VM instructions that can “fully” manipulate the primitive.

        TString will force us to provide string manipulation bytecode 
instructions.
        TLong/TInteger/TDouble/TFloat will force us to provide convenient math 
instructions.
        TMap and TList will force us to have corresponding get(), put(), 
size(), containsKey(), etc.-type instructions.
        TBoolean will force us to provide boolean operators — perhaps part of 
the math instruction subset.

This is great. This requirement gives us a hard and fast rule for creating 
primitive instructions.

———

However — here is the kicker — think about complex types. Given that this is an 
unbounded set that is uncontrolled by TinkerPop, we have to think about what 
instructions (Englsh-semantically) express operations on all potential data 
structures! This is where we really need a general theory of form. As it 
currently stands in TP3, these are our “complex type” instructions.

        is: generally useful for object filtering based on object feature.
        has: generally useful for filtering maps based on a key/value pair 
feature.
        property: generally useful for adding a key/value pair to maps.
        value: generally useful for getting a keys associated value from a map.

Already I think this is all wrong. Why is has() just for maps? What about 
looking for objects in a list? Be nice to not have a different instruction as 
has() is English-valid for List.contains(). Why is property() just for maps? 
What about inserting into a list? In this case, property() is a bad word for 
list.add(). Why is value() only for maps? What about getting an object from a 
list?

Here are some thoughts:

        1. Long, Integer, Float, Double, and Boolean do not have any internal 
structure.
        2. A List is an ordered set of key/value pairs where the keys are 
integers. list(‘a’,’b’,’c’) == map(1,’a’,2,’b’,3,’c’)
        3. A Map is an ordered set of key/value pairs where they keys are 
arbitrary objects.
        4. A String is a List of characters. “abc” == list(“a”,”b”,”c”) == 
map(1,”a”,2,”b”,3,”c”)

With some twiddling, I came up with this:

        is(filter): generally used for filtering an object based on a feature 
of that object (as a whole).
        has(filter): generally used for filtering an object based on a feature 
of the values within it. (valueFilter)
        has(filter, filter): generally used for filtering an object based on 
the features of the keys and values within it. (keyFilter, valueFilter)
        get(filter): generally used for getting values within an object based 
on key features. (keyFilter)
        get(filter, filter): generally used for getting values within an object 
based on key/value features. (keyFilter, valueFilter)
        add(flatmap): generally used for adding objects to the tail of an 
object. (values)
        add(filter, flatmap): generally used for adding values to an object at 
a particular key. (keyFilter, values)
        delete(): generally used for deleting an object (as a whole).
        delete(filter): generally used for deleting values in an object based 
on a key feature. (keyFilter)
        delete(filter, filter): generally used for deleting values in an object 
based on key/value features. (keyFilter, valueFilter)
        
        ** TP4 pop is an key-filter function.
                - Pop.key(predicate)
                - Pop.key(object) == Pop.key(eq(object))
                - Pop.index(n) == Pop.key(eq(int)) // the keys of a list are 
integers
                - Pop.last() == Pop.key(unfold().tail(1))
                - Pop.first() == Pop.key(unfold().limit(1))
                - Pop.all()  == Pop.key(identity())
        ** If the Pop result is greater than 1, then the result is a 
collection, else its a singleton.
        ** Pop.last() is the default if no Pop is provided.
                        

        TList
        is(list(1,2,3)): List.equals(List.of(1,2,3)) // equivalent to 
is(eq(list(1,2,3,))
        is(within(list(1,2,3)): List is a sublist of List.of(1,2,3)
        is(not(within(list(1,2,3))): List is not a sublist of List.of(1,2,3)
        is(type(list)): The incoming object is a list
        is(type(list).count(local).is(gt(3))): The incoming object is a list 
whose size is > 3
        has(‘name’): List.contains(‘name’) // equivalent to has(eq(‘name’))
        has(lt(3)): List.contains() an object less than 3
        has(regex("n*”)): List.contains() a string that matches regex.
        has(has(regex("n*”))): List.contains() a list that contains a string 
that matches regex.
        has(type(string)): List.contains() a string object.
        get(3): List.get(3) // equivalent to get(index(3), identity())
        get(is(3)): The object 3 if its in the list // equivalent to 
get(last,is(eq(3)))
        get(all,gt(3)): A list containing all list objects greater than 3 // 
equivalent to get(all,is(gt(3)))
        get(first,type(string)): The first string of the list
        get(all,type(string)): A list of all the strings in the list
        get(first,has(regex(“n*))): The first list in the list that contains a 
string that matches the regex
        get(regex(“n*”)): The last string object in the list that matches the 
predicate // equivalent to get(last,is(regex))
        get(index(within(1,2,4))): A list containing the original lists 1, 2, 
and 4 indexed objects
        get(index(gt(2))): List.sublist(2,size()-1))
        get(first,either(‘a’,1,true)): The first object in the list that is 
equal to a, 1, or true.
        add(‘marko’): List.add(“marko”) // equivalent to add(last,’marko’)
        add(3,’marko’): List.add(3,“marko”) // equivalent to 
add(index(3),’marko')
        add(3,select(‘a’).out().value(‘name’)): Add the names of the adjacent 
vertices of ‘a’ to the list starting at index 3
        add(3,select(‘a’).out().value(‘name’).limit(1)): Add the first name of 
the adjacent vertices of ‘a’ to the list at index 3
        add(3,select(‘a’).out().value(‘name’).fold()): Add the names of the 
adjacent vertices of ‘a’ as a list to the list starting at index 3
        add(index(either(1,5)),’marko’): List.add(1,“marko”); 
List.add(5,”marko”)
        delete(): List.clear() // equivalent to delete(all, identity())
        delete(all, ‘marko’): List.removeAll(“marko”) // equivalent to 
delete(all, is(eq(‘marko’)))
        delete(index(3)): // List.remove(3)
        delete(index(gt(3))): // Remove all objects after the third index
        delete(first, “marko”): Remove the first “marko” in the list
        delete(“marko”): Remove the last marko in the list // equivalent to 
delete(last,is(eq(marko)))
        delete(all,regex(“*n”)): Remove all strings in the list that match the 
regex.
        delete(all,type(string).count(current).gt(3)): Remove all the strings 
in the list whose String.size() is > 3.

        TString // A string is just a list of characters so TList method 
semantics map over nearly one-to-one
        is(“marko”): String.equals(“marko”) // equivalent to is(eq(“marko”))
        is(regex(“n*”)): String.matches(“n*”)
        has(“abc”): String.contains(“abc”)
        get(3): String.charAt(3)
        get(all,’a'): A string containing all the ‘a’ characters
        get(first,’b’): A string that is either empty or is equal to ‘b’
        get(all, “abc”): A string containing all the “abc” sequences
        add(“a”): String.concat(“a”)
        delete(): String = “"
        delete(all, “abc”): String.removeAll(‘abc’)
        delete(first, “abc”): Remove first abc sequence
        delete(index(3)): Remove the third character

        TMap // a map is just a list whose indices are arbitrary objects, not 
integers.
        is(map(a,1,b,2)): Map.equals(Map.of(a,1,b,2))
        is(within(map(a,1,b,2)): Map is a submap of Map.of(a,1,b,2)
        is(not(within(map(a,1,b,2))): Map is not a submap of Map.of(a,1,b,2)
        is(type(map)): The incoming object is a map
        is(type(map).count(local).gt(3)): The incoming object is a map whose 
size is > 3
        has(“marko"): Map.values().contains(’name’)
        has(regex("n*”)): Map.values() has a string that matches regex.
        has(has(regex("n*”))): Map.values() contains a list which contains a 
string that matches regex.
        has(type(string)): Map has a string value
        has(type(string),”marko”)): Map has a string key whose value is “marko”
        has(“name”,”marko”): Map.get(“name”).equals(“marko”)
        get(’name'): Map.get(’name’) // equivalent to 
get(key(is(eq(name))),identity())
        get(all, is(regex(“n*"))): Map.submap() for the values that match n*.
        get(is(regex(“n*”))): Map.submap() for the keys that match n*.
        get(within(‘a’,’b’,’c')): A map containing the key/value pairs for keys 
a, b, and c
        get(first,type(string)): The first string value of the Map.
        get(all,type(string)): A Map of all the key/value pairs with string 
values
        get(key(type(string))): A map of all key/value pairs with string keys
        get(first,is(regex(“n*))): The first key/value pair in the map that 
contains a string key that matches the regex
        get(first,either(‘a’,1,true)): The first key/value pair in the map 
whose key is equal to a, 1, or true.
        add(’name',’marko’): Map.put(“name",“marko”) 
        add(’name',select(‘a’).out().value(‘name’).limit(1)): Add the first 
name of the adjacent vertices of ‘a’ to the name-value
        add(’name',select(‘a’).out().value(‘name’).fold()): Add the names of 
the adjacent vertices of ‘a’ as a list to the name-value of the map.
        add(either(1,5),’marko’): Map.put(1,“marko”); Map.put(5,”marko”)
        delete(): Map.clear() // equivalent to delete(all)
        delete(all, ’marko’): Removes all the key/value pairs who value is 
marko        
        delete(“name"): Map.remove(“name")
        delete(regex(“*n”)): Remove all key/value pairs where the key matches 
the regex.
        delete(type(string).count(current).gt(3)): Remove all key/value pairs 
where the keys are strings and whose size is > 3. 
        

Now that the instructions above are generally applicable to collections. We can 
see if complex types can leverage them:

        Property graph vertices: 
                - g.V(1).has(’marko’) // vertex.values().contains(“name”)
                - g.V(1).has(‘name’,’marko’) // 
vertex.get(“name”).equals(“marko”)
                - g.V(1).get(‘name’) 
                - g.V(1).add(‘name’,’josh’) // put(‘name’,’josh’)
                - g.V(1).using(‘y’).is(within(V().using(‘x’))) // checks if 
vertex 1 in graph ‘y' is contained in graph ‘x’.
                - g.V(1).delete() // deletes the vertex
                - g.V(1).delete(‘name’) // deletes the vertex’s name property
                - g.V(1).delete(all, ‘marko’) // deletes the vertex properties 
with a marko value
                - g.V(1).delete(all, type(int).is(lt(3))) // deletes the vertex 
properties with values that are integers less than 3
                - g.V(1).delete(“age", type(int).is(lt(3))) // deletes the 
vertex age properties with values that are integers less than 3
                - g.V(1).out() // vertex.get(“outE”).unfold().get(“inV”) // 
crazy thought
        
        RDF graph vertices:
                g.V(uri:1).outE(‘foaf:knows’).has(‘ng’,uri2) // would determine 
if the triple is in the named graph uri:2.
                g.V(uri:1).out(‘foaf:name’).id() // would return 
marko^^xsd:string
                g.V(uri:1).delete() // DELETE uri:1 ?x ?y && ?x ?y uri:1
        
        Relational table rows:
                g.R(‘people’).has(‘name’,’marko’) // should filter out those 
rows that don’t have a name/marko entry.
                g.R(‘people’).get(‘name’) // would emit the value of the name 
column of each row.
                g.R(‘people’).is(within(map)) // would check if the row’s 
key/value pairs are in the map argument.
                g.R(‘people’).count(local) // would return the number of colums 
in the row.
                g.R(‘people’).toMap() // would turn the complex row object into 
the primitive TMap. // toMap() replaces valueMap().
                g.R(‘people’).join(g.R(‘addresses’)).by(‘ssn’) // join will be 
added to TP4 instruction set
                g.R(‘people’).has(‘age’,lt(10)).delete() // this deletes all 
rows from the people table that are < 10 years old
                g.R(‘people’).has(‘age’,lt(10)).toMap().delete() // this clears 
the map, leaving the database row unchanged.
                
        Document database:
                g.D(‘uuid:1’).has(‘name’,’marko’) // should filter out those 
documents who don’t have a key/value of name/marko.
                g.D(‘uuid:1’).get(‘name’) // will emit the value of the name 
key.
                g.D(‘uuid:1’).delete() // deletes the document from the 
database.
                g.D(‘uuid:1’).delete(‘name’) // delete the name key/value from 
the document (and subsequently, from the database)

For the most part, property graph vertices, relational database rows, and 
documentdb documents are just generalized maps…maps are just generalized lists… 
lists are just generalized strings…and strings are just generalized singletons.

Bye,
Marko.

http://rredux.com <http://rredux.com/>




> On Apr 15, 2019, at 1:07 PM, Marko Rodriguez <[email protected]> wrote:
> 
> Hello,
> 
>> I think this does satisfy your requirements, though I don't think I
>> understand all aspects the approach, especially the need for
>> TinkerPop-specific types *for basic scalar values* like booleans, strings,
>> and numbers. Since we are committed to the native data types supported by
>> the JVM.
> 
> TinkerPop4 will have VM implementations on various language-platforms. For 
> sure, Apache’s distribution will have a JVM and .NET implementation. The 
> purpose of TinkerPop-specific types (and not JVM, Mono, Python, etc.) types 
> is that we know its the same type across all VMs.
> 
>> To my mind, your approach is headed in the direction of a
>> TinkerPop-specific notion of a *type*, in general, which captures the
>> structure and constraints of a logical data type
>> <https://www.slideshare.net/joshsh/a-graph-is-a-graph-is-a-graph-equivalence-transformation-and-composition-of-graph-data-models-129403012/42
>>  
>> <https://www.slideshare.net/joshsh/a-graph-is-a-graph-is-a-graph-equivalence-transformation-and-composition-of-graph-data-models-129403012/42>>,
>> and which can be used for query planning and optimization. These include
>> both scalar types as well as vertex, edge, and property types, as well as
>> more generic constructs such as optionals, lists, records.
> 
> Yes — I’d like to be able to use some type of formal data type specification. 
> You have those skills. I don’t. My rudimentary (non-categorical) 
> representation is just “common useful data structures” — map, list, bool, 
> string, etc. 
> 
>> Can a TList really only contain primitives? A list of vertices or edges
>> would definitely be unusual, and typical PG implementations may not choose
>> to support them, but language-agnostic VM possibly should. They would
>> nicely capture RDF lists, in which list nodes typically do not have any
>> properties (edges) other than rdf:first and rdf:rest.
> 
> A TList only supports primitives. However, a TRDFList could be a complex type 
> for dealing with RDF lists and would be contained with the TP4-VM. Adding 
> complex types is okay — it doesn’t break anything.
> 
> As a related concept — realize that TDocument has a TDocumentArray not a 
> TList. This is because TDocuments can have “lists” that contain primitives, 
> documents, and lists.
> 
> 
>> For hypergraphs, an inV and outV which may produce more than one vertex, is
>> one way to go, but a labeled hypergraph should really have other projections
>> <https://www.slideshare.net/joshsh/a-graph-is-a-graph-is-a-graph-equivalence-transformation-and-composition-of-graph-data-models-129403012/49
>>  
>> <https://www.slideshare.net/joshsh/a-graph-is-a-graph-is-a-graph-equivalence-transformation-and-composition-of-graph-data-models-129403012/49>>
>> in addition to inV, outV. That suggests a more generic step than inV or
>> outV, which takes as an argument the name of the projection as well as the
>> in/out element. E.g. project("in", v1), project("out", v1),
>> project("subject", v1).
> 
> Hm. Yea, I’m not too strong with hypergraph thinking.
> 
>       g.V(1) // vertex
>       g.V(1).outE(‘family’)  // hyperedges
>       g.V(1).outE(‘family’).inV(‘father’) // ? perhaps inV/outV/bothV can 
> take a String… label?
> 
> We should talk to the GRAKN.AI guys and see what they think.
>       https://grakn.ai/ <https://grakn.ai/>
>       https://dev.grakn.ai/docs/general/quickstart 
> <https://dev.grakn.ai/docs/general/quickstart>
>       
>> For undirected graphs, we might as well just allow both in() and out()
>> rather than throwing exceptions. You can think of an undirected edge as a
>> pair of directed edges.
> 
> Okay.
> 
>> Agreed that provider-specific structures (types) are OK, and should not be
>> discouraged. Not only do different providers have their own data models,
>> but specific applications have their own schemas. A structure like a
>> metaproperty may be allowed in certain contexts and not others, and the
>> same goes for instances of conventional structures like edges of a certain
>> label.
> 
> Yes. I want to make sure we naturally/natively support property graphs, RDF 
> graphs, hypergraphs, tables, documents, etc. Property graphs (as specified by 
> Neo4j) are not “special” in TP4. Like Gremlin for languages, property graphs 
> sit side-by-side w/ other data structures. If we do this right, we will be 
> heros!
> 
> 
>> For multi-properties, there is a distinction to be made between multiple
>> properties with the same key and element, and single collection-valued
>> properties. This is something the PG Working Group has been grappling with.
>> I think both should be allowed.
> 
> Agreed. This all gets back to a way to specify what the data structure is:
> 
>       JanusGraph: a single-labeled property graph with multi/meta-properties.
>       Neo4j: a multi-labeled property graph with singleton properties (w/ 
> list values supported).
>       RDF: an unlabeled 1-property graph (named graph property?) with 
> vertex-based literals.
>       … ?.
> 
> Like Graph.Features in TP3.
> 
>> IMO it's OK if URIs, in an RDF context, become Strings in a TP context. You
>> can think of URI as a constraint on String, which should be enforced at the
>> appropriate time, but does not require a vendor-specific class. Can you
>> concatenate two URIs? Sure... just concatenate the Strings, but also be
>> aware that the result is not a URI.
> 
> Cool.
> 
> Thanks for reading and providing good ideas.
> 
> Marko.
> 
> http://rredux.com <http://rredux.com/>
> 
> 
> 
>> On Mon, Apr 15, 2019 at 5:06 AM Marko Rodriguez <[email protected] 
>> <mailto:[email protected]>>
>> wrote:
>> 
>>> Hello,
>>> 
>>> I have a consolidated approach to handling data structures in TP4. I would
>>> appreciate any feedback you many have.
>>> 
>>>        1. Every object processed by TinkerPop has a TinkerPop-specific
>>> type.
>>>                - TLong, TInteger, TString, TMap, TVertex, TEdge, TPath,
>>> TList, …
>>>                - BENEFIT #1: A universal type system will protect us from
>>> language platform peculiarities (e.g. Python long vs Java long).
>>>                - BENEFIT #2: The serialization format is constrained and
>>> consistent across all languages platforms. (no more coming across a
>>> MySpecialClass).
>>>        2. All primitive T-type data can be directly access via get().
>>>                - TBoolean.get() -> java.lang.Boolean | System.Boolean |
>>> ...
>>>                - TLong.get() -> java.lang.Long | System.Int64 | ...
>>>                - TString.get() -> java.lang.String | System.String | …
>>>                - TList.get() -> java.lang.ArrayList | .. // can only
>>> contain primitives
>>>                - TMap.get() -> java.lang.LinkedHashMap | .. // can only
>>> contain primitives
>>>                - ...
>>>        3. All complex T-types have no methods! (except those afforded by
>>> Object)
>>>                - TVertex: no accessible methods.
>>>                - TEdge: no accessible methods.
>>>                - TRow: no accessible methods.
>>>                - TDocument: no accessible methods.
>>>                - TDocumentArray: no accessible methods. // a document
>>> list field that can contain complex objects
>>>                - ...
>>> 
>>> REQUIREMENT #1: We need to be able to support multiple graphdbs in the
>>> same query.
>>>                - e.g., read from JanusGraph and write to Neo4j.
>>> REQUIREMENT #2: We need to make sure complex objects can not be queried
>>> client-side for properties/edges/etc. data.
>>>                - e.g., vertices are universally assumed to be “detached."
>>> REQUIREMENT #3: We no longer want to maintain a structure test suite.
>>> Operational semantics should be verified via Bytecode ->
>>> Processor/Structure.
>>>                - i.e., the only way to read/write vertices is via
>>> Bytecode as complex T-types don’t have APIs.
>>> REQUIREMENT #4: We should support other database data structures besides
>>> graph.
>>>                - e.g., reading from MySQL and writing to JanusGraph.
>>> 
>>> ———
>>> 
>>> Assume the following TraversalSource:
>>> 
>>> g.withStructure(JanusGraphStructure.class, config1).
>>>  withStructure(Neo4jStructure.class, conflg2)
>>> 
>>> Now, assume the following traversal fragment:
>>> 
>>>        outE(’knows’).has(’stars’,5).inV()
>>> 
>>> This would initially be written to Bytecode as:
>>> 
>>>        [[outE,knows],[has,stars,5],[inV]]
>>> 
>>> A decoration strategy realizes that there are two structures registered in
>>> the Bytecode source instructions and would rewrite the above as:
>>> 
>>>        [choose,[[type,TVertex]],[[outE,knows],[has,stars,5],[inV]]]
>>> 
>>> A JanusGraph strategy would rewrite this as:
>>> 
>>> 
>>> [choose,[[type,TVertex]],[[outE,knows],[has,stars,5],[inV]],[[type,JanusVertex]],[[jg:vertexCentric,out,knows,stars,5]]]
>>> 
>>> A Neo4j strategy would rewrite this as:
>>> 
>>> 
>>> [choose,[[type,TVertex]],[[outE,knows],[has,stars,5],[inV]],[[type,JanusVertex]],[[jg:vertexCentric,out,knows,stars,5]],[[type,Neo4jVertex]],[[neo:outE,knows],[neo:has,stars,5],[neo:inV]]]
>>> 
>>> A finalization strategy would rewrite this as:
>>> 
>>> 
>>> [choose,[[type,JanusVertex]],[[jg:vertexCentric,out,knows,stars,5]],[[type,Neo4jVertex]],[[neo:outE,knows],[neo:has,stars,5],[neo:inV]]]
>>> 
>>> Now, when a TVertex gets to this CFunction, it will check its type, if its
>>> a JanusVertex, it goes down the JanusGraph-specific instruction branch. If
>>> the type is Neo4jVertex, it goes down the Neo4j-specific instruction branch.
>>> 
>>>        REQUIREMENT #1 SOLVED
>>> 
>>> The last instruction of the root bytecode can not return a complex object.
>>> If so, an exception is thrown. g.V() is illegal. g.V().id() is legal.
>>> Complex objects do not exist outside the TP4-VM. Only primitives can leave
>>> the VM-client barrier. If you want vertex property data (e.g.), you have to
>>> access it and return it within the traversal — e.g., g.V().valueMap().
>>>        BENEFIT #1: Language variant implementations are simple. Just
>>> primitives.
>>>        BENEFIT #2: The serialization specification is simple. Just
>>> primitives. (also, note that Bytecode is just a TList of primitives! —
>>> though TBytecode will exist.)
>>>        BENEFIT #3: The concept of a “DetachedVertex” is universally
>>> assumed.
>>> 
>>>        REQUIREMENT #2 SOLVED
>>> 
>>> It is completely up to the structure provider to use structure-specific
>>> instructions for dealing with their particular TVertex. They will have to
>>> provide CFunction implementations for out, in, both, has, outE, inE, bothE,
>>> drop, property, value, id, label … (seems like a lot, but out/in/both could
>>> be one parameterized CFunction).
>>>        BENEFIT #1: No more structure/ API and structure/ test suite.
>>>        BENEFIT #2: The structure provider has full control of where the
>>> vertex data is stored (cached in memory or fetch from the db or a cut
>>> vertex or …). No assumptions are made by the TP4-VM.
>>>        BENEFIT #3: The structure provider can safely assume their
>>> vertices will not be accessed outside the TP4-VM (outside the processor).
>>> 
>>>        REQUIREMENT #3 SOLVED
>>> 
>>> We can support TRow for relational databases. A TRow’s data is accessible
>>> via the instructions has, hasKey, value, property, id, ... The location of
>>> the data in TRow is completely up to the structure provider and its
>>> strategy analysis (if only ’name’ is accessed, then SELECT ’name’ FROM...).
>>> We can easily support TDocument for document databases. A TDocument’s data
>>> is accessible via the instructions has, hasKey, value, property, id, … A
>>> value() could return yet another TDocument (or a TDocumentArray containing
>>> TDocuments).
>>> 
>>> Supporting a new complex type is simply a function of asking:
>>> 
>>>        “Does the TP4 VM instruction set have the requisite
>>> instruction-types (semantically) to manipulate this structure?"
>>> 
>>> We are no longer playing the language-specific object API game. We are
>>> playing the language-agnostic VM instruction game. The TP4-VM instruction
>>> set is the sole determiner of what complex objects can be processed. (i.e.
>>> what data structures can be processed without impedance mismatch).
>>> 
>>>        REQUIREMENT #4 SOLVED
>>> 
>>> ———
>>> 
>>> The TP4-VM (and, in turn, Gremlin) can naturally support:
>>> 
>>>        1. Property graphs: as currently supported in TP3.
>>>        2. RDF graphs: id() is a URI | Literal. g.V(1).value(‘foaf:name’)
>>> returns multi/meta-properties *or* g.V(1).out(‘foaf:name’) returns vertices
>>> whose id()s are xsd:string literals.
>>>        3. Hypergraphs: inV() can return more than one vertex.
>>>        4. Undirected graphs: in() and out() throw exceptions. Only both()
>>> works.
>>>        5. Meta-properties: value(‘name’) can return a TVertexProperty  (a
>>> special complex object that is structure provider specific — and that is
>>> okay!).
>>>        6. Multi-properties: value(‘name’) can return a TPropertyArray of
>>> TVertexProperty objects.
>>> 
>>> This means that the same instruction can behave differently for different
>>> structures. This is okay as there can be property graph, RDF, hypergraph,
>>> etc. test suites.
>>> 
>>> Since complex objects don’t leave the TP4-VM barrier, providers can create
>>> any complex objects they want — they just have to have corresponding
>>> strategies to create provider-unique bytecode instructions (and thus,
>>> CFunctions) for those complex objects.
>>> 
>>> Finally. there are a few of problems to work out:
>>>        - There is no way to yield a “v[1]” or “e[3][v[1]-knows->v[2]]”
>>> representation. Is that bad? Perhaps not.
>>>        - What is the nature of a TPath? Its complex, but we want to
>>> return it.
>>>        - g.V().id() on an RDF graph can return a URI. Is a URI “simple”?
>>> No, the set of simple types should never grow!…. thus, URI => String. Is
>>> that wack?
>>>        - Do we add g.R() and g.D() to Gremlin to type-support TRow and
>>> TDocument objects. g.V() would be weird :( … Hmmmm?
>>>                - However, there are only so many data structures……. or
>>> are there? TMatrix, TXML, …. whoa.
>>> 
>>> Thanks for reading,
>>> Marko.
>>> 
>>> http://rredux.com <http://rredux.com/> <http://rredux.com/ 
>>> <http://rredux.com/>>
>

Re: [DISCUSS] Primitive Types, Complex Types, and their Entailments in TP4

Reply via email to