Hi,
Here is a toy class I just made that converts Bytecode to “GraphSON2.0.” I
believe it covers everything! (save I didn’t blow out the Number section):
https://gist.github.com/okram/908b73b24e8db48f1006124942a900b1
<https://gist.github.com/okram/908b73b24e8db48f1006124942a900b1>
The following code:
Traversal<?, ?> traversal =
__.V().has("age",
gt(10).and(lt(30))).out("knows").repeat(out().hasLabel("person")).times(2).groupCount().by(label);
GraphSONWriter.build().create().writeObject(System.out,
GraphSONConverter.convert(traversal));
Outputs:
{"bytecode":[["V"],["has","age",{"predicate":"and","@type":"P","value":[{"predicate":"gt","@type":"P","value":{"@type":"int32","value":10}},{"predicate":"lt","@type":"P","value":{"@type":"int32","value":30}}]}],["out","knows"],["repeat",{"bytecode":[["out"],["has","~label",{"predicate":"eq","@type":"P","value":"person"}]],"@type":"Traversal"}],["times",{"@type":"int32","value":2}],["groupCount"],["by",{"@type":"T","value":"label"}]],"@type":"Traversal"}
Or in pretty print:
{
"bytecode": [
[
"V"
],
[
"has",
"age",
{
"predicate": "and",
"@type": "P",
"value": [
{
"predicate": "gt",
"@type": "P",
"value": {
"@type": "int32",
"value": 10
}
},
{
"predicate": "lt",
"@type": "P",
"value": {
"@type": "int32",
"value": 30
}
}
]
}
],
[
"out",
"knows"
],
[
"repeat",
{
"bytecode": [
[
"out"
],
[
"has",
"~label",
{
"predicate": "eq",
"@type": "P",
"value": "person"
}
]
],
"@type": "Traversal"
}
],
[
"times",
{
"@type": "int32",
"value": 2
}
],
[
"groupCount"
],
[
"by",
{
"@type": "T",
"value": "label"
}
]
],
"@type": "Traversal"
}
Thoughts?,
Marko.
http://markorodriguez.com
> On Jul 13, 2016, at 9:30 AM, Stephen Mallette <[email protected]> wrote:
>
> Marko, the namespacing idea seems smart.
>
> Robert, I think other graphs have similar behavior to TinkerGraph's
> default. In Titan, the absence of a schema (default, obviously) produces
> this:
>
> gremlin> graph = TitanFactory.open('conf/titan-cassandra-es.properties')
> ==>standardtitangraph[cassandrathrift:[127.0.0.1]]
> gremlin> graph.addVertex("n",100D)
> ==>v[4288]
> gremlin> graph.traversal().V().has('n',100f)
> gremlin> graph.traversal().V().has('n',100d)
> ==>v[4288]
>
> This kind of problem has caused trouble for years and years in TinkerPop
> and allowing the type to be embedded seemed like a good solution. Of
> course, you bring up a good point about javascript - to this point we've
> relied on JS devs to conform to java/groovy types by forcing conversion in
> their gremlin scripts or configuring their graphs to avoid use of types
> that would produce these kinds of ambiguous results.
>
>
>
> On Wed, Jul 13, 2016 at 9:51 AM, Robert Dale <[email protected]> wrote:
>
>> And just to be clear, I'm not necessarily disagreeing. But I think
>> it's important to understand where and why it's necessary.
>>
>> For example, if I'm writing a gremlin script (string), I don't type my
>> input numbers. It's rightly converted by the underlying architecture.
>> (I'm guessing groovy which has enhanced number support). Also, if a
>> GLV is submitting typed numbers, how would that work? For example, in
>> Javascript?
>>
>> On Wed, Jul 13, 2016 at 9:16 AM, Robert Dale <[email protected]> wrote:
>>> Hi, Stephen. I think that's a bad example. You may recall I brought
>>> up that issue in the forum. However, it's actually attributed to the
>>> default ID manager of ANY (for historical) which I think is a really
>>> bad default (and reason) because it only leads to confusion. Java is
>>> one of the few, if not only, brain-damaged languages where 5 != 5 !=
>>> 5. In Java, number objects must be coerced into like form for
>>> comparison. The other ID managers do this coercion. Saner languages
>>> do this under the covers.
>>>
>>> On Wed, Jul 13, 2016 at 8:56 AM, Stephen Mallette <[email protected]>
>> wrote:
>>>> Robert, thanks for joining this discussion.
>>>>
>>>>> I wonder if it even makes sense to type numbers according to their
>>>> memory model. As objects, Byte, Short, and Integer occupy the same
>>>> space. Long isn't much more. So in Java we're not saving much space.
>>>> Jackson will attempt to parse in order: int, long, BigInt, BigDecimal.
>>>> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even
>>>> have this concept. Does anything in gremlin actually require this?
>>>>
>>>> If the intended numeric type isn't preserved, weird things can happen
>> with
>>>> graphs that have a schema (like Titan/DSE). Even TinkerGraph using the
>>>> default ID manager will not be happy if you try to do a lookup of Long
>>>> identifiers with an Integer:
>>>>
>>>> gremlin> graph = TinkerFactory.createModern()
>>>> ==>tinkergraph[vertices:6 edges:6]
>>>> gremlin> graph.vertices(1)
>>>> ==>v[1]
>>>> gremlin> graph.vertices(1L)
>>>> gremlin>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 8:17 AM, Robert Dale <[email protected]> wrote:
>>>>
>>>>> Marko, I agree that empty object properties should not be represented.
>>>>> I think if you saw that in an example then it was probably for
>>>>> demonstration purposes.
>>>>>
>>>>> Kevin, can you expand on this comment:
>>>>>
>>>>>> the format you suggest would lead to the same inconsistencies as in
>>>>> GraphSON 1.0.
>>>>>> Since the type is at the same level than the data itself, whether the
>>>>> container is an Array or an Object
>>>>>> https://github.com/apache/tinkerpop/pull/351#issuecomment-231351653
>>>>>
>>>>> What exactly are the inconsistencies? What is the problem in
>>>>> determining an array or object?
>>>>> This is a natural JSON array (or list): []
>>>>> This is a natural JSON object: {}
>>>>>
>>>>> Type at the object level is a common pattern and supported feature of
>>>>> Jackson. Also, GeoJSON would be a natural fit as it also stores
>>>>> 'type' at the object level. Titan supports GeoJSON currently. I
>>>>> wonder if it would make sense to promote geometry to gremlin.
>>>>>
>>>>> We should probably start documenting a table of supported types. (If
>>>>> there is one, please provide link)
>>>>>
>>>>> I wonder if it even makes sense to type numbers according to their
>>>>> memory model. As objects, Byte, Short, and Integer occupy the same
>>>>> space. Long isn't much more. So in Java we're not saving much space.
>>>>> Jackson will attempt to parse in order: int, long, BigInt, BigDecimal.
>>>>> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even
>>>>> have this concept. Does anything in gremlin actually require this?
>>>>> I'm thinking that this is only going to be relevant at the domain
>>>>> model level. This way json native numbers can be used and not need
>>>>> typing.
>>>>>
>>>>> Additionally, I think that all things that will be typed should always
>>>>> be typed. For the use cases of injesting a saved graph from a file, it
>>>>> can probably be assumed that the top-level objects are vertices since
>>>>> the graph is vertex-centric and everything else follows naturally.
>>>>> I'm not entirely sure what is required for submitting traversals to
>>>>> gremlin server from GLV. However, if this is used for the results
>>>>> from gremlin server then the results could start with any one of path,
>>>>> vertex, edge, property, vertex property, etc. So you'll need that type
>>>>> data there.
>>>>>
>>>>> --
>>>>> Robert Dale
>>>>>
>>>>> On Tue, Jul 12, 2016 at 8:35 AM, Marko Rodriguez <[email protected]
>>>
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I’m not following this PR too closely so what I might be saying is a
>>>>> already known/argued against/etc.
>>>>>>
>>>>>> 1. I think we should go with Robert Dale’s proposal of int32,
>>>>> int64, Vertex, uuid, etc. instead of Java class names.
>>>>>> 2. In Java we then have a Map<String,Class> for typecasting
>>>>> accordingly.
>>>>>> 3. This would make GraphSON 2.0 perfect for Bytecode
>>>>> serialization in TINKERPOP-1278.
>>>>>> 4. I think that if a Vertex, Edge, etc. doesn’t have
>> properties,
>>>>> outV, etc. then don’t even have those fields in the representation.
>>>>>> 5. Most of the serialization back and forth will be
>> ReferenceXXX
>>>>> elements and thus, don’t create more Maps/lists for no reason. — less
>> chars.
>>>>>>
>>>>>> For me, my interests with this work is all about a language agnostic
>> way
>>>>> of sending Gremlin traversal bytecode between different languages. This
>>>>> work is exactly what I am looking for.
>>>>>>
>>>>>> Thanks,
>>>>>> Marko.
>>>>>>
>>>>>> http://markorodriguez.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Jul 9, 2016, at 9:48 AM, Stephen Mallette <[email protected]>
>>>>> wrote:
>>>>>>>
>>>>>>> With all the work on GLVs and the recent work on GraphSON 2.0, I
>> think
>>>>> it's
>>>>>>> important that we have a solid, efficient, programming language
>> neutral,
>>>>>>> lossless serialization format. Right now that format is GraphSON
>> and it
>>>>>>> works for that purpose (ever more so with 2.0). Given some
>> discussion
>>>>> on
>>>>>>> the GraphSON 2.0 PR driven a bit by Robert Dale:
>>>>>>>
>>>>>>> https://github.com/apache/tinkerpop/pull/351#issuecomment-231157389
>>>>>>>
>>>>>>> I wonder if we shouldn't consider another IO format that has Gremlin
>>>>>>> Server/GLVs in mind. At this point I'm not suggesting anything
>> specific
>>>>> -
>>>>>>> I'm just hanging the idea out for further discussion and brain
>> storming.
>>>>>>> Thoughts?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Robert Dale
>>>>>
>>>
>>>
>>>
>>> --
>>> Robert Dale
>>
>>
>>
>> --
>> Robert Dale
>>