Re: [DISCUSS] Graph Core API and TraversalSource

Marko Rodriguez Thu, 12 Nov 2015 09:36:54 -0800

Hi,

Didn't see you had the code in there. I did it, here it is.


gremlin> clock {testAddVertex(100000)}
==>40.35240715
gremlin> clock {testAddV(100000)}
==>426.87777037

Marko.

http://markorodriguez.com

On Nov 12, 2015, at 10:33 AM, Marko Rodriguez <[email protected]> wrote:

> Just out of curiosity -- do the testAddVertex() first then do testAddV().
> 
> Marko.
> 
> http://markorodriguez.com
> 
> On Nov 12, 2015, at 10:29 AM, Daniel Kuppitz <[email protected]> wrote:
> 
>> Here's the result of a pretty simple performance comparison:
>> 
>> gremlin> testAddV = { num -> graph = TinkerGraph.open(); g =
>> graph.traversal(); for (i = 0; i < num; i++) { g.addV(id, i).next() } }
>> ==>groovysh_evaluate$_run_closure1@30b9eadd
>> gremlin> testAddVertex = { num -> graph = TinkerGraph.open(); for (i = 0; i
>> < num; i++) { graph.addVertex(id, i) } }
>> ==>groovysh_evaluate$_run_closure1@2e647e59
>> gremlin> clock {testAddV(100000)}
>> ==>462.04376528
>> gremlin> clock {testAddVertex(100000)}
>> ==>70.90365949999999
>> 
>> 
>> As you can see, addVertex() is almost 7x as fast as addV(). However, if you
>> rely on traversal strategies, you would - of course - still prefer addV()
>> over addVertex().
>> 
>> Cheers,
>> Daniel
>> 
>> 
>> On Thu, Nov 12, 2015 at 6:13 PM, Stephen Mallette <[email protected]>
>> wrote:
>> 
>>> i think we have a somewhat confusing story about Graph.addVertex() and
>>> GraphTraversalSource.addV().  We've wanted to promote use of
>>> TraversalSource but our docs make a fair bit of use of Graph.addVertex()
>>> and Vertex.addEdge() in various places.  It seems that if we want to
>>> downplay core Graph API methods, we should limit core Graph API methods to
>>> here only
>>> 
>>> http://tinkerpop.incubator.apache.org/docs/3.0.2-incubating/#_the_graph_structure
>>> 
>>> of course, @dkuppitz made the side-comment to me that he would never use
>>> addV() when data loading, citing possible performance reasons.
>>> 
>>> I'd also note that for simple data loading use cases the
>>> GraphTraversalSource.addE() isn't quite as intuitive to use as
>>> Vertex.addEdge(),
>>> 
>>> gremlin> v1 = g.addV(id, 1, label, "person", "name", "marko", "age",
>>> 29).next()
>>> ==>v[1]
>>> gremlin> v2 = g.addV(id, 3, label, "software", "name", "lop", "lang",
>>> "java").next()
>>> ==>v[3]
>>> gremlin> g.V(v1).as('a').V(v2).addInE('created', 'a', "weight", 0.4)
>>> ==>e[4][1-created->3]
>>> 
>>> compared with just:
>>> 
>>> gremlin> v1.addEdge("created", v2, id, 9, "weight", 0.4)
>>> ==>e[9][1-created->3]
>>> 
>>> So, up for discussion is: Do we promote core Graph API methods for bulk
>>> loading? Or do we promote consistent use of GraphTraversalSource in all
>>> cases?
>>> 
>

Re: [DISCUSS] Graph Core API and TraversalSource

Reply via email to