Re: [Neo4j] Gremlin performance?

Josef Holy Sat, 23 Jul 2011 09:40:36 -0700

Thanks a lot guys for quick and comprehensive answers!

I was assuming that Gremlin serves only for Pipes assembly which shouldn't 
impact the overall performance too much. We've 'hit the ceiling' with 
implementing various custom traversals with native neo4j APIs - the algorithms 
are quite lengthy and thus quite hard to maintain and tune. We are hoping that 
Gremlin (+Groovy) expressiveness could make things easier, even if in exchange 
for a little performance. The numbers you Marko provided are promising!


Will give Gremlin a shot and report back some real numbers.

Thanks a lot!


Cheers!

Josef.

On sobota, 23. července 2011 at 17:25, Marko Rodriguez wrote: 
> Hi,
> 
> Finally, one point to add.
> 
> If I only need to do a ShorestPath over a particular edge type or a "find all 
> paths" between two vertices and I'm using Neo4j as the graph backend, then I 
> will drop down and use Neo4j's Algo library. This is because their 
> ShorestPath implementation is bi-directional (efficient) and I would have to 
> write that in Gremlin as Gremlin doesn't provide "out of the box" textbook 
> algorithm support.
> 
> TinkerPop plans an algo library for standard graph algorithms whose paths are 
> defined by Pipes/Gremlin, but as of yet, it doesn't exist.
>  See http://markorodriguez.com/2011/02/08/property-graph-algorithms/
> 
> Thanks,
> Marko.
> 
> http://markorodriguez.com
> 
> On Jul 23, 2011, at 9:16 AM, Marko Rodriguez wrote:
> 
> > Hey,
> > 
> > Groovy is only used to compile a statement like "g.v(1).out.in.blah" to a 
> > Pipes which is native Java. As such, once the compilation is complete 
> > (milliseconds), it is simply native Java (This is not completely true as 
> > there are some Gremlin specific pipes). Next, for the relationship between 
> > Blueprints Neo4jGraph and native EmbeddedGraphDatabase, see this from some 
> > time ago:
> > 
> > http://groups.google.com/group/gremlin-users/msg/c94dfef8352f68d3
> > 
> > In short, traversing 29.6 million things took:
> >  5.6 seconds via EmbeddedGraphDatabase
> >  6.0 seconds via Neo4jGraph
> > 
> > ** As a side, the same experiment was run for OrientDB with a 7.2 (native 
> > OrientDB) vs. 7.9 (Blueprints OrientGraph).
> > http://groups.google.com/group/gremlin-users/msg/ff5c03e188efcffe
> > 
> > There is more discussion in that particular thread if you are interested.
> > 
> > Finally, with respect to production, I have many clients that use Gremlin 
> > in production. Here are the benefits of do so:
> >  1. Traversal descriptions are concise and expressive.
> >  - any arbitrary graph computation can be represented and evaluated.
> >  - in language theoretic terms, it can recognize Turing complete paths.
> >  2. Traversal descriptions can be expressed as classes in Groovy and thus, 
> > IDE friendly.
> >  - syntax highlighting, easy to write test cases/debug, etc.
> >  - See slides 234 and 235 from 
> > http://www.slideshare.net/slidarko/the-pathology-of-graph-databases
> > 
> > Thanks,
> > Marko.
> > 
> > http://markorodriguez.com
> > 
> > On Jul 23, 2011, at 2:30 AM, Michael Hunger wrote:
> > 
> > > If you look at the comments of the post -
> > > 
> > > groovy is only that slow if you implement all the algorithm details in 
> > > groovy !
> > > 
> > > Gremlin uses blueprints which is written in Java. Gremlin is just a DSL 
> > > on top of that API so it is just used for the construction of the 
> > > underlying pipeline.
> > > 
> > > Anyway, easiest way to see if that holds true is to write a PoC for 
> > > _your_ domain, I think general 
> > > statements are difficult.
> > > 
> > > But probably Marko has some nice performance benchmarks at hand.
> > > 
> > > Michael
> > > 
> > > Am 23.07.2011 um 09:51 schrieb Josef Holy:
> > > 
> > > > Hi all,
> > > > 
> > > > has someone on this list any practical experience with using Gremlin 
> > > > for traversing the EmbeddedGraphDatabase in a production environment? 
> > > > What interests me is how it performs compared to the traversal 
> > > > algorithms written directly against Neo4j APIs (using Traverser, 
> > > > TraversalDescription, ..etc). 
> > > > 
> > > > As Gremlin runs on top of Groovy + Pipes + Blueprints, I would expect 
> > > > it to be much slower than pure Neo4j Java APIs (but really SO much 
> > > > slower? 
> > > > http://stronglytypedblog.blogspot.com/2009/07/java-vs-scala-vs-groovy-performance.html
> > > >  ) .
> > > > 
> > > > 
> > > > Thanks for any comments/experiences!
> > > > 
> > > > 
> > > > Josef.
> > > > 
> > > > _______________________________________________
> > > > Neo4j mailing list
> > > > [email protected]
> > > > https://lists.neo4j.org/mailman/listinfo/user
> > > 
> > > _______________________________________________
> > > Neo4j mailing list
> > > [email protected]
> > > https://lists.neo4j.org/mailman/listinfo/user
> 
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
> 
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Gremlin performance?

Reply via email to