Re: [Neo4j] How to boost performance?

Michael Hunger Wed, 23 Nov 2011 04:13:03 -0800

Vinicius,

first: did you have any issues importing the data into Neo4j?
second: your example used cypher which is not optimized for performance (yet!). 
This is in our plans for the next two releases of neo4j.


So if you want to see the real performance of neo4j, please use the traversal 
framework or the core-API:

Cypher & Traversals:

// define
cypherQuery = cypherParser.parse("start n=node({start_node}) match n-->()-->x 
return x")
traversalQuery = 
Traversal.description().evaluator(Evaluators.atDepth(2)).expand(Traversal.expanderForAllTypes(Direction.OUTGOING))

// execute
for (Node n : cypherQuery.execute({"start_node":startNode})) { ... }
for (Node n : traversalQuery.traverse(startNode).nodes()) { ... }

If you're interested in the paths, remove the ".nodes()" call at the traverser

In java core-api code:

Node start=db.getNodeById(3);

for (Relationship rel=start.getRelationships()) {
    Node second = rel.getOtherNode(start);
    for (Relationship rel=second.getRelationships()) {
        Node third = rel.getOtherNode(second);
        // do something with the 3 nodes, 2 relationships which form your path
    }
}

In the REST API the traversal would look like: (see 
http://docs.neo4j.org/chunked/snapshot/rest-api-traverse.html#rest-api-traversal-using-a-return-filter)
    * POST http://localhost:7474/db/data/node/3/traverse/node
    * Accept: application/json
    * Content-Type: application/json 

{
  "relationships" : [ {"direction" : "out" } ],
  "max_depth" : 3
}


Am 23.11.2011 um 11:54 schrieb Vinicius Carvalho:

> Hi there, I've posted a few days ago about the POC I'm doing here at my
> company. I have some initial numbers and I'd like to ask for some help here
> in order to promote neo4j here in LMI Ericsson.
> 
> I've loaded a mySQL db with a really simple entity, that pretty much only
> represents a node and relations (only properties it has is an UID and x/y
> space coordinate for each node)
> 
> The DB contains 250.000 cells and 19. relations stored in a myISAM table,
> indexed only by it's primary key. Please find the DDL for the two tables.
> 
> CREATE TABLE  `pci`.`cells` (
>  `id` varchar(32) collate utf8_bin NOT NULL,
>  `x_pos` double default NULL,
>  `y_pos` double default NULL,
>  `pci` smallint(6) default '0',
>  PRIMARY KEY  (`id`)
> )
> 
> CREATE TABLE  `pci`.`relations` (
>  `id` int(11) NOT NULL auto_increment,
>  `source` varchar(32) collate utf8_bin default NULL,
>  `target` varchar(32) collate utf8_bin default NULL,
>  PRIMARY KEY  (`id`),
>  KEY `src_idx` (`source`),
>  KEY `src_target` (`target`)
> )
> 
> So as you can see, a simple secondary table contains the relationship with
> source and targets pointing to the cells table.
> 
> I've loaded this exact same DB into a neoserver running on the same
> machine: A Blade with 26 cpus (6 cores each) and 16gb RAM.
> 
> One of the requirements we have is to find all associations of my
> associations. Something that in neo I did like this:
> 
> START n = node(3)
> MATCH n-->()-->(x)
> return x
> 
> For this specific node it returns 6475 nodes.
> 
> I have tested this before using Hibernate in two modes: without a L2 cache,
> and with an L2 Cache (Ehcache standalone no replication).
> Here's a snippet of the code that loads it, so you can understand what's
> going under the hood:
> 
> 
> @Override
> public List<Cell> loadCellWithRealtions(String... ids) {
> Session session = (Session) em.getDelegate();
> Criteria c = session.createCriteria(Cell.class)
> .setFetchMode("incomingRelations", FetchMode.SELECT)
> .setFetchMode("outgoingRelations", FetchMode.SELECT)
> .add(Restrictions.in("id", Arrays.asList(ids)));
> List<Cell> results = c.list();
> for(Cell cell : results){
> Hibernate.initialize(cell.getIncomingRelations());
> Hibernate.initialize(cell.getOutgoingRelations());
> }
> return results;
> }
> 
> @Override
> public List<Cell> loadCellWithNeighbourRelations(String... ids) {
> List<Cell> cells = loadCellWithRealtions(ids);
> for(Cell c : cells){
> for(Relation r : c.getIncomingRelations()){
> Hibernate.initialize(r.getSource().getIncomingRelations());
> Hibernate.initialize(r.getSource().getOutgoingRelations());
> }
> for(Relation r : c.getOutgoingRelations()){
> Hibernate.initialize(r.getTarget().getIncomingRelations());
> Hibernate.initialize(r.getTarget().getOutgoingRelations());
> }
> }
> return cells;
> }
> 
> 
> 
> So the first method executes one query and 2 subselects to find a cell and
> all relations, the second method, iterate over each relation and do the
> same. So I pretty much will have something like 3+r*3 selects on db, where
> r is the number of relations right.
> 
> Ok, to be a bit fair with the tests, I've ran this for the same node 10
> times (get a chance to warm the caches), exclude the longest and smallest
> result, and then took a mean of it. Here's the results:
> 
> EhCache: 70ms
> Plain Hibernate: 550ms
> 
> I still don't have a version of neo4j code running integrated in the app
> server, but the idea is to use REST API. Running the query on the REST API
> took over 2 seconds on average, but due the large size of the response,
> network lagging was the issue. So I ran the same query 10 times using the
> web console, and the average time for neo was 300ms
> 
> Before asking anything I do know that we will have more complex queries
> where neo will shine, but I need to improve those results in order to sell
> it here :), with those numbers, ppl will just say that having a cache and
> using Relational model would suffice.
> 
> Anything I could do to improve this?
> 
> Regards
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] How to boost performance?

Reply via email to