But wouldn't it mean that I need to have exclusive lock on the db? I would like to keep the server running pointing at the same data directory.
Regards On Wed, Nov 23, 2011 at 1:50 PM, Michael Hunger < michael.hun...@neotechnology.com> wrote: > Please use EmbeddedGraphDatabase, > > EmbeddedReadOnlyGraphDatabase caches a snapshot of the data in its caches > and doesn't get update-changes. > > Michael > > Am 23.11.2011 um 14:39 schrieb Vinicius Carvalho: > > > Hi Michael, thanks. The data load was fine, I've used your script with > the > > BathInserter. Memory footprint was really slow, I think the peak was > 200mb > > of heap usage. I did something really retarded and left a logger.info, > > which slowed things a bit, but the process was really smooth. > > > > Many thanks on the help with the query. I'll try this, I'm putting the > > readonlyembedded neo inside our app right now. I expect to see some good > > performance boost :) > > > > Best Regards > > > > On Wed, Nov 23, 2011 at 12:12 PM, Michael Hunger < > > michael.hun...@neotechnology.com> wrote: > > > >> Vinicius, > >> > >> first: did you have any issues importing the data into Neo4j? > >> second: your example used cypher which is not optimized for performance > >> (yet!). This is in our plans for the next two releases of neo4j. > >> > >> So if you want to see the real performance of neo4j, please use the > >> traversal framework or the core-API: > >> > >> Cypher & Traversals: > >> > >> // define > >> cypherQuery = cypherParser.parse("start n=node({start_node}) match > >> n-->()-->x return x") > >> traversalQuery = > >> > Traversal.description().evaluator(Evaluators.atDepth(2)).expand(Traversal.expanderForAllTypes(Direction.OUTGOING)) > >> > >> // execute > >> for (Node n : cypherQuery.execute({"start_node":startNode})) { ... } > >> for (Node n : traversalQuery.traverse(startNode).nodes()) { ... } > >> > >> If you're interested in the paths, remove the ".nodes()" call at the > >> traverser > >> > >> In java core-api code: > >> > >> Node start=db.getNodeById(3); > >> > >> for (Relationship rel=start.getRelationships()) { > >> Node second = rel.getOtherNode(start); > >> for (Relationship rel=second.getRelationships()) { > >> Node third = rel.getOtherNode(second); > >> // do something with the 3 nodes, 2 relationships which form your > >> path > >> } > >> } > >> > >> In the REST API the traversal would look like: (see > >> > http://docs.neo4j.org/chunked/snapshot/rest-api-traverse.html#rest-api-traversal-using-a-return-filter > >> ) > >> * POST http://localhost:7474/db/data/node/3/traverse/node > >> * Accept: application/json > >> * Content-Type: application/json > >> > >> { > >> "relationships" : [ {"direction" : "out" } ], > >> "max_depth" : 3 > >> } > >> > >> > >> Am 23.11.2011 um 11:54 schrieb Vinicius Carvalho: > >> > >>> Hi there, I've posted a few days ago about the POC I'm doing here at my > >>> company. I have some initial numbers and I'd like to ask for some help > >> here > >>> in order to promote neo4j here in LMI Ericsson. > >>> > >>> I've loaded a mySQL db with a really simple entity, that pretty much > only > >>> represents a node and relations (only properties it has is an UID and > x/y > >>> space coordinate for each node) > >>> > >>> The DB contains 250.000 cells and 19. relations stored in a myISAM > table, > >>> indexed only by it's primary key. Please find the DDL for the two > tables. > >>> > >>> CREATE TABLE `pci`.`cells` ( > >>> `id` varchar(32) collate utf8_bin NOT NULL, > >>> `x_pos` double default NULL, > >>> `y_pos` double default NULL, > >>> `pci` smallint(6) default '0', > >>> PRIMARY KEY (`id`) > >>> ) > >>> > >>> CREATE TABLE `pci`.`relations` ( > >>> `id` int(11) NOT NULL auto_increment, > >>> `source` varchar(32) collate utf8_bin default NULL, > >>> `target` varchar(32) collate utf8_bin default NULL, > >>> PRIMARY KEY (`id`), > >>> KEY `src_idx` (`source`), > >>> KEY `src_target` (`target`) > >>> ) > >>> > >>> So as you can see, a simple secondary table contains the relationship > >> with > >>> source and targets pointing to the cells table. > >>> > >>> I've loaded this exact same DB into a neoserver running on the same > >>> machine: A Blade with 26 cpus (6 cores each) and 16gb RAM. > >>> > >>> One of the requirements we have is to find all associations of my > >>> associations. Something that in neo I did like this: > >>> > >>> START n = node(3) > >>> MATCH n-->()-->(x) > >>> return x > >>> > >>> For this specific node it returns 6475 nodes. > >>> > >>> I have tested this before using Hibernate in two modes: without a L2 > >> cache, > >>> and with an L2 Cache (Ehcache standalone no replication). > >>> Here's a snippet of the code that loads it, so you can understand > what's > >>> going under the hood: > >>> > >>> > >>> @Override > >>> public List<Cell> loadCellWithRealtions(String... ids) { > >>> Session session = (Session) em.getDelegate(); > >>> Criteria c = session.createCriteria(Cell.class) > >>> .setFetchMode("incomingRelations", FetchMode.SELECT) > >>> .setFetchMode("outgoingRelations", FetchMode.SELECT) > >>> .add(Restrictions.in("id", Arrays.asList(ids))); > >>> List<Cell> results = c.list(); > >>> for(Cell cell : results){ > >>> Hibernate.initialize(cell.getIncomingRelations()); > >>> Hibernate.initialize(cell.getOutgoingRelations()); > >>> } > >>> return results; > >>> } > >>> > >>> @Override > >>> public List<Cell> loadCellWithNeighbourRelations(String... ids) { > >>> List<Cell> cells = loadCellWithRealtions(ids); > >>> for(Cell c : cells){ > >>> for(Relation r : c.getIncomingRelations()){ > >>> Hibernate.initialize(r.getSource().getIncomingRelations()); > >>> Hibernate.initialize(r.getSource().getOutgoingRelations()); > >>> } > >>> for(Relation r : c.getOutgoingRelations()){ > >>> Hibernate.initialize(r.getTarget().getIncomingRelations()); > >>> Hibernate.initialize(r.getTarget().getOutgoingRelations()); > >>> } > >>> } > >>> return cells; > >>> } > >>> > >>> > >>> > >>> So the first method executes one query and 2 subselects to find a cell > >> and > >>> all relations, the second method, iterate over each relation and do the > >>> same. So I pretty much will have something like 3+r*3 selects on db, > >> where > >>> r is the number of relations right. > >>> > >>> Ok, to be a bit fair with the tests, I've ran this for the same node 10 > >>> times (get a chance to warm the caches), exclude the longest and > smallest > >>> result, and then took a mean of it. Here's the results: > >>> > >>> EhCache: 70ms > >>> Plain Hibernate: 550ms > >>> > >>> I still don't have a version of neo4j code running integrated in the > app > >>> server, but the idea is to use REST API. Running the query on the REST > >> API > >>> took over 2 seconds on average, but due the large size of the response, > >>> network lagging was the issue. So I ran the same query 10 times using > the > >>> web console, and the average time for neo was 300ms > >>> > >>> Before asking anything I do know that we will have more complex queries > >>> where neo will shine, but I need to improve those results in order to > >> sell > >>> it here :), with those numbers, ppl will just say that having a cache > and > >>> using Relational model would suffice. > >>> > >>> Anything I could do to improve this? > >>> > >>> Regards > >>> _______________________________________________ > >>> Neo4j mailing list > >>> User@lists.neo4j.org > >>> https://lists.neo4j.org/mailman/listinfo/user > >> > >> _______________________________________________ > >> Neo4j mailing list > >> User@lists.neo4j.org > >> https://lists.neo4j.org/mailman/listinfo/user > >> > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user