Right, there could be even a faster ways but they would need a few additional methods in NodeManager :)
Cheers Michael Am 23.07.2011 um 15:30 schrieb John cyuczieekc: > Hey Michael, > I took a very quick look, I think understand it, looks like it attempts to > get nodes by id starting from 0 until highest possible ID. > public synchronized boolean hasNext() > { > while ( currentNode == null && currentNodeId <= highId ) > { > try > { > currentNode = getNodeById( currentNodeId++ ); > } > catch ( NotFoundException e ) > { > // ok we try next > } > } > return currentNode != null; > } > (seems to work even for when neo4j recycles deleted ids; > highId=100012 in my case where I had 100,011 relationships each with unique > nodes, so likely 100,012 nodes) > > Thank you for pointing me to that, I will consider doing the same with > getRelationshipById() and bench them both xD > Done, looks like it halves the time when using getAllRels() > ie. (output) > counting inside the same transaction... > Node `one` has 1,753,000 out rels, time=10,235,246,378 > tx.finish() time=1,070,892,633 > counting one's outgoing rels (outside of initial transaction)... > Node `one` has 1,753,000 out rels, time=1,743,920,993 > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=26,404,843,957 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=699,410,620 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,667,748,552 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=655,286,990 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=*1,261*,898,459 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=*663*,559,121 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,257,557,629 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=637,826,233 ns > Shutting down database ... > > and the sample program for this(I did have to add getAllRels() which is > similar with getAllNodes()): > -------------------- in EmbeddedGraphDbImpl.java > public Iterable<Relationship> getAllRels() { > return new Iterable<Relationship>() { > > @Override > public Iterator<Relationship> iterator() { > long highId = nodeManager.getHighestPossibleIdInUse( > Relationship.class ); > return new AllRelsIterator( highId ); > } > }; > } > > private class AllRelsIterator implements Iterator<Relationship> { > > private final long highId; > private long currentRelId = 0; > private Relationship currentRel = null; > > > AllRelsIterator( long highId ) { > this.highId = highId; > } > > > @Override > public synchronized boolean hasNext() { > while ( currentRel == null && currentRelId <= highId ) { > try { > currentRel = getRelationshipById( currentRelId++ ); > } catch ( NotFoundException e ) { > // ok we try next > } > } > return currentRel != null; > } > > > @Override > public synchronized Relationship next() { > if ( !hasNext() ) { > throw new NoSuchElementException(); > } > > Relationship nextNode = currentRel; > currentRel = null; > return nextNode; > } > > > @Override > public void remove() { > throw new UnsupportedOperationException(); > } > } > > -------------- > /** > * Licensed to Neo Technology under one or more contributor > * license agreements. See the NOTICE file distributed with > * this work for additional information regarding copyright > * ownership. Neo Technology licenses this file to you under > * the Apache License, Version 2.0 (the "License"); you may > * not use this file except in compliance with the License. > * You may obtain a copy of the License at > * > * http://www.apache.org/licenses/LICENSE-2.0 > * > * Unless required by applicable law or agreed to in writing, > * software distributed under the License is distributed on an > * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY > * KIND, either express or implied. See the License for the > * specific language governing permissions and limitations > * under the License. > */ > package org.neo4j.examples; > > import java.io.*; > import java.text.*; > > import org.neo4j.graphdb.*; > import org.neo4j.graphdb.index.*; > import org.neo4j.kernel.*; > > > > public class CalculateShortestPath { > > private static final int > SHOWINFO_IF_COUNTING_REL_TOOK_MORE_THAN_ns = 2 * 300; > private static final int SHOWINFO_IF_REL_TOOK_MORE_THAN_ns > = 30000; > private static final int SHOWEVERY_xTH_REL > = 10000; > private static final int HOWMANY_RELATIONSHIPS > = 109000; > private static final String DB_PATH > = "neo4j-shortest-path"; > private static final String NAME_KEY > = "name"; > private static RelationshipType KNOWS > = DynamicRelationshipType > > .withName( "KNOWS" ); > > private static GraphDatabaseService graphDb; > private static Index<Node> indexService; > private static DecimalFormat commaDelimitedFormatter > = new DecimalFormat( "###,###" ); > > > public static String number( double val ) { > return commaDelimitedFormatter.format( val ); > } > > > public static void main( final String[] args ) { > // deleteFileOrDirectory( new File( DB_PATH ) );// XXX: > graphDb = new EmbeddedGraphDatabase( DB_PATH ); > registerShutdownHook(); > indexService = graphDb.index().forNodes( "nodes" ); > Transaction rootTx; > rootTx = graphDb.beginTx(); > Node one = getOrCreateNode( "one" ); > DynamicRelationshipType moo = DynamicRelationshipType.withName( > "moo" ); > try { > for ( int i = 1; i <= HOWMANY_RELATIONSHIPS; i++ ) { > long start = System.nanoTime(); > Relationship rel = one.createRelationshipTo( > graphDb.createNode(), moo ); > long end = System.nanoTime(); > if ( ( i % SHOWEVERY_xTH_REL == 0 ) || ( end - start > > SHOWINFO_IF_REL_TOOK_MORE_THAN_ns ) ) { > System.out.println( number( i ) + " timeDelta=" + > number( end - start ) ); > } > } > > System.out.println( "counting inside the same transaction..." ); > long start = System.nanoTime(); > Iterable<Relationship> rel = one.getRelationships( > Direction.OUTGOING, moo ); > long count = 0; > long tstart = 0; > for ( Relationship relationship : rel ) { > // long tend = System.nanoTime(); > count++; > // if ( ( tend - tstart > > SHOWINFO_IF_COUNTING_REL_TOOK_MORE_THAN_ns ) ) { > // System.out.println( number( count ) + " timeDelta=" + > number( tend - tstart ) ); > // } > // tstart = System.nanoTime(); > } > long end = System.nanoTime(); > System.out.println( "Node `" + one.getProperty( NAME_KEY ) + "` > has " + number( count ) + " out rels, time=" > + number( end - start ) ); > > rootTx.success(); > } finally { > long start = System.nanoTime(); > rootTx.finish(); > long end = System.nanoTime(); > System.out.println( "tx.finish() time=" + number( end - start ) > ); > } > > > System.out.println( "counting one's outgoing rels (outside of > initial transaction)..." ); > long start = System.nanoTime(); > Iterable<Relationship> rel = one.getRelationships( > Direction.OUTGOING, moo ); > long count = 0; > for ( Relationship relationship : rel ) { > count++; > } > long end = System.nanoTime(); > System.out.println( "Node `" + one.getProperty( NAME_KEY ) + "` has > " + number( count ) + " out rels, time=" > + number( end - start ) ); > > int repeat = 3; > do { > System.out.println( "counting all outgoing rels of all nodes ... > via getAllNodes" ); > count = 0; > start = System.nanoTime(); > Iterable<Node> allNodes = graphDb.getAllNodes(); > for ( Node node : allNodes ) { > Iterable<Relationship> allRels = node.getRelationships( > Direction.OUTGOING ); > for ( Relationship relationship : allRels ) { > count++; > } > } > end = System.nanoTime(); > System.out.println( "all relationships in the database = " + > number( count ) + " timedelta=" + number( end - start ) > + " ns" ); > > > System.out.println( "counting all Relationships ... via > getAllRels" ); > count = 0; > start = System.nanoTime(); > Iterable<Relationship> allRels = graphDb.getAllRels(); > for ( Relationship relationship : allRels ) { > count++; > } > end = System.nanoTime(); > System.out.println( "all relationships in the database = " + > number( count ) + " timedelta=" + number( end - start ) > + " ns" ); > } while ( repeat-- > 0 ); > } > > > private static Node getOrCreateNode( String name ) { > Node node = indexService.get( NAME_KEY, name ).getSingle(); > if ( node == null ) { > System.out.println( "creating new node with name=" + name ); > node = graphDb.createNode(); > node.setProperty( NAME_KEY, name ); > indexService.add( node, NAME_KEY, name ); > } > return node; > } > > > private static void registerShutdownHook() { > // Registers a shutdown hook for the Neo4j instance so that it > // shuts down nicely when the VM exits (even if you "Ctrl-C" the > // running example before it's completed) > Runtime.getRuntime().addShutdownHook( new Thread() { > > @SuppressWarnings( "synthetic-access" ) > @Override > public void run() { > System.out.println( "Shutting down database ..." ); > graphDb.shutdown(); > } > } ); > } > > > private static void deleteFileOrDirectory( File file ) { > if ( file.exists() ) { > if ( file.isDirectory() ) { > for ( File child : file.listFiles() ) { > deleteFileOrDirectory( child ); > } > } > file.delete(); > } > } > } > > ======== > here's another output when no additions were done (using same sample > program) but trying to count the relationships via getAllRels first, then > getAllNodes, and repeating this block: > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=*10,1*02,256,698 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=*29,9*02,912,485 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=687,562,153 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,377,601,567 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=*648*,269,229 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=*1,371*,030,811 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=756,425,427 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,256,402,192 ns > Shutting down database ... > > ====== > and here's another way: > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=*10,154*,915,686 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=*690*,298,744 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=649,536,192 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=1,978,228,693 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=653,655,768 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=647,365,027 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=*28,549*,060,316 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,355,109,837 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,442,695,434 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,438,563,566 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,366,895,645 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=*1,384*,237,380 ns > Shutting down database ... > > I'll paste the program as it is for this last test (though it's the same > one, but a bit reordered): > I also cleaned it up some: > /** > * Licensed to Neo Technology under one or more contributor > * license agreements. See the NOTICE file distributed with > * this work for additional information regarding copyright > * ownership. Neo Technology licenses this file to you under > * the Apache License, Version 2.0 (the "License"); you may > * not use this file except in compliance with the License. > * You may obtain a copy of the License at > * > * http://www.apache.org/licenses/LICENSE-2.0 > * > * Unless required by applicable law or agreed to in writing, > * software distributed under the License is distributed on an > * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY > * KIND, either express or implied. See the License for the > * specific language governing permissions and limitations > * under the License. > */ > package org.neo4j.examples; > > import java.io.*; > import java.text.*; > > import org.neo4j.graphdb.*; > import org.neo4j.graphdb.index.*; > import org.neo4j.kernel.*; > > > > public class Copy_2_of_CalculateShortestPath { > > private static final String DB_PATH = > "neo4j-shortest-path"; > private static final String NAME_KEY = "name"; > > private static GraphDatabaseService graphDb; > private static Index<Node> indexService; > private static DecimalFormat commaDelimitedFormatter = new > DecimalFormat( "###,###" ); > > > public static String number( double val ) { > return commaDelimitedFormatter.format( val ); > } > > > public static void main( final String[] args ) { > // deleteFileOrDirectory( new File( DB_PATH ) );// XXX: > graphDb = new EmbeddedGraphDatabase( DB_PATH ); > registerShutdownHook(); > indexService = graphDb.index().forNodes( "nodes" ); > Transaction rootTx; > Node one = getOrCreateNode( "one" ); > DynamicRelationshipType moo = DynamicRelationshipType.withName( > "moo" ); > > int repeat = 5; > do { > System.out.println( "counting all Relationships ... via > getAllRels" ); > long count = 0; > long start = System.nanoTime(); > Iterable<Relationship> allRels = graphDb.getAllRels(); > for ( Relationship relationship : allRels ) { > count++; > } > long end = System.nanoTime(); > System.out.println( "all relationships in the database = " + > number( count ) + " timedelta=" + number( end - start ) > + " ns" ); > } while ( repeat-- > 0 ); > > repeat = 5; > do { > System.out.println( "counting all outgoing rels of all nodes ... > via getAllNodes" ); > long count = 0; > long start = System.nanoTime(); > Iterable<Node> allNodes = graphDb.getAllNodes(); > for ( Node node : allNodes ) { > Iterable<Relationship> allRels2 = node.getRelationships( > Direction.OUTGOING ); > for ( Relationship relationship : allRels2 ) { > count++; > } > } > long end = System.nanoTime(); > System.out.println( "all relationships in the database = " + > number( count ) + " timedelta=" + number( end - start ) > + " ns" ); > } while ( repeat-- > 0 ); > } > > > private static Node getOrCreateNode( String name ) { > Node node = indexService.get( NAME_KEY, name ).getSingle(); > if ( node == null ) { > System.out.println( "creating new node with name=" + name ); > node = graphDb.createNode(); > node.setProperty( NAME_KEY, name ); > indexService.add( node, NAME_KEY, name ); > } > return node; > } > > > private static void registerShutdownHook() { > // Registers a shutdown hook for the Neo4j instance so that it > // shuts down nicely when the VM exits (even if you "Ctrl-C" the > // running example before it's completed) > Runtime.getRuntime().addShutdownHook( new Thread() { > > @SuppressWarnings( "synthetic-access" ) > @Override > public void run() { > System.out.println( "Shutting down database ..." ); > graphDb.shutdown(); > } > } ); > } > > > private static void deleteFileOrDirectory( File file ) { > if ( file.exists() ) { > if ( file.isDirectory() ) { > for ( File child : file.listFiles() ) { > deleteFileOrDirectory( child ); > } > } > file.delete(); > } > } > } > > All in all, a little better, but not making much difference for me at the > moment, until reaching some very high amount of relationships or doing lots > of counts. > Well great, see you later :) > > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=11,904,833,914 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=715,916,208 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=652,136,373 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=770,903,756 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=651,695,964 ns > counting all Relationships ... via getAllRels > all relationships in the database = 1,753,000 timedelta=658,016,686 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=28,067,437,117 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,347,472,686 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,282,134,781 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,309,284,532 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,358,393,266 ns > counting all outgoing rels of all nodes ... via getAllNodes > all relationships in the database = 1,753,000 timedelta=1,219,737,032 ns > Shutting down database ... > > > On Sat, Jul 23, 2011 at 10:51 AM, Michael Hunger < > michael.hun...@neotechnology.com> wrote: > >> An internal implementation would be probably faster. >> >> If timing is that critical for you, you can have a look in >> EmbeddedGraphDbImpl.getAllNodes() and implement a similar solution for >> relationships. >> >> Cheers >> >> Michael >> >> Am 23.07.2011 um 04:20 schrieb John cyuczieekc: >> >>> Hey Jim, >>> I am sort of glad to hear that, maybe in the future I could see a method >>> like getAllRelationships(), or not, np :) >>> Yes, using Michael's code works, but ... >>> total relations count=100,011 timedelta=3,075,897,991 ns >>> it kind of takes 3 seconds (when not cached) to count 100k relationships >>> (considering there are 100k+2 unique nodes too) >>> when cached: >>> total relations count=100,011 timedelta=154,673,763 ns >>> >>> Still, it's pretty fast, but I have to wonder if it would be faster if >> using >>> relationships directly :) >>> >>> Either way, wish y'all a great day! >>> >>> >>> On Sat, Jul 23, 2011 at 3:57 AM, Jim Webber <j...@neotechnology.com> >> wrote: >>> >>>> Hi John, >>>> >>>> Relationships are stored in a different store than nodes. This enables >>>> Neo4j to manage lifecycle events (like caching) for nodes and >> relationships >>>> separately. >>>> >>>> Neo4j really is a graph DB, not a triple store masquerading as a graph >> DB. >>>> >>>> Nonetheless, that code Michael sent still works :-) >>>> >>>> Jim >>>> _______________________________________________ >>>> Neo4j mailing list >>>> User@lists.neo4j.org >>>> https://lists.neo4j.org/mailman/listinfo/user >>>> >>> _______________________________________________ >>> Neo4j mailing list >>> User@lists.neo4j.org >>> https://lists.neo4j.org/mailman/listinfo/user >> >> _______________________________________________ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user