Hey Michael,
I took a very quick look, I think understand it, looks like it attempts to
get nodes by id starting from 0 until highest possible ID.
public synchronized boolean hasNext()
{
while ( currentNode == null && currentNodeId <= highId )
{
try
{
currentNode = getNodeById( currentNodeId++ );
}
catch ( NotFoundException e )
{
// ok we try next
}
}
return currentNode != null;
}
(seems to work even for when neo4j recycles deleted ids;
highId=100012 in my case where I had 100,011 relationships each with unique
nodes, so likely 100,012 nodes)
Thank you for pointing me to that, I will consider doing the same with
getRelationshipById() and bench them both xD
Done, looks like it halves the time when using getAllRels()
ie. (output)
counting inside the same transaction...
Node `one` has 1,753,000 out rels, time=10,235,246,378
tx.finish() time=1,070,892,633
counting one's outgoing rels (outside of initial transaction)...
Node `one` has 1,753,000 out rels, time=1,743,920,993
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=26,404,843,957 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=699,410,620 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,667,748,552 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=655,286,990 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=*1,261*,898,459 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=*663*,559,121 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,257,557,629 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=637,826,233 ns
Shutting down database ...
and the sample program for this(I did have to add getAllRels() which is
similar with getAllNodes()):
-------------------- in EmbeddedGraphDbImpl.java
public Iterable<Relationship> getAllRels() {
return new Iterable<Relationship>() {
@Override
public Iterator<Relationship> iterator() {
long highId = nodeManager.getHighestPossibleIdInUse(
Relationship.class );
return new AllRelsIterator( highId );
}
};
}
private class AllRelsIterator implements Iterator<Relationship> {
private final long highId;
private long currentRelId = 0;
private Relationship currentRel = null;
AllRelsIterator( long highId ) {
this.highId = highId;
}
@Override
public synchronized boolean hasNext() {
while ( currentRel == null && currentRelId <= highId ) {
try {
currentRel = getRelationshipById( currentRelId++ );
} catch ( NotFoundException e ) {
// ok we try next
}
}
return currentRel != null;
}
@Override
public synchronized Relationship next() {
if ( !hasNext() ) {
throw new NoSuchElementException();
}
Relationship nextNode = currentRel;
currentRel = null;
return nextNode;
}
@Override
public void remove() {
throw new UnsupportedOperationException();
}
}
--------------
/**
* Licensed to Neo Technology under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Neo Technology licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.neo4j.examples;
import java.io.*;
import java.text.*;
import org.neo4j.graphdb.*;
import org.neo4j.graphdb.index.*;
import org.neo4j.kernel.*;
public class CalculateShortestPath {
private static final int
SHOWINFO_IF_COUNTING_REL_TOOK_MORE_THAN_ns = 2 * 300;
private static final int SHOWINFO_IF_REL_TOOK_MORE_THAN_ns
= 30000;
private static final int SHOWEVERY_xTH_REL
= 10000;
private static final int HOWMANY_RELATIONSHIPS
= 109000;
private static final String DB_PATH
= "neo4j-shortest-path";
private static final String NAME_KEY
= "name";
private static RelationshipType KNOWS
= DynamicRelationshipType
.withName( "KNOWS" );
private static GraphDatabaseService graphDb;
private static Index<Node> indexService;
private static DecimalFormat commaDelimitedFormatter
= new DecimalFormat( "###,###" );
public static String number( double val ) {
return commaDelimitedFormatter.format( val );
}
public static void main( final String[] args ) {
// deleteFileOrDirectory( new File( DB_PATH ) );// XXX:
graphDb = new EmbeddedGraphDatabase( DB_PATH );
registerShutdownHook();
indexService = graphDb.index().forNodes( "nodes" );
Transaction rootTx;
rootTx = graphDb.beginTx();
Node one = getOrCreateNode( "one" );
DynamicRelationshipType moo = DynamicRelationshipType.withName(
"moo" );
try {
for ( int i = 1; i <= HOWMANY_RELATIONSHIPS; i++ ) {
long start = System.nanoTime();
Relationship rel = one.createRelationshipTo(
graphDb.createNode(), moo );
long end = System.nanoTime();
if ( ( i % SHOWEVERY_xTH_REL == 0 ) || ( end - start >
SHOWINFO_IF_REL_TOOK_MORE_THAN_ns ) ) {
System.out.println( number( i ) + " timeDelta=" +
number( end - start ) );
}
}
System.out.println( "counting inside the same transaction..." );
long start = System.nanoTime();
Iterable<Relationship> rel = one.getRelationships(
Direction.OUTGOING, moo );
long count = 0;
long tstart = 0;
for ( Relationship relationship : rel ) {
// long tend = System.nanoTime();
count++;
// if ( ( tend - tstart >
SHOWINFO_IF_COUNTING_REL_TOOK_MORE_THAN_ns ) ) {
// System.out.println( number( count ) + " timeDelta=" +
number( tend - tstart ) );
// }
// tstart = System.nanoTime();
}
long end = System.nanoTime();
System.out.println( "Node `" + one.getProperty( NAME_KEY ) + "`
has " + number( count ) + " out rels, time="
+ number( end - start ) );
rootTx.success();
} finally {
long start = System.nanoTime();
rootTx.finish();
long end = System.nanoTime();
System.out.println( "tx.finish() time=" + number( end - start )
);
}
System.out.println( "counting one's outgoing rels (outside of
initial transaction)..." );
long start = System.nanoTime();
Iterable<Relationship> rel = one.getRelationships(
Direction.OUTGOING, moo );
long count = 0;
for ( Relationship relationship : rel ) {
count++;
}
long end = System.nanoTime();
System.out.println( "Node `" + one.getProperty( NAME_KEY ) + "` has
" + number( count ) + " out rels, time="
+ number( end - start ) );
int repeat = 3;
do {
System.out.println( "counting all outgoing rels of all nodes ...
via getAllNodes" );
count = 0;
start = System.nanoTime();
Iterable<Node> allNodes = graphDb.getAllNodes();
for ( Node node : allNodes ) {
Iterable<Relationship> allRels = node.getRelationships(
Direction.OUTGOING );
for ( Relationship relationship : allRels ) {
count++;
}
}
end = System.nanoTime();
System.out.println( "all relationships in the database = " +
number( count ) + " timedelta=" + number( end - start )
+ " ns" );
System.out.println( "counting all Relationships ... via
getAllRels" );
count = 0;
start = System.nanoTime();
Iterable<Relationship> allRels = graphDb.getAllRels();
for ( Relationship relationship : allRels ) {
count++;
}
end = System.nanoTime();
System.out.println( "all relationships in the database = " +
number( count ) + " timedelta=" + number( end - start )
+ " ns" );
} while ( repeat-- > 0 );
}
private static Node getOrCreateNode( String name ) {
Node node = indexService.get( NAME_KEY, name ).getSingle();
if ( node == null ) {
System.out.println( "creating new node with name=" + name );
node = graphDb.createNode();
node.setProperty( NAME_KEY, name );
indexService.add( node, NAME_KEY, name );
}
return node;
}
private static void registerShutdownHook() {
// Registers a shutdown hook for the Neo4j instance so that it
// shuts down nicely when the VM exits (even if you "Ctrl-C" the
// running example before it's completed)
Runtime.getRuntime().addShutdownHook( new Thread() {
@SuppressWarnings( "synthetic-access" )
@Override
public void run() {
System.out.println( "Shutting down database ..." );
graphDb.shutdown();
}
} );
}
private static void deleteFileOrDirectory( File file ) {
if ( file.exists() ) {
if ( file.isDirectory() ) {
for ( File child : file.listFiles() ) {
deleteFileOrDirectory( child );
}
}
file.delete();
}
}
}
========
here's another output when no additions were done (using same sample
program) but trying to count the relationships via getAllRels first, then
getAllNodes, and repeating this block:
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=*10,1*02,256,698 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=*29,9*02,912,485 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=687,562,153 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,377,601,567 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=*648*,269,229 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=*1,371*,030,811 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=756,425,427 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,256,402,192 ns
Shutting down database ...
======
and here's another way:
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=*10,154*,915,686 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=*690*,298,744 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=649,536,192 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=1,978,228,693 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=653,655,768 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=647,365,027 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=*28,549*,060,316 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,355,109,837 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,442,695,434 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,438,563,566 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,366,895,645 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=*1,384*,237,380 ns
Shutting down database ...
I'll paste the program as it is for this last test (though it's the same
one, but a bit reordered):
I also cleaned it up some:
/**
* Licensed to Neo Technology under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Neo Technology licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.neo4j.examples;
import java.io.*;
import java.text.*;
import org.neo4j.graphdb.*;
import org.neo4j.graphdb.index.*;
import org.neo4j.kernel.*;
public class Copy_2_of_CalculateShortestPath {
private static final String DB_PATH =
"neo4j-shortest-path";
private static final String NAME_KEY = "name";
private static GraphDatabaseService graphDb;
private static Index<Node> indexService;
private static DecimalFormat commaDelimitedFormatter = new
DecimalFormat( "###,###" );
public static String number( double val ) {
return commaDelimitedFormatter.format( val );
}
public static void main( final String[] args ) {
// deleteFileOrDirectory( new File( DB_PATH ) );// XXX:
graphDb = new EmbeddedGraphDatabase( DB_PATH );
registerShutdownHook();
indexService = graphDb.index().forNodes( "nodes" );
Transaction rootTx;
Node one = getOrCreateNode( "one" );
DynamicRelationshipType moo = DynamicRelationshipType.withName(
"moo" );
int repeat = 5;
do {
System.out.println( "counting all Relationships ... via
getAllRels" );
long count = 0;
long start = System.nanoTime();
Iterable<Relationship> allRels = graphDb.getAllRels();
for ( Relationship relationship : allRels ) {
count++;
}
long end = System.nanoTime();
System.out.println( "all relationships in the database = " +
number( count ) + " timedelta=" + number( end - start )
+ " ns" );
} while ( repeat-- > 0 );
repeat = 5;
do {
System.out.println( "counting all outgoing rels of all nodes ...
via getAllNodes" );
long count = 0;
long start = System.nanoTime();
Iterable<Node> allNodes = graphDb.getAllNodes();
for ( Node node : allNodes ) {
Iterable<Relationship> allRels2 = node.getRelationships(
Direction.OUTGOING );
for ( Relationship relationship : allRels2 ) {
count++;
}
}
long end = System.nanoTime();
System.out.println( "all relationships in the database = " +
number( count ) + " timedelta=" + number( end - start )
+ " ns" );
} while ( repeat-- > 0 );
}
private static Node getOrCreateNode( String name ) {
Node node = indexService.get( NAME_KEY, name ).getSingle();
if ( node == null ) {
System.out.println( "creating new node with name=" + name );
node = graphDb.createNode();
node.setProperty( NAME_KEY, name );
indexService.add( node, NAME_KEY, name );
}
return node;
}
private static void registerShutdownHook() {
// Registers a shutdown hook for the Neo4j instance so that it
// shuts down nicely when the VM exits (even if you "Ctrl-C" the
// running example before it's completed)
Runtime.getRuntime().addShutdownHook( new Thread() {
@SuppressWarnings( "synthetic-access" )
@Override
public void run() {
System.out.println( "Shutting down database ..." );
graphDb.shutdown();
}
} );
}
private static void deleteFileOrDirectory( File file ) {
if ( file.exists() ) {
if ( file.isDirectory() ) {
for ( File child : file.listFiles() ) {
deleteFileOrDirectory( child );
}
}
file.delete();
}
}
}
All in all, a little better, but not making much difference for me at the
moment, until reaching some very high amount of relationships or doing lots
of counts.
Well great, see you later :)
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=11,904,833,914 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=715,916,208 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=652,136,373 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=770,903,756 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=651,695,964 ns
counting all Relationships ... via getAllRels
all relationships in the database = 1,753,000 timedelta=658,016,686 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=28,067,437,117 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,347,472,686 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,282,134,781 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,309,284,532 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,358,393,266 ns
counting all outgoing rels of all nodes ... via getAllNodes
all relationships in the database = 1,753,000 timedelta=1,219,737,032 ns
Shutting down database ...
On Sat, Jul 23, 2011 at 10:51 AM, Michael Hunger <
[email protected]> wrote:
> An internal implementation would be probably faster.
>
> If timing is that critical for you, you can have a look in
> EmbeddedGraphDbImpl.getAllNodes() and implement a similar solution for
> relationships.
>
> Cheers
>
> Michael
>
> Am 23.07.2011 um 04:20 schrieb John cyuczieekc:
>
> > Hey Jim,
> > I am sort of glad to hear that, maybe in the future I could see a method
> > like getAllRelationships(), or not, np :)
> > Yes, using Michael's code works, but ...
> > total relations count=100,011 timedelta=3,075,897,991 ns
> > it kind of takes 3 seconds (when not cached) to count 100k relationships
> > (considering there are 100k+2 unique nodes too)
> > when cached:
> > total relations count=100,011 timedelta=154,673,763 ns
> >
> > Still, it's pretty fast, but I have to wonder if it would be faster if
> using
> > relationships directly :)
> >
> > Either way, wish y'all a great day!
> >
> >
> > On Sat, Jul 23, 2011 at 3:57 AM, Jim Webber <[email protected]>
> wrote:
> >
> >> Hi John,
> >>
> >> Relationships are stored in a different store than nodes. This enables
> >> Neo4j to manage lifecycle events (like caching) for nodes and
> relationships
> >> separately.
> >>
> >> Neo4j really is a graph DB, not a triple store masquerading as a graph
> DB.
> >>
> >> Nonetheless, that code Michael sent still works :-)
> >>
> >> Jim
> >> _______________________________________________
> >> Neo4j mailing list
> >> [email protected]
> >> https://lists.neo4j.org/mailman/listinfo/user
> >>
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user