Paul, Marko, could you do a test on if the new Random(0) would be a good change? I am not really into that algo, so I think you could do a much better job there, given your expertise!
Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Nov 10, 2010 at 12:19 AM, Paul A. Jackson <paul.jack...@pb.com> wrote: > Perhaps if "new Random( System.currentTimeMillis() )" we replaced with "new > Random( 0 )", you would get the benefits of pseudo random behavior but also > deterministic results from run to run. > > -Paul > > -----Original Message----- > From: Paul A. Jackson > Sent: Tuesday, November 09, 2010 6:16 PM > To: 'Neo4j user discussions' > Subject: RE: [Neo4j] Eigenvector Centrality subclasses > > I'm using: > import org.neo4j.graphalgo.impl.centrality.EigenvectorCentrality; > import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityArnoldi; > import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityPower; > > The variance I am seeing is far greater than anything that could be explained > by floating point precision issues. For example, a result coming back after > one call as 0.045 and then on the next call with identical options it could > return 0.038. > > I glanced over the code and I see that they both use java.util.Random, so > that could explain why it is not deterministic. Maybe that answers > everything. > > Unfortunately, what it means is that you might randomly have two subsequent > calls that appear to return similar results, but actually you have not zeroed > in on the correct answer within the actual level of precision that is desired. > > The JavaDoc explicitly states that precision doesn't means proximity to > correct result, but it doesn't make the results less unsatisfying. > > -Paul > > -----Original Message----- > From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On > Behalf Of Marko Rodriguez > Sent: Tuesday, November 09, 2010 6:06 PM > To: Neo4j user discussions > Subject: Re: [Neo4j] Eigenvector Centrality subclasses > > Hey Paul, > >> I get inconsistent results from run to run using eigenvector centrality. It >> doesn't seem to matter which implementation I use but I have used Arnoldi >> most, for no reason other than it returns the iteration count. > > Given that eigenvector components sum to 1, and when dealing with large > graphs, you may be running into floating point precision issues. In general, > different eigenvector methods may have small variations in their values (even > though its the same eigenvector!), but, if you are getting Spearman rank > order correlation ~1.0, then I think its 'all good.' Also, note that for > those eigenvector centrality implementations that are based on random walk, > variations are sure to show up. > >> The iteration count is not consistent from run to run when run against the >> exact same graph using the exact same precision. In a graph with 32 nodes >> and 117 edges, I get anywhere from 18 to 24 iterations needed to get a >> precision of 0.001. The variance is easier to see when the test is run on >> different computers. > > Hmm... What code are you using? I'm talking in general and not specifically > about anything Neo4j related... > > Thanks, > Marko. > > http://markorodriguez.com > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user