[jira] [Commented] (GIRAPH-191) Random Walk with Restart

2012-05-18 Thread Paolo Castagna (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278712#comment-13278712
 ] 

Paolo Castagna commented on GIRAPH-191:
---

It would be good if we could also use this or provide a PageRank implementation 
which deals with dangling nodes/vertexes properly.

Dangling vertexes are vertexes with no edges.

SinglePageRankVertex has:

{code}
if (getSuperstep()  MAX_SUPERSTEPS) {
  long edges = getNumOutEdges();
  sendMsgToAllEdges(
  new DoubleWritable(getVertexValue().get() / edges));
} else {
  voteToHalt();
}
{code}

This does not work when getNumOutEdges() returns 0.

Some suggest to divide the PageRank scores of dangling vertexes evenly among 
all other vertex (it's yet another sort of random jump to propagate PageRank 
scores to all nodes). This can be implemented in Giraph as a separate superstep 
using a SumAggregator.

Discussion on the giraph-user mailing list with further comments and references 
is here:

 - 
http://mail-archives.apache.org/mod_mbox/incubator-giraph-user/201205.mbox/%3c4fb509f4.4040...@googlemail.com%3E

 Random Walk with Restart
 

 Key: GIRAPH-191
 URL: https://issues.apache.org/jira/browse/GIRAPH-191
 Project: Giraph
  Issue Type: New Feature
Reporter: Gianmarco De Francisci Morales
 Attachments: GIRAPH-191.patch


 Implementing RWR on Giraph should be a very simple modification of the 
 SimplePageRankVertex code.
 {code}
 if ( myID == sourceID )
   DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
 else
   DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
 {code}
 It would be nice to make it as configurable as possible by using parametric 
 damping factors, preference vectors, strongly preferential, etc...
 More or less along these lines:
 http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira





[jira] [Commented] (GIRAPH-146) Maven is running the tests twice during builds

2012-04-30 Thread Paolo Castagna (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264925#comment-13264925
 ] 

Paolo Castagna commented on GIRAPH-146:
---

Giraph is using cobertura-maven-plugin which needs to instrument the byte code 
before running the unit tests for the second time. I've never managed to avoid 
this and some also argue that it would be wrong to do so, for example see:

 - 
http://stackoverflow.com/questions/8485559/managing-report-plugins-in-maven-site
 - http://stackoverflow.com/questions/4521564/hudson-and-maven-tests-run-twice
 - 
http://stackoverflow.com/questions/3421582/how-to-avoid-double-compilation-and-testing-with-coberturacheck

One person suggested to run codemvn clean install 
-Dmaven.test.skip=true/code first and then codemvn cobertura:check/code

 Maven is running the tests twice during builds
 --

 Key: GIRAPH-146
 URL: https://issues.apache.org/jira/browse/GIRAPH-146
 Project: Giraph
  Issue Type: Bug
  Components: build
Reporter: Jakob Homan

 I had a feeling the build time had jumped significantly... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira