Actually I tried "mvn install package" with skipping test, but then it could not pass the test in the compilation of core and it could not continue!
2014-02-28 11:54 GMT+08:00 Edward J. Yoon <[email protected]>: > Please try to build without -Dmaven.test.skip=true option. > > > Problem 2: > > Compared to 0.6.0, you use VerticesInfo to store the vertices, and in the > > DiskVerticesInfo, they serialize and deserialize every vertices into the > > local file system. > > Is this used to provides fault tolerance (the checkpoint part)? Or is it > > designed for other purposes? > > It was a part of effort to reduce the memory usage[1]. In the TRUNK > version, we've optimized memory consumption, by serializing vertex > objects in memory, without big degradation of performance. As I > mentioned before, the vertices doesn't occupy large memory. > > > If it is designed for checkpoint part in fault tolerance, why it write to > > local disk but not HDFS? > > In my mind, if a machine crashed, if the fault tolerance mechanism > depends > > on the manual reboot or repair of the crashed machine, the potential > lengthy > > recovery time is intolerant. > > Do you agree with me? > > Or maybe you have other trade-off? > > User will be able to set the checkpointing interval. Then, the content > of memory buffers need to be written to HDFS only when checkpoint > occurs. > > 1. > https://issues.apache.org/jira/browse/HAMA-704?focusedCommentId=13580454&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13580454 > > On Fri, Feb 28, 2014 at 11:55 AM, developer wang <[email protected]> > wrote: > > Thank you very much for the previous useful reply. > > But I encounters with other problems. > > > ----------------------------------------------------------------------------------------------------------------------------------------------- > > 1 Problem > > I tried the trunk (with the commit-point in git: > > 1b3f1744a33a29686c2eafe7764bb3640938fcc8), but it can not pass the > > compilation phase (I use this command: mvn install package > > -Dmaven.test.skip=true) it complained this: > > > > [INFO] > > [INFO] > > ------------------------------------------------------------------------ > > [INFO] Building graph 0.7.0-SNAPSHOT > > [INFO] > > ------------------------------------------------------------------------ > > [INFO] > > ------------------------------------------------------------------------ > > [INFO] Reactor Summary: > > [INFO] > > [INFO] Apache Hama parent POM ............................ SUCCESS > [2.960s] > > [INFO] pipes ............................................. SUCCESS > [10.033s] > > [INFO] commons ........................................... SUCCESS > [6.664s] > > [INFO] core .............................................. SUCCESS > [23.909s] > > [INFO] graph ............................................. FAILURE > [0.048s] > > [INFO] machine learning .................................. SKIPPED > > [INFO] examples .......................................... SKIPPED > > [INFO] hama-dist ......................................... SKIPPED > > [INFO] > > ------------------------------------------------------------------------ > > [INFO] BUILD FAILURE > > [INFO] > > ------------------------------------------------------------------------ > > [INFO] Total time: 44.102s > > [INFO] Finished at: Fri Feb 28 10:06:23 HKT 2014 > > [INFO] Final Memory: 50M/384M > > [INFO] > > ------------------------------------------------------------------------ > > [ERROR] Failed to execute goal on project hama-graph: Could not resolve > > dependencies for project org.apache.hama:hama-graph:jar:0.7.0-SNAPSHOT: > > Failure to find org.apache.hama:hama-core:jar:tests:0.7.0-SNAPSHOT in > > https://repository.cloudera.com/artifactory/cloudera-repos was cached > in the > > local repository, resolution will not be reattempted until the update > > interval of cloudera-repo has elapsed or updates are forced -> [Help 1] > > > > Does this mean you forget upload some jars to the maven remote > repository? > > (I can the above command to compile 0.6.3) > > > > I surveyed about this question on the Internet, some says I should run > maven > > with -U. > > So I tried to this command: mvn -U compile > > but it still fails. almost the same error: > > [ERROR] Failed to execute goal > > org.apache.maven.plugins:maven-remote-resources-plugin:1.1:process > (default) > > on project hama-graph: Failed to resolve dependencies for one or more > > projects in the reactor. Reason: Missing: > > [ERROR] ---------- > > [ERROR] 1) org.apache.hama:hama-core:test-jar:tests:0.7.0-SNAPSHOT > > [ERROR] > > [ERROR] Try downloading the file manually from the project website. > > [ERROR] > > [ERROR] Then, install it using the command: > > [ERROR] mvn install:install-file -DgroupId=org.apache.hama > > -DartifactId=hama-core -Dversion=0.7.0-SNAPSHOT -Dclassifier=tests > > -Dpackaging=test-jar -Dfile=/path/to/file > > [ERROR] > > [ERROR] Alternatively, if you host your own repository you can deploy the > > file there: > > [ERROR] mvn deploy:deploy-file -DgroupId=org.apache.hama > > -DartifactId=hama-core -Dversion=0.7.0-SNAPSHOT -Dclassifier=tests > > -Dpackaging=test-jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] > > [ERROR] > > [ERROR] Path to dependency: > > [ERROR] 1) org.apache.hama:hama-graph:jar:0.7.0-SNAPSHOT > > [ERROR] 2) org.apache.hama:hama-core:test-jar:tests:0.7.0-SNAPSHOT > > [ERROR] > > [ERROR] ---------- > > [ERROR] 1 required artifact is missing. > > > > > > > > > > > > > ----------------------------------------------------------------------------------------------------------------------------------------------- > > > > Problem 2: > > Compared to 0.6.0, you use VerticesInfo to store the vertices, and in the > > DiskVerticesInfo, they serialize and deserialize every vertices into the > > local file system. > > Is this used to provides fault tolerance (the checkpoint part)? Or is it > > designed for other purposes? > > > > If it is designed for checkpoint part in fault tolerance, why it write to > > local disk but not HDFS? > > In my mind, if a machine crashed, if the fault tolerance mechanism > depends > > on the manual reboot or repair of the crashed machine, the potential > lengthy > > recovery time is intolerant. > > Do you agree with me? > > Or maybe you have other trade-off? > > > > > > > > 2014-02-28 8:07 GMT+08:00 Edward J. Yoon <[email protected]>: > > > >> In 0.6.3, you can use only ListVerticesInfo. Please use the TRUNK if you > >> want. > >> > >> And, vertices doesn't occupy large memory. Please use ListVerticesInfo. > >> > >> FT is not supported yet. > >> > >> On Thu, Feb 27, 2014 at 10:06 PM, developer wang < > [email protected]> > >> wrote: > >> > Hi, all. > >> > Thank you for your detailed reply. > >> > > >> > In the previous test, I used a incomplete graph to run PageRank, > then > >> > I > >> > got this error: > >> > java.lang.IllegalArgumentException: Messages must never be behind > the > >> > vertex in ID! Current Message ID: > >> > > >> > With your detailed reply, I knew it was because some vertices tried > to > >> > send messages to dangling nodes (Actually 0.6.0 can handle this by > >> > adding a > >> > repair phase). Then I fixed this by adding dangling nodes explicitly > >> > with a > >> > line which only contains a vertex id. > >> > > >> > After this, I could run the PageRank (in the attachment) with the > >> > ListVerticesInfo. > >> > > >> > But if I use DiskVerticesInfo instead of DiskVerticesInfo by this: > >> > pageJob.set("hama.graph.vertices.info", > >> > "org.apache.hama.graph.DiskVerticesInfo"); > >> > > >> > I will still get the below error: > >> > java.lang.IllegalArgumentException: Messages must never be behind the > >> > vertex in ID! Current Message ID: > >> > > >> > What's the problem? > >> > Do I use DiskVerticesInfo correctly? > >> > > >> > Or if I want run my application with fault tolerance, what should I > do? > >> > > >> > Thank you very much. > >> > > >> > > >> > > >> > 2014-02-26 18:27 GMT+08:00 Edward J. Yoon <[email protected]>: > >> > > >> >> > Could you answer this question: > >> >> > I found during the loading, peers would not exchange vertices with > >> >> > each > >> >> > other as hama 0.6.0 did. > >> >> > So how does hama 0.6.3 solve the problem below: a peer load a > vertex > >> >> > which > >> >> > is belong to another peer? (for example, suppose 3 peers for this > >> >> > task > >> >> > and > >> >> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer #2 > did > >> >> > not > >> >> > send vertex 2 to peer #2) > >> >> > >> >> Instead of network communication, 0.6.3 uses file communication for > >> >> input data partitioning. Please see > >> >> > >> >> > >> >> > http://svn.apache.org/repos/asf/hama/trunk/core/src/main/java/org/apache/hama/bsp/PartitioningRunner.java > >> >> > >> >> On Wed, Feb 26, 2014 at 6:03 PM, developer wang > >> >> <[email protected]> > >> >> wrote: > >> >> > Actually I comment the set statement > >> >> > //pageJob.set("hama.graph.self.ref", "true"); > >> >> > > >> >> > and In GraphJobRunner: > >> >> > final boolean selfReference = > >> >> > conf.getBoolean("hama.graph.self.ref", > >> >> > false); > >> >> > > >> >> > And I will explicitly set hama.graph.self.ref to false, and use a > >> >> > complete > >> >> > graph to have a try again. > >> >> > > >> >> > > >> >> > Could you answer this question: > >> >> > I found during the loading, peers would not exchange vertices with > >> >> > each > >> >> > other as hama 0.6.0 did. > >> >> > So how does hama 0.6.3 solve the problem below: a peer load a > vertex > >> >> > which > >> >> > is belong to another peer? (for example, suppose 3 peers for this > >> >> > task > >> >> > and > >> >> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer #2 > did > >> >> > not > >> >> > send vertex 2 to peer #2) > >> >> > > >> >> > or I have some misunderstanding about hama 0.6.3 or above? (In the > >> >> > last > >> >> > years, I used 0.6.0 to do the daily job) > >> >> > > >> >> > > >> >> > > >> >> > 2014-02-26 16:48 GMT+08:00 Edward J. Yoon <[email protected]>: > >> >> > > >> >> >> The background is described here: > >> >> >> https://issues.apache.org/jira/browse/HAMA-758 > >> >> >> > >> >> >> On Wed, Feb 26, 2014 at 5:38 PM, Edward J. Yoon > >> >> >> <[email protected]> > >> >> >> wrote: > >> >> >> > Oh, please try after set "hama.check.missing.vertex" to false in > >> >> >> > job > >> >> >> > configuration. > >> >> >> > > >> >> >> > On Wed, Feb 26, 2014 at 5:14 PM, developer wang > >> >> >> > <[email protected]> > >> >> >> > wrote: > >> >> >> >> Thank you very much. > >> >> >> >> > >> >> >> >> Since I think the framework should not decide whether the graph > >> >> >> >> should > >> >> >> >> self-reference, so I disable this config. (Actually when I used > >> >> >> >> 0.6.0, > >> >> >> >> I > >> >> >> >> also disabled this config) > >> >> >> >> > >> >> >> >> Since I use my PC to test whether my application works, I use a > >> >> >> >> small > >> >> >> >> graph. > >> >> >> >> (It does have a lot of dangling node) > >> >> >> >> > >> >> >> >> The dataset and the PageRank is attached. > >> >> >> >> > >> >> >> >> Thank you very much. > >> >> >> >> > >> >> >> >> > >> >> >> >> 2014-02-26 16:04 GMT+08:00 Edward J. Yoon > >> >> >> >> <[email protected]>: > >> >> >> >> > >> >> >> >>> Hi Wang, > >> >> >> >>> > >> >> >> >>> Can you send me your input data so that I can debug? > >> >> >> >>> > >> >> >> >>> On Wed, Feb 26, 2014 at 4:55 PM, developer wang > >> >> >> >>> <[email protected]> > >> >> >> >>> wrote: > >> >> >> >>> > Firstly, thank you very much for reply. > >> >> >> >>> > > >> >> >> >>> > But in the log, I found "14/02/25 16:45:00 INFO > >> >> >> >>> > graph.GraphJobRunner: > >> >> >> >>> > 2918 > >> >> >> >>> > vertices are loaded into localhost:60340 " > >> >> >> >>> > So it had finished the loading phase. is this true? > >> >> >> >>> > > >> >> >> >>> > Another problem is that: > >> >> >> >>> > I found during the loading, peers would not exchange > vertices > >> >> >> >>> > with > >> >> >> >>> > each > >> >> >> >>> > other as hama 0.6.0 did. > >> >> >> >>> > So how does hama 0.6.3 solve the problem below: a peer load > a > >> >> >> >>> > vertex > >> >> >> >>> > which > >> >> >> >>> > is belong to another peer? (for example, suppose 3 peers for > >> >> >> >>> > this > >> >> >> >>> > task > >> >> >> >>> > and > >> >> >> >>> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, > peer > >> >> >> >>> > #2 > >> >> >> >>> > did > >> >> >> >>> > not > >> >> >> >>> > send vertex 2 to peer #2) > >> >> >> >>> > > >> >> >> >>> > > >> >> >> >>> > 2014-02-26 15:46 GMT+08:00 Edward J. Yoon > >> >> >> >>> > <[email protected]>: > >> >> >> >>> > > >> >> >> >>> >> > I tried PageRank with a small input of my own. > >> >> >> >>> >> > >> >> >> >>> >> Hi Wang, > >> >> >> >>> >> > >> >> >> >>> >> This error often occurs when there is a record conversion > >> >> >> >>> >> error. > >> >> >> >>> >> So, > >> >> >> >>> >> you should check whether the vertex reader works correctly. > >> >> >> >>> >> > >> >> >> >>> >> And, I highly recommend you to use latest TRUNK version[1] > as > >> >> >> >>> >> possible. > >> >> >> >>> >> > >> >> >> >>> >> 1. > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > http://wiki.apache.org/hama/GettingStarted#Build_latest_version_from_source > >> >> >> >>> >> > >> >> >> >>> >> Thank you. > >> >> >> >>> >> > >> >> >> >>> >> On Wed, Feb 26, 2014 at 1:44 PM, developer wang > >> >> >> >>> >> <[email protected]> > >> >> >> >>> >> wrote: > >> >> >> >>> >> > Hi, all. > >> >> >> >>> >> > I am Peng Wang, a student trying to use and learn > Hama. > >> >> >> >>> >> > > >> >> >> >>> >> > I cloned the develop git repository of Hama. > >> >> >> >>> >> > > >> >> >> >>> >> > I firstly tried the newest version in the tag, the > tag: > >> >> >> >>> >> > 0.7.0-SNAPSHOT. > >> >> >> >>> >> > commit bef419747695d15de8a1087f44028ee40571b5f9 > >> >> >> >>> >> > Author: Edward J. Yoon <[email protected]> > >> >> >> >>> >> > Date: Fri Mar 29 00:44:59 2013 +0000 > >> >> >> >>> >> > > >> >> >> >>> >> > [maven-release-plugin] copy for tag 0.7.0-SNAPSHOT > >> >> >> >>> >> > > >> >> >> >>> >> > git-svn-id: > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > https://svn.apache.org/repos/asf/hama/tags/0.7.0-SNAPSHOT@1462366 > >> >> >> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68 > >> >> >> >>> >> > > >> >> >> >>> >> > But the tag: 0.6.3-RC3 > >> >> >> >>> >> > commit c9526b1272c83d641332667ce5d81d7ccc94be06 > >> >> >> >>> >> > Author: Edward J. Yoon <[email protected]> > >> >> >> >>> >> > Date: Sun Oct 6 08:27:00 2013 +0000 > >> >> >> >>> >> > > >> >> >> >>> >> > [maven-release-plugin] copy for tag 0.6.3-RC3 > >> >> >> >>> >> > > >> >> >> >>> >> > git-svn-id: > >> >> >> >>> >> > > >> >> >> >>> >> > > https://svn.apache.org/repos/asf/hama/tags/0.6.3-RC3@1529594 > >> >> >> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68 > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > From the commit log, 0.7.0-SNAPSHOT is earlier than > >> >> >> >>> >> > 0.6.3-RC3, > >> >> >> >>> >> > So I used 0.6.3-RC3 instead of 0.7.0-SNAPSHOT (but on > >> >> >> >>> >> > the > >> >> >> >>> >> > website > >> >> >> >>> >> > of > >> >> >> >>> >> > hama, 0.7.0-SNAPSHOT is the newest version) > >> >> >> >>> >> > > >> >> >> >>> >> > Then I deployed Hama with the Pseudo Distributed Mode > on > >> >> >> >>> >> > my > >> >> >> >>> >> > desktop > >> >> >> >>> >> > with > >> >> >> >>> >> > 3 task runners. > >> >> >> >>> >> > I tried PageRank with a small input of my own. > >> >> >> >>> >> > But it failes. And its log is: > >> >> >> >>> >> > java.lang.IllegalArgumentException: Messages must never > be > >> >> >> >>> >> > behind > >> >> >> >>> >> > the > >> >> >> >>> >> > vertex > >> >> >> >>> >> > in ID! Current Message ID: 100128 vs. 1004 > >> >> >> >>> >> > at > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306) > >> >> >> >>> >> > at > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254) > >> >> >> >>> >> > at > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145) > >> >> >> >>> >> > at > >> >> >> >>> >> > org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:177) > >> >> >> >>> >> > at > >> >> >> >>> >> > org.apache.hama.bsp.BSPTask.run(BSPTask.java:146) > >> >> >> >>> >> > at > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1246) > >> >> >> >>> >> > > >> >> >> >>> >> > Could you tell me what is the problem in my situation? > >> >> >> >>> >> > > >> >> >> >>> >> > I check whether hama had finished the loading phase, > and > >> >> >> >>> >> > I > >> >> >> >>> >> > found > >> >> >> >>> >> > "14/02/25 16:45:00 INFO graph.GraphJobRunner: 2918 > vertices > >> >> >> >>> >> > are > >> >> >> >>> >> > loaded > >> >> >> >>> >> > into > >> >> >> >>> >> > localhost:60340 "in the log. > >> >> >> >>> >> > So it had finished the loading phase. > >> >> >> >>> >> > > >> >> >> >>> >> > After this, I read the source code, and I found during > >> >> >> >>> >> > the > >> >> >> >>> >> > loading, > >> >> >> >>> >> > peers would not exchange vertices with each other as hama > >> >> >> >>> >> > 0.5.0 > >> >> >> >>> >> > did. > >> >> >> >>> >> > So how does hama 0.6.3 solve the problem below: a peer > >> >> >> >>> >> > load > >> >> >> >>> >> > a > >> >> >> >>> >> > vertex > >> >> >> >>> >> > which is belong to another peer? > >> >> >> >>> >> > > >> >> >> >>> >> > Could you tell which branch or tag is a stable > version? > >> >> >> >>> >> > And does it support fault tolerance for graph > >> >> >> >>> >> > algorithms? > >> >> >> >>> >> > and > >> >> >> >>> >> > how > >> >> >> >>> >> > can I > >> >> >> >>> >> > get it? > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> -- > >> >> >> >>> >> Edward J. Yoon (@eddieyoon) > >> >> >> >>> >> Chief Executive Officer > >> >> >> >>> >> DataSayer, Inc. > >> >> >> >>> > > >> >> >> >>> > > >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > >> >> >> >>> -- > >> >> >> >>> Edward J. Yoon (@eddieyoon) > >> >> >> >>> Chief Executive Officer > >> >> >> >>> DataSayer, Inc. > >> >> >> >> > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > -- > >> >> >> > Edward J. Yoon (@eddieyoon) > >> >> >> > Chief Executive Officer > >> >> >> > DataSayer, Inc. > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Edward J. Yoon (@eddieyoon) > >> >> >> Chief Executive Officer > >> >> >> DataSayer, Inc. > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Edward J. Yoon (@eddieyoon) > >> >> Chief Executive Officer > >> >> DataSayer, Inc. > >> > > >> > > >> > >> > >> > >> -- > >> Edward J. Yoon (@eddieyoon) > >> Chief Executive Officer > >> DataSayer, Inc. > > > > > > > > -- > Edward J. Yoon (@eddieyoon) > Chief Executive Officer > DataSayer, Inc. >
