Avery Ching wrote: > It shouldn't be, your code looks very similar to the unittests (i.e. > TestManualCheckpoint.java). So, you're trying to run your test with the > local hadoop (similar to the unittests)? Or are you using an actual > hadoop setup?
Hi Avery, while I am learning and writing the first examples, I am trying to run with a local hadoop (similar to the unit tests). This way, I can easily run and debug the code from the IDE. Tomorrow, I'll look at the unit tests again trying to see if I can spot what I am doing wrong. Thanks, Paolo > > Avery > > On 4/10/12 11:41 PM, Paolo Castagna wrote: >> I am using hadoop-core-1.0.1.jar ... could that be a problem? >> >> Paolo >> >> Paolo Castagna wrote: >>> Hi Avery, >>> nope, no luck. >>> >>> I have changed all my log.debug(...) into log.info(...). Same behavior. >>> >>> I have a log4j.properties [1] file in my classpath and it has: >>> log4j.logger.org.apache.jena.grande=DEBUG >>> log4j.logger.org.apache.jena.grande.giraph=DEBUG >>> I also tried to change that to: >>> log4j.logger.org.apache.jena.grande=INFO >>> log4j.logger.org.apache.jena.grande.giraph=INFO >>> No luck. >>> >>> My Giraph job has: >>> GiraphJob job = new GiraphJob(getConf(), getClass().getName()); >>> job.setVertexClass(getClass()); >>> job.setVertexInputFormatClass(TurtleVertexInputFormat.class); >>> job.setVertexOutputFormatClass(TurtleVertexOutputFormat.class); >>> >>> But, if I run in debug with a breakpoint in the >>> TurtleVertexInputFormat.class >>> constructor, it is never instanciated. How can it be? >>> >>> So perhaps the problem is not the logging, it is the fact that >>> my GiraphJob is not using TurtleVertexInputFormat.class and >>> TurtleVertexOutputFormat.class, but I don't see what I am doing >>> wrong. :-/ >>> >>> Thanks, >>> Paolo >>> >>> [1] >>> https://github.com/castagna/jena-grande/blob/master/src/test/resources/log4j.properties >>> >>> >>> Avery Ching wrote: >>>> I think the issue might be that Hadoop only logs INFO and above >>>> messages >>>> by default. Can you retry with INFO level logging? >>>> >>>> Avery >>>> >>>> On 4/10/12 12:17 PM, Paolo Castagna wrote: >>>>> Hi, >>>>> I am still learning Giraph, so, please, be patient with me and >>>>> forgive my >>>>> trivial questions. >>>>> >>>>> As a simple initial use case, I want to compute the shortest paths >>>>> from a single >>>>> source in a social graph in RDF format using the FOAF [1] vocabulary. >>>>> This example also will hopefully inform GIRAPH-170 [2] and related >>>>> issues, such >>>>> as: GIRAPH-141 [3]. >>>>> >>>>> Here is an example in Turtle [4] format of a tiny graph using FOAF: >>>>> ---- >>>>> @prefix :<http://example.org/> . >>>>> @prefix foaf:<http://xmlns.com/foaf/0.1/> . >>>>> >>>>> :alice >>>>> a foaf:Person ; >>>>> foaf:name "Alice" ; >>>>> foaf:mbox<mailto:al...@example.org> ; >>>>> foaf:knows :bob ; >>>>> foaf:knows :charlie ; >>>>> foaf:knows :snoopy ; >>>>> . >>>>> >>>>> :bob >>>>> foaf:name "Bob" ; >>>>> foaf:knows :charlie ; >>>>> . >>>>> >>>>> :charlie >>>>> foaf:name "Charlie" ; >>>>> foaf:knows :alice ; >>>>> . >>>>> ---- >>>>> This is nice, human friendly (RDF without angle brackets!), but not >>>>> easily >>>>> splittable to be processed with MapReduce (or Giraph). >>>>> >>>>> Here is the same graph in N-Triples [5] format: >>>>> ---- >>>>> <http://example.org/alice> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >>>>> <http://xmlns.com/foaf/0.1/Person> . >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/name> >>>>> "Alice" . >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/mbox> >>>>> <mailto:al...@example.org> . >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/bob> . >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/charlie> . >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/snoopy> . >>>>> <http://example.org/charlie> <http://xmlns.com/foaf/0.1/name> >>>>> "Charlie" . >>>>> <http://example.org/charlie> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/alice> . >>>>> <http://example.org/bob> <http://xmlns.com/foaf/0.1/name> "Bob" . >>>>> <http://example.org/bob> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/charlie> . >>>>> ---- >>>>> This is more verbose and ugly, but splittable. >>>>> >>>>> The graph I am interested in is the graph represented by foaf:knows >>>>> relationships/links between people (please, note --knows--> >>>>> relationship here >>>>> has a direction, this isn't symmetric as in centralized social >>>>> networking >>>>> websites such as Facebook or LinkedIn. Alice can claim to know Bob, >>>>> without Bob >>>>> knowing it and/or it might even be a false claim): >>>>> >>>>> alice --knows--> bob >>>>> alice --knows--> charlie >>>>> alice --knows--> snoopy >>>>> bob --knows--> charlie >>>>> charlie --knows--> alice >>>>> >>>>> As a first step, I wrote a MapReduce job [6] to transform the RDF >>>>> graph above in >>>>> a sort of adjacency list using Turtle syntax, here is the output >>>>> (three lines): >>>>> ---- >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/mbox> >>>>> <mailto:al...@example.org>;<http://xmlns.com/foaf/0.1/name> "Alice"; >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >>>>> <http://xmlns.com/foaf/0.1/Person>;<http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/charlie>,<http://example.org/bob>, >>>>> <http://example.org/snoopy>; .<http://example.org/charlie> >>>>> <http://xmlns.com/foaf/0.1/knows> <http://example.org/alice>. >>>>> >>>>> <http://example.org/bob> <http://xmlns.com/foaf/0.1/name> "Bob"; >>>>> <http://xmlns.com/foaf/0.1/knows> <http://example.org/charlie>; . >>>>> <http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/bob>. >>>>> >>>>> <http://example.org/charlie> <http://xmlns.com/foaf/0.1/name> >>>>> "Charlie"; >>>>> <http://xmlns.com/foaf/0.1/knows> <http://example.org/alice>; . >>>>> <http://example.org/bob> <http://xmlns.com/foaf/0.1/knows> >>>>> <http://example.org/charlie>.<http://example.org/alice> >>>>> <http://xmlns.com/foaf/0.1/knows> <http://example.org/charlie>. >>>>> ---- >>>>> This is legal Turtle, but it is also splittable. Each line has all the >>>>> RDF >>>>> statements (i.e. egdes) for a person (there are also incoming edges). >>>>> >>>>> I wrote a TurtleVertexReader [7] which extends >>>>> TextVertexReader<NodeWritable, >>>>> Text, NodeWritable, Text> and a TurtleVertexInputFormat [8] which >>>>> extends >>>>> TextVertexInputFormat<NodeWritable, Text, NodeWritable, Text>. >>>>> I wrote (copying from the example SimpleShortestPathsVertex) a >>>>> FoafShortestPathsVertex [9] which extends EdgeListVertex<NodeWritable, >>>>> IntWritable, NodeWritable, IntWritable> and I am running it locally >>>>> using these >>>>> arguments: -Dgiraph.maxWorkers=1 -Dgiraph.SplitMasterWorker=false >>>>> -DoverwriteOutput=true src/test/resources/data3.ttl target/foaf >>>>> http://example.org/alice 1 >>>>> >>>>> TurtleVertexReader, TurtleVertexInputFormat and >>>>> FoafShortestPathsVertex are >>>>> still work in progress and I am sure there are plenty of stupid >>>>> errors. >>>>> However, I do not understand why when I run FoafShortestPathsVertex >>>>> with the >>>>> DEBUG level, I see debug statements from FoafShortestPathsVertex: >>>>> 19:34:44 DEBUG FoafShortestPathsVertex :: >>>>> main({-Dgiraph.maxWorkers=1, >>>>> -Dgiraph.SplitMasterWorker=false, -DoverwriteOutput=true, >>>>> src/test/resources/data3.ttl, target/foaf, >>>>> http://example.org/alice, 1}) >>>>> 19:34:44 DEBUG FoafShortestPathsVertex :: getConf() --> null >>>>> 19:34:44 DEBUG FoafShortestPathsVertex :: setConf(Configuration: >>>>> core-default.xml, core-site.xml) >>>>> 19:34:44 DEBUG FoafShortestPathsVertex :: >>>>> run({src/test/resources/data3.ttl, >>>>> target/foaf, http://example.org/alice, 1}) >>>>> 19:34:44 DEBUG FoafShortestPathsVertex :: getConf() --> >>>>> Configuration: >>>>> core-default.xml, core-site.xml >>>>> 19:34:44 DEBUG FoafShortestPathsVertex :: getConf() --> >>>>> Configuration: >>>>> core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, >>>>> giraph-site.xml >>>>> >>>>> But, I do not see anything else, no log statement from >>>>> TurtleVertexReader or >>>>> TurtleVertexInputFormat. Why? What am I doing wrong? >>>>> Is it because I am running it locally? >>>>> >>>>> Thanks, >>>>> Paolo >>>>> >>>>> [1] http://en.wikipedia.org/wiki/FOAF_%28software%29 >>>>> [2] https://issues.apache.org/jira/browse/GIRAPH-170 >>>>> [3] https://issues.apache.org/jira/browse/GIRAPH-141 >>>>> [4] http://en.wikipedia.org/wiki/Turtle_%28syntax%29 >>>>> [5] http://en.wikipedia.org/wiki/N-Triples >>>>> [6] >>>>> https://github.com/castagna/jena-grande/blob/a650758a56cfe0680320445434e6d6adf2d7e544/src/main/java/org/apache/jena/grande/mapreduce/Rdf2AdjacencyListDriver.java >>>>> >>>>> >>>>> [7] >>>>> https://github.com/castagna/jena-grande/blob/a650758a56cfe0680320445434e6d6adf2d7e544/src/main/java/org/apache/jena/grande/giraph/TurtleVertexReader.java >>>>> >>>>> >>>>> [8] >>>>> https://github.com/castagna/jena-grande/blob/a650758a56cfe0680320445434e6d6adf2d7e544/src/main/java/org/apache/jena/grande/giraph/TurtleVertexInputFormat.java >>>>> >>>>> >>>>> [9] >>>>> https://github.com/castagna/jena-grande/blob/a650758a56cfe0680320445434e6d6adf2d7e544/src/main/java/org/apache/jena/grande/giraph/FoafShortestPathsVertex.java >>>>> >>>>> >