Hi Jeffrey,

Best attempt as answers inline.

On 2/16/12 6:12 PM, Jeffrey Yunes wrote:
Hi Giraph community,
I think I followed all of the directions (for a Giraph on a psuedo-cluster), 
and it looks like

mvn clean test -Dprop.mapred.job.tracker=localhost:9001
runs fine. However, I'm new to the Hadoop infrastructure, and have a couple of 
questions about getting started with Giraph.

1)
hadoop jar target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 50 -w 3
gives me the error "java.lang.NullPointerException at at 
org.apache.giraph.benchmark.PageRankBenchmark.run(PageRankBenchmark.java:127)" It 
looks like some error with configuration?

This is a bug. I have a quick fix for it. Sorry about that. I opened an issue for it. https://issues.apache.org/jira/browse/GIRAPH-150

diff --git a/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java b/
index 0e76122..4d08929 100644
--- a/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java
+++ b/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java
@@ -124,7 +124,8 @@ public class PageRankBenchmark extends EdgeListVertex<
     } else {
       job.setVertexClass(PageRankBenchmark.class);
     }
- LOG.info("Using class " + BspUtils.getVertexClass(getConf()).getName());
+    LOG.info("Using class " +
+        BspUtils.getVertexClass(job.getConfiguration()).getName());
     job.setVertexInputFormatClass(PseudoRandomVertexInputFormat.class);
     job.setWorkerConfiguration(workers, workers, 100.0f);

2) How should I / do I enable the log4j? An appender that writes to the HDFS? 
How else could I grep all my logs for errors and things?
log4j is used by the task trackers to dump to the job logs. If you click on your running job in the web page, you can then click into each task and look at the logs under 'Task Logs'. You can configure the task tracker log4jproperties to set the log level, but the default is info I believe.
3) With regard to Giraph and maven, none of the directions suggested doing "local overrides." 
Therefore, why should I expect my Giraph installation to refer to libraries and configuration in 
"~/Applications/hadoop or zookeeper" rather than those in "~.m2/repo?"
Giraph builts a massive jar that has all the required classes and jars to launch ZooKeeper and interact with Hadoop. This makes for easy deployment to a running cluster.

4) Why doesn't running maven for Giraph install hadoop along the way (or does 
it)?
Because there are so many versions of Hadoop and if you are lauching Hadoop, then the hadoop jar should be in your classpath automatically.

I'd appreciate if you'd help improve my understanding!
No problem.  Welcome to Giraph!

Thanks!
-Jeff




Reply via email to