Hi, I'm trying to run SSSP, Page Rank and Connected Components from the Giraph examples for some Networks, but I've been constantly getting an error message. For smaller networks, I can run those algorithms successfully, though, for not so bigger nets (say, 1 million edges), I got this kind of message (a failing map):
hduser@horragalles:~$ $HADOOP_HOME'/bin/hadoop' jar /usr/local/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.7.2-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/hduser/Nets-Giraph/10K-10M-er.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/hduser/output/o7 -w 1 -ca giraph.SplitMasterWorker=false -yj giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.7.2-jar-with-dependencies.jar -ca giraph.logLevel=debug 16/03/16 14:16:03 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one. 16/03/16 14:16:03 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one. 16/03/16 14:16:03 INFO utils.ConfigurationUtils: Setting custom argument [giraph.SplitMasterWorker] to [false] in GiraphConfiguration 16/03/16 14:16:03 INFO utils.ConfigurationUtils: Setting custom argument [giraph.logLevel] to [debug] in GiraphConfiguration 16/03/16 14:16:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/03/16 14:16:05 INFO Configuration.deprecation: mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max 16/03/16 14:16:05 INFO Configuration.deprecation: mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb 16/03/16 14:16:05 INFO Configuration.deprecation: mapred.job.reduce.memory.mb is deprecated. Instead, use mapreduce.reduce.memory.mb 16/03/16 14:16:05 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 16/03/16 14:16:05 INFO Configuration.deprecation: mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first 16/03/16 14:16:05 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts 16/03/16 14:16:05 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 1, old value = 4) 16/03/16 14:16:05 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 16/03/16 14:16:05 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 16/03/16 14:16:05 INFO mapreduce.JobSubmitter: number of splits:1 16/03/16 14:16:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1458141522317_0002 16/03/16 14:16:06 INFO impl.YarnClientImpl: Submitted application application_1458141522317_0002 16/03/16 14:16:06 INFO mapreduce.Job: The url to track the job: http://horragalles:8088/proxy/application_1458141522317_0002/ 16/03/16 14:16:06 INFO job.GiraphJob: Tracking URL: http://horragalles:8088/proxy/application_1458141522317_0002/ 16/03/16 14:16:06 INFO job.GiraphJob: Waiting for resources... Job will start only when it gets all 2 mappers 16/03/16 14:16:22 INFO job.HaltApplicationUtils$DefaultHaltInstructionsWriter: writeHaltInstructions: To halt after next superstep execute: 'bin/halt-application --zkServer horragalles:22181 --zkNode /_hadoopBsp/job_1458141522317_0002/_haltComputation' 16/03/16 14:16:22 INFO mapreduce.Job: Running job: job_1458141522317_0002 16/03/16 14:16:23 INFO mapreduce.Job: Job job_1458141522317_0002 running in uber mode : false 16/03/16 14:16:23 INFO mapreduce.Job: map 100% reduce 0% 16/03/16 14:16:43 INFO mapreduce.Job: Job job_1458141522317_0002 failed with state FAILED due to: Task failed task_1458141522317_0002_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 16/03/16 14:16:43 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=1 Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=55500 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=27750 Total vcore-milliseconds taken by all map tasks=27750 Total megabyte-milliseconds taken by all map tasks=227328000 I've already tried to increase hadoop memory on mapred.xml: <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>8192</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>8192</value> </property> <property> <name>mapred.map.java.opts</name> <value>-Xmx6144m</value> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx6144m</value> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>2.1</value> </property> </configuration> I've shared the log file in this link (I think it might help (take a look at line 954): https://drive.google.com/file/d/0B9p1dXvTCnE9SE1sQ3VzcHdaM0U/view?usp=sharing Ps.: I have 50GB of ram memory (a single node server), moreover when I was monitoring the resource utilization of those algorithms, they used not even 2GB before the map failed. Giraph 1.2.0, Hadoop 2.7.2 and Apache Maven 3.0.5 -- *Atenciosamente,* *Daniel N. R. da Silva*
