Thanks, we'll fix that. Meanwhile use this patch to get trunk to build.
On Fri, Nov 18, 2011 at 9:28 AM, Yingyi Bu <buyin...@gmail.com> wrote: > Could anyone fix the trunk: two files miss headers so that build fails... > > Attached is the target/rat.txt from the failed build. > I fixed them locally anyway... > Thanks! > Yingyi > On Thu, Nov 17, 2011 at 11:53 PM, Yingyi Bu <buyin...@gmail.com> wrote: >> >> Avery, >> Thanks a lot for help!! >> I'll sync the trunk and try with your suggested settings. >> Best regards, >> Yingyi >> On Thu, Nov 17, 2011 at 11:47 PM, Avery Ching <ach...@apache.org> wrote: >>> >>> Yingyi, >>> >>> Looks like you lost the connection to ZooKeeper. You might want to sync >>> with trunk. GIRAPH-11 changed the settings to allow longer ZooKeeper >>> timeouts. Also, ordering of the vertices is no longer required and the load >>> balancing should be better. Looks like you might want to try to add some >>> better GC options to reduce stop-the-world pauses (likely causing the >>> timeouts). >>> >>> Here's some example settings you can trying fiddling with as well just >>> add them to the other JVM settings you tried out earlier. Let us know how >>> its goes. >>> >>> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:ParallelGCThreads=8 >>> -XX:+CMSIncrementalPacing -XX:+PrintGCDetails >>> -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly >>> -XX:+PrintTenuringDistribution >>> >>> Avery >>> >>> On 11/17/11 11:24 PM, Yingyi Bu wrote: >>> >>> Hi Avery, >>> Thanks a lot for your help!! >>> I use your settings, and get rid of OOM now! However, after running >>> the job for 10 minutes, one worker failed, and then for a while, all mappers >>> failed. Attached below are mapper logs from two nodes. It seems they >>> cannot connect to the Zookeeper. The workers run well until the highlighted >>> exception. Do I miss something in the job setting? >>> Thanks, again!! >>> Best regards, >>> Yingyi >>> >>> >>> Mapper log on Node-1: >>> 2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager: >>> getZooKeeperServerList: For task 0, got file 'zkServerList_asterix-010 0 ' >>> (polling period is 3000) >>> 2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager: >>> getZooKeeperServerList: Found [asterix-010, 0] 2 hosts in filename >>> 'zkServerList_asterix-010 0 ' >>> 2011-11-17 22:56:39,046 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Trying to delete old directory >>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper >>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager: >>> generateZooKeeperConfigFile: Creating file >>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg >>> in >>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper >>> with base port 22181 >>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager: >>> generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true >>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager: >>> generateZooKeeperConfigFile: Delete of zoo.cfg = false >>> 2011-11-17 22:56:39,050 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Attempting to start ZooKeeper server with command >>> [/mnt/data/sda/space/yingyi/tools/java/jre/bin/java, -Xmx256m, >>> -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, >>> -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, >>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/job.jar, >>> org.apache.zookeeper.server.quorum.QuorumPeerMain, >>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg] >>> in directory >>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper >>> 2011-11-17 22:56:39,056 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to >>> asterix-010:22181 with poll msecs = 3000 >>> 2011-11-17 22:56:39,058 WARN org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Got ConnectException >>> java.net.ConnectException: Connection refused >>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) >>> at >>> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) >>> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) >>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) >>> at java.net.Socket.connect(Socket.java:529) >>> at >>> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:612) >>> at >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:401) >>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) >>> at >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to >>> asterix-010:22181 with poll msecs = 3000 >>> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Connected! >>> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Creating my filestamp >>> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0 >>> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: setup: >>> Starting up BspServiceMaster (master thread)... >>> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService: >>> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 on >>> asterix-010:22181 >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:host.name=asterix-010 >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.version=1.6.0_21 >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.vendor=Sun Microsystems Inc. >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop- >>> 0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexe >>> c/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/ >>> hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/ >>> jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../ >>> share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to >>> asterix-010:22181 with poll msecs = 3000 >>> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Connected! >>> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager: >>> onlineZooKeeperServers: Creating my filestamp >>> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0 >>> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: setup: >>> Starting up BspServiceMaster (master thread)... >>> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService: >>> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 on >>> asterix-010:22181 >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:host.name=asterix-010 >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.version=1.6.0_21 >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.vendor=Sun Microsystems Inc. >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre >>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/ >>> space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hado >>> op-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205 >>> .0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/ >>> ../share/hadoop/lib/jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0. >>> 20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar2011-11-17 >>> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work2011-11-17 >>> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work/tmp >>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.compiler=<NA> >>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:os.name=Linux >>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:os.arch=amd642011-11-17 22:56:42,087 INFO >>> org.apache.zookeeper.ZooKeeper: Client >>> environment:os.version=2.6.18-194.26.1.el52011-11-17 22:56:42,087 INFO >>> org.apache.zookeeper.ZooKeeper: Client environment:user.name=yingyib >>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:user.home=/home/yingyib >>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work >>> 2011-11-17 22:56:42,088 INFO org.apache.zookeeper.ZooKeeper: Initiating >>> client connection, connectString=asterix-010:22181 sessionTimeout=60000 >>> watcher=org.apache.giraph.graph.BspServiceMaster@13a78071 >>> 2011-11-17 22:56:42,098 INFO org.apache.zookeeper.ClientCnxn: Opening >>> socket connection to server asterix-010/10.0.0.10:22181 >>> 2011-11-17 22:56:42,099 INFO org.apache.zookeeper.ClientCnxn: Socket >>> connection established to asterix-010/10.0.0.10:22181, initiating session >>> 2011-11-17 22:56:42,123 INFO org.apache.zookeeper.ClientCnxn: Session >>> establishment complete on server asterix-010/10.0.0.10:22181, sessionid = >>> 0x133b57675b60000, negotiated timeout = 60000 >>> 2011-11-17 22:56:42,125 INFO org.apache.giraph.graph.BspService: process: >>> Asynchronous connection complete. >>> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper: map: No >>> need to do anything when not a worker >>> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper: >>> cleanup: Starting for MASTER_ZOOKEEPER_ONLY2011-11-17 22:56:42,197 INFO >>> org.apache.giraph.graph.BspServiceMaster: becomeMaster: First child is >>> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000' >>> and my bid is >>> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000' >>> 2011-11-17 22:56:42,197 INFO org.apache.giraph.graph.BspServiceMaster: >>> becomeMaster: I am now the master! >>> 2011-11-17 22:56:42,208 INFO org.apache.giraph.graph.BspService: process: >>> applicationAttemptChanged signaled >>> 2011-11-17 22:56:42,216 WARN org.apache.giraph.graph.BspService: process: >>> Unknown and unprocessed event >>> (path=/_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir, >>> type=NodeChildrenChanged, state=SyncConnected) >>> 2011-11-17 22:56:45,130 INFO >>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to >>> process : 10 >>> 2011-11-17 22:56:45,227 INFO org.apache.giraph.graph.BspServiceMaster: >>> coordinateSuperstep: 0 out of 10 chosen workers finished on superstep -1 >>> 2011-11-17 23:01:20,045 ERROR org.apache.zookeeper.ClientCnxn: Error >>> while calling watcher >>> java.lang.RuntimeException: >>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = >>> NoNode for >>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments >>> at >>> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:885) >>> at >>> org.apache.giraph.graph.BspServiceMaster.checkHealthyWorkerFailure(BspServiceMaster.java:1946) >>> at >>> org.apache.giraph.graph.BspServiceMaster.processEvent(BspServiceMaster.java:1976) >>> at >>> org.apache.giraph.graph.BspService.process(BspService.java:1095) >>> at >>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488) >>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: >>> KeeperErrorCode = NoNode for >>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments >>> at >>> org.apache.zookeeper.KeeperException.create(KeeperException.java:102) >>> at >>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) >>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921) >>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950) >>> at >>> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:858) >>> ... 4 more2011-11-17 23:01:22,009 INFO >>> org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: 0 out of 10 >>> chosen workers finished on superstep -12011-11-17 23:11:27,357 WARN >>> org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a >>> shutdown hook kill of the ZooKeeper process. >>> >>> Mapper log on Node-2: >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:host.name=asterix-001 >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.version=1.6.0_21 >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.vendor=Sun Microsystems Inc. >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop- >>> 0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexe >>> c/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/ >>> hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/ >>> jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../ >>> share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work/tmp >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:java.compiler=<NA> >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:os.name=Linux >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:os.arch=amd64 >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:os.version=2.6.18-194.26.1.el5 >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:user.name=yingyib >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:user.home=/home/yingyib >>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client >>> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work >>> 2011-11-17 22:56:44,159 INFO org.apache.zookeeper.ZooKeeper: Initiating >>> client connection, connectString=asterix-010:22181 sessionTimeout=60000 >>> watcher=org.apache.giraph.graph.BspServiceWorker@60ded0f0 >>> 2011-11-17 22:56:44,171 INFO org.apache.zookeeper.ClientCnxn: Opening >>> socket connection to server asterix-010/10.0.0.10:22181 >>> 2011-11-17 22:56:44,173 INFO org.apache.zookeeper.ClientCnxn: Socket >>> connection established to asterix-010/10.0.0.10:22181, initiating session >>> 2011-11-17 22:56:44,178 INFO org.apache.zookeeper.ClientCnxn: Session >>> establishment complete on server asterix-010/10.0.0.10:22181, sessionid = >>> 0x133b57675b60007, negotiated timeout = 60000 >>> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.BspService: process: >>> Asynchronous connection complete. >>> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.GraphMapper: setup: >>> Registering health of this worker... >>> 2011-11-17 22:56:44,191 INFO org.apache.giraph.graph.BspService: >>> getJobState: Job state already exists >>> (/_hadoopBsp/job_201111172247_0003/_masterJobState) >>> 2011-11-17 22:56:44,195 INFO org.apache.giraph.graph.BspService: >>> getApplicationAttempt: Node >>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists! >>> 2011-11-17 22:56:44,198 INFO org.apache.giraph.graph.BspService: >>> getApplicationAttempt: Node >>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists! >>> 2011-11-17 22:56:44,204 INFO org.apache.giraph.graph.BspServiceWorker: >>> registerHealth: Created my health node for attempt=0, superstep=-1 with >>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/asterix-001_8 >>> and hostnamePort = ["asterix-001",30008] >>> 2011-11-17 22:56:45,177 INFO org.apache.giraph.graph.BspService: process: >>> inputSplitsReadyChanged (input splits ready) >>> 2011-11-17 22:56:45,192 WARN org.apache.giraph.graph.BspService: process: >>> Unknown and unprocessed event >>> (path=/_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2/_inputSplitReserved, >>> type=NodeCreated, state=SyncConnected) >>> 2011-11-17 22:56:45,192 INFO org.apache.giraph.graph.BspServiceWorker: >>> reserveInputSplit: Reserved input split path >>> /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2 >>> 2011-11-17 22:56:45,196 INFO org.apache.giraph.graph.BspServiceWorker: >>> loadVertices: Reserved /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2 >>> from ZooKeeper and got input split >>> 'hdfs://asterix-master:31888/webmap-tiny-sorted/part-00002:0+834285620' >>> 2011-11-17 23:01:20,608 INFO org.apache.zookeeper.ClientCnxn: Client >>> session timed out, have not heard from server in 59117ms for sessionid >>> 0x133b57675b60007, closing socket connection and attempting reconnect >>> 2011-11-17 23:02:06,630 ERROR org.apache.zookeeper.ClientCnxn: Error >>> while calling watcher >>> java.lang.RuntimeException: process: Disconnected from ZooKeeper, cannot >>> recover. >>> at >>> org.apache.giraph.graph.BspService.process(BspService.java:990) >>> at >>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488) >>> 2011-11-17 23:02:35,793 INFO org.apache.zookeeper.ClientCnxn: Opening >>> socket connection to server asterix-010/10.0.0.10:22181 >>> 2011-11-17 23:02:35,794 INFO org.apache.zookeeper.ClientCnxn: Socket >>> connection established to asterix-010/10.0.0.10:22181, initiating session >>> 2011-11-17 23:02:35,806 INFO org.apache.zookeeper.ClientCnxn: Unable to >>> reconnect to ZooKeeper service, session 0x133b57675b60007 has expired, >>> closing socket connection >>> On Thu, Nov 17, 2011 at 9:46 PM, Avery Ching <ach...@apache.org> wrote: >>>> >>>> Hi Yingyi, >>>> >>>> Here are some ideas you might want to try: >>>> >>>> 1) Limit the thread stack size. >>>> >>>> 2 You can set the heap available to the mapper jvm. >>>> >>>> I.e. Here's a setting to get 10 GB of heap and use a smaller stack (64k) >>>> for the threads. >>>> >>>> -Dmapred.child.java.opts="-Xms10g -Xmx10g -Xss64k" >>>> >>>> Also, you might want to try using the EdgeListVertex instead of Vertex >>>> (i.e. GiraphJob.setVertexClass(EdgeListVertex.class)), it is quite a bit >>>> smaller. >>>> >>>> Let us know if that helps you. You should also check to see if your >>>> Hadoop installation is using a 32-bit of 64-bit JVM. If it's 32-bit you >>>> will be limited in how much heap you can use. >>>> >>>> Avery >>>> >>>> On 11/17/11 9:38 PM, Yingyi Bu wrote: >>>> >>>> Hi, >>>> I'm running a Giraph PageRank job. I tried with 8GB input text data >>>> over 10 nodes (each has 4 core, 4 disks, and 12GB physical memory), that >>>> is 800MB input-data/machine. However, Giraph job fails because of high >>>> GC costs and Out-of-Memory exception. >>>> Do I set some special things in Hadoop configurations, for example, >>>> maximum heap size for map task vm ? >>>> Thanks!! >>>> Best regards, >>>> Yingyi >>> >>> >> > > -- Claudio Martella claudio.marte...@gmail.com
rat.diff
Description: Binary data