Hi With graphs with up to 15,000 vertices can run. I believe it is a quick look at memory saw this being used enough memory swap.
2013/1/24 Edward J. Yoon <[email protected]>: > Have you checked your algorithm with small input? I'm not sure that is > a memory issue. > > On Thu, Jan 24, 2013 at 7:53 PM, Francisco Sanches > <[email protected]> wrote: >> Hi thanks for the reply, >> >> My input is a graph of 35 000 vertices, but my application uses a >> vector of 65537 to store the distance. Each node has a vector with the >> number of positions (All-pair). >> >> 2013/1/24 Edward J. Yoon <[email protected]>: >>> Hi, >>> >>> How large your input data is? >>> >>> On Tue, Jan 22, 2013 at 11:21 PM, Francisco Sanches >>> <[email protected]> wrote: >>>> Hi, >>>> >>>> When I try to run a larger graph in the application that I developed >>>> here first get this message: >>>> >>>> hduser@node28:/usr/local/hama-trunk$ time bin/hama jar aplic19jan.jar >>>> 65537 /hdfs/disco/dados/internet_a /hdfs/disco/outinternet_a 42 30 >>>> 13/01/22 10:59:14 INFO bsp.FileInputFormat: Total input paths to process : >>>> 1 >>>> 13/01/22 10:59:14 INFO bsp.FileInputFormat: Total input paths to process : >>>> 1 >>>> 13/01/22 10:59:17 INFO bsp.BSPJobClient: Running job: job_201301221058_0002 >>>> 13/01/22 10:59:20 INFO bsp.BSPJobClient: Current supersteps number: 0 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: Current supersteps number: 1 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: The total number of supersteps: 1 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: Counters: 6 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: >>>> org.apache.hama.bsp.JobInProgress$JobCounter >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: SUPERSTEPS=1 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: LAUNCHED_TASKS=1 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: SUPERSTEP_SUM=1 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: TIME_IN_SYNC_MS=70 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: IO_BYTES_READ=3467587 >>>> 13/01/22 10:59:32 INFO bsp.BSPJobClient: TASK_INPUT_RECORDS=42089 >>>> 13/01/22 10:59:32 INFO bsp.FileInputFormat: Total input paths to process : >>>> 42 >>>> 13/01/22 10:59:35 INFO bsp.BSPJobClient: Running job: job_201301221058_0001 >>>> 13/01/22 10:59:38 INFO bsp.BSPJobClient: Current supersteps number: 0 >>>> 13/01/22 11:00:38 INFO bsp.BSPJobClient: Current supersteps number: 1 >>>> attempt_201301221058_0001_000001_0: SLF4J: Class path contains >>>> multiple SLF4J bindings. >>>> attempt_201301221058_0001_000001_0: SLF4J: Found binding in >>>> [jar:file:/usr/local/hama-trunk/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> attempt_201301221058_0001_000001_0: SLF4J: Found binding in >>>> [file:/tmp/hama-hduser/bsp/local/groomServer/attempt_201301221058_0001_000001_0/work/org/slf4j/impl/StaticLoggerBinder.class] >>>> attempt_201301221058_0001_000001_0: SLF4J: See >>>> http://www.slf4j.org/codes.html#multiple_bindings for an explanation. >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:14 INFO >>>> sync.ZKSyncClient: Initializing ZK Sync Client >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:14 INFO >>>> sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper! At >>>> node30/10.16.2.30:61003 >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:16 INFO ipc.Server: >>>> Starting SocketReader >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:16 INFO ipc.Server: >>>> IPC Server Responder: starting >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:16 INFO ipc.Server: >>>> IPC Server listener on 61003: starting >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:16 INFO >>>> message.HadoopMessageManagerImpl: BSPPeer address:node30 port:61003 >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:16 INFO ipc.Server: >>>> IPC Server handler 0 on 61003: starting >>>> attempt_201301221058_0001_000001_0: 13/01/22 11:00:34 INFO >>>> graph.GraphJobRunner: Loading finished at 1 steps. >>>> 13/01/22 11:04:46 INFO bsp.BSPJobClient: Job failed. >>>> >>>> real 5m34.523s >>>> user 0m4.832s >>>> sys 0m0.960s >>>> >>>> Analyzing logs the one that fails has this log: >>>> >>>> /************************************************************ >>>> STARTUP_MSG: Starting GroomServer >>>> STARTUP_MSG: host = node30/10.16.2.30 >>>> STARTUP_MSG: args = [node30] >>>> STARTUP_MSG: version = 1.0.0 >>>> STARTUP_MSG: build = >>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r >>>> 1214675; compiled by 'hortonfo' on Fri Dec 16 20:01:27 UTC 2011 >>>> ************************************************************/ >>>> 2013-01-22 10:58:51,118 INFO org.apache.hama.bsp.GroomServer: groom start >>>> 2013-01-22 10:58:51,420 INFO org.apache.hama.bsp.GroomServer: >>>> /tmp/hama-hduser/bsp/local >>>> 2013-01-22 10:58:55,417 INFO org.apache.hadoop.ipc.Server: Starting >>>> SocketReader >>>> 2013-01-22 10:58:55,452 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> listener on 50000: starting >>>> 2013-01-22 10:58:55,452 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> Responder: starting >>>> 2013-01-22 10:58:55,455 INFO org.apache.hama.bsp.GroomServer: Worker >>>> rpc server --> node30:50000 >>>> 2013-01-22 10:58:55,455 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 0 on 50000: starting >>>> 2013-01-22 10:58:55,560 INFO org.mortbay.log: Logging to >>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via >>>> org.mortbay.log.Slf4jLog >>>> 2013-01-22 10:58:55,645 INFO org.apache.hama.bsp.GroomServer: starting >>>> webserver: node30 >>>> 2013-01-22 10:58:55,645 INFO org.apache.hama.http.HttpServer: Port >>>> returned by webServer.getConnectors()[0].getLocalPort() before open() >>>> is -1. Opening the listener on 40015 >>>> 2013-01-22 10:58:55,646 INFO org.apache.hama.http.HttpServer: >>>> listener.getLocalPort() returned 40015 >>>> webServer.getConnectors()[0].getLocalPort() returned 40015 >>>> 2013-01-22 10:58:55,646 INFO org.apache.hama.http.HttpServer: Jetty >>>> bound to port 40015 >>>> 2013-01-22 10:58:55,646 INFO org.mortbay.log: jetty-6.1.14 >>>> 2013-01-22 10:58:56,252 INFO org.mortbay.log: Started >>>> SelectChannelConnector@node30:40015 >>>> 2013-01-22 10:59:01,574 INFO org.apache.hadoop.ipc.Server: Starting >>>> SocketReader >>>> 2013-01-22 10:59:01,576 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> Responder: starting >>>> 2013-01-22 10:59:01,576 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> listener on 59102: starting >>>> 2013-01-22 10:59:01,576 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 0 on 59102: starting >>>> 2013-01-22 10:59:01,580 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 1 on 59102: starting >>>> 2013-01-22 10:59:01,577 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 2 on 59102: starting >>>> 2013-01-22 10:59:01,584 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 4 on 59102: starting >>>> 2013-01-22 10:59:01,583 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 3 on 59102: starting >>>> 2013-01-22 10:59:01,584 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 5 on 59102: starting >>>> 2013-01-22 10:59:01,584 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 6 on 59102: starting >>>> 2013-01-22 10:59:01,585 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 7 on 59102: starting >>>> 2013-01-22 10:59:01,585 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 8 on 59102: starting >>>> 2013-01-22 10:59:01,585 INFO org.apache.hadoop.ipc.Server: IPC Server >>>> handler 9 on 59102: starting >>>> 2013-01-22 10:59:06,899 INFO org.apache.hama.bsp.GroomServer: >>>> GroomServer up at: 127.0.0.1/127.0.0.1:59102 >>>> 2013-01-22 10:59:06,900 INFO org.apache.hama.bsp.GroomServer: Starting >>>> groom: node30:50000 >>>> 2013-01-22 10:59:18,083 INFO org.apache.hama.bsp.GroomServer: Launch 1 >>>> tasks. >>>> 2013-01-22 10:59:18,283 WARN org.apache.hadoop.util.NativeCodeLoader: >>>> Unable to load native-hadoop library for your platform... using >>>> builtin-java classes where applicable >>>> 2013-01-22 10:59:23,057 INFO org.apache.hama.bsp.GroomServer: Task >>>> 'attempt_201301221058_0002_000000_0' has started. >>>> 2013-01-22 10:59:31,403 INFO org.apache.hama.bsp.GroomServer: Task >>>> attempt_201301221058_0002_000000_0 is done. >>>> 2013-01-22 10:59:35,757 INFO org.apache.hama.bsp.GroomServer: Launch 3 >>>> tasks. >>>> 2013-01-22 10:59:56,486 INFO org.apache.hama.bsp.GroomServer: Task >>>> 'attempt_201301221058_0001_000000_0' has started. >>>> 2013-01-22 10:59:56,487 INFO org.apache.hama.bsp.GroomServer: Launch 3 >>>> tasks. >>>> 2013-01-22 11:00:06,973 INFO org.apache.hama.bsp.GroomServer: Task >>>> 'attempt_201301221058_0001_000002_0' has started. >>>> 2013-01-22 11:00:06,974 INFO org.apache.hama.bsp.GroomServer: Launch 3 >>>> tasks. >>>> 2013-01-22 11:00:12,927 INFO org.apache.hama.bsp.GroomServer: Task >>>> 'attempt_201301221058_0001_000001_0' has started. >>>> 2013-01-22 11:03:50,330 ERROR org.apache.hama.bsp.TaskRunner: Error >>>> when executing BSPPeer process. >>>> java.io.IOException: BSP task process exit with nonzero status of 137. >>>> command = [/usr/lib/jvm/jdk1.7.0_11/jre/bin/java, -Xmx6144m, >>>> -classpath, >>>> /usr/local/hama-trunk/bin/../conf:/usr/lib/jvm/java-7-oracle//lib/tools.jar:/usr/local/hama-trunk/bin/../core/target/classes:/usr/local/hama-trunk/bin/../hama-**.jar:/usr/local/hama-trunk/bin/../lib/ant-1.7.1.jar:/usr/local/hama-trunk/bin/../lib/ant-launcher-1.7.1.jar:/usr/local/hama-trunk/bin/../lib/avro-1.6.0.jar:/usr/local/hama-trunk/bin/../lib/avro-ipc-1.6.0.jar:/usr/local/hama-trunk/bin/../lib/commons-cli-1.2.jar:/usr/local/hama-trunk/bin/../lib/commons-configuration-1.7.jar:/usr/local/hama-trunk/bin/../lib/commons-httpclient-3.0.1.jar:/usr/local/hama-trunk/bin/../lib/commons-lang-2.6.jar:/usr/local/hama-trunk/bin/../lib/commons-logging-1.1.1.jar:/usr/local/hama-trunk/bin/../lib/commons-math3-3.0.jar:/usr/local/hama-trunk/bin/../lib/guava-10.0.1.jar:/usr/local/hama-trunk/bin/../lib/hadoop-core-1.0.0.jar:/usr/local/hama-trunk/bin/../lib/hadoop-test-1.0.0.jar:/usr/local/hama-trunk/bin/../lib/jackson-core-asl-1.9.2.jar:/usr/local/hama-trunk/bin/../lib/jackson-mapper-asl-1.9.2.jar:/usr/local/hama-trunk/bin/../lib/jetty-6.1.14.jar:/usr/local/hama-trunk/bin/../lib/jetty-annotations-6.1.14.jar:/usr/local/hama-trunk/bin/../lib/jetty-util-6.1.14.jar:/usr/local/hama-trunk/bin/../lib/jsp-2.1-6.1.14.jar:/usr/local/hama-trunk/bin/../lib/jsp-api-2.1-6.1.14.jar:/usr/local/hama-trunk/bin/../lib/junit-4.8.1.jar:/usr/local/hama-trunk/bin/../lib/log4j-1.2.16.jar:/usr/local/hama-trunk/bin/../lib/netty-3.2.6.Final.jar:/usr/local/hama-trunk/bin/../lib/servlet-api-6.0.32.jar:/usr/local/hama-trunk/bin/../lib/slf4j-api-1.5.8.jar:/usr/local/hama-trunk/bin/../lib/slf4j-log4j12-1.5.8.jar:/usr/local/hama-trunk/bin/../lib/snappy-java-1.0.4.1.jar:/usr/local/hama-trunk/bin/../lib/zookeeper-3.4.5.jar::/tmp/hama-hduser/bsp/local/groomServer/attempt_201301221058_0001_000001_0/work/classes:/tmp/hama-hduser/bsp/local/groomServer/attempt_201301221058_0001_000001_0/work, >>>> org.apache.hama.bsp.GroomServer$BSPPeerChild, 127.0.0.1, 59102, >>>> attempt_201301221058_0001_000001_0, node30, -1, RUNNING] >>>> at >>>> org.apache.hama.bsp.TaskRunner$BspChildRunner.call(TaskRunner.java:126) >>>> at >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:166) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >>>> at java.lang.Thread.run(Thread.java:722) >>>> 2013-01-22 11:04:39,789 INFO org.apache.hama.bsp.GroomServer: adding >>>> purge task: attempt_201301221058_0001_000001_0 >>>> 2013-01-22 11:04:39,884 INFO org.apache.hama.bsp.GroomServer: About to >>>> purge task: attempt_201301221058_0001_000001_0 >>>> 2013-01-22 11:04:41,170 INFO org.apache.hama.bsp.GroomServer: Kill 1 tasks. >>>> 2013-01-22 11:04:41,219 INFO org.apache.hama.bsp.GroomServer: Kill 1 tasks. >>>> 2013-01-22 11:04:41,589 INFO org.apache.hama.bsp.GroomServer: Kill 1 tasks. >>>> >>>> >>>> What can it be? Some configuration? To not get the message of Java >>>> Heap Space configured: >>>> >>>> <name> bsp.child.java.opts </ name> >>>> <value>-Xmx6144m </ value> >>>> >>>> I have only 4 GB of memory in each Groomserver, can it be that? I am >>>> using the trunk version of Hama. >>>> >>>> -- >>>> Francisco Sanches >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon >>> @eddieyoon >> >> >> >> -- >> Francisco Sanches > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon -- Francisco Sanches
