Some additional information (now running in debug mode - Hama and Zookeeper).
However, the examples dont work (again). I put only the most relevant parts (in my opinion). -------------------------- In the folder tasklogs: ----------------------------- 13/09/05 10:02:12 DEBUG security.Groups: Creating new Groups object 13/09/05 10:02:12 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000 ... The line below as veriquei the forum / jira the hadoop, is harmless.. conf.Configuration: java.io.IOException: config() ... 13/09/05 10:02:36 DEBUG security.UserGroupInformation: hadoop login 13/09/05 10:02:36 DEBUG security.UserGroupInformation: hadoop login commit 13/09/05 10:02:38 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser 13/09/05 10:02:38 DEBUG security.UserGroupInformation: UGI loginUser:hduser ... ----------------------------- File bspmaster.log ----------------------------- 2013-09-05 10:00:55,284 INFO org.apache.hama.bsp.JobInProgress: Job is initialized. 2013-09-05 10:00:55,676 DEBUG org.apache.hama.bsp.Counters: Adding LAUNCHED_TASKS 2013-09-05 10:01:05,945 DEBUG org.apache.hama.bsp.Counters: Adding SUPERSTEP_SUM ... 2013-09-05 10:02:15,201 DEBUG org.apache.zookeeper.ClientCnxn: Got ping response for sessionid: 0x140ed8a99a10001 after 120ms .... 2013-09-05 10:02:46,487 DEBUG org.apache.hama.bsp.Counters: Adding SUPERSTEP_SUM 2013-09-05 10:02:46,531 INFO org.apache.hama.bsp.JobInProgress: Taskid 'attempt_201309050955_0001_000002_0' has failed. .... 2013-09-05 10:02:46,575 DEBUG org.apache.hama.bsp.JobInProgress: Removing /tmp/hadoop-hduser/bsp/local/bspMaster/job_201309050955_0001.xml and /tmp/hadoop-hduser/bsp/local/bspMaster/job_201309050955_0001.jar getJobFile = hdfs://localhost:54310/tmp/hadoop-hduser/bsp/system/submit_elbm0i/job.xml 2013-09-05 10:02:47,175 INFO org.apache.hama.bsp.JobInProgress: Job failed. 2013-09-05 10:02:47,187 DEBUG org.apache.hama.bsp.JobInProgress: Removing null and null getJobFile = hdfs://localhost:54310/tmp/hadoop-hduser/bsp/system/submit_elbm0i/job.xml 2013-09-05 10:02:47,811 DEBUG org.apache.hama.bsp.Counters: Adding SUPERSTEP_SUM 2013-09-05 10:02:48,531 DEBUG org.apache.hama.bsp.Counters: Adding SUPERSTEP_SUM 2013-09-05 10:03:35,246 DEBUG org.apache.zookeeper.ClientCnxn: Got ping response for sessionid: 0x140ed8a99a10001 after 48ms ... ----------- File zookeeper.out (without errors) ----------- ...2013-09-05 09:56:27,760 [myid:] - INFO [SyncThread:0:ZooKeeperServer@595] - Established session 0x140ed8a99a10000 with negotiated timeout 240000 for client /127.0.0.1:45872 2013-09-05 09:58:36,924 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:21810:NIOServerCnxnFactory@197] - Accepted socket connection from /127.0.0.1:45876 2013-09-05 09:58:37,120 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:21810:ZooKeeperServer@839] - Client attempting to establish new session at /127.0.0.1:45876 2013-09-05 09:58:37,272 [myid:] - INFO [SyncThread:0:ZooKeeperServer@595] - Established session 0x140ed8a99a10001 with negotiated timeout 240000 for client /127.0.0.1:45876 ... Some tip of what might be? 2013/9/5 Júlio Pires <[email protected]> > In this scenario was not using an external Zookeeper (Hama starts an > embedded, correct?). > > I redid my environment. However, I have found the problem, but not the > solution. > > On physical machines, the example works correctly (localhost/pseudo > distributed mode). > In tests on virtual machines via qemu, also works. > > But, when I do tests on virtual machines (via cloud) dont work. > > Is there any specific configuration to run on virtual machines in Cloud? > > > > 2013/9/4 Edward J. Yoon <[email protected]> > > Please see the zookeeper logs and check whether zookeeper is running. >> Also, you need to check the "hama.zookeeper.quorum", >> "hama.zookeeper.property.clientPort" properties are correct. >> >> >> >> On Wed, Sep 4, 2013 at 10:17 AM, Júlio Pires <[email protected]> >> wrote: >> > Thanks Edward! >> > >> > I'm running with the following scenario: >> > JDK 7 >> > Hadoop: 1.2.1 (is compatible, correct?) >> > Hama: 0.6.3 >> > >> > grooservers e hama.zookeeper.quorum com o endereço das três (master, >> > slave_a e slave_b). >> > >> > However, when trying to run the PI example, the following errors occur: >> > ------------------------------------------------ >> > ERRO 1 - Arquivo (hama-bspmaster.log) >> > ------------------------------------------------- >> > 2013-09-04 02:55:39,369 INFO >> > org.apache.hama.bsp.sync.ZKSyncBSPMasterClient: Initialized ZK false >> > 2013-09-04 02:55:39,369 INFO org.apache.hama.bsp.sync.ZKSyncClient: >> > Initializing ZK Sync Client >> > 2013-09-04 02:55:39,513 *ERROR >> > org.apache.hama.bsp.sync.ZKSyncBSPMasterClient: >> > org.apache.zookeeper.KeeperException$ConnectionLossException: >> > KeeperErrorCode = ConnectionLoss for /bsp* >> > at >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >> > at >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) >> > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) >> > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) >> > at >> > >> org.apache.hama.bsp.sync.ZKSyncBSPMasterClient.init(ZKSyncBSPMasterClient.java:62) >> > at org.apache.hama.bsp.BSPMaster.initZK(BSPMaster.java:509) >> > at org.apache.hama.bsp.BSPMaster.startMaster(BSPMaster.java:492) >> > at org.apache.hama.bsp.BSPMaster.startMaster(BSPMaster.java:475) >> > at org.apache.hama.BSPMasterRunner.run(BSPMasterRunner.java:46) >> > >> > ----------------------------------------------- >> > ERRO 2: Arquivo: tasklog/attempt_201309040255_0001_000003_0.log" >> > ----------------------------------------------- >> > >> > 13/09/04 02:57:21 INFO sync.ZooKeeperSyncClientImpl: Start connecting to >> > Zookeeper! At /192.168.122.127:61001 >> > 13/09/04 02:57:22 *ERROR sync.ZooKeeperSyncClientImpl: >> > org.apache.zookeeper.KeeperException$NodeExistsException: >> KeeperErrorCode = >> > NodeExists for /bsp/job_201309040255_0001/peers* >> > 13/09/04 02:57:22 INFO ipc.Server: Starting SocketReader >> > 13/09/04 02:57:22 INFO ipc.Server: IPC Server Responder: starting >> > 13/09/04 02:57:22 INFO ipc.Server: IPC Server listener on 61001: >> starting >> > >> > What can it be? >> > >> > >> > >> > 2013/9/3 Edward J. Yoon <[email protected]> >> > >> >> > Questions: >> >> > 1) In the files "groomservers" and hama-site.xml (property >> >> > hama.zookeeper.quorum ), should be included the three machines or >> just >> >> > slave_a and slave_b? >> >> >> >> There's no great difference either way. If the BSP application >> >> requires large amount of RAM and hdfs-namenode is also started on >> >> master server, the latter is best (large cluster case). >> >> >> >> > 2) With the running environment, how do I add a new slave machine >> >> > On-the-fly? >> >> >> >> Just start groom server to add new slave machine: >> >> >> >> user@new_slave_c $ bin/hama-daemons.sh --config $HAMA_CONF_DIR start >> groom >> >> >> >> Then, it will be recognized automatically. >> >> >> >> On Wed, Sep 4, 2013 at 3:58 AM, Júlio Pires <[email protected]> >> >> wrote: >> >> > Hello, >> >> > >> >> > I have three machines, named as follows: master, slave_a, slave_b >> >> > >> >> > Questions: >> >> > 1) In the files "groomservers" and hama-site.xml (property >> >> > hama.zookeeper.quorum ), should be included the three machines or >> just >> >> > slave_a and slave_b? >> >> > >> >> > 2) With the running environment, how do I add a new slave machine >> >> > On-the-fly? >> >> > >> >> > Thanks! >> >> >> >> >> >> >> >> -- >> >> Best Regards, Edward J. Yoon >> >> @eddieyoon >> >> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> @eddieyoon >> > >
