> 1) each processor (node) can (and normally does) execute several BSP > Peers (threads). I see the BSPPeerInterface as an abstraction for a > peer to send and receive messages to/from other peers. To send a > message to other peer we need to know the host/ip + port of the node > where that other peer is running. There's no way to dynamically find > out how many peers there are and their addresses. So this is hardcoded > into the job (an implementation of the BSP class), right?
refer: http://wiki.apache.org/hama/Architecture#The_BSP_Algorithm_for_Pi In Pi example, one task collects the results as a reduce phase of M/R. I think, hardcoded part can be replaced to taskid. The N task will have unique ID as a integer 0 ~ N. (the reason is related with numerical matrix computing in a short) > 2) After runnign the PI example, I got absolutely no output from the > job. The LOG.info() calls in PiEstimator.MyEstimator.bsp() didn't made > their way into stdout/stderr. Is this the goal of the input/output > system? To redirect a job's output to somewhere (maybe a file in hdfs) > ? If you run only one groom server, try again after set the value of "bsp.peers.num" property in hama-default.xml to "1". My test is OK, and the output will appears in LOG as below: 10/09/29 14:15:09 DEBUG bsp.BSPPeer: [slave.udanax.org:61000] enter the enterbarrier 10/09/29 14:15:10 DEBUG bsp.BSPPeer: [slave.udanax.org:61000] leave from the leaveBarrier 10/09/29 14:15:10 INFO examples.PiEstimator$MyEstimator: Receives messages:3.1848 10/09/29 14:15:10 INFO examples.PiEstimator$MyEstimator: Receives messages:3.1344 10/09/29 14:15:10 INFO examples.PiEstimator$MyEstimator: Estimated value of PI is 2.3634 .. I'm looking at the rest issues. On Wed, Sep 29, 2010 at 9:55 AM, Filipe David Manana <[email protected]> wrote: > Edward, > > Thanks for the explanation. > These steps and the flow are now clear to me. > > I just runned the PI example. > > Had to change the line: > > bspPeer.send(new InetSocketAddress("slave.udanax.org", 61000), estimate); > > to > > bspPeer.send(new InetSocketAddress("localhost", 61000), estimate); > > > Using the BSP model terminology, I have some questions regarding the > BSPPeer class: > > 1) each processor (node) can (and normally does) execute several BSP > Peers (threads). I see the BSPPeerInterface as an abstraction for a > peer to send and receive messages to/from other peers. To send a > message to other peer we need to know the host/ip + port of the node > where that other peer is running. There's no way to dynamically find > out how many peers there are and their addresses. So this is hardcoded > into the job (an implementation of the BSP class), right? > > 2) In case we have several peers executing in the same node, how do > they communicate to each other? I mean, right now it doesn't seem we > support multiple peers running in the same node. But as soon it's > supported, each one will have a different address (some host/ip but > different port), right? > > > Some other more generic questions: > > 1) Is the config option bsp.groom.port (default 40020) equivalent to > Constant.DEFAULT_PEER_PORT (61000) ? It seems to me the former > (bsp.groom.port) is not used anywhere. > > > 2) After runnign the PI example, I got absolutely no output from the > job. The LOG.info() calls in PiEstimator.MyEstimator.bsp() didn't made > their way into stdout/stderr. Is this the goal of the input/output > system? To redirect a job's output to somewhere (maybe a file in hdfs) > ? > > > Btw, I ran the PI example like this: > > In one shell I started zookeeper: $ ./bin/hama zookeeper > In other shell I started BSP Master: $ ./bin/hama bspmaster > In another one I started a Groom Server (BSP Slave): $ ./bin/hama groom > > And finally, submitted the PI jar in another shell like this: > ./bin/hama jar build/hama-0.2.0-dev-examples.jar pi > > Is this the correct way to launch the job example? > > > cheers > > On Mon, Sep 27, 2010 at 5:20 AM, Edward J. Yoon <[email protected]> wrote: >> Hello, >> >> When user submit a job via following command ($ bin/hama jar >> build/hama-0.2.0-dev-examples.jar pi) , the flow is as below: >> >> 1) jar and job files (named as a id) will be copied from the local to >> (jobid named) temp directory on HDFS. >> >> 10/09/27 11:57:27 DEBUG bsp.BSPJobClient: BSPJobClient.submitJobDir: >> hdfs://slave.udanax.org:9000/tmp/hadoop-edward/bsp/system/job_201009271157_0001 >> >> 2) then, they will be copied to bspMaster's local file system from >> HDFS, at JobInProgress. Because, the job can be submitted from some >> slave node. >> >> 10/09/27 11:57:27 DEBUG bsp.JobInProgress: JobInProgress.localJobFile: >> /tmp/hadoop-edward/bsp/local/bspMaster/job_201009271157_0001.xml >> 10/09/27 11:57:27 DEBUG bsp.JobInProgress: JobInProgress.localJarFile: >> /tmp/hadoop-edward/bsp/local/bspMaster/job_201009271157_0001.jar >> >> These files on a bspMaster local fs, will be used for loading the >> input fomatter class and creating tasks on the bspMaster server. BTW, >> since input/output system is not planned from Hama 0.2 version, we >> don't need to care them at this time. >> >> 3) "actions" will be sent to slave nodes via heartbeat, and tasks will >> be launched by launchTask() method. >> >> For loading a BSP class and invoking the user defined BSP() method on >> the slave server, the localizing step is in this process, too. >> >> 10/09/27 12:27:29 DEBUG bsp.GroomServer: localJobFile: >> /tmp/hadoop-edward/bsp/local/groomServer/task_0/job.xml >> >> On Mon, Sep 27, 2010 at 4:51 AM, Filipe David Manana >> <[email protected]> wrote: >>> Hi, one simple question: >>> >>> >>> In BSPJobClient.java, method submitJobInternal() we have: >>> >>> Path submitJobDir = new Path(getSystemDir(), jobId.toString()); >>> Path submitJarFile = new Path(submitJobDir, "job.jar"); >>> Path submitJobFile = new Path(submitJobDir, "job.xml"); >>> >>> getSystemDir() will call BSPMaster.getSystemDir() which reads the >>> config property "bsp.system.dir". >>> >>> However, in the constructor of JobInProgress.java we have: >>> >>> this.localJobFile = master.getLocalPath(BSPMaster.SUBDIR + "/" + jobId >>> + ".xml"); >>> this.localJarFile = master.getLocalPath(BSPMaster.SUBDIR + "/" + jobId >>> + ".jar"); >>> >>> and getLocalPath() (BSPMaster class) will read the config property >>> "bsp.local.dir" >>> >>> >>> So, aren't we supposed to use the same paths in both places for the >>> job files? (either local or system dir, but not both) >>> >>> cheers >>> >>> >>> -- >>> Filipe David Manana, >>> [email protected], [email protected] >>> >>> "Reasonable men adapt themselves to the world. >>> Unreasonable men adapt the world to themselves. >>> That's why all progress depends on unreasonable men." >>> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> [email protected] >> http://blog.udanax.org >> > > > > -- > Filipe David Manana, > [email protected], [email protected] > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men." > -- Best Regards, Edward J. Yoon [email protected] http://blog.udanax.org
