Hama Scheduler
Hi, I want to implement a scheduler for hama as my thesis project. Here are my ideas: 1. make the split size as big as possible, so more jobs can be run on the cluster; 2. the scheduler can select a job based on the job's type, like graph job (message intensive) or iterative job Any advice? Best Regards!
Re: Hama Scheduler
BSPMaster makes use of TaskScheduler for scheduling tasks BSPMaster.java Class? extends TaskScheduler schedulerClass = conf.getClass( bsp.master.taskscheduler, SimpleTaskScheduler.class, TaskScheduler.class); this.taskScheduler = ReflectionUtils.newInstance(schedulerClass, conf); Then in SimpleTaskScheduler, tasks are scheduled through schedule function. And tasks are obtained by related JobInProgress. That's roughly the execution path. IIRC the split size is more related to a single job so increasing split size may not allow more jobs to be ran on the same cluster. At the moment the scheduling mechanism is done by creating a task per GroomServer, and each GroomServer allows default maxTasks up to 3. So increasing maxTasks may be a way to increase job running concurrently; or restricting tasks scheduled to GroomServer, and then scheduling tasks in the new job to free slots may also help increase the concurrent job execution. Right now scheduling is just a simple FCFS. It's welcome the improvment on adding something like policy so that the scheduling mechanism is more flexible. On 24 December 2013 20:06, Yuesheng Hu yueshen...@gmail.com wrote: Hi, I want to implement a scheduler for hama as my thesis project. Here are my ideas: 1. make the split size as big as possible, so more jobs can be run on the cluster; 2. the scheduler can select a job based on the job's type, like graph job (message intensive) or iterative job Any advice? Best Regards!
Re: Hama Scheduler
Hi Lin, Thank you for your reply. 2013/12/24 Chia-Hung Lin cli...@googlemail.com BSPMaster makes use of TaskScheduler for scheduling tasks BSPMaster.java Class? extends TaskScheduler schedulerClass = conf.getClass( bsp.master.taskscheduler, SimpleTaskScheduler.class, TaskScheduler.class); this.taskScheduler = ReflectionUtils.newInstance(schedulerClass, conf); Then in SimpleTaskScheduler, tasks are scheduled through schedule function. And tasks are obtained by related JobInProgress. That's roughly the execution path. IIRC the split size is more related to a single job so increasing split size may not allow more jobs to be ran on the same cluster. At the moment the scheduling mechanism is done by creating a task per GroomServer, and each GroomServer allows default maxTasks up to 3. So increasing maxTasks may be a way to increase job running concurrently; or restricting tasks scheduled to GroomServer, and then scheduling tasks in the new job to free slots may also help increase the concurrent job execution. Right now scheduling is just a simple FCFS. It's welcome the improvment on adding something like policy so that the scheduling mechanism is more flexible. On 24 December 2013 20:06, Yuesheng Hu yueshen...@gmail.com wrote: Hi, I want to implement a scheduler for hama as my thesis project. Here are my ideas: 1. make the split size as big as possible, so more jobs can be run on the cluster; 2. the scheduler can select a job based on the job's type, like graph job (message intensive) or iterative job Any advice? Best Regards!
[jira] [Commented] (HAMA-833) Add more finish states for vertices
[ https://issues.apache.org/jira/browse/HAMA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856383#comment-13856383 ] Ilias Kapouranis commented on HAMA-833: --- Yeah having this flexibility in aggregating will be great. Add more finish states for vertices --- Key: HAMA-833 URL: https://issues.apache.org/jira/browse/HAMA-833 Project: Hama Issue Type: Improvement Components: graph Affects Versions: 0.6.3 Reporter: Anastasis Andronidis Assignee: Anastasis Andronidis Priority: Minor Labels: features Fix For: 0.7.0 We should handle more cases on the vertices, like: 1) voteToStop() : Immediately stop the vertex compute and suppress any further calculations on top of that. (e.g. aggregation) 2) voteToTerminate(): Immediately stop the vertex compute, suppress any further calculations on top of that and deactivate the vertex so even if any message reaches it, will not come alive. Any comments? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Build failed in Jenkins: Hama-Nightly-for-Hadoop-2.x #128
See https://builds.apache.org/job/Hama-Nightly-for-Hadoop-2.x/128/ -- [...truncated 198 lines...] ... 32 more Caused by: svn: E175002: OPTIONS request failed on '/repos/asf/hama/trunk' at org.tmatesoft.svn.core.SVNErrorMessage.create(SVNErrorMessage.java:208) at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection._request(HTTPConnection.java:775) ... 33 more Caused by: svn: E175002: timed out waiting for server at org.tmatesoft.svn.core.SVNErrorMessage.create(SVNErrorMessage.java:208) at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection._request(HTTPConnection.java:514) ... 33 more Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385) at java.net.Socket.connect(Socket.java:546) at org.tmatesoft.svn.core.internal.util.SVNSocketConnection.run(SVNSocketConnection.java:57) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ... 5 more java.io.IOException: remote file operation failed: https://builds.apache.org/job/Hama-Nightly-for-Hadoop-2.x/ws/ at hudson.remoting.Channel@7a760048:ubuntu5 at hudson.FilePath.act(FilePath.java:910) at hudson.FilePath.act(FilePath.java:887) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:848) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:786) at hudson.model.AbstractProject.checkout(AbstractProject.java:1414) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:652) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:561) at hudson.model.Run.execute(Run.java:1677) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:231) Caused by: java.io.IOException: Failed to check out http://svn.apache.org/repos/asf/hama/trunk at hudson.scm.subversion.CheckoutUpdater$1.perform(CheckoutUpdater.java:110) at hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:161) at hudson.scm.SubversionSCM$CheckOutTask.perform(SubversionSCM.java:908) at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:889) at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:872) at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2461) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:328) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Caused by: org.tmatesoft.svn.core.SVNException: svn: E175002: OPTIONS /repos/asf/hama/trunk failed at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:388) at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:373) at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:361) at org.tmatesoft.svn.core.internal.io.dav.DAVConnection.performHttpRequest(DAVConnection.java:707) at org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:627) at org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:102) at org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:1020) at org.tmatesoft.svn.core.internal.io.dav.DAVRepository.getLatestRevision(DAVRepository.java:180) at org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.getRevisionNumber(SVNBasicDelegate.java:480) at org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.getLocations(SVNBasicDelegate.java:833) at org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.createRepository(SVNBasicDelegate.java:527) at
Jenkins build is back to normal : Hama-Nightly-for-Hadoop-1.x #1131
See https://builds.apache.org/job/Hama-Nightly-for-Hadoop-1.x/1131/