[
https://issues.apache.org/jira/browse/TAJO-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890213#comment-13890213
]
Min Zhou edited comment on TAJO-540 at 2/4/14 2:12 AM:
-------------------------------------------------------
Ok, I got time to write a more detailed plan for this ticket.
Historically, the first scheduler exists in hadoop ecosystem is the JobTracker
in mapreduce. JobTracker actually plays two roles of a mapreduce cluster, one
is resource management and the other is job tasks scheduling. Because of
JobTracker's playing those two roles, the job response time and scalability of
JobTracker is not good. This kind of issue also came across the ancestor of
mapreduce - Google, which later start a projected named Borg with one of the
goal to address this problem. Borg become a cluster resource management
scheduler in Google, and its current version name from their paper is
Omega.(see https://medium.com/large-scale-data-processing/a7a81f278e6f )
Later this kind of resource scheduler appears into our vision. That's Mesos and
Hadoop Yarn. The different between this 2 is mesos support gang scheduling and
yarn support incremental scheduling. Both of them divided cluster scheduling
into 2 layers, the higher layer is resource management, which is the
responsibility of those two. They control the resource for each
application/framework/job. Meanwhile, the other role for job tasks scheduling
of a JobTracker is put down into a lower layer - Each
application/framework/job's master coordinates the tasks for one
application/framework/job.
>From our benchmarking, a job with 10 sleep zero ms tasks in hadoop 1.0 costed
>about 20 seconds because of JobTracker's scheduling. And Hadoop Yarn take the
>same level time as well. What we need here is not a scheduler as MRAppMaster,
>it's a low-latency scheduler. From Jeff Dean's paper (
>http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/abstract ), we
>get a knowledge that Google is always beyond us. They developed a so called
>tied request technology to solve the low-latency requirements. please see the
>tied request section in
>http://static.googleusercontent.com/media/research.google.com/en//people/jeff/MIT_BigData_Sep2012.pdf
> if you can't download the acm paper.
What we need here is actually a google's tied request like scheduling.
Fortunately, we have a good candidate, sparrow, which actually was the
scheduler of the first version of Impala (c++ version), and will be plugged
into spark.
I'd like to port sparrow into Tajo, but before that I think we need to discuss
something first , cuz the structure will be radically changed.
to be continuted.
was (Author: coderplay):
Ok, I got time to write a more detailed plan for this ticket.
Historically, the first scheduler exists in hadoop ecosystem is the JobTracker
in mapreduce. JobTracker actually plays two roles of a mapreduce cluster, one
is resource management and the other is job tasks scheduling. Because of
JobTracker's playing those two roles, the job response time and scalability of
JobTracker is not good. This kind of issue also came across the ancestor of
mapreduce - Google, which later start a projected named Borg with one of the
goal to address this problem. Borg become a cluster resource management
scheduler in Google, and its current version name from their paper is
Omega.(see https://medium.com/large-scale-data-processing/a7a81f278e6f )
Later this kind of resource scheduler appears into our vision. That's Mesos and
Hadoop Yarn. The different between this 2 is mesos support gang scheduling and
yarn support incremental scheduling. Both of them divided cluster scheduling
into 2 layers, the higher layer is resource management, which is the
responsibility of those two. They control the resource for each
application/framework/job. Meanwhile, the other role for job tasks scheduling
of a JobTracker is put down into a lower layer - Each
application/framework/job's master coordinates the tasks for one
application/framework/job.
>From our benchmarking, a job with 10 sleep zero ms tasks in hadoop 1.0 costed
>about 20 seconds because of JobTracker's scheduling. And Hadoop Yarn take the
>same level time as well. What we need here is not a scheduler as MRAppMaster,
>it's a low-latency scheduler. From Jeff Dean's paper (
>http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/abstract ), we
>get a knowledge that Google is always beyond us. They developed a so called
>tied request technology to solve the low-latency requirements. please see the
>tied request section in
>http://static.googleusercontent.com/media/research.google.com/en//people/jeff/MIT_BigData_Sep2012.pdf
> if you can't download the acm paper.
What we need here is actually a google's tied request like scheduling.
Fortunately, we have a good candidate, sparrow, which actually was the
scheduler of the first version of Impala (c++ version), and will be plugged
into spark.
I'd like to port sparrow into Tajo, but before that I think we need to discuss
something first , cuz the structure will be radically changed.
> (Umbrella) Implement Tajo Query Scheduler
> -----------------------------------------
>
> Key: TAJO-540
> URL: https://issues.apache.org/jira/browse/TAJO-540
> Project: Tajo
> Issue Type: New Feature
> Reporter: Hyunsik Choi
>
> Currently, there is no Tajo query scheduler. So, all queries launched
> simultaneously compete cluster resource which is managed by
> TajoResourceManager.
> In this issue, we will investigate, design, and implement a Tajo query
> scheduler. This is an umbrella issue for that. We will create subtasks for
> them.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)