[ 
https://issues.apache.org/jira/browse/TAJO-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890213#comment-13890213
 ] 

Min Zhou edited comment on TAJO-540 at 2/4/14 2:12 AM:
-------------------------------------------------------

Ok, I got time to write a more detailed plan for this ticket.

Historically, the first scheduler exists in hadoop ecosystem is the JobTracker 
in mapreduce.  JobTracker actually plays two roles of a mapreduce cluster, one 
is resource management and the other is job tasks scheduling. Because of 
JobTracker's playing those two roles,  the job response time and scalability of 
JobTracker is not good. This kind of issue also came across the ancestor of 
mapreduce - Google, which later start a projected named Borg with one of the 
goal to address this problem. Borg become a cluster resource management 
scheduler in Google, and its current version name from their paper is 
Omega.(see https://medium.com/large-scale-data-processing/a7a81f278e6f )

Later this kind of resource scheduler appears into our vision. That's Mesos and 
Hadoop Yarn. The different between this 2 is mesos support gang scheduling and 
yarn support incremental scheduling.  Both of them divided cluster scheduling 
into 2 layers, the higher  layer is resource management, which is the 
responsibility of  those two.  They control the resource for each 
application/framework/job. Meanwhile, the other role for job tasks scheduling 
of a JobTracker is put down into a lower layer - Each 
application/framework/job's master coordinates the tasks for one 
application/framework/job.

>From our benchmarking,  a job with 10 sleep zero ms tasks in hadoop 1.0 costed 
>about 20 seconds because of JobTracker's scheduling. And Hadoop Yarn take the 
>same level time as well.  What we need here is not a scheduler as MRAppMaster, 
>it's a low-latency scheduler.  From Jeff Dean's paper ( 
>http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/abstract ), we 
>get a knowledge that Google is always beyond us.  They developed a so called 
>tied request technology to solve the low-latency requirements.  please see the 
>tied request section in 
>http://static.googleusercontent.com/media/research.google.com/en//people/jeff/MIT_BigData_Sep2012.pdf
> if you can't download the acm paper.

What we need here is actually a google's tied request like scheduling.  
Fortunately,  we have a good candidate, sparrow,  which actually was the 
scheduler of the first version of Impala (c++ version), and will be plugged 
into spark. 

I'd like to port sparrow into Tajo, but before that I think we need to discuss 
something first , cuz the structure will be radically changed.

to be continuted.
 




was (Author: coderplay):
Ok, I got time to write a more detailed plan for this ticket.

Historically, the first scheduler exists in hadoop ecosystem is the JobTracker 
in mapreduce.  JobTracker actually plays two roles of a mapreduce cluster, one 
is resource management and the other is job tasks scheduling. Because of 
JobTracker's playing those two roles,  the job response time and scalability of 
JobTracker is not good. This kind of issue also came across the ancestor of 
mapreduce - Google, which later start a projected named Borg with one of the 
goal to address this problem. Borg become a cluster resource management 
scheduler in Google, and its current version name from their paper is 
Omega.(see https://medium.com/large-scale-data-processing/a7a81f278e6f )

Later this kind of resource scheduler appears into our vision. That's Mesos and 
Hadoop Yarn. The different between this 2 is mesos support gang scheduling and 
yarn support incremental scheduling.  Both of them divided cluster scheduling 
into 2 layers, the higher  layer is resource management, which is the 
responsibility of  those two.  They control the resource for each 
application/framework/job. Meanwhile, the other role for job tasks scheduling 
of a JobTracker is put down into a lower layer - Each 
application/framework/job's master coordinates the tasks for one 
application/framework/job.

>From our benchmarking,  a job with 10 sleep zero ms tasks in hadoop 1.0 costed 
>about 20 seconds because of JobTracker's scheduling. And Hadoop Yarn take the 
>same level time as well.  What we need here is not a scheduler as MRAppMaster, 
>it's a low-latency scheduler.  From Jeff Dean's paper ( 
>http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/abstract ), we 
>get a knowledge that Google is always beyond us.  They developed a so called 
>tied request technology to solve the low-latency requirements.  please see the 
>tied request section in 
>http://static.googleusercontent.com/media/research.google.com/en//people/jeff/MIT_BigData_Sep2012.pdf
> if you can't download the acm paper.

What we need here is actually a google's tied request like scheduling.  
Fortunately,  we have a good candidate, sparrow,  which actually was the 
scheduler of the first version of Impala (c++ version), and will be plugged 
into spark. 

I'd like to port sparrow into Tajo, but before that I think we need to discuss 
something first , cuz the structure will be radically changed.


 



> (Umbrella) Implement Tajo Query Scheduler
> -----------------------------------------
>
>                 Key: TAJO-540
>                 URL: https://issues.apache.org/jira/browse/TAJO-540
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: Hyunsik Choi
>
> Currently, there is no Tajo query scheduler. So, all queries launched 
> simultaneously compete cluster resource which is managed by 
> TajoResourceManager.
> In this issue, we will investigate,  design, and implement a Tajo query 
> scheduler. This is an umbrella issue for that. We will create subtasks for 
> them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to