[ 
https://issues.apache.org/jira/browse/TAJO-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893203#comment-13893203
 ] 

Min Zhou commented on TAJO-540:
-------------------------------

Go ahead.  From my deep investigate, we can keep the yarn thread, only need 
some refactoring in order to keep the same interface as standalone mode 
scheduling.

Currently, standalone mode scheduling is something like FIFO centralized 
scheduling,  if the previous query occupies all of the slots of workers, the 
succeeding query will be blocked.  We have 2 choices,   the first one is change 
the FIFO strategy into another one, like fair share. But this hadoop jobtracker 
like scheduling can't achieve a very low latency and good scalalibity. The 
second one is porting sparrow into tajo. 

If we want to port sparrow, we need to do one thing in advance.  That is ,  due 
to sparrow is a decentralized algorithm, typically every node has a scheduling 
service deployed. Those schedulers need to know every node's status. Actually, 
Impala has a Statestore daemon to offer this kind of service.  
see *The Impala Catalog Service* in 
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_concepts.html
  
This is also called service discovery.  Facebook presto has such component as 
well.
see 
https://github.com/facebook/presto/blob/master/presto-main/src/main/java/com/facebook/presto/metadata/DiscoveryNodeManager.java

For long term purpose, we need add a service discovery component not only for 
scheduling, but also for high availability.  Fortunately, we needn't build a 
service discovery from scratch. There a lot of open source projects for this. 
One of the most famous is zookeeper. 
see 
http://www.javacodegeeks.com/2013/11/coordination-and-service-discovery-with-apache-zookeeper.html
A better library built on the top of zookeeper  
https://github.com/Netflix/curator/wiki/Service-Discovery
see 
http://blog.palominolabs.com/2012/08/14/using-netflix-curator-for-service-discovery/

For short term. I think TajoMaster already hold the status of all workers. Each 
worker can fetch all workers' address through a rpc send to TajoMaster. If we 
have such information in the worker side, we can embed sparrow like scheduler 
as a optional service into worker. 


> (Umbrella) Implement Tajo Query Scheduler
> -----------------------------------------
>
>                 Key: TAJO-540
>                 URL: https://issues.apache.org/jira/browse/TAJO-540
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: Hyunsik Choi
>
> Currently, there is no Tajo query scheduler. So, all queries launched 
> simultaneously compete cluster resource which is managed by 
> TajoResourceManager.
> In this issue, we will investigate,  design, and implement a Tajo query 
> scheduler. This is an umbrella issue for that. We will create subtasks for 
> them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to