[
https://issues.apache.org/jira/browse/TAJO-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyunsik Choi updated TAJO-1397:
-------------------------------
Attachment: resource_circuit.png
Here is my proposed resource circulation. See resource_circuit.png file.
* Node sends a heartbeats with its resource and status periodically.
* ResourceTracker maintains a global view of node resources.
When a query master is launched, the query master requests necessary resource
to ResourceTracker. This is repeatedly performed while a query is executing.
The request will includes the following information:
* query id
* user
* A set of resource requests, each of which will include
* priority
* the number of containers and resource capacity for containers
* desired host names
* type of task (leaf or intermediate)
* ...
When the query master receives the resource requests, it schedules tasks to
nodes. In some cases where a node cannot accept the task schedule, the node can
reject the request. It can be occur when the available resources that
ResourceTracker and actual nodes keep are different in some cases.
In the proposed resource circulation, the resource allocation will be performed
in a task level. So, it will give even more flexibility to adjust the resource
capacity per user or query during query processing.
> Resource allocation should be fine grained.
> -------------------------------------------
>
> Key: TAJO-1397
> URL: https://issues.apache.org/jira/browse/TAJO-1397
> Project: Tajo
> Issue Type: Improvement
> Components: query master, resource manager, worker
> Reporter: Hyunsik Choi
> Fix For: 0.11
>
> Attachments: old_resource_circuit.png, resource_circuit.png
>
>
> See the comment:
> https://issues.apache.org/jira/browse/TAJO-540?focusedCommentId=14359478&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14359478
> From the discussion in TAJO-540
> {quote}
> In general, query (or job) scheduler aims at the maximum resource
> utilization. For multi-tenancy, we also need to consider the fairness for
> multiple users (or queries). BTW, the maximum resource utilization and
> fairness are usually conflict to each other in many cases. To mitigate this
> problem, many scheduler seems to use preemption approach.
> In this point, our resource and scheduler system has the following problems:
> * A query exclusively uses allocated resources at the first time until the
> query is completed or failed.
> * There is no mechanism to deallocate resources during query processing.
> * Preempt is also not allowed.
> To achieve the multi tenancy, we should change our resource circulation.
> Especially, resource allocation must be fine grained instead of per query.
> So, I'll create a jira issue to change the resource circulation. We have to
> do this issue firstly in my opinion. If we achieve this, implementing
> multi-tenant scheduler would be much easier than now. It would be a good
> starting point of this issue.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)