[jira] [Created] (TAJO-473) Improve the fault tolerance of LazyTaskScheduler

Jihoon Son (JIRA) Thu, 02 Jan 2014 21:55:23 -0800

Jihoon Son created TAJO-473:
-------------------------------

             Summary: Improve the fault tolerance of LazyTaskScheduler 
                 Key: TAJO-473
                 URL: https://issues.apache.org/jira/browse/TAJO-473
             Project: Tajo
          Issue Type: New Feature
          Components: query master
    Affects Versions: 0.2-incubating
            Reporter: Jihoon Son



As discussed in TAJO-385 and https://reviews.apache.org/r/16455/, the 
LazyTaskScheduler has a problem when tasks are failed.
When a failed task of multiple fragments is re-assigned to a node, the locality 
of fragments is extremely hard to preserved because it is nearly impossible 
that every fragments is stored at two or more common hosts.

A simple and good solution is that creating multiple query unit attempts for 
each fragments when a failed task is reattempted. To implement this approach, 
we should maintain the information of the query processing attempt for each 
fragment, not for each query unit.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (TAJO-473) Improve the fault tolerance of LazyTaskScheduler

Reply via email to