Hyunsik Choi created TAJO-1216:
----------------------------------
Summary: Output commit should be two phase commit
Key: TAJO-1216
URL: https://issues.apache.org/jira/browse/TAJO-1216
Project: Tajo
Issue Type: Improvement
Reporter: Hyunsik Choi
*Problem*
The output data of each query are firstly stored in some temporary staging
directory. Then, they are finally moved to the final output directory when all
tasks are successfully completed. We call this step *output commit*.
Currently, we use a simple way to just move an output data set to the final
directory. But, this manner makes failure handle very hard.
*Solution*
In order to solve the problem, we need two-phase commit. This approach is as
follows:
* When each task is completed, the task request a *commit pending* to
QueryMaster.
* QueryMaster chooses only one commit pending possibly among multiple commit
pending requests, and then response *commit* to a corresponding task.
* Only one task which receives *commit* moves the result data to the final
output directory. Others cancel commit works.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)