GitHub user blrunner opened a pull request:
https://github.com/apache/tajo/pull/979
TAJO-2087: Implement DirectOutputCommitter
Here is prototype codes for ``DirectOutputCommitter``. This PR is not ready
to review, it shows my approach to implement ``DirectOutputCommitter``. Current
version works as following:
- Register commit history to catalog (TODO).
- Each tasks will write the output data directly to the final location.
- In a commit phase, delete existing files with query type as follows.
First, backup existing files or directories to staging directory. And then
delete backup files or directories.
- Update the status of commit history to catalog (TODO).
- If query fails, QueryMaster will delete committed files and update the
status of query history to catalog (TODO).
- When ``TajoMaster`` starting, it will check the status of query histories
to catalog. If it find running query, it will delete committed files and update
the status of query history (TODO).
- Add unit test cases for failed query (TODO).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/blrunner/tajo direct-output-committer
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tajo/pull/979.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #979
----
commit 083bed51db1e68ed840961e2e169695dde60e116
Author: JaeHwa Jung <[email protected]>
Date: 2016-02-24T02:08:08Z
Add the list of output files and backup files to TaskAttemptContext
commit b39c8d1bcb153d53aae028577935499034bd4b6f
Author: JaeHwa Jung <[email protected]>
Date: 2016-02-24T05:31:55Z
Add outputFiles and backupFiles to Protocol Buffer
commit e3b26ea738ba33e1a6c8b8c856793f5a584eb861
Author: JaeHwa Jung <[email protected]>
Date: 2016-02-24T05:48:02Z
Add property for setting Direct Output Committer to TajoConf and SessionVars
commit 9efb4662957ff39ff215a3c829ece5e69d9ebe36
Author: JaeHwa Jung <[email protected]>
Date: 2016-02-25T01:59:26Z
Remove related property from SessionVars
commit 234f2829768f18fab7c7894aab2ccf7780ae3ffb
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-04T02:44:52Z
Add temporary codes for testing
commit 7effec1fc663d246ffd3e25bfd4a98c803b22607
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-15T09:01:43Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
direct-output-committer
commit cb762766848c2af5d25e20ab552a2041c67924cc
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-15T09:30:36Z
Prefix of output file name must be the id of query.
commit dce41c6be686916a346dc15a033bea39cc79550b
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-16T05:50:32Z
Implement direct Output Committer to FileTablespace
commit 908ccd2b6c2ebbd602892b979c1ff41d7ed4a820
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-16T06:30:45Z
Implement a method for renaming recursively directories
commit bd1e1b3f16e8b6263ef4e762b621a4ba2235aa34
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-16T06:43:56Z
Remove proto modifications
commit 95e513a04bcfee10643ebe17b0e21074057f0be2
Author: JaeHwa Jung <[email protected]>
Date: 2016-03-16T10:21:06Z
Add session variable and add more unit test cases
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---