[
https://issues.apache.org/jira/browse/TAJO-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155899#comment-14155899
]
ASF GitHub Bot commented on TAJO-1067:
--------------------------------------
Github user hyunsik commented on a diff in the pull request:
https://github.com/apache/tajo/pull/161#discussion_r18317839
--- Diff:
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Query.java ---
@@ -432,19 +432,68 @@ public Path commitOutputData(Query query) {
boolean movedToOldTable = false;
boolean committed = false;
Path oldTableDir = new Path(queryContext.getStagingDir(),
TajoConstants.INSERT_OVERWIRTE_OLD_TABLE_NAME);
- try {
- if (fs.exists(finalOutputDir)) {
- fs.rename(finalOutputDir, oldTableDir);
- movedToOldTable = fs.exists(oldTableDir);
- } else { // if the parent does not exist, make its parent
directory.
- fs.mkdirs(finalOutputDir.getParent());
+
+ // INSERT OVERWRITE INTO always moves the result data into the
original table location.
+ // As a result, all existing partitions have been removed. The
query should not remove all partitions
+ // because existing partitions may be a data-set for a
production cluster.
+ if (queryContext.hasPartition()) {
+ Map<Path, Path> renameDirs = TUtil.newHashMap();
+ Map<Path, Path> recoveryDirs = TUtil.newHashMap();
+
+ try {
+ if (!fs.exists(finalOutputDir)) {
+ fs.mkdirs(finalOutputDir);
+ }
+
+ visitPartitionedDirectory(fs, stagingResultDir,
finalOutputDir, stagingResultDir.toString(),
+ renameDirs, oldTableDir);
+
+ // Rename target partition directories
+ for(Map.Entry<Path, Path> entry : renameDirs.entrySet()) {
+ // Backup existing data files for recovering
+ if (fs.exists(entry.getValue())) {
+ String recoveryPathString =
entry.getValue().toString().replaceAll(finalOutputDir.toString(),
+ oldTableDir.toString());
+ Path recoveryPath = new Path(recoveryPathString);
+ fs.rename(entry.getValue(), recoveryPath);
+ fs.exists(recoveryPath);
+ recoveryDirs.put(entry.getValue(), recoveryPath);
+ }
+ // Delete existing directory
+ fs.deleteOnExit(entry.getValue());
+ // Rename staging directory to final output directory
+ fs.rename(entry.getKey(), entry.getValue());
+ }
+
+ } catch (IOException ioe) {
+ // Remove created dirs
+ for(Map.Entry<Path, Path> entry : renameDirs.entrySet()) {
+ fs.deleteOnExit(entry.getValue());
+ }
+
+ // Recovery renamed dirs
+ for(Map.Entry<Path, Path> entry : recoveryDirs.entrySet())
{
+ fs.deleteOnExit(entry.getValue());
--- End diff --
Could you check the use of FileSystem::deleteOnExit? You seem to misuse it.
> INSERT OVERWRITE INTO should not remove all partitions.
> -------------------------------------------------------
>
> Key: TAJO-1067
> URL: https://issues.apache.org/jira/browse/TAJO-1067
> Project: Tajo
> Issue Type: Bug
> Components: query master
> Reporter: Jaehwa Jung
> Assignee: Jaehwa Jung
> Priority: Critical
> Fix For: 0.9.0
>
>
> Currently, INSERT OVERWRITE INTO always moves the result data into the
> original table location. As a result, all existing partitions have been
> removed. The query should not remove all partitions because existing
> partitions may be a dataset for a production cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)