Hiroyuki Nagaya created TEZ-4604:
------------------------------------

             Summary: Hive compaction in Tez does not delete files under 
staging directory
                 Key: TEZ-4604
                 URL: https://issues.apache.org/jira/browse/TEZ-4604
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Hiroyuki Nagaya


I am using a combination of Hadoop, Hive and Tez.
When I run major compaction with Hive, files under the staging directory are 
not deleted.
With Mapreduce, files are deleted from the staging directory and files are 
created in the history directory.

Hadoop 3.3.6
Hive 4.0.1
Tez 0.10.4

1. When using Mapreduce

The following data will be deleted.

/tmp/hadoop-yarn/staging/hadoop/.staging/job_1705466455536_3620
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1705466455536_3620/job.jar
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1705466455536_3620/job.split
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1705466455536_3620/job.splitmetainfo
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1705466455536_3620/job.xml

Historical data will be created in the following directories
/tmp/hadoop-yarn/staging/history/done

2. When using Tez

The following data will not be deleted

/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002/.tez
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002/job.jar
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002/job.split
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002/job.splitmetainfo
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002/job.xml

No historical data will be created.


Is it a bug that the following directories are not deleted?
Or is it a Tez configuration problem?
I would like it to be deleted because the process has been completed 
successfully and it is about 80MB in size.
/tmp/hadoop-yarn/staging/hadoop/.staging/job_1740026697751_0002



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to