[
https://issues.apache.org/jira/browse/HIVE-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773947#comment-16773947
]
Dzianis Sokal edited comment on HIVE-18922 at 2/21/19 10:55 AM:
----------------------------------------------------------------
Steps to reproduce:
* Having Cloudera Quick Start VM 5.13.0; Hive version 1.1.0
{code:java}
$ hive --version
Hive 1.1.0-cdh5.13.0
Subversion
file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.13.0
-r Unknown
Compiled by jenkins on Wed Oct 4 11:06:55 PDT 2017
>From source with checksum 4c9678e964cc1d15a0190a0a1867a837{code}
* Going to Hue -> Hive editor in Browser and executing the following
{code:java}
CREATE EXTERNAL TABLE person (
name STRING,
surname STRING
) STORED AS ORC LOCATION '/user/cloudera/hive/default/person';
INSERT INTO person values ('Denis', 'Sokol');
INSERT INTO person values ('John', 'Smith');
INSERT INTO person values ('Paul', 'Lauren');
INSERT INTO person values ('David', 'Black');{code}
I see in HDFS:
{code:java}
[cloudera@quickstart ~]$ hadoop fs -ls /user/cloudera/hive/default/person/
Found 8 items
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:00
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-25_976_3499323211348894665-1
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:01
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-58_192_198398826154921102-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:03
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-03-26_141_6885357932995739759-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:05
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-04-43_828_3388080610685722623-1
-rwxr-xr-x 1 cloudera cloudera 302 2019-02-21 02:00
/user/cloudera/hive/default/person/000000_0
-rwxr-xr-x 1 cloudera cloudera 334 2019-02-21 02:01
/user/cloudera/hive/default/person/000000_0_copy_1
-rwxr-xr-x 1 cloudera cloudera 317 2019-02-21 02:03
/user/cloudera/hive/default/person/000000_0_copy_2
-rwxr-xr-x 1 cloudera cloudera 337 2019-02-21 02:05
/user/cloudera/hive/default/person/000000_0_copy_3
{code}
* Compaction doesn't help as well
{code:java}
ALTER TABLE person CONCATENATE;{code}
leads to:
{code:java}
[cloudera@quickstart ~]$ hadoop fs -ls /user/cloudera/hive/default/person/
Found 6 items
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:00
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-25_976_3499323211348894665-1
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:01
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-58_192_198398826154921102-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:03
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-03-26_141_6885357932995739759-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:05
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-04-43_828_3388080610685722623-1
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:06
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-06-21_176_6497325643092016331-3
-rwxr-xr-x 1 cloudera cloudera 934 2019-02-21 02:06
/user/cloudera/hive/default/person/000000_0{code}
Please let me know if that is a known issue and whether it is safe to remove
+staging+ files.
was (Author: sunseaandpalms):
Steps to reproduce:
* Having Cloudera Quick Start VM 5.13.0; Hive version 1.1.0
{code:java}
$ hive --version
Hive 1.1.0-cdh5.13.0
Subversion
file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.13.0
-r Unknown
Compiled by jenkins on Wed Oct 4 11:06:55 PDT 2017
>From source with checksum 4c9678e964cc1d15a0190a0a1867a837{code}
* Going to Hue -> Hive editor in Browser and executing the following
{code:java}
CREATE EXTERNAL TABLE person (
name STRING,
surname STRING
) STORED AS ORC LOCATION '/user/cloudera/hive/default/person';
INSERT INTO person values ('Denis', 'Sokol');
INSERT INTO person values ('John', 'Smith');
INSERT INTO person values ('Paul', 'Lauren');
INSERT INTO person values ('David', 'Black');{code}
I see in HDFS:
{code:java}
[cloudera@quickstart ~]$ hadoop fs -ls /user/cloudera/hive/default/person/
Found 8 items
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:00
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-25_976_3499323211348894665-1
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:01
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-58_192_198398826154921102-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:03
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-03-26_141_6885357932995739759-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:05
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-04-43_828_3388080610685722623-1
-rwxr-xr-x 1 cloudera cloudera 302 2019-02-21 02:00
/user/cloudera/hive/default/person/000000_0
-rwxr-xr-x 1 cloudera cloudera 334 2019-02-21 02:01
/user/cloudera/hive/default/person/000000_0_copy_1
-rwxr-xr-x 1 cloudera cloudera 317 2019-02-21 02:03
/user/cloudera/hive/default/person/000000_0_copy_2
-rwxr-xr-x 1 cloudera cloudera 337 2019-02-21 02:05
/user/cloudera/hive/default/person/000000_0_copy_3
{code}
* Compaction doesn't help as well
{code:java}
ALTER TABLE person CONCATENATE;{code}
leads to:
{code:java}
[cloudera@quickstart ~]$ hadoop fs -ls /user/cloudera/hive/default/person/
Found 6 items
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:00
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-25_976_3499323211348894665-1
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:01
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-00-58_192_198398826154921102-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:03
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-03-26_141_6885357932995739759-3
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:05
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-04-43_828_3388080610685722623-1
drwxr-xr-x - cloudera cloudera 0 2019-02-21 02:06
/user/cloudera/hive/default/person/.hive-staging_hive_2019-02-21_02-06-21_176_6497325643092016331-3
-rwxr-xr-x 1 cloudera cloudera 934 2019-02-21 02:06
/user/cloudera/hive/default/person/000000_0{code}
> Hive is not cleaning up staging directories
> --------------------------------------------
>
> Key: HIVE-18922
> URL: https://issues.apache.org/jira/browse/HIVE-18922
> Project: Hive
> Issue Type: Bug
> Reporter: Anant Mittal
> Priority: Major
>
> Hive is creating hdfs folders with format
> <table_location>/.hive-staging_hive_<date>_<time>-xx/-ext-xxxxx
> These are not being cleaned up even after long duration. The folder is used
> to load to the table. Example:
> Loading data to table default.tablename from
> hdfs://clustermachine/apps/hive/warehouse/tablename/.hive-staging_hive_2018-01-31_11-45-14_005_1129336997995057804-51/-ext-10000
>
> This might be covered to some extent by HIVE-11940 but, want to make sure all
> cases are addressed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)