[jira] [Commented] (HAWQ-1498) Segments keep open file descriptors for deleted files

JIRA Thu, 06 Jul 2017 18:14:14 -0700

    [ 
https://issues.apache.org/jira/browse/HAWQ-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077442#comment-16077442
 ]


Harald Bögeholz commented on HAWQ-1498:
---------------------------------------

Here are concrete steps for reproducing the problem.

Fist I check the state of things on the slave node mds-hdp-04:


{noformat}
[root@mds-hdp-04 ~]# df /mds
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/vdc       515928320 98522628 391174908  21% /mds
[root@mds-hdp-04 ~]# du -s /mds
98448860        /mds
[root@mds-hdp-04 ~]# lsof  L1 > lsof-before
{noformat}

(note that there is a plus character in the lsof command that Jira is eating.)

Then on the master mds-hdp-03 I start psql and create a table:

{noformat}
[gpadmin@mds-hdp-03 ~]$ psql -d graphs
psql (8.2.15)
Type "help" for help.

graphs=# create table junk as select * from bara10m;
SELECT 19999997
{noformat}

On the slave node I see that the table is taking ~120 MB of storage:


{noformat}
[root@mds-hdp-04 ~]# df /mds
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/vdc       515928320 98642512 391055024  21% /mds
[root@mds-hdp-04 ~]# du -s /mds
98568532        /mds
{noformat}

I then drop the table:

{noformat}
graphs=# drop table junk;
DROP TABLE
{noformat}

On the slave node I check disk space and open files:

{noformat}
[root@mds-hdp-04 ~]# df /mds
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/vdc       515928320 98642512 391055024  21% /mds
[root@mds-hdp-04 ~]# du -s /mds
98449024        /mds
[root@mds-hdp-04 ~]# lsof  L1 >lsof-dropped
{noformat}

Note that the output of du is roughly back to where it was, but the amount of 
used space df indicates stays the same. Will show the lengthy output of lsof 
below.

Next, on the master, close the connection to the database:

{noformat}
graphs=# \q
[gpadmin@mds-hdp-03 ~]$
{noformat}

Check things on the slave:


{noformat}
[root@mds-hdp-04 ~]# df /mds
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/vdc       515928320 98522664 391174872  21% /mds
[root@mds-hdp-04 ~]# du -s /mds
98448896        /mds
[root@mds-hdp-04 ~]# lsof  L1 >lsof-closed
{noformat}

Disk space is freed, everything is (up to a few kilobytes) as it was before. 
The output of lsof before (lsof-before) and after the experiment (lsof-dropped) 
is identical, as shown by diff. Here's lsof-dropped, the complete output of 
lsof after the table was dropped but before the connection to the database was 
closed:


{noformat}
COMMAND     PID    USER   FD   TYPE DEVICE  SIZE/OFF NLINK     NODE NAME
tuned       827    root    7u   REG  253,1      4096     0   136067 
/tmp/ffijUJke5 (deleted)
bash      28331    yarn  cwd    DIR 253,32         0     0 25167570 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0005/container_e03_1498453636069_0005_01_000949
 (deleted)
bash      28331    yarn  255r   REG 253,32       710     0 25167857 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0005/container_e03_1498453636069_0005_01_000949/default_container_executor.sh
 (deleted)
sleep     28334    yarn  cwd    DIR 253,32         0     0 25167570 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0005/container_e03_1498453636069_0005_01_000949
 (deleted)
postgres  76659 gpadmin   36r   REG 253,32    212024     0  9438395 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949181
 (deleted)
postgres  76659 gpadmin   37r   REG 253,32      1667     0  9438396 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949181_214299.meta
 (deleted)
postgres  76698 gpadmin   15r   REG 253,32 121425488     0  9437196 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949177
 (deleted)
postgres  76698 gpadmin   16r   REG 253,32    948647     0  9438394 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949177_214295.meta
 (deleted)
postgres  76698 gpadmin   18r   REG 253,32    212024     0  9438395 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949181
 (deleted)
postgres  76698 gpadmin   19r   REG 253,32      1667     0  9438396 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949181_214299.meta
 (deleted)
bash     175610    yarn  cwd    DIR 253,32         0     0 25168110 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0006/container_e03_1498453636069_0006_01_001196
 (deleted)
bash     175610    yarn  255r   REG 253,32       710     0 25168161 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0006/container_e03_1498453636069_0006_01_001196/default_container_executor.sh
 (deleted)
sleep    175612    yarn  cwd    DIR 253,32         0     0 25168110 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0006/container_e03_1498453636069_0006_01_001196
 (deleted)
bash     658810    yarn  cwd    DIR 253,32         0     0 25167030 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0004/container_e03_1498453636069_0004_01_000509
 (deleted)
bash     658810    yarn  255r   REG 253,32       710     0 25167246 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0004/container_e03_1498453636069_0004_01_000509/default_container_executor.sh
 (deleted)
sleep    658813    yarn  cwd    DIR 253,32         0     0 25167030 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0004/container_e03_1498453636069_0004_01_000509
 (deleted)
postgres 761948 gpadmin    3r   REG 253,16        69     0  6553920 
/data/hawq/segment/global/pgstat.stat (deleted)
{noformat}

Note the line showing that the postgres process is holding a handle on that 120 
MB file. For completeness, here the identical output of lsof before and after 
the experiment:


{noformat}
COMMAND     PID    USER   FD   TYPE DEVICE SIZE/OFF NLINK     NODE NAME
tuned       827    root    7u   REG  253,1     4096     0   136067 
/tmp/ffijUJke5 (deleted)
bash      28331    yarn  cwd    DIR 253,32        0     0 25167570 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0005/container_e03_1498453636069_0005_01_000949
 (deleted)
bash      28331    yarn  255r   REG 253,32      710     0 25167857 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0005/container_e03_1498453636069_0005_01_000949/default_container_executor.sh
 (deleted)
sleep     28334    yarn  cwd    DIR 253,32        0     0 25167570 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0005/container_e03_1498453636069_0005_01_000949
 (deleted)
bash     175610    yarn  cwd    DIR 253,32        0     0 25168110 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0006/container_e03_1498453636069_0006_01_001196
 (deleted)
bash     175610    yarn  255r   REG 253,32      710     0 25168161 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0006/container_e03_1498453636069_0006_01_001196/default_container_executor.sh
 (deleted)
sleep    175612    yarn  cwd    DIR 253,32        0     0 25168110 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0006/container_e03_1498453636069_0006_01_001196
 (deleted)
bash     658810    yarn  cwd    DIR 253,32        0     0 25167030 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0004/container_e03_1498453636069_0004_01_000509
 (deleted)
bash     658810    yarn  255r   REG 253,32      710     0 25167246 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0004/container_e03_1498453636069_0004_01_000509/default_container_executor.sh
 (deleted)
sleep    658813    yarn  cwd    DIR 253,32        0     0 25167030 
/mds/yarn/local/usercache/gpadmin/appcache/application_1498453636069_0004/container_e03_1498453636069_0004_01_000509
 (deleted)
postgres 761948 gpadmin    3r   REG 253,16       69     0  6553920 
/data/hawq/segment/global/pgstat.stat (deleted)
{noformat}

If there is any timeout it must be on the order of several days. I have 
discovered the problem after having psql open for several days, starting 
computations by hand that each took hours. After one or two days I was missing 
a terabyte of disk space and couldn't help noticing that.

> Segments keep open file descriptors for deleted files
> -----------------------------------------------------
>
>                 Key: HAWQ-1498
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1498
>             Project: Apache HAWQ
>          Issue Type: Bug
>            Reporter: Harald Bögeholz
>            Assignee: Radar Lei
>             Fix For: 2.2.0.0-incubating
>
>
> I have been running some large computations in HAWQ using psql on the master. 
> These computations created temporary tables and dropped them again. 
> Nevertheless free disk space in HDFS decreased by much more than it should. 
> While the psql session on the master was still open I investigated on one of 
> the slave machines.
> HDFS is stored on /mds:
> {noformat}
> [root@mds-hdp-04 ~]# ls -l /mds
> total 36
> drwxr-xr-x. 3 root      root    4096 Jun 14 04:23 falcon
> drwxr-xr-x. 3 root      root    4096 Jun 14 04:42 hdfs
> drwx------. 2 root      root   16384 Jun  8 02:48 lost+found
> drwxr-xr-x. 5 storm     hadoop  4096 Jun 14 04:45 storm
> drwxr-xr-x. 4 root      root    4096 Jun 14 04:43 yarn
> drwxr-xr-x. 2 zookeeper hadoop  4096 Jun 14 04:39 zookeeper
> [root@mds-hdp-04 ~]# df /mds
> Filesystem     1K-blocks      Used Available Use% Mounted on
> /dev/vdc       515928320 314560220 175137316  65% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952      /mds
> {noformat}
> Note that there is a more than 200 GB difference between the disk space used 
> according to df and the sum of all files on that file system according to du.
> I have found the culprit to be several postgres processes running as gpadmin 
> and holding open file descriptors to deleted files. Here are the first few:
> {noformat}
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> postgres 665334 gpadmin   18r   REG 253,32 134217728     0  9438234 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482
>  (deleted)
> postgres 665334 gpadmin   34r   REG 253,32     24488     0  9438114 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398
>  (deleted)
> postgres 665334 gpadmin   35r   REG 253,32       199     0  9438115 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398_187044.meta
>  (deleted)
> postgres 665334 gpadmin   37r   REG 253,32 134217728     0  9438208 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446
>  (deleted)
> postgres 665334 gpadmin   38r   REG 253,32   1048583     0  9438209 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446_187092.meta
>  (deleted)
> postgres 665334 gpadmin   39r   REG 253,32   1048583     0  9438235 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482_187128.meta
>  (deleted)
> postgres 665334 gpadmin   40r   REG 253,32 134217728     0  9438262 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555
>  (deleted)
> postgres 665334 gpadmin   41r   REG 253,32   1048583     0  9438263 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555_187201.meta
>  (deleted)
> postgres 665334 gpadmin   42r   REG 253,32 134217728     0  9438285 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602
>  (deleted)
> postgres 665334 gpadmin   43r   REG 253,32   1048583     0  9438286 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602_187248.meta
>  (deleted)
> {noformat}
> As soon I close the psql session on the master the disk space is freed on the 
> slaves:
> {noformat}
> [root@mds-hdp-04 ~]# df /mds
> Filesystem     1K-blocks     Used Available Use% Mounted on
> /dev/vdc       515928320 89992720 399704816  19% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952      /mds
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> {noformat}
> I believe this to be a bug. At least for me it looks like a very undesirable 
> behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HAWQ-1498) Segments keep open file descriptors for deleted files

Reply via email to