[ 
https://issues.apache.org/jira/browse/HAWQ-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279759#comment-16279759
 ] 

Kuien Liu commented on HAWQ-1567:
---------------------------------

Thanks to [~yjin] and [~wlin], I will keep an eye on it, once it hangs again, 
more info may be available to locate the issue.

> Unknown process holds the lock causes DROP TABLE hangs forever
> --------------------------------------------------------------
>
>                 Key: HAWQ-1567
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1567
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Core
>            Reporter: Kuien Liu
>            Assignee: Radar Lei
>
> On Hawq 2.2.0.0-incubating (Jun 2017), we meet several times that query is 
> hanging for long time:
> # 1. DROP TABLE hangs for tens of minutes, because it waits for 
> AccessExclusiveLock.
> # 2. BUT the lock is held by a ghost process ( not alive, and little message  
> in log file is availabe to know what's up)
> A detailed context is pasted:
> postgres=# select procpid, sess_id, usesysid, xact_start, waiting, 
> current_query from pg_stat_activity where current_query <> '<IDLE>';
>  procpid | sess_id | usesysid |          xact_start           | waiting |     
>                                                                               
>                                                               current_query
> ---------+---------+----------+-------------------------------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>    91321 |  120242 |   328199 | 2017-11-28 14:45:52.631739+08 | t       |  
> drop table if exists ads_is_svc_rcv_approval_detail_df
> postgres=# select * from pg_locks where pid = 91321;
>    locktype    | database | relation | page | tuple | transactionid | classid 
> | objid | objsubid | transaction |  pid  |        mode         | granted | 
> mppsessionid | mppiswriter | gp_segment_id
> ---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-------+---------------------+---------+--------------+-------------+---------------
>  transactionid |          |          |      |       |      21867785 |         
> |       |          |    21867785 | 91321 | ExclusiveLock       | t       |    
>    120242 | f           |        -10000
>  relation      |    16510 |     2608 |      |       |               |         
> |       |          |    21867785 | 91321 | RowExclusiveLock    | t       |    
>    120242 | f           |        -10000
>  relation      |    16510 |     1259 |      |       |               |         
> |       |          |    21867785 | 91321 | RowExclusiveLock    | t       |    
>    120242 | f           |        -10000
>  relation      |    16510 |  3212612 |      |       |               |         
> |       |          |    21867785 | 91321 | AccessExclusiveLock | f       |    
>    120242 | f           |        -10000
>  relation      |    16510 |     1247 |      |       |               |         
> |       |          |    21867785 | 91321 | RowExclusiveLock    | t       |    
>    120242 | f           |        -10000
> (5 rows)
> postgres=# select * from pg_locks where relation = 3212612;
>  locktype | database | relation | page | tuple | transactionid | classid | 
> objid | objsubid | transaction |  pid   |        mode         | granted | 
> mppsessionid | mppiswriter | gp_segment_id
> ----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+--------+---------------------+---------+--------------+-------------+---------------
>  relation |    16510 |  3212612 |      |       |               |         |    
>    |          |    21867785 |  91321 | AccessExclusiveLock | f       |       
> 120242 | f           |        -10000
>  relation |    16510 |  3212612 |      |       |               |         |    
>    |          |           0 | 107940 | AccessShareLock     | t       |       
> 120553 | f           |        -10000
> (2 rows)
> postgres=# select * from pg_stat_activity where procpid = 107940;
>  datid | datname | procpid | sess_id | usesysid | usename | current_query | 
> waiting | query_start | backend_start | client_addr | client_port | 
> application_name | xact_start | waiting_resource
> -------+---------+---------+---------+----------+---------+---------------+---------+-------------+---------------+-------------+-------------+------------------+------------+------------------
> (0 rows)
> postgres=# select * from pg_locks  where pid = 107940 or mppsessionid = 
> 120553;
>  locktype | database | relation | page | tuple | transactionid | classid | 
> objid | objsubid | transaction | pid | mode | granted | mppsessionid | 
> mppiswriter | gp_segment_id
> ----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-----+------+---------+--------------+-------------+---------------
> (0 rows)
> postgres=# select logtime, logpid, logmessage from 
> hawq_toolkit.__hawq_log_master_latest where logsession = 'con120553';
>             logtime            | logpid  |                   logmessage
> -------------------------------+---------+-------------------------------------------------
>  2017-11-28 15:14:38.277254+08 | p107940 | clean up communication to resource 
> manager now.
>  2017-11-28 15:14:38.322206+08 | p107940 | generateResourceRefreshHeartBeat 
> exits.
>  2017-11-28 15:14:38.388077+08 | p107940 | APSARA uninit:DONE
> (3 rows)
> In the master log file, it is not easy to know who is accessing the relation.
> To [~yjin]  and all: any suggestion on it? thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to