Oh, my copy paste skills failed me.
Here's output from HIVE_LOCKS:

# HL_LOCK_EXT_ID, HL_LOCK_INT_ID, HL_TXNID, HL_DB, HL_TABLE, HL_PARTITION,
HL_LOCK_STATE, HL_LOCK_TYPE, HL_LAST_HEARTBEAT, HL_ACQUIRED_AT, HL_USER,
HL_HOST
'8496355', '1', '78461824', 'default', 'data_http', 'dt=20160821', 'a',
'r', '0', '1471768489000', 'storm', 'sorm-data03.msk.mts.ru'

TXNS, TXN_COMPONENTS:
# TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru,
78461824, default, data_http, dt=201608213


Good news, I've found a log when it all started:

Here comes heartbeat:
2016-08-21 11:34:48,934 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:heartbeatTxn(1882)) - Going to execute update <update TXNS
set txn_last_heartbeat = 1471768488000 where txn_id = 78461824 and
txn_state = 'o'>

Concurrently we are trying to abort transaction
2016-08-21 11:34:48,907 DEBUG [DeadTxnReaper-0]: txn.TxnHandler
(TxnHandler.java:abortTxns(1429)) - Going to execute update <delete from
HIVE_LOCKS where hl_txnid in
(78461824,78461825,78461826,78461827,78461828,78461829)>
2016-08-21 11:34:48,943 DEBUG [DeadTxnReaper-0]: txn.TxnHandler
(TxnHandler.java:abortTxns(1446)) - Going to execute update <update TXNS
set txn_state = 'a' where txn_state = 'o' and txn_id in
(78461824,78461825,78461826,78461827,78461828,78461829) and
txn_last_heartbeat < 1471764233000>
2016-08-21 11:34:48,957 DEBUG [pool-3-thread-11]: txn.TxnHandler
(TxnHandler.java:abortTxns(1429)) - Going to execute update <delete from
HIVE_LOCKS where hl_txnid in (78461824)>
2016-08-21 11:34:48,979 DEBUG [pool-3-thread-11]: txn.TxnHandler
(TxnHandler.java:abortTxns(1446)) - Going to execute update <update TXNS
set txn_state = 'a' where txn_state = 'o' and txn_id in (78461824)>

At this point transaction aborted and thers no lock.
But heartbeat thread still working and:
2016-08-21 11:34:49,025 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:lock(1546)) - Going to execute update <insert into
TXN_COMPONENTS (tc_txnid, tc_database, tc_table, tc_partition) values
(78461824, 'default', 'data_http', 'dt=20160821')>
2016-08-21 11:34:49,027 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:lock(1576)) - Going to execute update <insert into
HIVE_LOCKS  (hl_lock_ext_id, hl_lock_int_id, hl_txnid, hl_db, hl_table,
hl_partition, hl_lock_state, hl_lock_type, hl_last_heartbeat, hl_user,
hl_host) values (8496355, 1,78461824, 'default', 'data_http',
'dt=20160821', 'w', 'r', 0, 'storm', 'sorm-data03.msk.mts.ru')>
2016-08-21 11:34:49,029 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:checkLock(1696)) -   lockid:8496355 intLockId:1
txnid:78461824 db:default table:data_http partition:dt=20160821
state:WAITING type:SHARED_READ


So I guess, that it's race condition between heartbeat thread and TxnReaper
thread. Last heartbeat information in HIVE_LOCKS table differs from TXNS
table.
Full log here
<https://drive.google.com/open?id=0ByB92PAoAkrKSFFaQjBNSnNXZFk>.

On Tue, Aug 23, 2016 at 8:20 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> your query is “select * from HIVE_LOCKS” but the output is not from
> HIVE_LOCKS.
> What entries do you have in HIVE_LOCKS for this txn_id?
>
> If all you see is an entry in TXN table in ‘a’ state – that is OK.  that
> just mean that this transaction was aborted.
>
> Eugene
>
> From: Igor Kuzmenko <f1she...@gmail.com>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
> Date: Monday, August 22, 2016 at 8:27 AM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Subject: Hive transaction doesn't release lock.
>
> Hello, I'm using Apache Hive 1.2.1 and Apache Storm to stream data in hive
> table.
> After making some tests I tried to truncate my table, but sql execution
> doesn't complete because of the lock on table:
>
>
> select * from HIVE_LOCKS;
>
>
> # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
> TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
> '78461824', 'a', '1471762974000', '1471768488000', 'storm', 'sorm-data03.
> msk.mts.ru', '78461824', 'default', 'data_http', 'dt=20160821'
>
>
>
> At the same time:
>
> select * from TXNS, TXN_COMPONENTS
> where 1=1
> and TXNS.TXN_ID = TXN_COMPONENTS.TC_TXNID
> and TXNS.TXN_ID = 78461824
> ;
>
>
> # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
> TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
> 78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru,
> 78461824, default, data_http, dt=20160821
>
>
> Txn 78461824 is in aborted state, but still keep lock on table
> 'data_http'. How can I avoid this? What should I do to get rid of that
> lock?
>
>

Reply via email to