On Wed, Sep 7, 2016 at 8:55 PM, Dickson, Matt MR
<matt.dick...@defence.gov.au> wrote:
> UNOFFICIAL
>
>
> In Zookeeper there don't appear to be any locks with the same txid that is 
> listed via Accumulo.  However under /accumulo/xxxxxxxx/table_locks/+default/ 
> there are the same number of files as orphaned locks labelled 
> 'lock-000000xx', are these the locks I can delete?

Every table lock should have an associated fate transaction.   The
fate print tool has analysis[1] to look for situations when this is
not the case.  It seems like this analysis is finding locks w/o
transactions.   Looking at the code I see there is slight possibility
of race condition.  Its reads the list of locks into memory and then
reads the transactions.  So if a transaction completed and deleted
lock between those steps the tool could report a false positive.  If
you run the tool multiple times and it reports the same thing then
that race condition is not the cause.

Yes, you should delete the locks.  These orphaned locks can hold up
future table operations.  It would be nice to find out what caused
this.  You can grep for the fate txid related to the locks in the
master log to gather info about the fate tx that created the lock.

Do you know if the tables ids its complaining about as having locks
were deleted?  Do you know if fate transactions were deleted (manually
in ZK or using Accumulo's tool)?


[1]: 
https://github.com/apache/accumulo/blob/rel/1.8.0/fate/src/main/java/org/apache/accumulo/fate/AdminUtil.java#L69

>
> I should note that while investigating this there were no other fate 
> transactions being listed by Accumulo for this table, +default, so the system 
> was in a stable state.
>
>
> -----Original Message-----
>
>
> From: Josh Elser [mailto:josh.el...@gmail.com]
> Sent: Thursday, 8 September 2016 01:07
> To: user@accumulo.apache.org
> Subject: Re: Orphaned FATE Locks [SEC=UNOFFICIAL]
>
> Hi Matt,
>
> What version of Accumulo are you using? Figuring out why those transactions 
> aren't automatically get removed is something else we would want to look into.
>
> It sounds like these transactions are just vestigial (not actually running), 
> so I wouldn't think that they would affect current bulk loads.
>
> I believe you could just stop the Master and remove the corresponding nodes 
> in ZooKeeper (as that's where the txns are stored and `fate print` is reading 
> from), but I would defer to Keith for confirmation first :)
>
> Dickson, Matt MR wrote:
>> *UNOFFICIAL*
>>
>> When running 'fate print -t IN_PROGRESS' to list fate transactions
>> there are approximately eight orphaned locks listed as:
>> txid: xxxxxxxxxxxxxxx locked: [R:+default] I'm looking into these
>> because bulk ingests are failing and there are a lot of CopyFailed
>> transactions in the fate lock list. Could these orphaned locks block
>> further bulk ingests and is there a way to kill them?
>> When I run 'fate fail xxxxxxxx' it states there is no fate transaction
>> associated with the transaction id.
>> Thanks advance,
>> Matt

Reply via email to