Re: Orphaned FATE Locks [SEC=UNOFFICIAL]
On Wed, Sep 7, 2016 at 8:55 PM, Dickson, Matt MR wrote: > UNOFFICIAL > > > In Zookeeper there don't appear to be any locks with the same txid that is > listed via Accumulo. However under /accumulo//table_locks/+default/ > there are the same number of files as orphaned locks labelled > 'lock-00xx', are these the locks I can delete? Every table lock should have an associated fate transaction. The fate print tool has analysis[1] to look for situations when this is not the case. It seems like this analysis is finding locks w/o transactions. Looking at the code I see there is slight possibility of race condition. Its reads the list of locks into memory and then reads the transactions. So if a transaction completed and deleted lock between those steps the tool could report a false positive. If you run the tool multiple times and it reports the same thing then that race condition is not the cause. Yes, you should delete the locks. These orphaned locks can hold up future table operations. It would be nice to find out what caused this. You can grep for the fate txid related to the locks in the master log to gather info about the fate tx that created the lock. Do you know if the tables ids its complaining about as having locks were deleted? Do you know if fate transactions were deleted (manually in ZK or using Accumulo's tool)? [1]: https://github.com/apache/accumulo/blob/rel/1.8.0/fate/src/main/java/org/apache/accumulo/fate/AdminUtil.java#L69 > > I should note that while investigating this there were no other fate > transactions being listed by Accumulo for this table, +default, so the system > was in a stable state. > > > -Original Message- > > > From: Josh Elser [mailto:josh.el...@gmail.com] > Sent: Thursday, 8 September 2016 01:07 > To: user@accumulo.apache.org > Subject: Re: Orphaned FATE Locks [SEC=UNOFFICIAL] > > Hi Matt, > > What version of Accumulo are you using? Figuring out why those transactions > aren't automatically get removed is something else we would want to look into. > > It sounds like these transactions are just vestigial (not actually running), > so I wouldn't think that they would affect current bulk loads. > > I believe you could just stop the Master and remove the corresponding nodes > in ZooKeeper (as that's where the txns are stored and `fate print` is reading > from), but I would defer to Keith for confirmation first :) > > Dickson, Matt MR wrote: >> *UNOFFICIAL* >> >> When running 'fate print -t IN_PROGRESS' to list fate transactions >> there are approximately eight orphaned locks listed as: >> txid: xxx locked: [R:+default] I'm looking into these >> because bulk ingests are failing and there are a lot of CopyFailed >> transactions in the fate lock list. Could these orphaned locks block >> further bulk ingests and is there a way to kill them? >> When I run 'fate fail ' it states there is no fate transaction >> associated with the transaction id. >> Thanks advance, >> Matt
RE: Orphaned FATE Locks [SEC=UNOFFICIAL]
UNOFFICIAL Josh, Have had an opportunity to run this by Keith? -Original Message- From: Josh Elser [mailto:josh.el...@gmail.com] Sent: Thursday, 8 September 2016 12:21 To: user@accumulo.apache.org Subject: Re: Orphaned FATE Locks [SEC=UNOFFICIAL] Ah, this would be why I deferred to Keith. I apparently am not as knowledgable as I thought :) I'll try to catch him in IRC tmrw and see if we can get you an answer. Otherwise, I'll have to go digging into code to try to figure out an answer. Dickson, Matt MR wrote: > UNOFFICIAL > > > In Zookeeper there don't appear to be any locks with the same txid that is > listed via Accumulo. However under /accumulo//table_locks/+default/ > there are the same number of files as orphaned locks labelled > 'lock-00xx', are these the locks I can delete? > > I should note that while investigating this there were no other fate > transactions being listed by Accumulo for this table, +default, so the system > was in a stable state. > > > -Original Message- > > > From: Josh Elser [mailto:josh.el...@gmail.com] > Sent: Thursday, 8 September 2016 01:07 > To: user@accumulo.apache.org > Subject: Re: Orphaned FATE Locks [SEC=UNOFFICIAL] > > Hi Matt, > > What version of Accumulo are you using? Figuring out why those transactions > aren't automatically get removed is something else we would want to look into. > > It sounds like these transactions are just vestigial (not actually running), > so I wouldn't think that they would affect current bulk loads. > > I believe you could just stop the Master and remove the corresponding > nodes in ZooKeeper (as that's where the txns are stored and `fate > print` is reading from), but I would defer to Keith for confirmation > first :) > > Dickson, Matt MR wrote: >> *UNOFFICIAL* >> >> When running 'fate print -t IN_PROGRESS' to list fate transactions >> there are approximately eight orphaned locks listed as: >> txid: xxx locked: [R:+default] I'm looking into these >> because bulk ingests are failing and there are a lot of CopyFailed >> transactions in the fate lock list. Could these orphaned locks block >> further bulk ingests and is there a way to kill them? >> When I run 'fate fail ' it states there is no fate >> transaction associated with the transaction id. >> Thanks advance, >> Matt
Re: Orphaned FATE Locks [SEC=UNOFFICIAL]
Ah, this would be why I deferred to Keith. I apparently am not as knowledgable as I thought :) I'll try to catch him in IRC tmrw and see if we can get you an answer. Otherwise, I'll have to go digging into code to try to figure out an answer. Dickson, Matt MR wrote: UNOFFICIAL In Zookeeper there don't appear to be any locks with the same txid that is listed via Accumulo. However under /accumulo//table_locks/+default/ there are the same number of files as orphaned locks labelled 'lock-00xx', are these the locks I can delete? I should note that while investigating this there were no other fate transactions being listed by Accumulo for this table, +default, so the system was in a stable state. -Original Message- From: Josh Elser [mailto:josh.el...@gmail.com] Sent: Thursday, 8 September 2016 01:07 To: user@accumulo.apache.org Subject: Re: Orphaned FATE Locks [SEC=UNOFFICIAL] Hi Matt, What version of Accumulo are you using? Figuring out why those transactions aren't automatically get removed is something else we would want to look into. It sounds like these transactions are just vestigial (not actually running), so I wouldn't think that they would affect current bulk loads. I believe you could just stop the Master and remove the corresponding nodes in ZooKeeper (as that's where the txns are stored and `fate print` is reading from), but I would defer to Keith for confirmation first :) Dickson, Matt MR wrote: *UNOFFICIAL* When running 'fate print -t IN_PROGRESS' to list fate transactions there are approximately eight orphaned locks listed as: txid: xxx locked: [R:+default] I'm looking into these because bulk ingests are failing and there are a lot of CopyFailed transactions in the fate lock list. Could these orphaned locks block further bulk ingests and is there a way to kill them? When I run 'fate fail ' it states there is no fate transaction associated with the transaction id. Thanks advance, Matt
RE: Orphaned FATE Locks [SEC=UNOFFICIAL]
UNOFFICIAL In Zookeeper there don't appear to be any locks with the same txid that is listed via Accumulo. However under /accumulo//table_locks/+default/ there are the same number of files as orphaned locks labelled 'lock-00xx', are these the locks I can delete? I should note that while investigating this there were no other fate transactions being listed by Accumulo for this table, +default, so the system was in a stable state. -Original Message- From: Josh Elser [mailto:josh.el...@gmail.com] Sent: Thursday, 8 September 2016 01:07 To: user@accumulo.apache.org Subject: Re: Orphaned FATE Locks [SEC=UNOFFICIAL] Hi Matt, What version of Accumulo are you using? Figuring out why those transactions aren't automatically get removed is something else we would want to look into. It sounds like these transactions are just vestigial (not actually running), so I wouldn't think that they would affect current bulk loads. I believe you could just stop the Master and remove the corresponding nodes in ZooKeeper (as that's where the txns are stored and `fate print` is reading from), but I would defer to Keith for confirmation first :) Dickson, Matt MR wrote: > *UNOFFICIAL* > > When running 'fate print -t IN_PROGRESS' to list fate transactions > there are approximately eight orphaned locks listed as: > txid: xxx locked: [R:+default] I'm looking into these > because bulk ingests are failing and there are a lot of CopyFailed > transactions in the fate lock list. Could these orphaned locks block > further bulk ingests and is there a way to kill them? > When I run 'fate fail ' it states there is no fate transaction > associated with the transaction id. > Thanks advance, > Matt
Re: Orphaned FATE Locks [SEC=UNOFFICIAL]
Hi Matt, What version of Accumulo are you using? Figuring out why those transactions aren't automatically get removed is something else we would want to look into. It sounds like these transactions are just vestigial (not actually running), so I wouldn't think that they would affect current bulk loads. I believe you could just stop the Master and remove the corresponding nodes in ZooKeeper (as that's where the txns are stored and `fate print` is reading from), but I would defer to Keith for confirmation first :) Dickson, Matt MR wrote: *UNOFFICIAL* When running 'fate print -t IN_PROGRESS' to list fate transactions there are approximately eight orphaned locks listed as: txid: xxx locked: [R:+default] I'm looking into these because bulk ingests are failing and there are a lot of CopyFailed transactions in the fate lock list. Could these orphaned locks block further bulk ingests and is there a way to kill them? When I run 'fate fail ' it states there is no fate transaction associated with the transaction id. Thanks advance, Matt