No, xxx... is your instance id. You can find it at the top of the monitor page. It's the ugly UUID there.
-Eric On Thu, Feb 20, 2014 at 3:26 PM, Dickson, Matt MR < [email protected]> wrote: > *UNOFFICIAL* > Is the xxx... the transaction id returned by the 'fate.Admin print'? > > Whats involved with recreating a node? > > Matt > > ------------------------------ > *From:* Eric Newton [mailto:[email protected]] > *Sent:* Friday, 21 February 2014 01:35 > > *To:* [email protected] > *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL] > > You can use the zkCli.sh utility to "rmr" /accumulo/xx.../fate and > /accumulo/xx.../table_locks, and then recreate those nodes. > > -Eric > > > > On Wed, Feb 19, 2014 at 5:58 PM, Dickson, Matt MR < > [email protected]> wrote: > >> *UNOFFICIAL* >> Thanks for your help on this Eric. >> >> I've started deleting the transactions by running the, ./accumulo >> ...fate.Admin delete <txid>, and notice this takes about 20 seconds per >> transaction. With 7500 to delete this is going to take a long time (almost >> 2 days), so I tried running several threads each with a seperate range of >> id's to delete. Unfortunately this seemed to have some contention and I >> kept recieving an InvocationTargetException .... Caused by >> zookeeper.KeeperException: KeeperErrorCode = noNode for >> /accumulo/xxxxx-xxxx-xxxx-xxxx/table_locks/3n/lock-xxxxxx >> >> When I go back to one thread this error disappears. >> >> Is there a better way to run this? >> >> Thanks in advance, >> Matt >> >> ------------------------------ >> *From:* Eric Newton [mailto:[email protected]] >> *Sent:* Wednesday, 19 February 2014 01:21 >> >> *To:* [email protected] >> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL] >> >> The "LeaseExpiredException" is part of the recovery process. The >> master determines that a tablet server has lost its lock, or it is >> unresponsive and has been halted, possibly indirectly by removing the lock. >> >> The master then steals the write lease on the WAL file, which causes >> future writes to the WALog to fail. The message you have seen is part of >> that failure. You should have seen a tablet server failure associated with >> this message on the machine with <ip>. >> >> Having 50K FATE IN_PROGRESS lines is bad. That is preventing your bulk >> imports from getting run. >> >> Are there any lines that show locked: [W:3n] ? The other FATE >> transactions are waiting to get a READ lock on table id 3n. >> >> -Eric >> >> >> >> On Sun, Feb 16, 2014 at 7:59 PM, Dickson, Matt MR < >> [email protected]> wrote: >> >>> UNOFFICIAL >>> >>> Josh, >>> >>> Zookeepr - 3.4.5-cdh4.3.0 >>> Accumulo - 1.5.0 >>> Hadoop - cdh 4.3.0 >>> >>> In the accumulo console getting >>> >>> ERROR RemoteException(...LeaseExpiredException): Lease mismatch on >>> /accumulo/wal/<ip>+9997/<uid> owned by DFSClient_NONMAPREDUCE_699577321_12 >>> but is accessed by DFSClient_NONMAPREDUCE_903051502_12 >>> >>> We can scan the table without issues and can load rows directly, ie not >>> using bulk import. >>> >>> A bit more information - we recently extended how we manage old tablets >>> in the system. We load data by date, creating splits for each day and then >>> ageoff using the ageoff filters. This leaves empty tablets so we now merge >>> these old tablets together to effectively remove them. I mention it >>> because I'm not sure if this might have introduced another issue. >>> >>> Matt >>> >>> -----Original Message----- >>> From: Josh Elser [mailto:[email protected]] >>> Sent: Monday, 17 February 2014 11:32 >>> To: [email protected] >>> Subject: Re: Failing to BulkIngest [SEC=UNOFFICIAL] >>> >>> Matt, >>> >>> Can you provide Hadoop, ZK and Accumulo versions? Does the cluster >>> appear to be functional otherwise (can you scan that table you're bulk >>> importing to? any other errors on the monitor? etc) >>> >>> On 2/16/14, 7:07 PM, Dickson, Matt MR wrote: >>> > *UNOFFICIAL* >>> > >>> > I have a situation where bulk ingests are failing with a "Thread >>> "shell" >>> > stuck on IO to xxx:9999:99999 ... >>> > From the management console the table we are loading to has no >>> > compactions running, yet we ran "./accumulo >>> > org.apache.accumulo.server.fate.Admin print and can see 50,000 lines >>> > stating >>> > txid: xxxx status:IN_PROGRESS op: CompactRange locked: [] >>> > locking: [R:3n] top: Compact:Range >>> > Does this mean there are actually compactions running or old >>> > comapaction locks still hanging around that will be preventing the >>> builk ingest to run? >>> > Thanks in advance, >>> > Matt >>> >> >> >
