Re: Failing to BulkIngest [SEC=UNOFFICIAL]

Eric Newton Mon, 24 Feb 2014 19:21:07 -0800

Good to know. Thanks for the feedback.

-Eric




On Mon, Feb 24, 2014 at 10:07 PM, Dickson, Matt MR <
[email protected]> wrote:

>  *UNOFFICIAL*
> For completion of this post, the solution was:
>
> After shutting down Accumulo and digging around we found a cron initiated
> process under an old user account.  This was firing off a shard management
> task that ran large merges on ranges of old splits.
>
> After killing 15 of these processes and then cleaning out the table_locks
> and fate directory in zookeeper and then restaring Accumulo, there were no
> more locks and Accumulo is now running as expected with BulkIngests
> succeeding
>
> Thanks for all you help Eric.
>
>  ------------------------------
> *From:* Eric Newton [mailto:[email protected]]
> *Sent:* Tuesday, 25 February 2014 11:41
>
> *To:* [email protected]
> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>
>  There is some bookkeeping in the metadata table, but the drivers are the
> FATE entries in zookeeper.
>
> If you have some clients running in the background, they will re-create
> their FATE requests on restart of the master.
>
> -Eric
>
>
>
> On Mon, Feb 24, 2014 at 6:14 PM, Dickson, Matt MR <
> [email protected]> wrote:
>
>>  *UNOFFICIAL*
>> Based on that we difinately need to alter our strategy.
>>
>> To come back to rectifying the issue, we tried deleting the fate and
>> table_locks in zookeeper and recreated the nodes, however when Accumulo
>> restarted it went through and recreated the locks.  In zookeeper there are
>> now 22,500 table locks and 32,300 fate locks.
>>
>> Would the details of these locks be in the !METADATA table and therefore
>> be removable?  I hesitate to do anymore removing via the zookeeper
>> directories and the FATE commands because it looks like Accumulo will just
>> restore the locks anyway, and removing the locks via FATE will take several
>> days.
>>
>> I've tried running operations like 'compact --cancel' in Accumulo but
>> these get stuck on waiting for IO, which is presumably due to all the locks.
>>
>>
>>
>>  ------------------------------
>>  *From:* Eric Newton [mailto:[email protected]]
>> *Sent:* Tuesday, 25 February 2014 01:07
>>
>> *To:* [email protected]
>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>
>>   A merge should only take a few seconds, unless a tablet has files with
>> data outside it's range.  This is normal for files that are bulk imported,
>> or the product of splits.
>>
>> If you are planning to merge many tablets, compact the ranges of tablets
>> you are planning to merge.  This will ensure that the file data for each
>> range is limited to that range.  After that, the merge should only take a
>> few seconds.
>>
>> Merges require a write lock on *the entire table*.  And only one merge
>> can run at the same time.  If you are merging many tablets, then merging
>> them to a giant tablet, and letting it re-split might be faster.
>>
>> The merge mechanism is seen as a once-in-a-while housecleaning mechanism,
>> and not a daily activity.  It would need to be a lot more efficient in
>> terms of resources (locks) and data re-writing to use frequently.
>>
>> Merging and bulk-importing are in conflict for the same resource: the
>> table lock and files that contain data strictly within the range of the
>> tablet.  So, doing them separately makes sense.  The master should be able
>> to figure out how to run them fairly, regardless.
>>
>> We only have the primitive FATE admin command to kill, delete and list
>> the running FATE operations.
>>
>> -Eric
>>
>>
>>
>> On Sun, Feb 23, 2014 at 5:22 PM, Dickson, Matt MR <
>> [email protected]> wrote:
>>
>>>  *UNOFFICIAL*
>>> We have recently added functionality to merge old tablets, so these will
>>> be running while bulk ingest is going.  The bulk ingest process checks for
>>> running compactions and will wait until there are none, but does no
>>> co-ordination with running merges.  The bulk ingest and merging are
>>> automated, bulk ingest runs hourly and the merge of old tablets runs
>>> daily.  Once we get bulk ingest working again, should we pause ingest while
>>> the merge operations are run to avoid/minimise FATE operations?
>>>
>>> To help get the ingest working again, is there a way to list running
>>> merges?  Is it possible to cancel a merge?
>>>
>>> Matt
>>>  ------------------------------
>>>  *From:* Eric Newton [mailto:[email protected]]
>>> *Sent:* Friday, 21 February 2014 15:49
>>>
>>> *To:* [email protected]
>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>
>>>   The locks are not a problem.
>>>
>>> The problem is the creation of FATE transactions which require locks.
>>>
>>> Why are you creating FATE operations?  Are you merging tablets?  Are you
>>> bulk importing while other table-wide operations are in progress? Are these
>>> processes automated?
>>>
>>> There is some bookkeeping in the !METADATA table, for FATE transactions,
>>> but not the other way around.
>>>
>>>
>>>
>>>
>>> On Thu, Feb 20, 2014 at 10:54 PM, Dickson, Matt MR <
>>> [email protected]> wrote:
>>>
>>>>  *UNOFFICIAL*
>>>> Thanks for that.
>>>>
>>>> We recreated the nodes and restarted Accumulo, but it went through and
>>>> Added the locks back during start up, so it appears Accumulo has knowledge
>>>> of the locks, maybe in the metadata table(?), and has updated the fate
>>>> locks in zookeeper.  The issue of bulk ingest failing is still occuring.
>>>>
>>>> How can we investigate within Accumulo how it tracks these locks so
>>>> that we can flush this information also or identify the issue?
>>>>
>>>> Matt
>>>>  ------------------------------
>>>>  *From:* Eric Newton [mailto:[email protected]]
>>>> *Sent:* Friday, 21 February 2014 14:27
>>>>
>>>> *To:* [email protected]
>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>
>>>>   Sorry... I should have been more clear.
>>>>
>>>> "-e" is for ephemeral, these are not ephemeral nodes. I think "-s" is
>>>> the default, so you don't need to specify it.
>>>>
>>>> You can put anything in for the data.. it is unimportant:
>>>>
>>>> cli>  create /accumulo/xx.../fate foo
>>>> cli>  create /accumulo/xx.../table_locks bar
>>>>
>>>> I think that you can give the zkCli.sh shell quotes for an empty string:
>>>>
>>>> cli> create /accumulo/xx.../fate ""
>>>>
>>>> But, I can't remember if that works.  Accumulo never reads the contents
>>>> of those nodes, so anything you put in there will be ignored.
>>>>
>>>> The master may even re-create these nodes on start-up, but I did not
>>>> test it.
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Thu, Feb 20, 2014 at 6:18 PM, Dickson, Matt MR <
>>>> [email protected]> wrote:
>>>>
>>>>>  *UNOFFICIAL*
>>>>> After running the zkCli.sh rmr on the directories, we are
>>>>> having difficulties recreating the nodes.
>>>>>
>>>>> The zookeeper create command has 2 options -s and -e, but it's not
>>>>> clear what each of these does and which one to use to recreate the 
>>>>> accumulo
>>>>> node.  Also the create command requires a 'data' name specified however
>>>>> when we look at our qa system the accumulo node has no data name within 
>>>>> it.
>>>>>
>>>>> What is the zookeper command to run to recreate the /accumulo/xx.../fate
>>>>> and /accumulo/xx.../table_locks nodes?
>>>>>
>>>>>  ------------------------------
>>>>>  *From:* Eric Newton [mailto:[email protected]]
>>>>> *Sent:* Friday, 21 February 2014 07:31
>>>>>
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>
>>>>>   No, xxx... is your instance id.  You can find it at the top of the
>>>>> monitor page. It's the ugly UUID there.
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Feb 20, 2014 at 3:26 PM, Dickson, Matt MR <
>>>>> [email protected]> wrote:
>>>>>
>>>>>>  *UNOFFICIAL*
>>>>>> Is the xxx... the transaction id returned by the 'fate.Admin print'?
>>>>>>
>>>>>> Whats involved with recreating a node?
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>>  ------------------------------
>>>>>>  *From:* Eric Newton [mailto:[email protected]]
>>>>>> *Sent:* Friday, 21 February 2014 01:35
>>>>>>
>>>>>> *To:* [email protected]
>>>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>>
>>>>>>   You can use the zkCli.sh utility to "rmr" /accumulo/xx.../fate and
>>>>>> /accumulo/xx.../table_locks, and then recreate those nodes.
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 19, 2014 at 5:58 PM, Dickson, Matt MR <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>>  *UNOFFICIAL*
>>>>>>> Thanks for your help on this Eric.
>>>>>>>
>>>>>>> I've started deleting the transactions by running the, ./accumulo
>>>>>>> ...fate.Admin delete <txid>, and notice this takes about 20 seconds per
>>>>>>> transaction.  With 7500 to delete this is going to take a long time 
>>>>>>> (almost
>>>>>>> 2 days), so I tried running several threads each with a seperate range 
>>>>>>> of
>>>>>>> id's to delete.  Unfortunately this seemed to have some contention and I
>>>>>>> kept recieving an InvocationTargetException .... Caused by
>>>>>>> zookeeper.KeeperException: KeeperErrorCode = noNode for
>>>>>>> /accumulo/xxxxx-xxxx-xxxx-xxxx/table_locks/3n/lock-xxxxxx
>>>>>>>
>>>>>>> When I go back to one thread this error disappears.
>>>>>>>
>>>>>>> Is there a better way to run this?
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>> Matt
>>>>>>>
>>>>>>>  ------------------------------
>>>>>>> *From:* Eric Newton [mailto:[email protected]]
>>>>>>> *Sent:* Wednesday, 19 February 2014 01:21
>>>>>>>
>>>>>>> *To:* [email protected]
>>>>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>>>
>>>>>>>   The "LeaseExpiredException" is part of the recovery process.  The
>>>>>>> master determines that a tablet server has lost its lock, or it is
>>>>>>> unresponsive and has been halted, possibly indirectly by removing the 
>>>>>>> lock.
>>>>>>>
>>>>>>> The master then steals the write lease on the WAL file, which causes
>>>>>>> future writes to the WALog to fail.  The message you have seen is part 
>>>>>>> of
>>>>>>> that failure.  You should have seen a tablet server failure associated 
>>>>>>> with
>>>>>>> this message on the machine with <ip>.
>>>>>>>
>>>>>>> Having 50K FATE IN_PROGRESS lines is bad.  That is preventing your
>>>>>>> bulk imports from getting run.
>>>>>>>
>>>>>>> Are there any lines that show locked: [W:3n] ?  The other FATE
>>>>>>> transactions are waiting to get a READ lock on table id 3n.
>>>>>>>
>>>>>>> -Eric
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 16, 2014 at 7:59 PM, Dickson, Matt MR <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> UNOFFICIAL
>>>>>>>>
>>>>>>>> Josh,
>>>>>>>>
>>>>>>>> Zookeepr - 3.4.5-cdh4.3.0
>>>>>>>> Accumulo - 1.5.0
>>>>>>>> Hadoop - cdh 4.3.0
>>>>>>>>
>>>>>>>> In the accumulo console getting
>>>>>>>>
>>>>>>>> ERROR RemoteException(...LeaseExpiredException): Lease mismatch on
>>>>>>>> /accumulo/wal/<ip>+9997/<uid> owned by 
>>>>>>>> DFSClient_NONMAPREDUCE_699577321_12
>>>>>>>> but is accessed by DFSClient_NONMAPREDUCE_903051502_12
>>>>>>>>
>>>>>>>> We can scan the table without issues and can load rows directly, ie
>>>>>>>> not using bulk import.
>>>>>>>>
>>>>>>>> A bit more information - we recently extended how we manage old
>>>>>>>> tablets in the system. We load data by date, creating splits for each 
>>>>>>>> day
>>>>>>>> and then ageoff using the ageoff filters.  This leaves empty tablets 
>>>>>>>> so we
>>>>>>>> now merge these old tablets together to effectively remove them.  I 
>>>>>>>> mention
>>>>>>>> it because I'm not sure if this might have introduced another issue.
>>>>>>>>
>>>>>>>> Matt
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Josh Elser [mailto:[email protected]]
>>>>>>>> Sent: Monday, 17 February 2014 11:32
>>>>>>>> To: [email protected]
>>>>>>>> Subject: Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>>>>
>>>>>>>> Matt,
>>>>>>>>
>>>>>>>> Can you provide Hadoop, ZK and Accumulo versions? Does the cluster
>>>>>>>> appear to be functional otherwise (can you scan that table you're bulk
>>>>>>>> importing to? any other errors on the monitor? etc)
>>>>>>>>
>>>>>>>> On 2/16/14, 7:07 PM, Dickson, Matt MR wrote:
>>>>>>>> > *UNOFFICIAL*
>>>>>>>> >
>>>>>>>> > I have a situation where bulk ingests are failing with a "Thread
>>>>>>>> "shell"
>>>>>>>> > stuck on IO to xxx:9999:99999 ...
>>>>>>>> >  From the management console the table we are loading to has no
>>>>>>>> > compactions running, yet we ran "./accumulo
>>>>>>>> > org.apache.accumulo.server.fate.Admin print and can see 50,000
>>>>>>>> lines
>>>>>>>> > stating
>>>>>>>> > txid: xxxx     status:IN_PROGRESS op: CompactRange     locked: []
>>>>>>>> > locking: [R:3n]     top: Compact:Range
>>>>>>>> > Does this mean there are actually compactions running or old
>>>>>>>> > comapaction locks still hanging around that will be preventing
>>>>>>>> the builk ingest to run?
>>>>>>>> > Thanks in advance,
>>>>>>>> > Matt
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Failing to BulkIngest [SEC=UNOFFICIAL]

Reply via email to