Re: Failing to BulkIngest [SEC=UNOFFICIAL]

Eric Newton Mon, 24 Feb 2014 16:41:51 -0800

There is some bookkeeping in the metadata table, but the drivers are the
FATE entries in zookeeper.


If you have some clients running in the background, they will re-create
their FATE requests on restart of the master.

-Eric



On Mon, Feb 24, 2014 at 6:14 PM, Dickson, Matt MR <
[email protected]> wrote:

>  *UNOFFICIAL*
> Based on that we difinately need to alter our strategy.
>
> To come back to rectifying the issue, we tried deleting the fate and
> table_locks in zookeeper and recreated the nodes, however when Accumulo
> restarted it went through and recreated the locks.  In zookeeper there are
> now 22,500 table locks and 32,300 fate locks.
>
> Would the details of these locks be in the !METADATA table and therefore
> be removable?  I hesitate to do anymore removing via the zookeeper
> directories and the FATE commands because it looks like Accumulo will just
> restore the locks anyway, and removing the locks via FATE will take several
> days.
>
> I've tried running operations like 'compact --cancel' in Accumulo but
> these get stuck on waiting for IO, which is presumably due to all the locks.
>
>
>
>  ------------------------------
> *From:* Eric Newton [mailto:[email protected]]
> *Sent:* Tuesday, 25 February 2014 01:07
>
> *To:* [email protected]
> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>
>  A merge should only take a few seconds, unless a tablet has files with
> data outside it's range.  This is normal for files that are bulk imported,
> or the product of splits.
>
> If you are planning to merge many tablets, compact the ranges of tablets
> you are planning to merge.  This will ensure that the file data for each
> range is limited to that range.  After that, the merge should only take a
> few seconds.
>
> Merges require a write lock on *the entire table*.  And only one merge can
> run at the same time.  If you are merging many tablets, then merging them
> to a giant tablet, and letting it re-split might be faster.
>
> The merge mechanism is seen as a once-in-a-while housecleaning mechanism,
> and not a daily activity.  It would need to be a lot more efficient in
> terms of resources (locks) and data re-writing to use frequently.
>
> Merging and bulk-importing are in conflict for the same resource: the
> table lock and files that contain data strictly within the range of the
> tablet.  So, doing them separately makes sense.  The master should be able
> to figure out how to run them fairly, regardless.
>
> We only have the primitive FATE admin command to kill, delete and list the
> running FATE operations.
>
> -Eric
>
>
>
> On Sun, Feb 23, 2014 at 5:22 PM, Dickson, Matt MR <
> [email protected]> wrote:
>
>>  *UNOFFICIAL*
>> We have recently added functionality to merge old tablets, so these will
>> be running while bulk ingest is going.  The bulk ingest process checks for
>> running compactions and will wait until there are none, but does no
>> co-ordination with running merges.  The bulk ingest and merging are
>> automated, bulk ingest runs hourly and the merge of old tablets runs
>> daily.  Once we get bulk ingest working again, should we pause ingest while
>> the merge operations are run to avoid/minimise FATE operations?
>>
>> To help get the ingest working again, is there a way to list running
>> merges?  Is it possible to cancel a merge?
>>
>> Matt
>>  ------------------------------
>>  *From:* Eric Newton [mailto:[email protected]]
>> *Sent:* Friday, 21 February 2014 15:49
>>
>> *To:* [email protected]
>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>
>>   The locks are not a problem.
>>
>> The problem is the creation of FATE transactions which require locks.
>>
>> Why are you creating FATE operations?  Are you merging tablets?  Are you
>> bulk importing while other table-wide operations are in progress? Are these
>> processes automated?
>>
>> There is some bookkeeping in the !METADATA table, for FATE transactions,
>> but not the other way around.
>>
>>
>>
>>
>> On Thu, Feb 20, 2014 at 10:54 PM, Dickson, Matt MR <
>> [email protected]> wrote:
>>
>>>  *UNOFFICIAL*
>>> Thanks for that.
>>>
>>> We recreated the nodes and restarted Accumulo, but it went through and
>>> Added the locks back during start up, so it appears Accumulo has knowledge
>>> of the locks, maybe in the metadata table(?), and has updated the fate
>>> locks in zookeeper.  The issue of bulk ingest failing is still occuring.
>>>
>>> How can we investigate within Accumulo how it tracks these locks so that
>>> we can flush this information also or identify the issue?
>>>
>>> Matt
>>>  ------------------------------
>>>  *From:* Eric Newton [mailto:[email protected]]
>>> *Sent:* Friday, 21 February 2014 14:27
>>>
>>> *To:* [email protected]
>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>
>>>   Sorry... I should have been more clear.
>>>
>>> "-e" is for ephemeral, these are not ephemeral nodes. I think "-s" is
>>> the default, so you don't need to specify it.
>>>
>>> You can put anything in for the data.. it is unimportant:
>>>
>>> cli>  create /accumulo/xx.../fate foo
>>> cli>  create /accumulo/xx.../table_locks bar
>>>
>>> I think that you can give the zkCli.sh shell quotes for an empty string:
>>>
>>> cli> create /accumulo/xx.../fate ""
>>>
>>> But, I can't remember if that works.  Accumulo never reads the contents
>>> of those nodes, so anything you put in there will be ignored.
>>>
>>> The master may even re-create these nodes on start-up, but I did not
>>> test it.
>>>
>>> -Eric
>>>
>>>
>>>
>>> On Thu, Feb 20, 2014 at 6:18 PM, Dickson, Matt MR <
>>> [email protected]> wrote:
>>>
>>>>  *UNOFFICIAL*
>>>> After running the zkCli.sh rmr on the directories, we are
>>>> having difficulties recreating the nodes.
>>>>
>>>> The zookeeper create command has 2 options -s and -e, but it's not
>>>> clear what each of these does and which one to use to recreate the accumulo
>>>> node.  Also the create command requires a 'data' name specified however
>>>> when we look at our qa system the accumulo node has no data name within it.
>>>>
>>>> What is the zookeper command to run to recreate the /accumulo/xx.../fate
>>>> and /accumulo/xx.../table_locks nodes?
>>>>
>>>>  ------------------------------
>>>>  *From:* Eric Newton [mailto:[email protected]]
>>>> *Sent:* Friday, 21 February 2014 07:31
>>>>
>>>> *To:* [email protected]
>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>
>>>>   No, xxx... is your instance id.  You can find it at the top of the
>>>> monitor page. It's the ugly UUID there.
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Thu, Feb 20, 2014 at 3:26 PM, Dickson, Matt MR <
>>>> [email protected]> wrote:
>>>>
>>>>>  *UNOFFICIAL*
>>>>> Is the xxx... the transaction id returned by the 'fate.Admin print'?
>>>>>
>>>>> Whats involved with recreating a node?
>>>>>
>>>>> Matt
>>>>>
>>>>>  ------------------------------
>>>>>  *From:* Eric Newton [mailto:[email protected]]
>>>>> *Sent:* Friday, 21 February 2014 01:35
>>>>>
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>
>>>>>   You can use the zkCli.sh utility to "rmr" /accumulo/xx.../fate and
>>>>> /accumulo/xx.../table_locks, and then recreate those nodes.
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 19, 2014 at 5:58 PM, Dickson, Matt MR <
>>>>> [email protected]> wrote:
>>>>>
>>>>>>  *UNOFFICIAL*
>>>>>> Thanks for your help on this Eric.
>>>>>>
>>>>>> I've started deleting the transactions by running the, ./accumulo
>>>>>> ...fate.Admin delete <txid>, and notice this takes about 20 seconds per
>>>>>> transaction.  With 7500 to delete this is going to take a long time 
>>>>>> (almost
>>>>>> 2 days), so I tried running several threads each with a seperate range of
>>>>>> id's to delete.  Unfortunately this seemed to have some contention and I
>>>>>> kept recieving an InvocationTargetException .... Caused by
>>>>>> zookeeper.KeeperException: KeeperErrorCode = noNode for
>>>>>> /accumulo/xxxxx-xxxx-xxxx-xxxx/table_locks/3n/lock-xxxxxx
>>>>>>
>>>>>> When I go back to one thread this error disappears.
>>>>>>
>>>>>> Is there a better way to run this?
>>>>>>
>>>>>> Thanks in advance,
>>>>>> Matt
>>>>>>
>>>>>>  ------------------------------
>>>>>> *From:* Eric Newton [mailto:[email protected]]
>>>>>> *Sent:* Wednesday, 19 February 2014 01:21
>>>>>>
>>>>>> *To:* [email protected]
>>>>>> *Subject:* Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>>
>>>>>>   The "LeaseExpiredException" is part of the recovery process.  The
>>>>>> master determines that a tablet server has lost its lock, or it is
>>>>>> unresponsive and has been halted, possibly indirectly by removing the 
>>>>>> lock.
>>>>>>
>>>>>> The master then steals the write lease on the WAL file, which causes
>>>>>> future writes to the WALog to fail.  The message you have seen is part of
>>>>>> that failure.  You should have seen a tablet server failure associated 
>>>>>> with
>>>>>> this message on the machine with <ip>.
>>>>>>
>>>>>> Having 50K FATE IN_PROGRESS lines is bad.  That is preventing your
>>>>>> bulk imports from getting run.
>>>>>>
>>>>>> Are there any lines that show locked: [W:3n] ?  The other FATE
>>>>>> transactions are waiting to get a READ lock on table id 3n.
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 16, 2014 at 7:59 PM, Dickson, Matt MR <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> UNOFFICIAL
>>>>>>>
>>>>>>> Josh,
>>>>>>>
>>>>>>> Zookeepr - 3.4.5-cdh4.3.0
>>>>>>> Accumulo - 1.5.0
>>>>>>> Hadoop - cdh 4.3.0
>>>>>>>
>>>>>>> In the accumulo console getting
>>>>>>>
>>>>>>> ERROR RemoteException(...LeaseExpiredException): Lease mismatch on
>>>>>>> /accumulo/wal/<ip>+9997/<uid> owned by 
>>>>>>> DFSClient_NONMAPREDUCE_699577321_12
>>>>>>> but is accessed by DFSClient_NONMAPREDUCE_903051502_12
>>>>>>>
>>>>>>> We can scan the table without issues and can load rows directly, ie
>>>>>>> not using bulk import.
>>>>>>>
>>>>>>> A bit more information - we recently extended how we manage old
>>>>>>> tablets in the system. We load data by date, creating splits for each 
>>>>>>> day
>>>>>>> and then ageoff using the ageoff filters.  This leaves empty tablets so 
>>>>>>> we
>>>>>>> now merge these old tablets together to effectively remove them.  I 
>>>>>>> mention
>>>>>>> it because I'm not sure if this might have introduced another issue.
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Josh Elser [mailto:[email protected]]
>>>>>>> Sent: Monday, 17 February 2014 11:32
>>>>>>> To: [email protected]
>>>>>>> Subject: Re: Failing to BulkIngest [SEC=UNOFFICIAL]
>>>>>>>
>>>>>>> Matt,
>>>>>>>
>>>>>>> Can you provide Hadoop, ZK and Accumulo versions? Does the cluster
>>>>>>> appear to be functional otherwise (can you scan that table you're bulk
>>>>>>> importing to? any other errors on the monitor? etc)
>>>>>>>
>>>>>>> On 2/16/14, 7:07 PM, Dickson, Matt MR wrote:
>>>>>>> > *UNOFFICIAL*
>>>>>>> >
>>>>>>> > I have a situation where bulk ingests are failing with a "Thread
>>>>>>> "shell"
>>>>>>> > stuck on IO to xxx:9999:99999 ...
>>>>>>> >  From the management console the table we are loading to has no
>>>>>>> > compactions running, yet we ran "./accumulo
>>>>>>> > org.apache.accumulo.server.fate.Admin print and can see 50,000
>>>>>>> lines
>>>>>>> > stating
>>>>>>> > txid: xxxx     status:IN_PROGRESS op: CompactRange     locked: []
>>>>>>> > locking: [R:3n]     top: Compact:Range
>>>>>>> > Does this mean there are actually compactions running or old
>>>>>>> > comapaction locks still hanging around that will be preventing the
>>>>>>> builk ingest to run?
>>>>>>> > Thanks in advance,
>>>>>>> > Matt
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Failing to BulkIngest [SEC=UNOFFICIAL]

Reply via email to