Mike,
  That's a good point. My thoughts on this are that we lack the utilities
to help since of the five largish instances I've seen recently have
required their maintainers to edit the metadata table manually. The
CheckForMetadataProblems could prompt the user with ways to fix certain
issues along with suggestions? I'd love to have more context too, but I'd
be more eager to learn of reasonably sized instances that have not had to
do this type of triage manually.

On Wed, Mar 1, 2017 at 7:56 AM, Michael Wall <mjw...@gmail.com> wrote:

> Matt,
>
> This sentence is concerning to me "I've always removed the referenced
> tablet in the metadata table to fix this and had no issues in the past."  I
> rarely make edits to the metadata table and am very, very cautious when I
> do.  This should not be part of normal operating procedures.  Can you
> provide more context?
>
> Mike
>
> On Wed, Mar 1, 2017 at 12:23 AM, Dickson, Matt MR <
> matt.dick...@defence.gov.au> wrote:
>
>> UNOFFICIAL
>>
>> Thanks for that Keith,
>>
>> That's got it working again.  As for the cause, I had an error in the
>> logs stating a tablet was hosted and assigned.  I've always removed the
>> referenced tablet in the metadata table to fix this and had no issues in
>> the past.  It looks like I fat fingered the deletion which removed the
>> wrong entry so not an issue with Accumulo.
>>
>> Thanks.
>>
>> -----Original Message-----
>> From: Keith Turner [mailto:ke...@deenlo.com]
>> Sent: Wednesday, 1 March 2017 03:36
>> To: user@accumulo.apache.org
>> Subject: Re: Fix "Table x has a hole" [SEC=UNOFFICIAL]
>>
>> Below are some commands that show how to recreate this problem and how
>> to fix it.   Each table in the metadata table has a pointer to the
>> previous tablets.  Adding and removing splits to a table changes this.
>>
>>   root@uno> createtable test
>>
>> Get the tables ID below we will need it later.
>>
>>   root@uno test> tables -l
>>   accumulo.metadata    =>        !0
>>   accumulo.replication =>      +rep
>>   accumulo.root        =>        +r
>>   test                 =>         3
>>   trace                =>         1
>>
>> Add some splits and then scan the metadata table.  The pointers to the
>> previous tablet are in the ~tab:~pr column.  The scan below uses the table
>> id above.
>>
>>   root@uno test> addsplits 11111111 3333333
>>   root@uno test> scan -t accumulo.metadata -c ~tab:~pr -b 3; -e 3<
>>   3;11111111 ~tab:~pr []    \x00
>>   3;3333333 ~tab:~pr []    \x0111111111
>>   3< ~tab:~pr []    \x013333333
>>
>> Add another split and rescan the metadata table.
>>
>>   root@uno test> addsplits 2222222
>>   root@uno test> scan -t accumulo.metadata -c ~tab:~pr -b 3; -e 3<
>>   3;11111111 ~tab:~pr []    \x00
>>   3;2222222 ~tab:~pr []    \x0111111111
>>   3;3333333 ~tab:~pr []    \x012222222
>>   3< ~tab:~pr []    \x013333333
>>
>> Grant permission to write to the metadata table and then recreate the
>> problem you have.
>>
>>   root@uno test> grant Table.WRITE -u root -t accumulo.metadata
>>   root@uno test> table accumulo.metadata
>>   root@uno accumulo.metadata> insert 3;3333333 ~tab ~pr \x0111111111
>>   root@uno accumulo.metadata> scan -t accumulo.metadata -c ~tab:~pr -b
>> 3; -e 3<
>>   3;11111111 ~tab:~pr []    \x00
>>   3;2222222 ~tab:~pr []    \x0111111111
>>   3;3333333 ~tab:~pr []    \x0111111111
>>   3< ~tab:~pr []    \x013333333
>>
>> If you ran check for metadata problems here, should see the error message
>> you saw.  Below, the pointer is fixed and write permission is revoked (to
>> prevent accidental writes in the future).
>>
>>   root@uno accumulo.metadata> insert 3;3333333 ~tab ~pr \x012222222
>>   root@uno accumulo.metadata> revoke Table.WRITE -u root -t
>> accumulo.metadata
>>   root@uno accumulo.metadata>
>>
>> After running the command above to fix the potiner, check for metadata
>> problems should be happy.
>>
>> It would be nice to try to track down the cause of this.  Spliting a
>> tablet involves three metadata operations.  For fault tolerance, the
>> columns ~tab:oldprevrow and ~tab:splitRatio are temporarily written.
>> If a tablet server dies in the middle of splitting a tablet, then
>> Accumulo will see these temporary columns and attempt to continue the
>> split.  So I am curious if you see these columns?
>>
>> On Sun, Feb 26, 2017 at 6:49 PM, Dickson, Matt MR <
>> matt.dick...@defence.gov.au> wrote:
>> > UNOFFICIAL
>> >
>> > Running the CheckForMetadataProblems on Accumulo is listing
>> >
>> > Table xxx has a hole 11111111 != 2222222
>> >
>> > Is there a correct way to repair this?
>> >
>> > Thanks in advance.
>>
>
>

Reply via email to