I'm not sure CheckForMetadataProblems can check for all that many different
types of problems. It is limited.
If you have tablets still in the metadata table for tables that no longer
exist, that indicates you probably had some sort of crash and possible
corruption of your metadata.
The only option would be to manually delete those entries.
A command to automatically prune these would probably be dangerous...
running it when there's a transient ZooKeeper problem, for example, could
end up deleting all your tables... which would be bad. Although it is
dangerous, manual surgery on the metadata table to remove these entries, as
you suggested, is probably the best option.


On Tue, Oct 6, 2020 at 12:03 PM Hart, Andrew <and.h...@cgi.com> wrote:

> I am still trying to find the one “unloaded tablet” that is preventing the
> cluster balancing, however, there are a lot of unassigned tablets.
>
>
>
> I have been getting rid of them by onlining tables and completing failed
> table deletes but I am still left with many tablets that are unassigned.
> They seem to be mostly from old deleted tables and so I am not sure why
> they are there at all.
>
> The unassigned tablets are shown in accumulo
> org.apache.accumulo.server.util.FindOfflineTablets and in accumulo admin
> checkTablets
>
> And as I said, some are assign to dead server but actually the server
> isn’t dead at all.
>
>
>
> CheckForMetadataProblems reports “All is well”
>
>
>
> I thought that if I could clear up this mess I could then eventually get
> to just one unassigned tablet which would be the “1 tablets are unloaded”
> one.  (I would then clone the table or copy the data out or something)
>
>
>
> So the problem remains.  The cluster doesn’t balance due to migrations.  I
> don’t find a tablet with a future entry and I can’t find it in unassigned
> or offline tablets due to the large number of other (presumably defunct)
> tablets with unassigned problems in tables that no longer exist.
>
>
>
> There are warnings in the documentation about manually editing the
> accumulo metadata table but it seems that the only option is to go in with
> a deletemany on any rows that start with an old deleted table.  There does
> not seem to be an “accumulo admin pruneDefunctTablets –t tid” command! :D
>
>
>
>
>
>
>
> *From:* Mike Miller <mmil...@apache.org>
> *Sent:* 06 October 2020 16:27
> *To:* user@accumulo.apache.org
> *Subject:* Re: Continuous tablets unloaded and fails to balance from
> accumulo master
>
>
>
> EXTERNAL SENDER: Do not click any links or open any attachments unless
> you trust the sender and know the content is safe.
> EXPÉDITEUR EXTERNE: Ne cliquez sur aucun lien et n’ouvrez aucune pièce
> jointe à moins qu’ils ne proviennent d’un expéditeur fiable, ou que vous
> ayez l'assurance que le contenu provient d'une source sûre.
>
>
>
> Do you want to merge old tablets that don't exist anymore?  I am not sure
> what you are asking... you might have better luck if you provide some more
> info and ask on Slack: https://accumulo.apache.org/contact-us/#slack
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__accumulo.apache.org_contact-2Dus_-23slack&d=DwMFaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=f1Vi1t2KLSKTuTeSpDUCXg&m=Lgh2fhFz4BGHb5Zc9up-gHPYKgQEyQzp4d5XjC5P35A&s=-e_h4A8fCLAqaw1Etl-J2VMdIHWi-Et0FEJW_DgZTbo&e=>
>
>
>
> On Tue, Oct 6, 2020 at 7:25 AM Hart, Andrew <and.h...@cgi.com> wrote:
>
> What is the way to remove tablets that still exist in accumulo but do not
> have an online, offline or deleting table?
>
>
>
> Some of these tablets say ASSIGNED TO DEAD SERVER but the tserver they
> refer to is up and working properly.
>
>
>
> *From:* Hart, Andrew <and.h...@cgi.com>
> *Sent:* 25 September 2020 13:52
> *To:* user@accumulo.apache.org
> *Subject:* RE: Continuous tablets unloaded and fails to balance from
> accumulo master
>
>
>
> EXTERNAL SENDER: Do not click any links or open any attachments unless
> you trust the sender and know the content is safe.
> EXPÉDITEUR EXTERNE: Ne cliquez sur aucun lien et n’ouvrez aucune pièce
> jointe à moins qu’ils ne proviennent d’un expéditeur fiable, ou que vous
> ayez l'assurance que le contenu provient d'une source sûre.
>
>
>
> Thanks for your help.  In looking for this I think I have found that there
> are deleted tables that still have a lot of tablets in the metadata table.
>
> I need to solve that before coming back to find the 1 unloaded tablet.
>
>
>
> Cheers And.
>
>
>
> *From:* Mike Miller <mmil...@apache.org>
> *Sent:* 24 September 2020 16:08
> *To:* user@accumulo.apache.org
> *Subject:* Re: Continuous tablets unloaded and fails to balance from
> accumulo master
>
>
>
> EXTERNAL SENDER: Do not click any links or open any attachments unless
> you trust the sender and know the content is safe.
> EXPÉDITEUR EXTERNE: Ne cliquez sur aucun lien et n’ouvrez aucune pièce
> jointe à moins qu’ils ne proviennent d’un expéditeur fiable, ou que vous
> ayez l'assurance que le contenu provient d'une source sûre.
>
>
>
> That might be OK, could just mean it hasn't been assigned yet.  The only
> way I can think of is to populate a list of all tablets from the metadata
> table and find the one without a "loc" column family.
>
>
>
> On Thu, Sep 24, 2020 at 10:55 AM Hart, Andrew <and.h...@cgi.com> wrote:
>
> No, no future entries in the table.
>
>
>
> *From:* Mike Miller <mmil...@apache.org>
> *Sent:* 24 September 2020 15:10
> *To:* user@accumulo.apache.org
> *Subject:* Re: Continuous tablets unloaded and fails to balance from
> accumulo master
>
>
>
> EXTERNAL SENDER: Do not click any links or open any attachments unless
> you trust the sender and know the content is safe.
> EXPÉDITEUR EXTERNE: Ne cliquez sur aucun lien et n’ouvrez aucune pièce
> jointe à moins qu’ils ne proviennent d’un expéditeur fiable, ou que vous
> ayez l'assurance que le contenu provient d'une source sûre.
>
>
>
> You should be able to figure out the unloaded tablet from the
> "accumulo.metadata" table.  The metadata table will list the tablet
> location using the "loc" column family to indicate it has loaded a tablet
> that it was assigned.
>
> For example the tablet "n;9" will have an entry like:
>
> n;9 loc:1000041fbf00006 []    ip-172-31-87-51.ec2.internal:9997
>
>
>
> From my understanding, the unloaded tablet should have a "future" column
> family, meaning it has been assigned a new location but not loaded yet.  If
> the tablet doesn't have a "loc" or "future" column family then that is a
> problem.
>
>
>
> On Thu, Sep 24, 2020 at 6:32 AM Hart, Andrew <and.h...@cgi.com> wrote:
>
> Hi,
>
>
>
> I am getting “Not balancing due to 1 outstanding migrations” and “[Normal
> tablets]: 1 tablets unloaded”.
>
> This means that the cluster never balances unless I restart the master,
> after which I get a 1 off balance and then it returns to the above messages.
>
>
>
> How do I identify the tablet that is unloaded?  It isn’t in the logs that
> I can see.  Is it possible to tell from the contents of the
> accumulo.metadata table?
>
>
>
> Is there a way to use FindOfflineTablets?
>
>
>
> And.
>
>

Reply via email to