Thanks for clarifying. So given the region was already open for a while, I
guess those were just empty recovered.edits dir under the region dir, and
my previous assumption does not really apply here. I also had checked
further on TableSnapshotInputFormat, then realised it actually performs a
copy of table dir to a temporary, *restoreDir, *that should be passed as
parameter to *TableSnapshotInputFormat.setInput *initialisation method:
https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.java#L212
Note the method comments on this *restoreDir *param:
>
> *restoreDir a temporary directory to restore the snapshot into. Current
> user should * have write permissions to this directory, and this should
> not be a subdirectory of rootdir. * After the job is finished, restoreDir
> can be deleted.*
>
Here's the point where snapshot data get copied to restoreDir:
https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L509
So as long as we follow javadoc advice, our concerns about potential data
loss is not valid. I guess problem here is that when table dir is
recreated/copied to *restoreDir*, original ownership/permissions is
preserved for the subdirs, such as regions recovered.edits.
Em ter, 18 de jun de 2019 às 01:03, Jacob LeBlanc <
jacob.lebl...@microfocus.com> escreveu:
> First of all, thanks for the reply! I appreciate the time taken addressing
> our issues.
>
> > It seems the mentioned "hiccup" caused RS(es) crash(es), as you got RITs
> and recovered edits under these regions dirs.
>
> To give more context, I was making changes to increase snapshot timeout on
> region servers and did a graceful restart, so I didn't mean to crash
> anything, but it seems like I did this to too many region servers at once
> (did about half the cluster) which seemed to result in some number of
> regions getting stuck in transition. This was attempted on a live
> production cluster so the hope was to do this without downtime but it
> resulted in an outage to our application instead. Unfortunately master and
> region server logs have since rolled and aged out so I don't have them
> anymore.
>
> > The fact there was a "recovered" dir under some regions dirs means that
> when the snapshot was taken, crashed RS(es) WAL(s) had been split, but not
> completely replayed yet.
>
> Snapshot was taken many days later. File timestamps under recovered.edits
> directory were from June 6th and snapshot from the pastebin was taken on
> June 14th, but actually snapshots were taken many times with the same
> result (ETL jobs are launched at least daily in oozie). Do you mean that if
> a snapshot was taken before region was fully recovered it could result in
> this state even if snapshot was subsequently deleted?
>
> > Would you know which specific hbase version is this?
>
> It is EMR 5.22 which runs HBase 1.4.9 (with some Amazon-specific edits
> maybe? I noticed line numbers in HRegion.java in stack trace don't quite
> line up with those in the 1.4.9 tag in github).
>
> > Could your job restore the snapshot into a temp table and then read from
> this temp table using TableInputFormat, instead?
>
> Maybe we could do this, but it will take us some effort to make the
> changes, test, release, etc... Of course we'd rather not jump through hoops
> like this.
>
> > In this case, it's finding "recovered" folder under regions dir, so it
> will replay the edits there. Looks like a problem with
> TableSnapshotInputFormat, seems weird that it tries to delete edits on a
> non-staging dir (your path suggests it's trying to delete the actual edit
> folder), that could cause data loss if it would succeed to delete edits
> before RSes actually replay it.
>
> I agree that this "seems weird" to me given that I am not intimately
> familiar with all of the inner workings of hbase code. The potential data
> loss is what I'm wondering about - would data loss have occurred if we
> happened to execute our job under a user that had delete permissions in
> HDFS directories? Or did the edits actually get replayed when regions were
> in stuck and transition and the files just didn't get cleaned up? Is this
> something for which I should file a defect in JIRA?
>
> Thanks again,
>
> --Jacob LeBlanc
>
>
> -Original Message-
> From: Wellington Chevreuil [mailto:wellington.chevre...@gmail.com]
> Sent: Monday, June 17, 2019 3:55 PM
> To: user@hbase.apache.org
> Subject: Re: TableSnapshotInputFormat failing to delete files under
> recovered.edits
>
> It seems the mentioned "hiccup" caused RS(es) crash(es), as you got RITs
> and recovered edits under these regions dirs. The fact there was a
> "recovered" dir under some regions dirs means that when the snapshot was
> taken, crashed RS(es) WAL(s) had been split, but not completely replayed
> yet.
>
> Since you are facing error when reading from