Thanks for reporting it. Good find.

On Wed, May 24, 2017 at 11:16 AM, Jerry He <[email protected]> wrote:

> In theory, the snapshot is a best effort Point in Time snapshot.  No
> guarantee of any clean cutoff.  The data you missed in this snapshot will
> appear in the next one.
> But the current problem you saw can and should be fixed though.
>
> Jerry
>
>
> On Wed, May 24, 2017 at 9:04 AM, LeBlanc, Jacob <[email protected]>
> wrote:
>
>> Will do. I'll build off 1.1.4 with the patch, apply it to the region
>> servers, and capture logs and let you know if I see the exception occur.
>>
>> --Jacob
>>
>> -----Original Message-----
>> From: Ted Yu [mailto:[email protected]]
>> Sent: Wednesday, May 24, 2017 11:57 AM
>> To: [email protected]
>> Subject: Re: Missing data in snapshot - possible flush timing issue?
>>
>> I attached tentative fix to HBASE-18099.
>>
>> If you have a bandwidth, you can try it out.
>>
>> On Wed, May 24, 2017 at 8:53 AM, LeBlanc, Jacob <[email protected]>
>> wrote:
>>
>> > Great! I see the JIRA bug you just opened. I'll enable debug logging
>> > on FlushSnapshotSubprocedure and HRegion on the region servers in the
>> > cluster to see if I can capture log messages as better evidence. Since
>> > it's a timing issue I'm not sure when we might see it again, but I'll
>> > keep an eye out.
>> >
>> > Thanks so much for your help,
>> >
>> > --Jacob
>> >
>> > -----Original Message-----
>> > From: Ted Yu [mailto:[email protected]]
>> > Sent: Wednesday, May 24, 2017 11:29 AM
>> > To: [email protected]
>> > Subject: Re: Missing data in snapshot - possible flush timing issue?
>> >
>> > In FlushSnapshotSubprocedure (running on region server), there is
>> > debug
>> > log:
>> >
>> >           LOG.debug("... Flush Snapshotting region " +
>> > region.toString() + "
>> > completed.");
>> >
>> > If you enable debug log, we would know whether the underlying region
>> > is considered having completed the flush.
>> >
>> > Higher in call() method there is this:
>> >
>> >           region.flush(true);
>> >
>> > The return value is not checked.
>> >
>> > In HRegion#flushcache(), Result.CANNOT_FLUSH may be returned due to:
>> >
>> >           String msg = "Not flushing since "
>> >
>> >               + (writestate.flushing ? "already flushing"
>> >
>> >               : "writes not enabled");
>> >
>> > The above seems to correlate with your description.
>> >
>> > Let me log a JIRA referring to this thread.
>> >
>> > On Wed, May 24, 2017 at 8:08 AM, LeBlanc, Jacob
>> > <[email protected]>
>> > wrote:
>> >
>> > > Thanks for looking Ted!
>> > >
>> > > My understanding of the log messages is that the last line of the
>> > > pastebin is the end of the flush of the memstore for the region
>> > > where we missed data, but that line is tagged with
>> "[MemStoreFlusher.1]"
>> > > whereas the other regions that were getting flushed as part of
>> > > snapshot are tagged with "[rs(
>> > > a1-qa-hbr31416d.lab.lynx-connected.com
>> > ,16020,1494432106955)-snapshot-pool81-thread-1]".
>> > > With only a superficial understanding, it seems like the flush of
>> > > that region where messages were tagged with "[MemStoreFlusher.1]",
>> > > while happening at the same time, wasn't really part of the snapshot
>> > > process. For example, line 3 in the pastebin shows the flush of one
>> > > region starting and tagged with snapshot-pool81-thread-1, line 4
>> > > shows the flush starting for the region we missed data and tagged
>> > > with MemStoreFlusher.1, and line 5 continues with the flush of
>> > > region as part of snapshot. So it definitely looks like multiple
>> > > flushes were occurring at the same time whereas elsewhere in the
>> > > logs it seems like the flushes are always done sequentially as part
>> > > of snapshot. So I came to the theory that perhaps there is a timing
>> > > issue where the flushed data for a region is missed as part of a
>> > > snapshot because the flush is occurring on another thread as part of
>> > > normal, periodic
>> > flushing of memstore.
>> > >
>> > > The last line I see in the full region server log that has anything
>> > > to do with the snapshot is line 11 in the pastebin at 2017-05-12
>> > > 02:06:05,577 where it's processing events from zookeeper. Again with
>> > > only a superficial understanding, I was assuming this had something
>> > > to do with the master signaling that the snapshot was complete.
>> > > We'll be sure to capture the master log next time.
>> > >
>> > > And thanks for also checking JIRA for me. If there is a bug here it
>> > > seems as though we don't have an option to upgrade to fix it and
>> > > we'll have to plan on coding around it for now.
>> > >
>> > > Thanks,
>> > >
>> > > --Jacob
>> > >
>> > > -----Original Message-----
>> > > From: Ted Yu [mailto:[email protected]]
>> > > Sent: Wednesday, May 24, 2017 8:47 AM
>> > > To: [email protected]
>> > > Subject: Re: Missing data in snapshot - possible flush timing issue?
>> > >
>> > > bq. the snapshot finishes before the flush of that last region
>> > > finishes
>> > >
>> > > According to the last line in the pastebin, flush finished at
>> > > 2017-05-12
>> > > 02:06:06,063
>> > > Did you find something in master log which indicated that snapshot
>> > > finished before the above time ?
>> > >
>> > > I went thru snapshot bug fixes in branch-1.1 backward until seeing
>> > > HBASE-15430 which was in 1.1.4 Among more recent snapshot fixes,
>> > > there was none matching your description.
>> > >
>> > > Next time this happens to your cluster, can you capture master log
>> > > (hopefully with DEBUG logging) ? We would get more clue from master
>> log.
>> > >
>> > > Thanks
>> > >
>> > > On Wed, May 24, 2017 at 1:21 AM, LeBlanc, Jacob
>> > > <[email protected]>
>> > > wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > We've recently been testing a Spark job that takes a snapshot of
>> > > > an HBase table and then creates an RDD using
>> > > > TableSnapshotInputFormat class, and we've run into an interesting
>> > > > issue. It appears that the snapshot, and therefore resulting RDD,
>> > > > was missing some data that was recently written to the table.
>> > > > After some digging, it looks like the missing data was limited to
>> > > > a particular region as if the memstore for the region failed to
>> flush before the snapshot was taken.
>> > > >
>> > > > For our testing the job was scheduled in Oozie to run hourly and
>> > > > we are missing data from the region for exactly the last hour,
>> > > > indicating that when the job ran at 1:05 and took the snapshot
>> > > > everything was flushed and worked fine, but at 2:05 when the job
>> > > > ran we missed everything written to the region since the flush at
>> 1:05.
>> > > >
>> > > > Here is an excerpt from the region server at that time:
>> > > >
>> > > > https://pastebin.com/1ECXjhRp
>> > > >
>> > > > Of the three regions for the table owned by this region server,
>> > > > two of them have messages from "snapshot-pool81-thread-1" saying
>> > > > it's flushing the memstore. But at "2017-05-12 02:06:03,229"
>> > > > before the flush of region with rowkey starting with ";uP\xE1"
>> > > > finishes, there is a message from "MemStoreFlusher.1" saying it's
>> > > > starting the flush of memstore for region with rowkey starting
>> > > > with "\xC0\x8F". Then it appears that the snapshot finishes before
>> > > > the flush of that last region finishes, and that is the exact
>> > > > region from which we are missing
>> > > data.
>> > > >
>> > > > So it kind of looks like some timing issue where a normal
>> > > > scheduled flush of the region's memstore kicked in on one thread
>> > > > while the regions were being flushed for the snapshot by another
>> > > > thread causing us to miss the flushed data for that one region in
>> > > > the snapshot. Out of something like
>> > > > 50-100 times this job was run we saw this occur twice.
>> > > >
>> > > > We are running HBase 1.1.4. Is this a known problem? I've tried
>> > > > looking around JIRA but didn't find anything obvious. Anyone
>> > > > familiar with the code could see how this might happen?
>> > > >
>> > > > We can code defensively against this, but wanted to understand
>> > > > what we are dealing with.
>> > > >
>> > > > BTW, I don't have the logs from the master at that time as they've
>> > > > been rolled since then.
>> > > >
>> > > > Thanks!
>> > > >
>> > > > --Jacob LeBlanc
>> > > >
>> > >
>> >
>>
>
>

Reply via email to