RE: Missing data in snapshot - possible flush timing issue?

LeBlanc, Jacob Wed, 24 May 2017 09:04:37 -0700

Will do. I'll build off 1.1.4 with the patch, apply it to the region servers, 
and capture logs and let you know if I see the exception occur.


--Jacob

-----Original Message-----
From: Ted Yu [mailto:[email protected]] 
Sent: Wednesday, May 24, 2017 11:57 AM
To: [email protected]
Subject: Re: Missing data in snapshot - possible flush timing issue?

I attached tentative fix to HBASE-18099.

If you have a bandwidth, you can try it out.

On Wed, May 24, 2017 at 8:53 AM, LeBlanc, Jacob <[email protected]>
wrote:

> Great! I see the JIRA bug you just opened. I'll enable debug logging 
> on FlushSnapshotSubprocedure and HRegion on the region servers in the 
> cluster to see if I can capture log messages as better evidence. Since 
> it's a timing issue I'm not sure when we might see it again, but I'll 
> keep an eye out.
>
> Thanks so much for your help,
>
> --Jacob
>
> -----Original Message-----
> From: Ted Yu [mailto:[email protected]]
> Sent: Wednesday, May 24, 2017 11:29 AM
> To: [email protected]
> Subject: Re: Missing data in snapshot - possible flush timing issue?
>
> In FlushSnapshotSubprocedure (running on region server), there is 
> debug
> log:
>
>           LOG.debug("... Flush Snapshotting region " + 
> region.toString() + "
> completed.");
>
> If you enable debug log, we would know whether the underlying region 
> is considered having completed the flush.
>
> Higher in call() method there is this:
>
>           region.flush(true);
>
> The return value is not checked.
>
> In HRegion#flushcache(), Result.CANNOT_FLUSH may be returned due to:
>
>           String msg = "Not flushing since "
>
>               + (writestate.flushing ? "already flushing"
>
>               : "writes not enabled");
>
> The above seems to correlate with your description.
>
> Let me log a JIRA referring to this thread.
>
> On Wed, May 24, 2017 at 8:08 AM, LeBlanc, Jacob 
> <[email protected]>
> wrote:
>
> > Thanks for looking Ted!
> >
> > My understanding of the log messages is that the last line of the 
> > pastebin is the end of the flush of the memstore for the region 
> > where we missed data, but that line is tagged with "[MemStoreFlusher.1]"
> > whereas the other regions that were getting flushed as part of 
> > snapshot are tagged with "[rs( 
> > a1-qa-hbr31416d.lab.lynx-connected.com
> ,16020,1494432106955)-snapshot-pool81-thread-1]".
> > With only a superficial understanding, it seems like the flush of 
> > that region where messages were tagged with "[MemStoreFlusher.1]", 
> > while happening at the same time, wasn't really part of the snapshot 
> > process. For example, line 3 in the pastebin shows the flush of one 
> > region starting and tagged with snapshot-pool81-thread-1, line 4 
> > shows the flush starting for the region we missed data and tagged 
> > with MemStoreFlusher.1, and line 5 continues with the flush of 
> > region as part of snapshot. So it definitely looks like multiple 
> > flushes were occurring at the same time whereas elsewhere in the 
> > logs it seems like the flushes are always done sequentially as part 
> > of snapshot. So I came to the theory that perhaps there is a timing 
> > issue where the flushed data for a region is missed as part of a 
> > snapshot because the flush is occurring on another thread as part of 
> > normal, periodic
> flushing of memstore.
> >
> > The last line I see in the full region server log that has anything 
> > to do with the snapshot is line 11 in the pastebin at 2017-05-12
> > 02:06:05,577 where it's processing events from zookeeper. Again with 
> > only a superficial understanding, I was assuming this had something 
> > to do with the master signaling that the snapshot was complete. 
> > We'll be sure to capture the master log next time.
> >
> > And thanks for also checking JIRA for me. If there is a bug here it 
> > seems as though we don't have an option to upgrade to fix it and 
> > we'll have to plan on coding around it for now.
> >
> > Thanks,
> >
> > --Jacob
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:[email protected]]
> > Sent: Wednesday, May 24, 2017 8:47 AM
> > To: [email protected]
> > Subject: Re: Missing data in snapshot - possible flush timing issue?
> >
> > bq. the snapshot finishes before the flush of that last region 
> > finishes
> >
> > According to the last line in the pastebin, flush finished at
> > 2017-05-12
> > 02:06:06,063
> > Did you find something in master log which indicated that snapshot 
> > finished before the above time ?
> >
> > I went thru snapshot bug fixes in branch-1.1 backward until seeing
> > HBASE-15430 which was in 1.1.4 Among more recent snapshot fixes, 
> > there was none matching your description.
> >
> > Next time this happens to your cluster, can you capture master log 
> > (hopefully with DEBUG logging) ? We would get more clue from master log.
> >
> > Thanks
> >
> > On Wed, May 24, 2017 at 1:21 AM, LeBlanc, Jacob 
> > <[email protected]>
> > wrote:
> >
> > > Hi all,
> > >
> > > We've recently been testing a Spark job that takes a snapshot of 
> > > an HBase table and then creates an RDD using 
> > > TableSnapshotInputFormat class, and we've run into an interesting 
> > > issue. It appears that the snapshot, and therefore resulting RDD, 
> > > was missing some data that was recently written to the table. 
> > > After some digging, it looks like the missing data was limited to 
> > > a particular region as if the memstore for the region failed to flush 
> > > before the snapshot was taken.
> > >
> > > For our testing the job was scheduled in Oozie to run hourly and 
> > > we are missing data from the region for exactly the last hour, 
> > > indicating that when the job ran at 1:05 and took the snapshot 
> > > everything was flushed and worked fine, but at 2:05 when the job 
> > > ran we missed everything written to the region since the flush at 1:05.
> > >
> > > Here is an excerpt from the region server at that time:
> > >
> > > https://pastebin.com/1ECXjhRp
> > >
> > > Of the three regions for the table owned by this region server, 
> > > two of them have messages from "snapshot-pool81-thread-1" saying 
> > > it's flushing the memstore. But at "2017-05-12 02:06:03,229" 
> > > before the flush of region with rowkey starting with ";uP\xE1" 
> > > finishes, there is a message from "MemStoreFlusher.1" saying it's 
> > > starting the flush of memstore for region with rowkey starting 
> > > with "\xC0\x8F". Then it appears that the snapshot finishes before 
> > > the flush of that last region finishes, and that is the exact 
> > > region from which we are missing
> > data.
> > >
> > > So it kind of looks like some timing issue where a normal 
> > > scheduled flush of the region's memstore kicked in on one thread 
> > > while the regions were being flushed for the snapshot by another 
> > > thread causing us to miss the flushed data for that one region in 
> > > the snapshot. Out of something like
> > > 50-100 times this job was run we saw this occur twice.
> > >
> > > We are running HBase 1.1.4. Is this a known problem? I've tried 
> > > looking around JIRA but didn't find anything obvious. Anyone 
> > > familiar with the code could see how this might happen?
> > >
> > > We can code defensively against this, but wanted to understand 
> > > what we are dealing with.
> > >
> > > BTW, I don't have the logs from the master at that time as they've 
> > > been rolled since then.
> > >
> > > Thanks!
> > >
> > > --Jacob LeBlanc
> > >
> >
>

RE: Missing data in snapshot - possible flush timing issue?

Reply via email to