[
https://issues.apache.org/jira/browse/HBASE-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Feng Honghua updated HBASE-10679:
---------------------------------
Status: Patch Available (was: Open)
> Both clients operating on a same region will get wrong scan results if the
> first scanner expires and the second scanner is created with the same
> scannerId
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-10679
> URL: https://issues.apache.org/jira/browse/HBASE-10679
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Feng Honghua
> Assignee: Feng Honghua
> Priority: Critical
> Attachments: HBASE-10679-trunk_v1.patch, HBASE-10679-trunk_v2.patch
>
>
> The scenario is as below (both Client A and Client B scan against Region R)
> # A opens a scanner SA on R, the scannerId is N, it successfully get its
> first row "a"
> # SA's lease expires and it's removed from scanners
> # B opens a scanner SB on R, the scannerId is N too. it successfully get its
> first row "m"
> # A issues its second scan request with scannerId N, regionserver finds N is
> valid scannerId and the region matches too. (since the region is always
> online on this regionserver and both two scanners are against it), so it
> executes scan request on SB, returns "n" to A -- wrong! (get data from other
> scanner, A expects row something like "b" that follows "a")
> # B issues its second scan request with scannerId N, regionserver also thinks
> it's valid, and executes scan on SB, return "o" to B -- wrong! (should return
> "n" but "n" has been scanned out by A just now)
> The consequence is both clients get wrong scan results:
> # A gets data from scanner created by other client, its own scanner has
> expired and removed
> # B misses data which should be gotten but has been wrongly scanned out by A
> The root cause is scannerId generated by regionserver can't be guaranteed
> unique within regionserver's whole lifecycle, *there is only guarantee that
> scannerIds of scanners that are currently still valid (not expired) are
> unique*, so a same scannerId can present in scanners again after a former
> scanner with this scannerId expires and has been removed from scanners. And
> if the second scanner is against the same region, the bug arises.
> Theoretically, the possibility of above scenario should be very rare(two
> consecutive scans on a same region from two different clients get a same
> scannerId, and the first expires before the second is created), but it does
> can happen, and once it happens, the consequence is severe(all clients
> involved get wrong data), and should be extremely hard to diagnose/debug
--
This message was sent by Atlassian JIRA
(v6.2#6252)