One query about the rollback:
If the journal contains the entry of "PONR", it returns directly.
The regionserver should abort if rollback returns false. Right?  

Jieshan
-----------------------------------------------------------------

On Fri, Aug 19, 2011 at 12:05 AM, Joseph Pallas
<[email protected]> wrote:
> The test program has multiple client threads, each of which is performing a 
> stream of operations (it's actually a custom workload running in the YCSB 
> framework).  The program is keeping track of data that was inserted by write 
> operations, and subsequent read operations only retrieve data that was 
> previously written.  The read operation involves first doing a 
> HTableInterface.exists call on a row/cf/qual that is expected to exist.  It 
> is this exists call that we have seen fail.  When the failure occurs, the 
> client reports an exception and stops.  Then we examine the data using the 
> HBase shell, and the item we were looking for is there: the exists call 
> should have succeeded.  Furthermore, the item has a timestamp that shows it 
> really was inserted several minutes previously-it was not inserted right 
> around the time of the failure (which might happen if there were a race 
> condition of some sort in our client).
>

OK.  The exists call is rarely used I'd say which may be why you are
seeing something we don't.



> So, what is interesting is when we look at the log files for the region 
> server, and at the time this happens, the region involved is in the middle of 
> a split. Also, the key we failed on is greater than the split key.  After 
> much reading of the code in SplitTransaction and HRegionServer, I came up 
> with a theory.
>
> When a region splits, daughter regions are created and the region is marked 
> as offline/splitting in META (by MetaEditor.offlineParentInMeta).  The 
> daughter regions are brought online and added to META by 
> SplitTransaction.openDaughterRegion and HRegionServer.postOpenDeployTasks.  
> Later, the META entry for the original region is cleaned up.  The two 
> daughter regions are managed in their own DaughterOpener thread.  This is 
> where I am suspicious: if daughter A's thread updates META before daughter 
> B's thread does, then there's a window of time on the client when 
> HConnectionManager.locateRegionInMeta if looking for a key in daughter B will 
> see only daughter A.  The client, I believe, does not check end rows in META, 
> so it will think that daughter A is the region to handle the request.
>

ooooh.

> Now, the question is: are they any circumstances under which sending that 
> request to the wrong region (daughter A instead of daughter B) might yield 
> incorrect results, instead of an exception?  My gut says maybe, but my 
> experiments have not yet managed to find it.


Well, we can do a transaction that involved mutliple rows.  Currently
(as I'm sure you know by now), the steps are:

1. close region (NSRE if anyone asks for the region after close)
2. offline region in edit (still NSRE'ing)
3. Open Daughters in parallel and then in parallel update .META.

We should add daughters, daughter B first, then daughter A, and then
offline parent?  If we do it in this sequence, if you are looking for
a row in daughter A, you'll get the parent still and then a NSRE
because its closed.... so you'll go back to .META. and then find
daughter A eventually.  If you are looking for a row in B and A is
online first, you'll think it has it when it doesn't... which would be
bad.

If we offline parent first and then add daughter B first... and we're
looking for row in daughter A, but its not online yet, we'll get
WrongRegionException which would be a blast from the past... something
we used to get in the old days but like polio, managed to eradicate
them.

How does this sound Joe?  We could rig you a SplitTransaction to do
the above.  We could hack one up first and if it did away with your
issue, we'd then spend a bit of time making sure it rolled back
properly on fail (need to make sure rollback works properly).

St.Ack

Reply via email to