[
https://issues.apache.org/jira/browse/HBASE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100139#comment-13100139
]
nkeywal commented on HBASE-4195:
--------------------------------
@stack: yes I am ok with all your points. Thanks! Some details below:
bq. Are seek and reseek the same now? Or it seems like they have a bunch of
common code... can we factor it out to common method if so?
The initialization of kvTail & snapshotTail differs, then it's the same code.
There are only 6 lines of code, but I aggree, it would be cleaner if shared in
a private method (this would simplify as well the improvement on peek)
bq. We're fixing a bug where we may miss a Put if a flush comes in in meantime
because we won't have a running Iterator on new KVSet (but maybe this is not
such a big deal - perhaps - because its unlikely the new Put will be within the
purview of the current read point?
That's what I expect. Note that between the 3 implementations:
- the initial one: it was impossible because we were just using the iterator
without going back to the list.
- the one currently in the tunk: possible because we're restarting from the
very beginning of the list.
- the proposed one; in the middle: we're not restarting from the beginning from
from an intermediate point of the list.
So we're not in the same situation as we were 2 years ago, but I expect
(without having done a full analysis) that the readpoint will hide this.
The best of the best, in terms of performance and similarity to the initial
implementation, would be to get the sub-skiplist implictly pointed by the
iterator, but there is nothing in the Java API to do it today: it would require
to implement a specific skip list.
> Possible inconsistency in a memstore read after a reseek, possible
> performance improvement
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-4195
> URL: https://issues.apache.org/jira/browse/HBASE-4195
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.90.4
> Environment: all
> Reporter: nkeywal
> Assignee: nkeywal
> Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 20110824_4195_MemStore.patch,
> 20110824_4195_TestHRegion.patch
>
>
> This follows the dicussion around HBASE-3855, and the random errors (20%
> failure on trunk) on the unit test
> org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
> I saw some points related to numIterReseek, used in the
> MemStoreScanner#getNext (line 690):
> {noformat}679 protected KeyValue getNext(Iterator it) {
> 680 KeyValue ret = null;
> 681 long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();
> 682 //DebugPrint.println( " MS@" + hashCode() + ": threadpoint = " +
> readPoint);
> 683
> 684 while (ret == null && it.hasNext()) {
> 685 KeyValue v = it.next();
> 686 if (v.getMemstoreTS() <= readPoint) {
> 687 // keep it.
> 688 ret = v;
> 689 }
> 690 numIterReseek--;
> 691 if (numIterReseek == 0) {
> 692 break;
> 693 }
> 694 }
> 695 return ret;
> 696 }{noformat}
> This function is called by seek, reseek, and next. The numIterReseek is only
> usefull for reseek.
> There are some issues, I am not totally sure it's the root cause of the test
> case error, but it could explain partly the randomness of the error, and one
> point is for sure a bug.
> 1) In getNext, numIterReseek is decreased, then compared to zero. The seek
> function sets numIterReseek to zero before calling getNext. It means that the
> value will be actually negative, hence the test will always fail, and the
> loop will continue. It is the expected behaviour, but it's quite smart.
> 2) In "reseek", numIterReseek is not set between the loops on the two
> iterators. If the numIterReseek is equals to zero after the loop on the first
> one, the loop on the second one will never call seek, as numIterReseek will
> be negative.
> 3) Still in "reseek", the test to call "seek" is (kvsetNextRow == null &&
> numIterReseek == 0). In other words, if kvsetNextRow is not null when
> numIterReseek equals zero, numIterReseek will start to be negative at the
> next iteration and seek will never be called.
> 4) You can have side effects if reseek ends with a numIterReseek > 0: the
> following calls to the "next" function will decrease numIterReseek to zero,
> and getNext will break instead of continuing the loop. As a result, later
> calls to next() may return null or not depending on how is configured the
> default value for numIterReseek.
> To check if the issue comes from point 4, you can set the numIterReseek to
> zero before returning in reseek:
> {noformat} numIterReseek = 0;
> return (kvsetNextRow != null || snapshotNextRow != null);
> }{noformat}
> On my env, on trunk, it seems to work, but as it's random I am not really
> sure. I also had to modify the test (I added a loop) to make it fails more
> often, the original test was working quite well here.
> It has to be confirmed that this totally fix (it could be partial or
> unrelated)
> org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
> before implementing a complete solution.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira