Still a 'hole' in scanners, even after HBASE-532
------------------------------------------------

                 Key: HBASE-588
                 URL: https://issues.apache.org/jira/browse/HBASE-588
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack
            Priority: Blocker


Before HBASE-532, as soon as a flush started, we called snapshot.  Snapshot 
used to copy current live memcache into a 'snapshot' TreeMap inside in 
Memcache.  This snapshot TreeMap was an accumulation of all snapshots since 
last flush.   Whenever we took out a scanner, we'd do a copy of this snapshot 
into a new backing map carried by the scanner (Every outstanding Scanner had 
complete copy).  Memcache snapshots were cleared when a flush started.   
Flushing could take near no time to up to tens of seconds during which an 
scanners taken out meantime would not see the edits in the snapshot currently 
being flushed and gets or getFull would also return incorrect answers because 
the content of the snapshot was not available to them.

HBASE-532 made it so the snapshot was available until flush was done -- until a 
file had made it out to disk.  This fixed gets and getFull and any scanners 
taken out during flushing.  But there is still a hole.  Any outstanding 
scanners will be going against the state of Store Readers at time scanner was 
opened; they will not see the new flush file.

Chatting about this on IRC, Jim suggests that we pass either memcache or 
current snapshot to each Scanner (Pass the snapshot if not empty).  The notion 
is that the Scanner would hold on to the Scanner reference should it be cleared 
by flushing.  Upside is that scanner wouldn't have to be concerned with the new 
flush that has been put out to disk.  Downsides are that Scanner data could be 
way stale if for instance the memcache was near to flushing but we hadn't done 
it yet.  And we wouldn't be clearing the snapshot promptly so would be some 
memory pressure.

Another suggestion is that flushing send an event.  Listeners such as 
outstanding scanners would notice event and open the new Reader.  Would have to 
skip forward in the new Reader to catch up with the current set but shouldn't 
be bad.  Same mechanism could be used to let compactions be moved into place 
while scanners were outstanding closing down all existing readers skipping to 
the current 'next' location in the new compacted store file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to