[jira] Updated: (HBASE-588) Still a 'hole' in scanners, even after HBASE-532

stack (JIRA) Wed, 23 Apr 2008 10:42:50 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack updated HBASE-588:
------------------------

    Attachment: 588-v5.patch

Add in fix for this -- though not caused by this patch.  Also some clean up in 
Flusher added.  Please review.

{code}
Java stack information for the threads listed above:
===================================================
"IPC Server handler 3 on 60020":
        at 
org.apache.hadoop.hbase.regionserver.Flusher.request(Flusher.java:118)
        - waiting to lock <0xb6526348> (a java.util.HashSet)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.update(HRegion.java:1527)
        - locked <0xb66c0708> (a java.lang.Integer)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1318)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1101)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
"regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher":
        at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:987)
        - waiting to lock <0xb66c0708> (a java.lang.Integer)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:923)
        at 
org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:171)
        - locked <0xb6526348> (a java.util.HashSet)
        at org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:94)

Found 1 deadlock.
{code}

> Still a 'hole' in scanners, even after HBASE-532
> ------------------------------------------------
>
>                 Key: HBASE-588
>                 URL: https://issues.apache.org/jira/browse/HBASE-588
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.1.2
>
>         Attachments: 588-v2.patch, 588-v3.patch, 588-v4.patch, 588-v5.patch, 
> 588.patch
>
>
> Before HBASE-532, as soon as a flush started, we called snapshot.  Snapshot 
> used to copy current live memcache into a 'snapshot' TreeMap inside in 
> Memcache.  This snapshot TreeMap was an accumulation of all snapshots since 
> last flush.   Whenever we took out a scanner, we'd do a copy of this snapshot 
> into a new backing map carried by the scanner (Every outstanding Scanner had 
> complete copy).  Memcache snapshots were cleared when a flush started.   
> Flushing could take near no time to up to tens of seconds during which an 
> scanners taken out meantime would not see the edits in the snapshot currently 
> being flushed and gets or getFull would also return incorrect answers because 
> the content of the snapshot was not available to them.
> HBASE-532 made it so the snapshot was available until flush was done -- until 
> a file had made it out to disk.  This fixed gets and getFull and any scanners 
> taken out during flushing.  But there is still a hole.  Any outstanding 
> scanners will be going against the state of Store Readers at time scanner was 
> opened; they will not see the new flush file.
> Chatting about this on IRC, Jim suggests that we pass either memcache or 
> current snapshot to each Scanner (Pass the snapshot if not empty).  The 
> notion is that the Scanner would hold on to the Scanner reference should it 
> be cleared by flushing.  Upside is that scanner wouldn't have to be concerned 
> with the new flush that has been put out to disk.  Downsides are that Scanner 
> data could be way stale if for instance the memcache was near to flushing but 
> we hadn't done it yet.  And we wouldn't be clearing the snapshot promptly so 
> would be some memory pressure.
> Another suggestion is that flushing send an event.  Listeners such as 
> outstanding scanners would notice event and open the new Reader.  Would have 
> to skip forward in the new Reader to catch up with the current set but 
> shouldn't be bad.  Same mechanism could be used to let compactions be moved 
> into place while scanners were outstanding closing down all existing readers 
> skipping to the current 'next' location in the new compacted store file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-588) Still a 'hole' in scanners, even after HBASE-532

Reply via email to