[
https://issues.apache.org/jira/browse/HBASE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-588:
------------------------
Attachment: 588-v5.patch
Add in fix for this -- though not caused by this patch. Also some clean up in
Flusher added. Please review.
{code}
Java stack information for the threads listed above:
===================================================
"IPC Server handler 3 on 60020":
at
org.apache.hadoop.hbase.regionserver.Flusher.request(Flusher.java:118)
- waiting to lock <0xb6526348> (a java.util.HashSet)
at
org.apache.hadoop.hbase.regionserver.HRegion.update(HRegion.java:1527)
- locked <0xb66c0708> (a java.lang.Integer)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1318)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1101)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
"regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher":
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:987)
- waiting to lock <0xb66c0708> (a java.lang.Integer)
at
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:923)
at
org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:171)
- locked <0xb6526348> (a java.util.HashSet)
at org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:94)
Found 1 deadlock.
{code}
> Still a 'hole' in scanners, even after HBASE-532
> ------------------------------------------------
>
> Key: HBASE-588
> URL: https://issues.apache.org/jira/browse/HBASE-588
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.1.2
>
> Attachments: 588-v2.patch, 588-v3.patch, 588-v4.patch, 588-v5.patch,
> 588.patch
>
>
> Before HBASE-532, as soon as a flush started, we called snapshot. Snapshot
> used to copy current live memcache into a 'snapshot' TreeMap inside in
> Memcache. This snapshot TreeMap was an accumulation of all snapshots since
> last flush. Whenever we took out a scanner, we'd do a copy of this snapshot
> into a new backing map carried by the scanner (Every outstanding Scanner had
> complete copy). Memcache snapshots were cleared when a flush started.
> Flushing could take near no time to up to tens of seconds during which an
> scanners taken out meantime would not see the edits in the snapshot currently
> being flushed and gets or getFull would also return incorrect answers because
> the content of the snapshot was not available to them.
> HBASE-532 made it so the snapshot was available until flush was done -- until
> a file had made it out to disk. This fixed gets and getFull and any scanners
> taken out during flushing. But there is still a hole. Any outstanding
> scanners will be going against the state of Store Readers at time scanner was
> opened; they will not see the new flush file.
> Chatting about this on IRC, Jim suggests that we pass either memcache or
> current snapshot to each Scanner (Pass the snapshot if not empty). The
> notion is that the Scanner would hold on to the Scanner reference should it
> be cleared by flushing. Upside is that scanner wouldn't have to be concerned
> with the new flush that has been put out to disk. Downsides are that Scanner
> data could be way stale if for instance the memcache was near to flushing but
> we hadn't done it yet. And we wouldn't be clearing the snapshot promptly so
> would be some memory pressure.
> Another suggestion is that flushing send an event. Listeners such as
> outstanding scanners would notice event and open the new Reader. Would have
> to skip forward in the new Reader to catch up with the current set but
> shouldn't be bad. Same mechanism could be used to let compactions be moved
> into place while scanners were outstanding closing down all existing readers
> skipping to the current 'next' location in the new compacted store file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.