[
https://issues.apache.org/jira/browse/HBASE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-588:
------------------------
Fix Version/s: 0.1.2
The Jim soln. means keeping around possibly two Maps where one is at least 64M
per columnfamily, per region while a scanner is outstanding.
Making this issue a fix for 0.1.2.
> Still a 'hole' in scanners, even after HBASE-532
> ------------------------------------------------
>
> Key: HBASE-588
> URL: https://issues.apache.org/jira/browse/HBASE-588
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.1.2
>
>
> Before HBASE-532, as soon as a flush started, we called snapshot. Snapshot
> used to copy current live memcache into a 'snapshot' TreeMap inside in
> Memcache. This snapshot TreeMap was an accumulation of all snapshots since
> last flush. Whenever we took out a scanner, we'd do a copy of this snapshot
> into a new backing map carried by the scanner (Every outstanding Scanner had
> complete copy). Memcache snapshots were cleared when a flush started.
> Flushing could take near no time to up to tens of seconds during which an
> scanners taken out meantime would not see the edits in the snapshot currently
> being flushed and gets or getFull would also return incorrect answers because
> the content of the snapshot was not available to them.
> HBASE-532 made it so the snapshot was available until flush was done -- until
> a file had made it out to disk. This fixed gets and getFull and any scanners
> taken out during flushing. But there is still a hole. Any outstanding
> scanners will be going against the state of Store Readers at time scanner was
> opened; they will not see the new flush file.
> Chatting about this on IRC, Jim suggests that we pass either memcache or
> current snapshot to each Scanner (Pass the snapshot if not empty). The
> notion is that the Scanner would hold on to the Scanner reference should it
> be cleared by flushing. Upside is that scanner wouldn't have to be concerned
> with the new flush that has been put out to disk. Downsides are that Scanner
> data could be way stale if for instance the memcache was near to flushing but
> we hadn't done it yet. And we wouldn't be clearing the snapshot promptly so
> would be some memory pressure.
> Another suggestion is that flushing send an event. Listeners such as
> outstanding scanners would notice event and open the new Reader. Would have
> to skip forward in the new Reader to catch up with the current set but
> shouldn't be bad. Same mechanism could be used to let compactions be moved
> into place while scanners were outstanding closing down all existing readers
> skipping to the current 'next' location in the new compacted store file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.