[
https://issues.apache.org/jira/browse/HBASE-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090790#comment-13090790
]
[email protected] commented on HBASE-4241:
------------------------------------------------------
bq. On 2011-08-25 05:16:29, Michael Stack wrote:
bq. >
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java,
line 33
bq. > <https://reviews.apache.org/r/1650/diff/3/?file=35483#file35483line33>
bq. >
bq. > Nice
I stole most of this from the test code :)
bq. On 2011-08-25 05:16:29, Michael Stack wrote:
bq. >
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java,
line 505
bq. > <https://reviews.apache.org/r/1650/diff/3/?file=35482#file35482line505>
bq. >
bq. > This is nice, the way you are going to the scanner to next rather
than what we did before where we'd iterate a set.
It's also nice because now it uses the exact same logic that is used during
compaction.
bq. On 2011-08-25 05:16:29, Michael Stack wrote:
bq. >
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java,
line 488
bq. > <https://reviews.apache.org/r/1650/diff/3/?file=35482#file35482line488>
bq. >
bq. > oh. so we never cared about this before? we'd just keep doing
versions even if well in excess of max? We were pruning versions though?
Higher in the stack? In the scanner that called this one?
Versions were only ever pruned as part of compactions. (A StoreScanner is also
in Store.compactStore(...))
We can't get rid of all excess versions at flush time (as not all of them are
in the memstore), but we can make a good effort to avoid flushing some of the
versions.
bq. On 2011-08-25 05:16:29, Michael Stack wrote:
bq. >
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java,
line 516
bq. > <https://reviews.apache.org/r/1650/diff/3/?file=35482#file35482line516>
bq. >
bq. > Should this be in a finally block? Would it make a diff if this was
left hanging open because of some exception? We can get one up out of the
scanner.next, right?
bq. >
bq. > oh, there is a finally just after... maybe move the scanner.close in
there?
Hmm... I was thinking since KeyValueScanner.close() can throw IOException I
don't want it to be in the finally clause.
I also think this close this close() is not actually needed.
Looking at StoreScanner.close(), though, I see that it does not throw anything.
Not sure about this one, I'll move it into the finally block, after the close
of the writer, just to be save.
I'll upload a new version soon and also includes suggestions from Ted.
- Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1650/#review1628
-----------------------------------------------------------
On 2011-08-25 04:49:36, Lars Hofhansl wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1650/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-08-25 04:49:36)
bq.
bq.
bq. Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. This avoids flushing row versions to disk that are known to be GC'd by the
next compaction anyway.
bq. This covers two scenarios:
bq. 1. maxVersions=N and we find at least N versions in the memstore. We can
safely avoid flushing any further versions to disk.
bq. 2. similarly minVersions=N and we find at least N versions in the
memstore. Now we can safely avoid flushing any further *expired* versions to
disk.
bq.
bq. This changes the Store flush to use the same mechanism that used for
compactions.
bq. I borrowed some code from the tests and refactored the test code to use a
new utility class that wraps a sorted collection and then behaves like
KeyValueScanner. The same class is used to create scanner over the memstore's
snapshot.
bq.
bq.
bq. This addresses bug HBASE-4241.
bq. https://issues.apache.org/jira/browse/HBASE-4241
bq.
bq.
bq. Diffs
bq. -----
bq.
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
1161347
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java
PRE-CREATION
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java
1161347
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java
1161347
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueScanFixture.java
1161347
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
1161347
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
1161347
bq.
bq. Diff: https://reviews.apache.org/r/1650/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Ran all tests. TestHTablePool and TestDistributedLogSplitting error out
(with or without my change).
bq. I had to change three tests that incorrectly relied on old rows hanging
around after a flush (or were otherwise incorrect).
bq.
bq. No new test, as this should cause no functional change.
bq.
bq.
bq. Thanks,
bq.
bq. Lars
bq.
bq.
> Optimize flushing of the Store cache for max versions and (new) min versions
> ----------------------------------------------------------------------------
>
> Key: HBASE-4241
> URL: https://issues.apache.org/jira/browse/HBASE-4241
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 0.92.0
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Attachments: 4241-v2.txt, 4241.txt
>
>
> As discussed with with Jon, there is room for improvement in how the memstore
> is flushed to disk.
> Currently only expired KVs are pruned before flushing, but we can also prune
> versions if we find at least maxVersions versions in the memstore.
> The same holds for the new minversion feature: If we find at least minVersion
> versions in the store we can remove all further versions that are expired.
> Generally we should use the same mechanism here that is used for Compaction.
> I.e. StoreScanner. We only need to add a scanner to Memstore that can scan
> along the current snapshot.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira