[
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210742#comment-13210742
]
Phabricator commented on HBASE-5347:
------------------------------------
mbautin has commented on the revision "[HBASE-5347] [jira] GC free memory
management in Level-1 Block Cache".
Looks pretty good (provided that it works as intended). Some comments inline.
Could you please add some description of the pinning/unpinning and ref/deref
concept (and the difference between the two) somewhere in javadoc? Otherwise a
lot of this code is very difficult to read. The added complexity is justified
by the performance benefit, but we will have to maintain this code for a long
time, so it should be as self-documenting as possible (think about new
developers joining the project).
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/KeyValue.java:2180 Remove this line
src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListElement.java:1
License
src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListHead.java:1 License
src/main/java/org/apache/hadoop/hbase/KeyValue.java:1222 It would be useful
to mention in the javadoc that the KV is being dereferenced in this method
because it does not refer to the backing array of an HFile block anymore.
src/main/java/org/apache/hadoop/hbase/KeyValue.java:2091 Could you put the
code specific to reference counting code into some other class? KeyValue is
already huge.
src/main/java/org/apache/hadoop/hbase/KeyValue.java:2141-2186 Make these
counters final.
src/main/java/org/apache/hadoop/hbase/KeyValue.java:2192 Remove this line and
other commented-out code in this function.
src/main/java/org/apache/hadoop/hbase/KeyValue.java:2222 Is this increment
thread-safe?
src/main/java/org/apache/hadoop/hbase/KeyValue.java:2237 Naming conventions:
listNum.
A more serious comment: is it possible that one thread will call ref() much
more often? If so, wouldn't it be better to load-balance between in-use lists
randomly? Note that with the current solution we still have to lock at an
individual list level, because multiple threads can still map to the same
in-use list. But I agree that hashing on thread id decreases the amount of
contention.
src/main/java/org/apache/hadoop/hbase/client/HTable.java:623-658 • It might
not be a good idea to add test-only methods to HTable, because that class is
used directly by a lot of clients. In other words, clients pass around HTable
instances and don't cast them to HTableInterface.
I think we should move these methods to a test-only class and/or make them
package-private.
• Replace "anyrandomrow" with a constant.
src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java:112 This
method name change is not terribly useful without an explanation of what
"pinned" is. Would it be better to explain this in javadoc and keep the old
name? This also applies to all other changes of method names including the word
"Pinned" or "AndPin".
src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java:39 Are these
name changes necessary? Maybe it is better to just add javadoc explaining the
"pinning" part.
src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java:322 Nice
refactoring
src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java:39-43 We
definitely need javadoc describing what pin/unpin is and how to use it.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:373 What exactly is
pinned here, and how would the caller unpin it, if that's the intent of the
method name change? The return value is an output stream.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:119 Make this
final?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:168
initialize2 is not a very descriptive counter name
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1092 Is it
possible to separate "pool allocator" interface from implementation, rather
than just calling a static method? This way we could swap in alternative
allocators in tests.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1499 Please
move the pool code to a separate class. Also, it would be nice to separate pool
allocator interface from implementation, as I mentioned.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java:196
Should these be IOExceptions, since the method already throws them?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:412 "Inx"
is a very rare abbreviation for "index". Better to spell out "Index".
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:694-695
Make final
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1628-1669 Make
these final.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:408 Would
the additional copy cause a performance regression based on this method's use
cases?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:417 Would
the additional copy cause a performance regression based on this method's use
cases?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:128 add a
"}" at the end of the "{@link KeyValue"
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:127 in any
HFileBlock
src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:390-391
Make final
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:2874-2876
Make final
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3071 Fix
formatting ("i =0").
src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:2812-2878
This has to be in a separate class.
src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java:410-411
Make final
src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:1404 Do we
unpin on close? As far as I remember, this used to be a wrapper over a
ByteArrayInputStream.
src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java:399-400
Make final
src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:547-548
Make final
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java:37
Why would we need to log into this class's logger from a subclass? Please
define another logger in the subclass directly instead.
src/main/java/org/apache/hadoop/hbase/util/Counters.java:1 License
src/main/java/org/apache/hadoop/hbase/util/Counters.java:25 Naming
convention: rename to COUNTER_CLASSES
src/main/java/org/apache/hadoop/hbase/util/Counters.java:23 This class
probably needs a better name, since it is very specific kind of counters
accessed through reflection.
src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListElement.java:1
License
src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListHead.java:1 License
src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListHead.java:4 Make
private?
src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java:83 Most
changes in this class and a lot of other changes in this diff are simply name
changes. Not sure if they are necessary. One advantage is definitely in
reminding the caller to unpin blocks.
REVISION DETAIL
https://reviews.facebook.net/D1635
> GC free memory management in Level-1 Block Cache
> ------------------------------------------------
>
> Key: HBASE-5347
> URL: https://issues.apache.org/jira/browse/HBASE-5347
> Project: HBase
> Issue Type: Improvement
> Reporter: Prakash Khemani
> Assignee: Prakash Khemani
> Attachments: D1635.5.patch
>
>
> On eviction of a block from the block-cache, instead of waiting for the
> garbage collecter to reuse its memory, reuse the block right away.
> This will require us to keep reference counts on the HFile blocks. Once we
> have the reference counts in place we can do our own simple
> blocks-out-of-slab allocation for the block-cache.
> This will help us with
> * reducing gc pressure, especially in the old generation
> * making it possible to have non-java-heap memory backing the HFile blocks
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira