[
https://issues.apache.org/jira/browse/HBASE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873880#action_12873880
]
HBase Review Board commented on HBASE-2468:
-------------------------------------------
Message from: "Todd Lipcon" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/98/#review107
-----------------------------------------------------------
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment635>
typo: locationS
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment633>
I think any of getCachedRegionCount, countCachedRegions, or
getNumCachedRegions would be a clearer name.
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment634>
style-wise, why "final" here?
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment636>
remove these @param javadocs - they just take up space if the param names
are self-explanatory (which these are)
same thing above
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment637>
this needs a Comparator argument, otherwise it does object identity rather
than bytewise comparison of the table names
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment640>
javadoc out of date - it prefetches the region for the given row, and
prefetchLimit regions ahead
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment638>
should return false to stop scanning at this point, right?
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment639>
// cache this meta entry
(it's not caching all)
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment641>
this block should go after the cache check below, right?
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment647>
this function needs to synchronized on cachedRegionLocations which is an
unsynchronized HashMap
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment648>
return location != null;
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/98/#comment654>
kill this function
src/main/java/org/apache/hadoop/hbase/client/HTable.java
<http://review.hbase.org/r/98/#comment649>
you should prefix the file with writeInt(allRegions.size()) so you don't
have to use EOFException catching below
src/main/java/org/apache/hadoop/hbase/client/HTable.java
<http://review.hbase.org/r/98/#comment650>
yuck, see above
src/main/java/org/apache/hadoop/hbase/client/HTable.java
<http://review.hbase.org/r/98/#comment651>
I feel like it would be cleaner to have the following methods be
non-static, so they don't need any arguments. that would reduce the number of
functions by factor of two
src/main/java/org/apache/hadoop/hbase/client/HTable.java
<http://review.hbase.org/r/98/#comment653>
incorrect javadoc, also a few places below
src/main/java/org/apache/hadoop/hbase/client/HTable.java
<http://review.hbase.org/r/98/#comment652>
why do we need a separate function for enabled and disabled? aren't they
always inverse of each other?
src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
<http://review.hbase.org/r/98/#comment642>
should make clear this is the row name in the user table, not the meta
table.
should also be clarified that it will start scanning with the region
*after* row, because it doesn't use getClosestRowBefore
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
<http://review.hbase.org/r/98/#comment643>
useless @throws
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
<http://review.hbase.org/r/98/#comment645>
you should use util.getTestDir here
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
<http://review.hbase.org/r/98/#comment644>
import java.io.File
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
<http://review.hbase.org/r/98/#comment646>
I really like these unit tests, but I think you should use a row key for
the Get that isn't also a region start key, as it may expose different
behavior. eg I think you will end up with 11 cached regions instead of 10
- Todd
> Improvements to prewarm META cache on clients
> ---------------------------------------------
>
> Key: HBASE-2468
> URL: https://issues.apache.org/jira/browse/HBASE-2468
> Project: HBase
> Issue Type: Improvement
> Components: client
> Reporter: Todd Lipcon
> Assignee: Mingjie Lai
> Fix For: 0.21.0
>
> Attachments: HBASE-2468-trunk.patch
>
>
> A couple different use cases cause storms of reads to META during startup.
> For example, a large MR job will cause each map task to hit meta since it
> starts with an empty cache.
> A couple possible improvements have been proposed:
> - MR jobs could ship a copy of META for the table in the DistributedCache
> - Clients could prewarm cache by doing a large scan of all the meta for the
> table instead of random reads for each miss
> - Each miss could fetch ahead some number of rows in META
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.