[
https://issues.apache.org/jira/browse/HBASE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874492#action_12874492
]
HBase Review Board commented on HBASE-2468:
-------------------------------------------
Message from: "Mingjie Lai" <[email protected]>
bq. On 2010-05-31 21:52:12, Todd Lipcon wrote:
bq. > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1075
bq. > <http://review.hbase.org/r/98/diff/4/?file=872#file872line1075>
bq. >
bq. > I feel like it would be cleaner to have the following methods be
non-static, so they don't need any arguments. that would reduce the number of
functions by factor of two
That was what I wanted to do in the very beginning. I don't like so many
functions either.
But there is one existing method:
public static boolean isTableEnabled(String tableName) ...
public static boolean isTableEnabled(Configuration conf, String tableName) ...
It's one of the reason that I just want to keep the original coding style to be
consistent.
In addition, I agree we can make isRegionCachePrefetchEnabled() etc. to be
non-static. However, enableRegionCachePrefetch(), and
disableRegionCachePrefetch() must to be static since we want to enable/disable
a table without instantiate HTable. In another word, we cannot really
dis/enable cache prefetch for each HTable instance who have the same table
name. While we can only enable/disable based on table name. It is pretty
similar as table enable/disable.
bq. On 2010-05-31 21:52:12, Todd Lipcon wrote:
bq. > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java,
line 1608
bq. > <http://review.hbase.org/r/98/diff/4/?file=871#file871line1608>
bq. >
bq. > kill this function
I don't like it either. I will kill all ``is...Disabled()'' methods.
bq. On 2010-05-31 21:52:12, Todd Lipcon wrote:
bq. > src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java,
line 3639
bq. > <http://review.hbase.org/r/98/diff/4/?file=874#file874line3639>
bq. >
bq. > I really like these unit tests, but I think you should use a row key
for the Get that isn't also a region start key, as it may expose different
behavior. eg I think you will end up with 11 cached regions instead of 10
As mentioned above, I reimplemented MetaScanner so it will start scanning Meta
from the region row where the given table row resides, instead of scanning from
the next region row in Meta. HTable.getRowOrBefore() is called in MetaScanner
to achieve it, (however I'm not very sure whether it is the most efficient way
to do it or not).
So for this unit test, no matter the passed row is a region start key or not,
we will always get 10 here.
bq. On 2010-05-31 21:52:12, Todd Lipcon wrote:
bq. > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java,
lines 776-781
bq. > <http://review.hbase.org/r/98/diff/4/?file=871#file871line776>
bq. >
bq. > this block should go after the cache check below, right?
I reimplemented MetaScanner so it will start scanning Meta from the region
where the given user table row resides, instead of scanning from the next
region row in Meta.
In this case, prefetchRegionCache() also fetches the queried table+row to the
cache. Here I put it before the cache check block, so it can load the result
from cache directly. Otherwise it may do an extra meta scan if cache prefetch
is enabled.
However if multiple threads accessing this block concurrently, this way will
cause cache-prefetch executed twice. But I think this case is pretty rare, so
the penalty should be relatively smaller.
What do you think?
- Mingjie
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/98/#review107
-----------------------------------------------------------
> Improvements to prewarm META cache on clients
> ---------------------------------------------
>
> Key: HBASE-2468
> URL: https://issues.apache.org/jira/browse/HBASE-2468
> Project: HBase
> Issue Type: Improvement
> Components: client
> Reporter: Todd Lipcon
> Assignee: Mingjie Lai
> Fix For: 0.21.0
>
> Attachments: HBASE-2468-trunk.patch
>
>
> A couple different use cases cause storms of reads to META during startup.
> For example, a large MR job will cause each map task to hit meta since it
> starts with an empty cache.
> A couple possible improvements have been proposed:
> - MR jobs could ship a copy of META for the table in the DistributedCache
> - Clients could prewarm cache by doing a large scan of all the meta for the
> table instead of random reads for each miss
> - Each miss could fetch ahead some number of rows in META
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.