Do you mean I need open a new issue? 2011/7/16 Stack <[email protected]>
> Yes. Please file an issue. A few fellas are messing with block cache > at the moment so they might be up for taking a detour to figure the > why on your interesting observation. > > Thanks, > St.Ack > > On Thu, Jul 14, 2011 at 8:41 PM, Mingjian Deng <[email protected]> > wrote: > > Hi stack: > > Server A or B is the same in the cluster. If I set > > hfile.block.cache.size=0.1 > > on other server, the problem will reappear.But When I set > > hfile.block.cache.size = 0.15 or more, it won't reappear. So I think you > can > > test on your own cluster. > > With the follow btrace codes: > > -------------------------------------------------------------- > > import static com.sun.btrace.BTraceUtils.*; > > import com.sun.btrace.annotations.*; > > > > import java.nio.ByteBuffer; > > import org.apache.hadoop.hbase.io.hfile.*; > > > > @BTrace public class TestRegion1{ > > @OnMethod( > > clazz="org.apache.hadoop.hbase.io.hfile.HFile$Reader", > > method="decompress" > > ) > > public static void traceCacheBlock(final long offset, final int > > compressedSize, > > final int decompressedSize, final boolean pread){ > > println(strcat("decompress: ",str(decompressedSize))); > > } > > } > > -------------------------------------------------------------- > > > > If I set hfile.block.cache.size=0.1, the result is: > > ----------- > > ....... > > decompress: 6020488 > > decompress: 6022536 > > decompress: 5991304 > > decompress: 6283272 > > decompress: 5957896 > > decompress: 6246280 > > decompress: 6041096 > > decompress: 6541448 > > decompress: 6039560 > > ....... > > ----------- > > If I set hfile.block.cache.size=0.12, the result is: > > ----------- > > ...... > > decompress: 65775 > > decompress: 65556 > > decompress: 65552 > > decompress: 9914120 > > decompress: 6026888 > > decompress: 65615 > > decompress: 65627 > > decompress: 6247944 > > decompress: 5880840 > > decompress: 65646 > > ...... > > ----------- > > If I set hfile.block.cache.size=0.15 or more, the result is: > > ----------- > > ...... > > decompress: 65646 > > decompress: 65615 > > decompress: 65627 > > decompress: 65775 > > decompress: 65556 > > decompress: 65552 > > decompress: 65646 > > decompress: 65615 > > decompress: 65627 > > decompress: 65775 > > decompress: 65556 > > decompress: 65552 > > ...... > > ----------- > > > > All of above tests run more than 10 minutes in high level read speed. > So > > it is very strange phenomenon. > > > > 2011/7/15 Stack <[email protected]> > > > >> This is interesting. Any chance that the cells on the regions hosted > >> on server A are 5M in size? > >> > >> The hfile block sizes are by default configured to be 64k but rare > >> would an hfile block ever be exactly 64k. We do not cut the hfile > >> block content at 64k exactly. The hfile block boundary will be at a > >> keyvalue boundary. > >> > >> If a cell were 5MB, it does not get split across multiple hfile > >> blocks. It will occupy one hfile block. > >> > >> Could it be that the region hosted on A is not like the others and it > >> has lots of these 5MB sizes? > >> > >> Let us know. If above is not the case, then you have an interesting > >> phenomenon going on and we need to dig in more. > >> > >> St.Ack > >> > >> > >> On Thu, Jul 14, 2011 at 5:27 AM, Mingjian Deng <[email protected]> > >> wrote: > >> > Hi: > >> > we found a strange problem in our read test. > >> > It is a 5 nodes cluster.Four of our 5 regionservers > >> > set hfile.block.cache.size=0.4, one of them is 0.1(node A). When we > >> random > >> > read from a 2TB data table we found node A's network reached 100MB, > and > >> > others are less than 10MB. We kown node A need to read data from disks > >> and > >> > put them in blockcache. In the follow codes in LruBlockCache: > >> > > >> > -------------------------------------------------------------------------------------------------------------------------- > >> > public void cacheBlock(String blockName, ByteBuffer buf, boolean > >> inMemory) > >> > { > >> > CachedBlock cb = map.get(blockName); > >> > if(cb != null) { > >> > throw new RuntimeException("Cached an already cached block"); > >> > } > >> > cb = new CachedBlock(blockName, buf, count.incrementAndGet(), > >> inMemory); > >> > long newSize = size.addAndGet(cb.heapSize()); > >> > map.put(blockName, cb); > >> > elements.incrementAndGet(); > >> > if(newSize > acceptableSize() && !evictionInProgress) { > >> > runEviction(); > >> > } > >> > } > >> > > >> > -------------------------------------------------------------------------------------------------------------------------- > >> > > >> > > >> > > >> > > >> > We debugged this code with btrace like follow code: > >> > > >> > -------------------------------------------------------------------------------------------------------------------------- > >> > import static com.sun.btrace.BTraceUtils.*; > >> > import com.sun.btrace.annotations.*; > >> > > >> > import java.nio.ByteBuffer; > >> > import org.apache.hadoop.hbase.io.hfile.*; > >> > > >> > @BTrace public class TestRegion{ > >> > @OnMethod( > >> > clazz="org.apache.hadoop.hbase.io.hfile.LruBlockCache", > >> > method="cacheBlock" > >> > ) > >> > public static void traceCacheBlock(@Self LruBlockCache > instance,String > >> > blockName, ByteBuffer buf, boolean inMemory){ > >> > println(strcat("size: > >> > > >> > ",str(get(field("org.apache.hadoop.hbase.io.hfile.LruBlockCache","size"),instance)))); > >> > println(strcat("elements: > >> > > >> > ",str(get(field("org.apache.hadoop.hbase.io.hfile.LruBlockCache","elements"),instance)))); > >> > } > >> > } > >> > > >> > -------------------------------------------------------------------------------------------------------------------------- > >> > > >> > > >> > > >> > We found that the "size" increace 5 MB each time in node A! Why not > 64 > >> KB > >> > each time?? But the "size" increace 64 KB when we run this btrace code > in > >> > other nodes at the same time. > >> > > >> > The follow codes also confirm the problem because the > "decompressedSize" > >> > is 5 MB each time in node A! > >> > > >> > ------------------------------------------------------------------------------------------------------------------------- > >> > import static com.sun.btrace.BTraceUtils.*; > >> > import com.sun.btrace.annotations.*; > >> > > >> > import java.nio.ByteBuffer; > >> > import org.apache.hadoop.hbase.io.hfile.*; > >> > > >> > @BTrace public class TestRegion1{ > >> > @OnMethod( > >> > clazz="org.apache.hadoop.hbase.io.hfile.HFile$Reader", > >> > method="decompress" > >> > ) > >> > public static void traceCacheBlock(final long offset, final int > >> > compressedSize, > >> > final int decompressedSize, final boolean pread){ > >> > println(strcat("decompressedSize: ",str(decompressedSize))); > >> > } > >> > } > >> > > >> > ------------------------------------------------------------------------------------------------------------------------- > >> > > >> > > >> > > >> > Why not 64 KB? > >> > > >> > BTW: When we set hfile.block.cache.size=0.4 in node A, the > >> > "decompressedSize" down to 64 KB, and the tps is up to high level. > >> > > >> > > >
