Re: performance problem during read

Ted Yu Sat, 16 Jul 2011 21:51:32 -0700

Yes.


On Saturday, July 16, 2011, Mingjian Deng <[email protected]> wrote:
> Do you mean I need open a new issue?
>
> 2011/7/16 Stack <[email protected]>
>
>> Yes.  Please file an issue.  A few fellas are messing with block cache
>> at the moment so they might be up for taking a detour to figure the
>> why on your interesting observation.
>>
>> Thanks,
>> St.Ack
>>
>> On Thu, Jul 14, 2011 at 8:41 PM, Mingjian Deng <[email protected]>
>> wrote:
>> > Hi stack:
>> >    Server A or B is the same in the cluster. If I set
>> > hfile.block.cache.size=0.1
>> > on other server, the problem will reappear.But When I set
>> > hfile.block.cache.size = 0.15 or more, it won't reappear. So I think you
>> can
>> > test on your own cluster.
>> >    With the follow btrace codes:
>> > --------------------------------------------------------------
>> > import static com.sun.btrace.BTraceUtils.*;
>> > import com.sun.btrace.annotations.*;
>> >
>> > import java.nio.ByteBuffer;
>> > import org.apache.hadoop.hbase.io.hfile.*;
>> >
>> > @BTrace public class TestRegion1{
>> >   @OnMethod(
>> >      clazz="org.apache.hadoop.hbase.io.hfile.HFile$Reader",
>> >      method="decompress"
>> >   )
>> >   public static void traceCacheBlock(final long offset, final int
>> > compressedSize,
>> >      final int decompressedSize, final boolean pread){
>> > println(strcat("decompress: ",str(decompressedSize)));
>> >   }
>> > }
>> > --------------------------------------------------------------
>> >
>> >    If I set hfile.block.cache.size=0.1, the result is:
>> > -----------
>> > .......
>> > decompress: 6020488
>> > decompress: 6022536
>> > decompress: 5991304
>> > decompress: 6283272
>> > decompress: 5957896
>> > decompress: 6246280
>> > decompress: 6041096
>> > decompress: 6541448
>> > decompress: 6039560
>> > .......
>> > -----------
>> >    If I set hfile.block.cache.size=0.12, the result is:
>> > -----------
>> > ......
>> > decompress: 65775
>> > decompress: 65556
>> > decompress: 65552
>> > decompress: 9914120
>> > decompress: 6026888
>> > decompress: 65615
>> > decompress: 65627
>> > decompress: 6247944
>> > decompress: 5880840
>> > decompress: 65646
>> > ......
>> > -----------
>> >    If I set hfile.block.cache.size=0.15 or more, the result is:
>> > -----------
>> > ......
>> > decompress: 65646
>> > decompress: 65615
>> > decompress: 65627
>> > decompress: 65775
>> > decompress: 65556
>> > decompress: 65552
>> > decompress: 65646
>> > decompress: 65615
>> > decompress: 65627
>> > decompress: 65775
>> > decompress: 65556
>> > decompress: 65552
>> > ......
>> > -----------
>> >
>> >    All of above tests run more than 10 minutes in high level read speed.
>> So
>> > it is very strange phenomenon.
>> >
>> > 2011/7/15 Stack <[email protected]>
>> >
>> >> This is interesting.  Any chance that the cells on the regions hosted
>> >> on server A are 5M in size?
>> >>
>> >> The hfile block sizes are by default configured to be 64k but rare
>> >> would an hfile block ever be exactly 64k.  We do not cut the hfile
>> >> block content at 64k exactly.  The hfile block boundary will be at a
>> >> keyvalue boundary.
>> >>
>> >> If a cell were 5MB, it does not get split across multiple hfile
>> >> blocks.  It will occupy one hfile block.
>> >>
>> >> Could it be that the region hosted on A is not like the others and it
>> >> has lots of these 5MB sizes?
>> >>
>> >> Let us know.  If above is not the case, then you have an interesting
>> >> phenomenon going on and we need to dig in more.
>> >>
>> >> St.Ack
>> >>

Re: performance problem during read

Reply via email to