[ 
https://issues.apache.org/jira/browse/HBASE-17849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948063#comment-15948063
 ] 

Enis Soztutar commented on HBASE-17849:
---------------------------------------

I have noticed something like this as well where the regions generated from 
pre-split are not growing at the equal rate. Especially the last region(s) are 
relatively small. 

> PE tool randomness is not totally random
> ----------------------------------------
>
>                 Key: HBASE-17849
>                 URL: https://issues.apache.org/jira/browse/HBASE-17849
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>
> Recently we were using the PE tool for doing some bucket cache related 
> performance tests. One thing that we noted was that the way the random read 
> works is not totally random.
> Suppose we load 200G of data using --size param and then we use --rows=500000 
> to do the randomRead. The assumption was among the 200G of data it could 
> generate randomly 500000 row keys to do the reads.
> But it so happens that the PE tool generates random rows only on those set of 
> row keys which falls under the first 500000 rows. 
> This was quite evident when we tried to use HBASE-15314 in our testing. 
> Suppose we split the bucket cache of size 200G into 2 files each 100G the 
> randomReads with --rows=500000 always lands in the first file and not in the 
> 2nd file. Better to make PE purely random.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to