[
https://issues.apache.org/jira/browse/HBASE-17849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040593#comment-16040593
]
Hudson commented on HBASE-17849:
--------------------------------
FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3151 (See
[https://builds.apache.org/job/HBase-Trunk_matrix/3151/])
HBASE-17849 PE tool random read is not totally random (Ram) (ramkrishna: rev
1d3252eb59a0e7dbc2f120e68a22d9429bc596a9)
* (edit)
hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
* (edit)
hbase-server/src/test/java/org/apache/hadoop/hbase/TestPerformanceEvaluation.java
> PE tool random read is not totally random
> -----------------------------------------
>
> Key: HBASE-17849
> URL: https://issues.apache.org/jira/browse/HBASE-17849
> Project: HBase
> Issue Type: Bug
> Components: test
> Affects Versions: 2.0.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-17849_1.patch, HBASE-17849_2.patch,
> HBASE-17849.patch, HBASE-17849.patch
>
>
> Recently we were using the PE tool for doing some bucket cache related
> performance tests. One thing that we noted was that the way the random read
> works is not totally random.
> Suppose we load 200G of data using --size param and then we use --rows=500000
> to do the randomRead. The assumption was among the 200G of data it could
> generate randomly 500000 row keys to do the reads.
> But it so happens that the PE tool generates random rows only on those set of
> row keys which falls under the first 500000 rows.
> This was quite evident when we tried to use HBASE-15314 in our testing.
> Suppose we split the bucket cache of size 200G into 2 files each 100G the
> randomReads with --rows=500000 always lands in the first file and not in the
> 2nd file. Better to make PE purely random.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)