[ https://issues.apache.org/jira/browse/HBASE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175152#comment-13175152 ]
Phabricator commented on HBASE-4916: ------------------------------------ jdcryans has commented on the revision "HBASE-4916 [jira] LoadTest MR Job". Another day of testing later, here's what I think of the current patch: - I like being able to drive a lot of load from just a few mappers, this is definitely something YCSB and PE can't do. - Being able to mix things is fun too, with PE I'd have to start 2 tests. With YCSB it's more versatile tho since you can tweak more than just random reads and writes. - Regarding the work loads, it's also much easier to do in YCSB as you don't have to write code. My other comments are in the code below. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/mapreduce/LoadTest.java:197 HBC.create() ?? src/main/java/org/apache/hadoop/hbase/loadtest/Workload.java:100 I understand why the table is recreated every time (you need to know the existing keys), but this is also a big flaw for this load tester. If you have a respectable number of nodes, it's takes a lot of time to generate enough data so that your block cache/OS cache cannot hold all the data. Being able to restart using an existing table is a must have IMO. How are you guys using it? src/main/java/org/apache/hadoop/hbase/loadtest/Workload.java:136 I think this should be configurable. src/main/java/org/apache/hadoop/hbase/loadtest/GetGenerator.java:97 This is one of the other big problem with this load generator, the only pattern is random. YCSB let's you do that but also offers zipf or latest distribution. REVISION DETAIL https://reviews.facebook.net/D741 > LoadTest MR Job > --------------- > > Key: HBASE-4916 > URL: https://issues.apache.org/jira/browse/HBASE-4916 > Project: HBase > Issue Type: Sub-task > Components: client, regionserver > Reporter: Nicolas Spiegelberg > Assignee: Christopher Gist > Fix For: 0.94.0 > > Attachments: HBASE-4916.D741.1.patch > > > Add a script to start a streaming map-reduce job where each map tasks runs an > instance of the load tester for a partition of the key-space. Ensure that the > load tester takes a parameter indicating the start key for write operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira