[ 
https://issues.apache.org/jira/browse/HDFS-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990244#comment-14990244
 ] 

Rakesh R commented on HDFS-8968:
--------------------------------

Good work [~lirui]. I've few comments, please take a look at it.

# Can we make this configurable like 
{{System.getProperty("test.benchmark.data","/tmp/benchmark/data"));}}
{code}
private static final String DFS_TMP_DIR = "/tmp/benchmark";
{code}
# {{printUsage}} can be highlighted using {{System.err.println}}. Also, we can 
say {{"Usage: ErasureCodeBenchmarkThroughput}}
{code}
System.out.println("ErasureCodeBenchmarkThroughput <read|write|gen|clean> "
        + "<size in MB> <ec|rep> [num clients] [stf|pos]\n" +
        "Stateful and positional option is only available for read.");
{code}
# It would be good to use hadoop utility {{StopWatch}} for the elapsed time 
computations. Presently its using {{System.currentTimeMillis() - start) / 
1000.0}}.
Sample usage:
{code}
    org.apache.hadoop.util.StopWatch sw = new StopWatch().start();
    // do the operation
    sw.stop();
    long elapsedtime = sw.now(TimeUnit.SECONDS);
{code}
# Just a suggestion to use {{java.util.concurrent.ExecutorCompletionService}} 
here rather than trying to find out which task has completed.
{code}
+    for (Future<Long> future : futures) {
+      results.add(future.get());
+    }
{code}

bq. As to unit test, maybe I can add a test where the tool runs against a 
MiniDFSCluster.
How about running both a real cluster and a MiniDFSCluster inside the 
ErasureCodeBenchmarkThroughput tool, similar to the 
{{org.apache.hadoop.hdfs.BenchmarkThroughput}}?

> New benchmark throughput tool for striping erasure coding
> ---------------------------------------------------------
>
>                 Key: HDFS-8968
>                 URL: https://issues.apache.org/jira/browse/HDFS-8968
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Rui Li
>         Attachments: HDFS-8968-HDFS-7285.1.patch, 
> HDFS-8968-HDFS-7285.2.patch, HDFS-8968.3.patch
>
>
> We need a new benchmark tool to measure the throughput of client writing and 
> reading considering cases or factors:
> * 3-replica or striping;
> * write or read, stateful read or positional read;
> * which erasure coder;
> * striping cell size;
> * concurrent readers/writers using processes or threads.
> The tool should be easy to use and better to avoid unnecessary local 
> environment impact, like local disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to