For benchmarking CPU, I start a pseudo-distributed HDFS cluster, put a smallish file on the local datanode (such that it fits in buffer cache), and then use the following script with various parameters to look at CPU usage to cat the file. for example:
$ REPS_PER_RUN=50 NUM_TRIALS=10 ./read-benchmark.sh hdfs://localhost/128M-file /tmp/benchmark-results.txt Script: #!/bin/sh -x set -e BINDIR=$(dirname $0) INPUT=$1 OUTPUT=$2 NUM_TRIALS=${NUM_TRIALS:-10} HADOOP=${HADOOP:-./bin/hadoop} HADOOP_FLAGS=${HADOOP_FLAGS:--Dio.file.buffer.size=$[64*1024]} REPS_PER_RUN=${REPS_PER_RUN:-1} HEADER="major\tminor\tfs_in\tfs_out\twall\tuser\tsys\tctx_invol\tctx_vol\n" TIME_FORMAT="%F\t%R\t%I\t%O\t%e\t%U\t%S\t%c\t%w" ! test -f $OUTPUT && printf $HEADER > $OUTPUT for x in `seq 1 $NUM_TRIALS` ; do /usr/bin/time --append -o $OUTPUT -f $TIME_FORMAT \ $HADOOP fs $HADOOP_FLAGS -cat $(for rep in $(seq 1 $REPS_PER_RUN) ; do echo $INPUT ; done) > /dev/null done On Wed, Jul 6, 2011 at 1:16 AM, Keren Ouaknine <ker...@gmail.com> wrote: > Hello, > > I am working on the optimization of task scheduling for Hadoop and would > like to benchmark with* Apache Hadoop's standards benchmarks*. So far, I > used my own scripts to measure and monitor. Where can I find the > benchmarking you are referring to please? > > Thanks, > Keren > > On Wed, Jul 6, 2011 at 7:32 AM, Todd Lipcon (JIRA) <j...@apache.org> > wrote: > > > Simplify BlockReader to not inherit from FSInputChecker > > ------------------------------------------------------- > > > > Key: HDFS-2129 > > URL: https://issues.apache.org/jira/browse/HDFS-2129 > > Project: Hadoop HDFS > > Issue Type: Sub-task > > Components: hdfs client > > Reporter: Todd Lipcon > > Assignee: Todd Lipcon > > > > > > BlockReader is currently quite complicated since it has to conform to the > > FSInputChecker inheritance structure. It would be much simpler to > implement > > it standalone. Benchmarking indicates it's slightly faster, as well. > > > > -- > > This message is automatically generated by JIRA. > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > > > > > > > -- > Keren Ouaknine > Cell: +972 54 2565404 > Web: www.kereno.com > -- Todd Lipcon Software Engineer, Cloudera