svn commit: r610248 - in /lucene/hadoop/trunk: ./ src/test/gridmix/ src/test/gridmix/javasort/ src/test/gridmix/maxent/ src/test/gridmix/monsterQuery/ src/test/gridmix/pipesort/ src/test/gridmix/streamsort/ src/test/gridmix/submissionScripts/ src/test/...

cdouglas Tue, 08 Jan 2008 17:00:52 -0800

Author: cdouglas
Date: Tue Jan  8 17:00:02 2008
New Revision: 610248

URL: http://svn.apache.org/viewvc?rev=610248&view=rev
Log:
HADOOP-2369. Adds a set of scripts for simulating a mix of user map/reduce
workloads. (Runping Qi via cdouglas)



Added:
    lucene/hadoop/trunk/src/test/gridmix/
    lucene/hadoop/trunk/src/test/gridmix/README
    lucene/hadoop/trunk/src/test/gridmix/generateData.sh
    lucene/hadoop/trunk/src/test/gridmix/gridmix-env
    lucene/hadoop/trunk/src/test/gridmix/javasort/
    lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.large
    lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.medium
    lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.small
    lucene/hadoop/trunk/src/test/gridmix/maxent/
    lucene/hadoop/trunk/src/test/gridmix/maxent/maxent.large
    lucene/hadoop/trunk/src/test/gridmix/monsterQuery/
    lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.large
    lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.medium
    lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.small
    lucene/hadoop/trunk/src/test/gridmix/pipesort/
    lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.large
    lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.medium
    lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.small
    lucene/hadoop/trunk/src/test/gridmix/streamsort/
    lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.large
    lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.medium
    lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.small
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allThroughHod
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allToSameCluster
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentHod
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentToSameCluster
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesHod
    
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesToSameCluster
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/sleep_if_too_busy
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortHod
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortToSameCluster
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanHod
    
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanToSameCluster
    lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortHod
    
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortToSameCluster
    lucene/hadoop/trunk/src/test/gridmix/webdatascan/
    lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.large
    lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.medium
    lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.small
    lucene/hadoop/trunk/src/test/gridmix/webdatasort/
    lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.large
    lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.medium
    lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.small
Modified:
    lucene/hadoop/trunk/CHANGES.txt

Modified: lucene/hadoop/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/CHANGES.txt?rev=610248&r1=610247&r2=610248&view=diff
==============================================================================
--- lucene/hadoop/trunk/CHANGES.txt (original)
+++ lucene/hadoop/trunk/CHANGES.txt Tue Jan  8 17:00:02 2008
@@ -172,6 +172,9 @@
 
     HADOOP-2233. Adds a generic load generator for modeling MR jobs. (cdouglas)
 
+    HADOOP-2369. Adds a set of scripts for simulating a mix of user map/reduce
+    workloads. (Runping Qi via cdouglas)
+
   OPTIMIZATIONS
 
     HADOOP-1898.  Release the lock protecting the last time of the last stack

Added: lucene/hadoop/trunk/src/test/gridmix/README
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/README?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/README (added)
+++ lucene/hadoop/trunk/src/test/gridmix/README Tue Jan  8 17:00:02 2008
@@ -0,0 +1,168 @@
+### "Gridmix" Benchmark ###
+
+Contents:
+
+0 Overview
+1 Getting Started
+  1.0 Build
+  1.1 Configure
+  1.2 Generate test data
+2 Running
+  2.0 General
+  2.1 Non-Hod cluster
+  2.2 Hod
+    2.2.0 Static cluster
+    2.2.1 Hod cluster
+
+
+* 0 Overview
+
+The scripts in this package model a cluster workload. The workload is
+simulated by generating random data and submitting map/reduce jobs that
+mimic observed data-access patterns in user jobs. The full benchmark
+generates approximately 2.5TB of (often compressed) input data operated on
+by the following simulated jobs:
+
+1) Three stage map/reduce job
+          Input:      500GB compressed (2TB uncompressed) SequenceFile
+                 (k,v) = (5 words, 100 words)
+                 hadoop-env: FIXCOMPSEQ
+     Compute1:   keep 10% map, 40% reduce
+          Compute2:   keep 100% map, 77% reduce
+                 Input from Compute1
+     Compute3:   keep 116% map, 91% reduce
+                 Input from Compute2
+     Motivation: Many user workloads are implemented as pipelined map/reduce
+                 jobs, including Pig workloads
+
+2) Large sort of variable key/value size
+     Input:      500GB compressed (2TB uncompressed) SequenceFile
+                 (k,v) = (5-10 words, 100-10000 words)
+                 hadoop-env: VARCOMPSEQ
+     Compute:    keep 100% map, 100% reduce
+     Motivation: Processing large, compressed datsets is common.
+
+3) Reference select
+     Input:      500GB compressed (2TB uncompressed) SequenceFile
+                 (k,v) = (5-10 words, 100-10000 words)
+                 hadoop-env: VARCOMPSEQ
+     Compute:    keep 0.2% map, 5% reduce
+                 1 Reducer
+     Motivation: Sampling from a large, reference dataset is common.
+
+4) Indirect Read
+     Input:      500GB compressed (2TB uncompressed) Text
+                 (k,v) = (5 words, 20 words)
+                 hadoop-env: FIXCOMPTEXT
+     Compute:    keep 50% map, 100% reduce Each map reads 1 input file,
+                 adding additional input files from the output of the
+                 previous iteration for 10 iterations
+     Motivation: User jobs in the wild will often take input data without
+                 consulting the framework. This simulates an iterative job
+                 whose input data is all "indirect," i.e. given to the
+                 framework sans locality metadata.
+
+5) API text sort (java, pipes, streaming)
+     Input:      500GB uncompressed Text
+                 (k,v) = (1-10 words, 0-200 words)
+                 hadoop-env: VARINFLTEXT
+     Compute:    keep 100% map, 100% reduce
+     Motivation: This benchmark should exercise each of the APIs to
+                 map/reduce
+
+Each of these jobs may be run individually or- using the scripts provided-
+as a simulation of user activity sized to run in approximately 4 hours on a
+480-500 node cluster using Hadoop 0.15.0. The benchmark runs a mix of small,
+medium, and large jobs simultaneously, submitting each at fixed intervals.
+
+Notes(1-4): Since input data are compressed, this means that each mapper
+outputs a lot more bytes than it reads in, typically causing map output
+spills.
+
+
+
+* 1 Getting Started
+
+1.0 Build
+
+1) Compile the examples, including the C++ sources:
+  > ant -Dcompile.c++=yes examples
+2) Copy the pipe sort example to a location in the default filesystem
+   (usually HDFS, default /gridmix/programs)
+  > $HADOOP_HOME/hadoop dfs -mkdir $GRID_MIX_PROG
+  > $HADOOP_HOME/hadoop dfs -put 
build/c++-examples/$PLATFORM_STR/bin/pipes-sort $GRID_MIX_PROG
+
+1.1 Configure
+
+One must modify hadoop-env to supply the following information:
+
+HADOOP_HOME     The hadoop install location
+GRID_MIX_HOME   The location of these scripts
+APP_JAR         The location of the hadoop example
+GRID_MIX_DATA   The location of the datsets for these benchmarks
+GRID_MIX_PROG   The location of the pipe-sort example
+
+Reasonable defaults are provided for all but HADOOP_HOME. The datasets used
+by each of the respective benchmarks are recorded in the Input::hadoop-env
+comment in section 0 and their location may be changed in hadoop-env. Note
+that each job expects particular input data and the parameters given to it
+must be changed in each script if a different InputFormat, keytype, or
+valuetype is desired.
+
+Note that NUM_OF_REDUCERS_FOR_*_JOB properties should be sized to the
+cluster on which the benchmarks will be run. The default assumes a large
+(450-500 node) cluster.
+
+1.2 Generate test data
+
+Test data is generated using the generateData.sh script. While one may
+modify the structure and size of the data generated here, note that many of
+the scripts- particularly for medium and small sized jobs- rely not only on
+specific InputFormats and key/value types, but also on a particular
+structure to the input data. Changing these values will likely be necessary
+to run on small and medium-sized clusters, but any modifications must be
+informed by an explicit familiarity with the underlying scripts.
+
+It is sufficient to run the script without modification, though it may
+require up to 4TB of free space in the default filesystem. Changing the size
+of the input data (COMPRESSED_DATA_BYTES, UNCOMPRESSED_DATA_BYTES,
+INDIRECT_DATA_BYTES) is safe. A 4x compression ratio for generated, block
+compressed data is typical.
+
+* 2 Running
+
+2.0 General
+
+The submissionScripts directory contains the high-level scripts submitting
+sized jobs for the gridmix benchmark. Each submits $NUM_OF_*_JOBS_PER_CLASS
+instances as specified in the gridmix-env script, where an instance is an
+invocation of a script as in $JOBTYPE/$JOBTYPE.$CLASS (e.g.
+javasort/text-sort.large). Each instance may submit one or more map/reduce
+jobs.
+
+There is a backoff script, submissionScripts/sleep_if_too_busy that can be
+modified to define throttling criteria. By default, it simply counts running
+java processes.
+
+2.1 Non-Hod cluster
+
+The submissionScripts/allToSameCluster script will invoke each of the other
+submission scripts for the gridmix benchmark. Depending on how your cluster
+manages job submission, these scripts may require modification. The details
+are very context-dependent.
+
+2.2 Hod
+
+Note that there are options in hadoop-env that control jobs sumitted thruogh
+Hod. One may specify the location of a config (HOD_CONFIG), the number of
+nodes to allocate for classes of jobs, and any additional options one wants
+to apply. The default includes an example for supplying a Hadoop tarball for
+testing platform changes (see Hod documentation).
+
+2.2.0 Static Cluster
+
+> hod --hod.script=submissionScripts/allToSameCluster -m 500
+
+2.2.1 Hod-allocated cluster
+
+> ./submissionScripts/allThroughHod

Added: lucene/hadoop/trunk/src/test/gridmix/generateData.sh
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/generateData.sh?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/generateData.sh (added)
+++ lucene/hadoop/trunk/src/test/gridmix/generateData.sh Tue Jan  8 17:00:02 
2008
@@ -0,0 +1,69 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/gridmix-env
+
+# 2TB data compressing to approx 500GB
+#COMPRESSED_DATA_BYTES=2147483648000
+COMPRESSED_DATA_BYTES=2147483648
+# 500GB
+#UNCOMPRESSED_DATA_BYTES=536870912000
+UNCOMPRESSED_DATA_BYTES=536870912
+# Number of partitions for output data
+NUM_MAPS=100
+# Default approx 70MB per data file, compressed
+#INDIRECT_DATA_BYTES=58720256000
+INDIRECT_DATA_BYTES=58720256
+INDIRECT_DATA_FILES=200
+
+${HADOOP_HOME}/bin/hadoop jar \
+  ${EXAMPLE_JAR} randomtextwriter \
+  -D test.randomtextwrite.total_bytes=${COMPRESSED_DATA_BYTES} \
+  -D test.randomtextwrite.bytes_per_map=$((${COMPRESSED_DATA_BYTES} / 
${NUM_MAPS})) \
+  -D test.randomtextwrite.min_words_key=5 \
+  -D test.randomtextwrite.max_words_key=10 \
+  -D test.randomtextwrite.min_words_value=100 \
+  -D test.randomtextwrite.max_words_value=10000 \
+  -D mapred.output.compress=true \
+  -D mapred.map.output.compression.type=BLOCK \
+  -outFormat org.apache.hadoop.mapred.SequenceFileOutputFormat \
+  ${VARCOMPSEQ}
+
+${HADOOP_HOME}/bin/hadoop jar \
+  ${EXAMPLE_JAR} randomtextwriter \
+  -D test.randomtextwrite.total_bytes=${COMPRESSED_DATA_BYTES} \
+  -D test.randomtextwrite.bytes_per_map=$((${COMPRESSED_DATA_BYTES} / 
${NUM_MAPS})) \
+  -D test.randomtextwrite.min_words_key=5 \
+  -D test.randomtextwrite.max_words_key=5 \
+  -D test.randomtextwrite.min_words_value=100 \
+  -D test.randomtextwrite.max_words_value=100 \
+  -D mapred.output.compress=true \
+  -D mapred.map.output.compression.type=BLOCK \
+  -outFormat org.apache.hadoop.mapred.SequenceFileOutputFormat \
+  ${FIXCOMPSEQ}
+
+${HADOOP_HOME}/bin/hadoop jar \
+  ${EXAMPLE_JAR} randomtextwriter \
+  -D test.randomtextwrite.total_bytes=${UNCOMPRESSED_DATA_BYTES} \
+  -D test.randomtextwrite.bytes_per_map=$((${UNCOMPRESSED_DATA_BYTES} / 
${NUM_MAPS})) \
+  -D test.randomtextwrite.min_words_key=1 \
+  -D test.randomtextwrite.max_words_key=10 \
+  -D test.randomtextwrite.min_words_value=0 \
+  -D test.randomtextwrite.max_words_value=200 \
+  -D mapred.output.compress=false \
+  -outFormat org.apache.hadoop.mapred.TextOutputFormat \
+  ${VARINFLTEXT}
+
+${HADOOP_HOME}/bin/hadoop jar \
+  ${EXAMPLE_JAR} randomtextwriter \
+  -D test.randomtextwrite.total_bytes=${INDIRECT_DATA_BYTES} \
+  -D test.randomtextwrite.bytes_per_map=$((${INDIRECT_DATA_BYTES} / 
${INDIRECT_DATA_FILES})) \
+  -D test.randomtextwrite.min_words_key=5 \
+  -D test.randomtextwrite.max_words_key=5 \
+  -D test.randomtextwrite.min_words_value=20 \
+  -D test.randomtextwrite.max_words_value=20 \
+  -D mapred.output.compress=true \
+  -D mapred.map.output.compression.type=BLOCK \
+  -outFormat org.apache.hadoop.mapred.TextOutputFormat \
+  ${FIXCOMPTEXT}

Added: lucene/hadoop/trunk/src/test/gridmix/gridmix-env
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/gridmix-env?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/gridmix-env (added)
+++ lucene/hadoop/trunk/src/test/gridmix/gridmix-env Tue Jan  8 17:00:02 2008
@@ -0,0 +1,50 @@
+#!/bin/bash
+
+
+## Environment configuration
+# Hadoop installation
+export HADOOP_HOME=
+# Base directory for gridmix install
+export GRID_MIX_HOME=${GRID_DIR}
+# Hadoop example jar
+export EXAMPLE_JAR=${HADOOP_HOME}/hadoop-0.15.2-dev-examples.jar
+# Hadoop test jar
+export APP_JAR=${HADOOP_HOME}/hadoop-0.15.2-dev-test.jar
+# Hadoop streaming jar
+export STREAM_JAR=${HADOOP_HOME}/contrib/hadoop-0.15.2-streaming.jar
+# Location on default filesystem for writing gridmix data (usually HDFS)
+# Default: /gridmix/data
+export GRID_MIX_DATA=/gridmix/data
+# Location of executables in default filesystem (usually HDFS)
+# Default: /gridmix/programs
+export GRID_MIX_PROG=/gridmix/programs
+
+## Data sources
+# Variable length key, value compressed SequenceFile
+export VARCOMPSEQ=${GRID_MIX_DATA}/WebSimulationBlockCompressed
+# Fixed length key, value compressed SequenceFile
+export FIXCOMPSEQ=${GRID_MIX_DATA}/MonsterQueryBlockCompressed
+# Variable length key, value uncompressed Text File
+export VARINFLTEXT=${GRID_MIX_DATA}/SortUncompressed
+# Fixed length key, value compressed Text File
+export FIXCOMPTEXT=${GRID_MIX_DATA}/EntropySimulationCompressed
+
+## Job sizing
+export NUM_OF_LARGE_JOBS_PER_CLASS=3
+export NUM_OF_MEDIUM_JOBS_PER_CLASS=20
+export NUM_OF_SMALL_JOBS_PER_CLASS=40
+
+export NUM_OF_REDUCERS_FOR_LARGE_JOB=370
+export NUM_OF_REDUCERS_FOR_MEDIUM_JOB=170
+export NUM_OF_REDUCERS_FOR_SMALL_JOB=15
+
+## Throttling
+export INTERVAL_BETWEEN_SUBMITION=20
+
+## Hod
+#export 
HOD_OPTIONS="--ringmaster.hadoop-tar-ball=/path/to/hadoop-0.15.0-dev.tar.gz"
+#export HOD_CONFIG=
+#export ALL_HOD_OPTIONS="$HOD_OPTIONS -c ${HOD_CONFIG}"
+#export SMALL_JOB_HOD_OPTIONS="$ALL_HOD_OPTIONS -m 5"
+#export MEDIUM_JOB_HOD_OPTIONS="$ALL_HOD_OPTIONS -m 50"
+#export LARGE_JOB_HOD_OPTIONS="$ALL_HOD_OPTIONS -m 100"

Added: lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.large (added)
+++ lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.large Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+INDIR=${VARINFLTEXT}
+
+Date=`date +%F-%H-%M-%S`
+OUTDIR=perf-out/sort-out-dir-large_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar ${APP_JAR} sort -m 1 -r 
$NUM_OF_REDUCERS_FOR_LARGE_JOB -inFormat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -outFormat 
org.apache.hadoop.mapred.TextOutputFormat -outKey org.apache.hadoop.io.Text 
-outValue org.apache.hadoop.io.Text $INDIR $OUTDIR
+

Added: lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.medium
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.medium?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.medium (added)
+++ lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.medium Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+INDIR=${VARINFLTEXT}/part-000*0,${VARINFLTEXT}/part-000*1,${VARINFLTEXT}/part-000*2
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/sort-out-dir-medium_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar ${APP_JAR} sort -m 1 -r 
$NUM_OF_REDUCERS_FOR_MEDIUM_JOB -inFormat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -outFormat 
org.apache.hadoop.mapred.TextOutputFormat -outKey org.apache.hadoop.io.Text 
-outValue org.apache.hadoop.io.Text $INDIR $OUTDIR
+

Added: lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.small
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.small?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.small (added)
+++ lucene/hadoop/trunk/src/test/gridmix/javasort/text-sort.small Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+INDIR=${VARINFLTEXT}/part-00000,${VARINFLTEXT}/part-00001,${VARINFLTEXT}/part-00002
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/sort-out-dir-small_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar ${APP_JAR} sort -m 1 -r 
$NUM_OF_REDUCERS_FOR_SMALL_JOB -inFormat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -outFormat 
org.apache.hadoop.mapred.TextOutputFormat -outKey org.apache.hadoop.io.Text 
-outValue org.apache.hadoop.io.Text $INDIR $OUTDIR
+

Added: lucene/hadoop/trunk/src/test/gridmix/maxent/maxent.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/maxent/maxent.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/maxent/maxent.large (added)
+++ lucene/hadoop/trunk/src/test/gridmix/maxent/maxent.large Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,26 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=100
+INDIR=${FIXCOMPTEXT}
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/maxent-out-dir-large_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 50 -keepred 100 
-inFormatIndirect org.apache.hadoop.mapred.TextInputFormat -outFormat 
org.apache.hadoop.mapred.TextOutputFormat -outKey 
org.apache.hadoop.io.LongWritable -outValue org.apache.hadoop.io.Text -indir 
$INDIR -outdir $OUTDIR.1 -r $NUM_OF_REDUCERS
+
+ITER=11
+for ((i=1; i<$ITER; ++i))
+do
+  ${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 50 -keepred 100 
-inFormatIndirect org.apache.hadoop.mapred.TextInputFormat -outFormat 
org.apache.hadoop.mapred.TextOutputFormat -outKey 
org.apache.hadoop.io.LongWritable -outValue org.apache.hadoop.io.Text -indir 
$INDIR -indir $OUTDIR.$i -outdir $OUTDIR.$(($i+1)) -r $NUM_OF_REDUCERS
+  if [ $? -ne "0" ]
+    then exit $?
+  fi
+  ${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR.$i
+done
+
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR.$ITER

Added: lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.large 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.large Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_LARGE_JOB
+INDIR=${FIXCOMPSEQ}
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/mq-out-dir-large_$Date.1
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 10 -keepred 40 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+INDIR=$OUTDIR
+OUTDIR=perf-out/mq-out-dir-large_$Date.2
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 100 -keepred 77 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+INDIR=$OUTDIR
+OUTDIR=perf-out/mq-out-dir-large_$Date.3
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 116 -keepred 91 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+

Added: lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.medium
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.medium?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.medium 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.medium Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_MEDIUM_JOB
+INDIR=${FIXCOMPSEQ}/part-000*0,${FIXCOMPSEQ}/part-000*1,${FIXCOMPSEQ}/part-000*2
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/mq-out-dir-medium_$Date.1
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 10 -keepred 40 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+INDIR=$OUTDIR
+OUTDIR=perf-out/mq-out-dir-medium_$Date.2
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 100 -keepred 77 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+INDIR=$OUTDIR
+OUTDIR=perf-out/mq-out-dir-medium_$Date.3
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 116 -keepred 91 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+

Added: lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.small
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.small?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.small 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/monsterQuery/monster_query.small Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_SMALL_JOB
+INDIR=${FIXCOMPSEQ}/part-00000,${FIXCOMPSEQ}/part-00001,${FIXCOMPSEQ}/part-00002
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/mq-out-dir-small_$Date.1
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 10 -keepred 40 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+INDIR=$OUTDIR
+OUTDIR=perf-out/mq-out-dir-small_$Date.2
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 100 -keepred 77 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+INDIR=$OUTDIR
+OUTDIR=perf-out/mq-out-dir-small_$Date.3
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 116 -keepred 91 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+

Added: lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.large (added)
+++ lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.large Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_LARGE_JOB
+INDIR=${VARINFLTEXT}
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/pipe-out-dir-large_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+
+${HADOOP_HOME}/bin/hadoop pipes -input $INDIR -output $OUTDIR -inputformat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -program 
${GRID_MIX_PROG}/pipes-sort -reduces $NUM_OF_REDUCERS -jobconf 
mapred.output.key.class=org.apache.hadoop.io.Text,mapred.output.value.class=org.apache.hadoop.io.Text
 -writer org.apache.hadoop.mapred.TextOutputFormat
+

Added: lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.medium
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.medium?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.medium (added)
+++ lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.medium Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_MEDIUM_JOB
+INDIR=${VARINFLTEXT}/part-000*0,${VARINFLTEXT}/part-000*1,${VARINFLTEXT}/part-000*2
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/pipe-out-dir-medium_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+
+${HADOOP_HOME}/bin/hadoop pipes -input $INDIR -output $OUTDIR -inputformat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -program 
${GRID_MIX_PROG}/pipes-sort -reduces $NUM_OF_REDUCERS -jobconf 
mapred.output.key.class=org.apache.hadoop.io.Text,mapred.output.value.class=org.apache.hadoop.io.Text
 -writer org.apache.hadoop.mapred.TextOutputFormat
+

Added: lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.small
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.small?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.small (added)
+++ lucene/hadoop/trunk/src/test/gridmix/pipesort/text-sort.small Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_SMALL_JOB
+INDIR=${VARINFLTEXT}/part-00000,${VARINFLTEXT}/part-00001,${VARINFLTEXT}/part-00002
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/pipe-out-dir-small_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+
+${HADOOP_HOME}/bin/hadoop pipes -input $INDIR -output $OUTDIR -inputformat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -program 
${GRID_MIX_PROG}/pipes-sort -reduces $NUM_OF_REDUCERS -jobconf 
mapred.output.key.class=org.apache.hadoop.io.Text,mapred.output.value.class=org.apache.hadoop.io.Text
 -writer org.apache.hadoop.mapred.TextOutputFormat
+

Added: lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.large (added)
+++ lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.large Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+export NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_LARGE_JOB
+export INDIR=${VARINFLTEXT}
+Date=`date +%F-%H-%M-%S`
+
+export OUTDIR=perf-out/stream-out-dir-large_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+
+${HADOOP_HOME}/bin/hadoop jar ${STREAM_JAR} -input $INDIR -output $OUTDIR 
-mapper cat -reducer cat -numReduceTasks $NUM_OF_REDUCERS
+

Added: lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.medium
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.medium?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.medium (added)
+++ lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.medium Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_MEDIUM_JOB
+INDIR=${VARINFLTEXT}/part-000*0,${VARINFLTEXT}/part-000*1,${VARINFLTEXT}/part-000*2
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/stream-out-dir-medium_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+
+${HADOOP_HOME}/bin/hadoop jar ${STREAM_JAR} -input $INDIR -output $OUTDIR 
-mapper cat -reducer cat -numReduceTasks $NUM_OF_REDUCERS
+

Added: lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.small
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.small?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.small (added)
+++ lucene/hadoop/trunk/src/test/gridmix/streamsort/text-sort.small Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_SMALL_JOB
+INDIR=${VARINFLTEXT}/part-00000,${VARINFLTEXT}/part-00001,${VARINFLTEXT}/part-00002
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/stream-out-dir-small_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+
+${HADOOP_HOME}/bin/hadoop jar ${STREAM_JAR} -input $INDIR -output $OUTDIR 
-mapper cat -reducer cat -numReduceTasks $NUM_OF_REDUCERS
+

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allThroughHod
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allThroughHod?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allThroughHod (added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allThroughHod Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,13 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+$GRID_MIX_HOME/submissionScripts/textSortHod 2>&1 >  textSortHod.out &
+$GRID_MIX_HOME/submissionScripts/monsterQueriesHod 2>&1 > 
monsterQueriesHod.out &
+$GRID_MIX_HOME/submissionScripts/webdataScanHod 2>&1 > webdataScanHod.out &
+$GRID_MIX_HOME/submissionScripts/webdataSortHod 2>&1 > webdataSortHod.out &
+$GRID_MIX_HOME/submissionScripts/maxentHod 2>&1 >  maxentHod.out & 
+
+

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allToSameCluster
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allToSameCluster?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allToSameCluster 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/allToSameCluster Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+$GRID_MIX_HOME/submissionScripts/textSortToSameCluster 2>&1 > 
textSortToSameCluster.out  &
+sleep 20
+$GRID_MIX_HOME/submissionScripts/monsterQueriesToSameCluster 2>&1 > 
monsterQueriesToSameCluster.out &
+sleep 20
+$GRID_MIX_HOME/submissionScripts/webdataScanToSameCluster 2>&1 > 
webdataScanToSameCluster.out &
+sleep 20
+$GRID_MIX_HOME/submissionScripts/webdataSortToSameCluster  2>&1 > 
webdataSortToSameCluster.out &
+sleep 20
+$GRID_MIX_HOME/submissionScripts/maxentToSameCluster 2>&1 > 
maxentToSameCluster.out &
+

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentHod
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentHod?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentHod (added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentHod Tue Jan  8 
17:00:02 2008
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $LARGE_JOB_HOD_OPTIONS --hod.script=$GRID_MIX_HOME/maxent/maxent.large 
 2>&1 > maxent.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy 
+done

Added: 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentToSameCluster
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentToSameCluster?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentToSameCluster 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/maxentToSameCluster 
Tue Jan  8 17:00:02 2008
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/maxent/maxent.large  2>&1 > maxent.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesHod
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesHod?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesHod 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesHod 
Tue Jan  8 17:00:02 2008
@@ -0,0 +1,26 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_SMALL_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $SMALL_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/monsterQuery/monster_query.small  2>&1 > 
monster_query.small.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+for ((i=0; i < $NUM_OF_MEDIUM_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $MEDIUM_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/monsterQuery/monster_query.medium  2>&1 > 
monster_query.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $LARGE_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/monsterQuery/monster_query.large  2>&1 > 
monster_query.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done

Added: 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesToSameCluster
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesToSameCluster?rev=610248&view=auto
==============================================================================
--- 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesToSameCluster
 (added)
+++ 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/monsterQueriesToSameCluster
 Tue Jan  8 17:00:02 2008
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_SMALL_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/monsterQuery/monster_query.small  2>&1 > 
monster_query.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+for ((i=0; i < $NUM_OF_MEDIUM_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/monsterQuery/monster_query.medium  2>&1 > 
monster_query.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/monsterQuery/monster_query.large  2>&1 > 
monster_query.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/sleep_if_too_busy
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/sleep_if_too_busy?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/sleep_if_too_busy 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/sleep_if_too_busy 
Tue Jan  8 17:00:02 2008
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+sleep 1
+for ((java_process=$((`ps -ef|grep java|wc|awk '{print $1}'`-1)); \
+      java_process > 60; \
+      java_process=$((`ps -ef|grep java|wc|awk '{print $1}'`-1))))
+do
+    sleep 10
+    echo $java_process
+done
+

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortHod
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortHod?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortHod (added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortHod Tue Jan  
8 17:00:02 2008
@@ -0,0 +1,39 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_SMALL_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $SMALL_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/pipesort/text-sort.small  2>&1 > 
pipesort.small.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    hod $SMALL_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/streamsort/text-sort.small  2>&1 > 
streamsort.small.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    hod $SMALL_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/javasort/text-sort.small  2>&1 > 
javasort.small.$i.out  &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+for ((i=0; i < $NUM_OF_MEDIUM_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $MEDIUM_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/pipesort/text-sort.medium  2>&1 > 
pipesort.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    hod $MEDIUM_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/streamsort/text-sort.medium  2>&1 > 
streamsort.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    hod $MEDIUM_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/javasort/text-sort.medium  2>&1 > 
javasort.medium.$i.out  &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $LARGE_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/pipesort/text-sort.large  2>&1 >  
pipesort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    hod $LARGE_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/streamsort/text-sort.large  2>&1 > 
streamsort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    hod $LARGE_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/javasort/text-sort.large  2>&1 > 
javasort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    

Added: 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortToSameCluster
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortToSameCluster?rev=610248&view=auto
==============================================================================
--- 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortToSameCluster 
(added)
+++ 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/textSortToSameCluster 
Tue Jan  8 17:00:02 2008
@@ -0,0 +1,39 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_SMALL_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/pipesort/text-sort.small  2>&1 > pipesort.small.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    $GRID_MIX_HOME/streamsort/text-sort.small  2>&1 > streamsort.small.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    $GRID_MIX_HOME/javasort/text-sort.small  2>&1 > javasort.small.$i.out & 
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+for ((i=0; i < $NUM_OF_MEDIUM_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/pipesort/text-sort.medium  2>&1 > pipesort.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    $GRID_MIX_HOME/streamsort/text-sort.medium  2>&1 > 
streamsort.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    $GRID_MIX_HOME/javasort/text-sort.medium  2>&1 > javasort.medium.$i.out & 
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/pipesort/text-sort.large  2>&1 > pipesort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    $GRID_MIX_HOME/streamsort/text-sort.large  2>&1 > pipesort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+    $GRID_MIX_HOME/javasort/text-sort.large  2>&1 > pipesort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanHod
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanHod?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanHod 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanHod Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,28 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_SMALL_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $SMALL_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/webdatascan/webdata_scan.small  2>&1 > 
webdata_scan.small.$i.out&
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+
+for ((i=0; i < $NUM_OF_MEDIUM_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $MEDIUM_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/webdatascan/webdata_scan.medium  2>&1 > 
webdata_scan.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $LARGE_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/webdatascan/webdata_scan.large  2>&1 > 
webdata_scan.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    

Added: 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanToSameCluster
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanToSameCluster?rev=610248&view=auto
==============================================================================
--- 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanToSameCluster 
(added)
+++ 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataScanToSameCluster 
Tue Jan  8 17:00:02 2008
@@ -0,0 +1,28 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+    
+for ((i=0; i < $NUM_OF_MEDIUM_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/webdatascan/webdata_scan.medium  2>&1 > 
webdata_scan.medium.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+for ((i=0; i < $NUM_OF_SMALL_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/webdatascan/webdata_scan.small  2>&1 > 
webdata_scan.small.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/webdatascan/webdata_scan.large  2>&1 > 
webdata_scan.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    
+

Added: lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortHod
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortHod?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortHod 
(added)
+++ lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortHod Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    hod $LARGE_JOB_HOD_OPTIONS 
--hod.script=$GRID_MIX_HOME/webdatasort/webdata_sort.large  2>&1 > 
webdata_sort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    

Added: 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortToSameCluster
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortToSameCluster?rev=610248&view=auto
==============================================================================
--- 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortToSameCluster 
(added)
+++ 
lucene/hadoop/trunk/src/test/gridmix/submissionScripts/webdataSortToSameCluster 
Tue Jan  8 17:00:02 2008
@@ -0,0 +1,13 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+for ((i=0; i < $NUM_OF_LARGE_JOBS_PER_CLASS; i++))
+do
+    echo $i
+    $GRID_MIX_HOME/webdatasort/webdata_sort.large  2>&1 > 
webdata_sort.large.$i.out &
+    $GRID_MIX_HOME/submissionScripts/sleep_if_too_busy
+done
+    

Added: lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.large (added)
+++ lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.large Tue Jan 
 8 17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=1
+INDIR=${VARCOMPSEQ}
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/webdata-scan-out-dir-large_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 0.2 -keepred 5 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS

Added: lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.medium
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.medium?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.medium (added)
+++ lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.medium Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=1
+INDIR=${VARCOMPSEQ}/part-000*0,${VARCOMPSEQ}/part-000*1,${VARCOMPSEQ}/part-000*2
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/webdata-scan-out-dir-medium_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar ${APP_JAR} loadgen -keepmap 1 -keepred 5 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS

Added: lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.small
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.small?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.small (added)
+++ lucene/hadoop/trunk/src/test/gridmix/webdatascan/webdata_scan.small Tue Jan 
 8 17:00:02 2008
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=1
+INDIR=${VARCOMPSEQ}/part-00000,${VARCOMPSEQ}/part-00001,${VARCOMPSEQ}/part-00002
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/webdata-scan-out-dir-small_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 1 -keepred 5 -inFormat 
org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS

Added: lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.large
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.large?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.large (added)
+++ lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.large Tue Jan 
 8 17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_LARGE_JOB
+INDIR=${VARCOMPSEQ}/part-000*0,${VARCOMPSEQ}/part-000*1
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/webdata-sort-out-dir-large_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 100 -keepred 100 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+

Added: lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.medium
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.medium?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.medium (added)
+++ lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.medium Tue 
Jan  8 17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_MEDIUM_JOB
+INDIR=${VARCOMPSEQ}/part-0000,${VARCOMPSEQ}/part-0001
+Date=`date +%F-%H-%M-%S`
+
+OUTDIR=perf-out/webdata-sort-out-dir-medium_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 100 -keepred 100 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+

Added: lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.small
URL: 
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.small?rev=610248&view=auto
==============================================================================
--- lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.small (added)
+++ lucene/hadoop/trunk/src/test/gridmix/webdatasort/webdata_sort.small Tue Jan 
 8 17:00:02 2008
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+GRID_DIR=`dirname "$0"`
+GRID_DIR=`cd "$GRID_DIR"; pwd`
+source $GRID_DIR/../gridmix-env
+
+NUM_OF_REDUCERS=$NUM_OF_REDUCERS_FOR_SMALL_JOB
+INDIR=${VARCOMPSEQ}/part-00000
+Date=`date +%F-%H-%M-%S`
+
+export OUTDIR=perf-out/webdata-sort-out-dir-small_$Date
+${HADOOP_HOME}/bin/hadoop dfs -rmr $OUTDIR
+
+${HADOOP_HOME}/bin/hadoop jar $APP_JAR loadgen -keepmap 100 -keepred 100 
-inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat 
org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey 
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text -indir $INDIR 
-outdir $OUTDIR -r $NUM_OF_REDUCERS
+
+

svn commit: r610248 - in /lucene/hadoop/trunk: ./ src/test/gridmix/ src/test/gridmix/javasort/ src/test/gridmix/maxent/ src/test/gridmix/monsterQuery/ src/test/gridmix/pipesort/ src/test/gridmix/streamsort/ src/test/gridmix/submissionScripts/ src/test/...

Reply via email to