Hi,
I am working for some time on Thotti. Thotti aims to be a performance
measurement framework for Mahout. It allows you to monitor the
performance of Mahout and compare different setups (JVM settings, Mahout
settings).
Thotti allows you to define your own test cases with normal Java classes
and annotations in a very similar manner to TestNG and similar
frameworks. At the moment only non-distributed tests can be executed,
but it is planned to suppport distributed tests too.
For test execution Thotti utilizes currently EC2 instances and contains
a component to manage EC2 instances (creation, termination). It also
makes heavy use of S3 to store distribute test, test data and test
result. But with a little bit work it can be extended to support
different cloud services or local servers.
Since Thotti is now stable enough for non-distributed tests I would like
to implement a reference test suite for Mahout for non-distributed
algorithms.
To build this reference test suite I need your help. Please send me your
test cases. Thotti is able to run the same test multiple times, with
different JVM settings and differerent parameters. So you can send me
your test cases and test data along with different test setups.
The example test case below will be executed by Thotti once. The JVM
will run with -server.
public class SimpleRecommenderTest {
@NDTest(id = "BForJVM909",
run = @Run(jvmArgs = @JVMArgs(id = "jvm909", value =
"-server")))
public void executeTest() throws IOException, TasteException {
DataModel model = new FileDataModel(prependDataDir(new
File("intro.csv")));
UserSimilarity similarity = new
PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood =
new NearestNUserNeighborhood(2, similarity, model);
Recommender recommender = new GenericUserBasedRecommender(
model, neighborhood, similarity);
recommender.recommend(1, 1);
}
}
I would be gratefull for your support on this work.
Bye,
Oliver