The problem with this POV is that it assumes it's obvious what the right outcome is. With a transaction test or a disk write test or big sort, it's obvious and you can make a benchmark. With ML, it's not even close.
For example, I can make you a recommender that is literally as fast as you like by picking any random set of items. Classifiers can likewise do so by randomly picking a class. Specifying even a desired answer isn't useful, since then you are just selecting a process that picks a particular answer on a particular data set. I don't think that works, since the classic idea of benchmark is not well-defined here, but you're welcome to go create and run whatever tests you like. On Sat, Feb 2, 2013 at 3:19 PM, jordi <[email protected]> wrote: > Hi Sean! First of all, thanks for your reply! > I do agree that it's very complicated to do the sizing of an environment since > there are many variables that should be considerated. You have mentioned some > of > them: the algorithm, the distribution of data, the amount of data, type of > hardware, etc. > But I dont agree that it's impossible to give a baseline. > Maybe should be a great idea for the Mahout+Hadoop community to take a look to > this guys (Standard Performance Evaluation Corporation, http://www.spec.org/). > They run the same benchmark on different types of architectures, establishing > empirically a baseline that can be used as a good start point to do a capacity > planning. > They have a lot of benchmarks depending on CPU, Java Client Server, etc. > Obviously, thats only a start point: before your software goes live to > production mode, it's desirable to benchmark again your software running a > load-test, adequating your infraestructure depending on performance results. > >
