On Jul 12, 2012, at 14:10 , Lars Buitinck wrote:

> 2012/7/12 Gael Varoquaux <[email protected]>:
>> I haven't followed too much the codebase (I should have but...). That
>> said, I must confess that I am a bit frightened at the number of
>> different technologies that are being put together. Enabling ease of
>> installation/reproducibility seems to me a priority.
> 
> What exactly are the dependencies? It would be nice if they were
> listed in the README. Is it just vbench and Sphinx + transitive
> closure?

Also sqlalchemy, it seems. In vbench, pytz is imported but I'm not sure if it 
ends up being needed. I will try to remove it.

Also memory_profiler but it's not a hard dependency

> 
> Btw., I notice that the "code" is very repetitive and the modules
> contain mostly data (though I haven't read all of them yet). Vlad,
> have you considered only storing the _benchmarks value in a file and
> reading that? E.g., if you have a file called "cluster.bench"
> containing only
> 
>  [
>    {
>     'obj': 'KMeans',
>     'init_params': {'n_clusters': 9},
>     'datasets': ('minimadelon', 'blobs'),
>     'statements': ('fit', 'predict')
>    },
>    {
>     'obj': 'MiniBatchKMeans',
>     'init_params': {'n_clusters': 9},
>     'datasets': ('minimadelon', 'madelon'),
>     'statements': ('fit', 'predict')
>    }
>  ]
> 
> then you can parse that using
> 
>    ast.literal_eval(open("cluster.bench"))
> 
> in a driver script, so you're effectively using PYON, the Python
> Object Notation (note to self: must file patent).
> 
> [It might be possible to get rid of the [,] too.]
> 
> My 2c.

This is actually the kind of thing I was heading for, but first I think it 
needs to be used like this for a while, while we see if this template is 
expressive enough for what we need to benchmark.

For example a thing that hurts me is that for every 'predict' benchmark, the 
model is refitted, since benchmarks are independent.

> 
> -- 
> Lars Buitinck
> Scientific programmer, ILPS
> University of Amsterdam
> 
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------
Vlad N.
http://vene.ro





------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to