2012/7/12 Gael Varoquaux <[email protected]>:
> I haven't followed too much the codebase (I should have but...). That
> said, I must confess that I am a bit frightened at the number of
> different technologies that are being put together. Enabling ease of
> installation/reproducibility seems to me a priority.
What exactly are the dependencies? It would be nice if they were
listed in the README. Is it just vbench and Sphinx + transitive
closure?
Btw., I notice that the "code" is very repetitive and the modules
contain mostly data (though I haven't read all of them yet). Vlad,
have you considered only storing the _benchmarks value in a file and
reading that? E.g., if you have a file called "cluster.bench"
containing only
[
{
'obj': 'KMeans',
'init_params': {'n_clusters': 9},
'datasets': ('minimadelon', 'blobs'),
'statements': ('fit', 'predict')
},
{
'obj': 'MiniBatchKMeans',
'init_params': {'n_clusters': 9},
'datasets': ('minimadelon', 'madelon'),
'statements': ('fit', 'predict')
}
]
then you can parse that using
ast.literal_eval(open("cluster.bench"))
in a driver script, so you're effectively using PYON, the Python
Object Notation (note to self: must file patent).
[It might be possible to get rid of the [,] too.]
My 2c.
--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general