Dear Dr. Danka, This is a very nice generalization you have built.
My group and I have published multiple papers on using active learning for drug discovery model creation, built on top of scikit-learn. (2017) Future Med Chem : https://dx.doi.org/10.4155/fmc-2016-0197 (*Most downloaded paper of the year) (Open Access) (2017) J Comput-Aided Chem : https://dx.doi.org/10.2751/jcac.18.124 (Open Access) (2018) ChemMedChem : https://dx.doi.org/10.1002/cmdc.201700677 In our work, we built a similar framework to modAL, though in our framework the iterative model building is done on a fully labeled (Y) set of examples, and we are more interested in knowing: (1) How fast learning converges within some convergence criteria (e.g., how many drugs must be in a model, given an evaluation metric), (2) Which examples are picked across repeated executions of AL (e.g., which drugs appear to be the most informative for model construction), (3) How much diversity is there in the examples picked (e.g., how different are the drugs selected by AL - visualized in the 2017 FutureMedChem paper), and (4) How dependent are actively learned models on descriptors (e.g., do different representations affect the speed of performance convergence?). I think some, if not all, of these questions are also answerable in your framework. Also, with regards to point (1) and evaluation metrics, I recently came up with an idea to generically analyze the nature of 2-class prediction performance metrics independent of the model methodology used: (2018) Molecular Informatics : https://dx.doi.org/10.1002/minf.201700127 (Open Access) You can find the philosophy of this article embedded in the active learning experiments performed in the 2018 ChemMedChem article. If you or anyone else on this list is interested in active learning and chemistry, please drop me a line. Again - very nice job, and best wishes for continued development. Sincerely, J.B. Brown Kyoto University Graduate School of Medicine 2018-02-19 16:45 GMT+09:00 Tivadar Danka <theodore.da...@gmail.com>: > Dear scikit-learn community! > > It is my pleasure to announce modAL, a modular active learning framework > for Python3, built on top of scikit-learn. Designed with modularity, > flexibility and extensibility in mind, it allows the rapid development of > active learning workflows with nearly complete freedom. It is aimed for > researchers and practitioners, where fast prototyping is essential for > testing and developing active learning pipelines. > > modAL is quite young and under constant improvement. Any feedback, feature > request or contribution are very welcome! > > The package can be installed via pip: > pip3 install modAL > > The repository, tutorials and documentation are available at > - GitHub: https://github.com/cosmic-cortex/modAL > - Webpage: https://cosmic-cortex.github.io/modAL > > Cheers, > Tivadar > > -------------------------------------- > Tivadar Danka > postdoctoral researcher > BIOMAG group, MTA-BRC > http://www.tivadardanka.com > twitter: @TivadarDanka <https://twitter.com/TivadarDanka> > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn