Hello,

i'm interested to contribute in scikit learn implementing some 
algorithms to make an optimal selection of objects in a N-dimensional 
space. These techniques are used when sampling is needed in large data 
and when the sampling must be done with a specifi criterion:

- Most Descriptive Compound: The aim of this algorithm is to select a 
subset of compounds which most effectively represents the compounds in 
the original population[Hudson, B; Quantitative Structure-Activity 
Relationships 1996, 15, 285]

- Dissimilarity Selection: The aim of this algorithm is to select a 
subset of compounds which are really different each others [Lajiness, M; 
Perspectives in Drug Discovery and Design 1997, 7(8), 65].


- others....

I can implement the Dissimilarity Selection, the Most Descriptive 
Compound for the moment. Maybe lather other algorithms.


Are you intrested?

Giuseppe Marco Randazzo

-- 
Giuseppe Marco Randazzo, Chemist, Ph.D
Collaborateur Ens. Recherche - UniGE Post-Doc Fellow
School of Pharmaceutical Sciences
University of Geneva, University of Lausanne
Pharmacochemistry and Pharmaceutical Analytical Chemistry
Pavillon des isotopes
20, Bd d'Yvoy
CH-1211 Geneva 4 (Switzerland)
Office: I20B
Portable : +41 76 262 67 12
Phone    : +41 22 37 968 94
skype    : gmrandazzo




------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to