Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-20 Thread Dayvid Victor
Thanks Joel and Mathieu. I'll start a new project and ask to add a reference in the Wiki. (hopefully, this week); I don't think all the techniques should be included, but at least the most popular (ENN, RENN, OSS) / and efficient ones (SSMA, PSO). mblondel, you think 'fit_transform' would be bett

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-20 Thread Mathieu Blondel
+1 to starting a separate project in order to receive early feedback. Besides popularity and number of citations, an issue is that our API doesn't currently support instance reduction. We need to decide whether to introduce a new method (e.g., "reduce" as you did) or use fit_transform (so far fit_

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-20 Thread Joel Nothman
Hi Dayvid, For now, a number of projects that follow the scikit-learn interface but for one reason or another (often just out of scope) are listed at https://github.com/scikit-learn/scikit-learn/wiki/Third-party-projects-and-code-snippets . I would recommend against keeping everything in a scikit

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-20 Thread Dayvid Victor
Hi Joel, Thanks for your feedback. Let me see if I got this straight, you think I should open a new repository and then add an entry in the Wiki? Do you have an example of some other project that did the same? How do I organize it, do I start a new project or I build a new project inside my sklea

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-19 Thread Joel Nothman
PS: Kyle, from a brief look, I would summarise it as sampling a small set of KNN centroids. On 19 June 2014 12:05, Joel Nothman wrote: > Hi Dayvid, > > Although it could potentially be included in scikit-learn, it looks like > your components do not require modifying the existing codebase, and

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-19 Thread Joel Nothman
Hi Dayvid, Although it could potentially be included in scikit-learn, it looks like your components do not require modifying the existing codebase, and could be construed as an entirely independent project. This could be referenced from the Scikit-learn Wiki or similar, without having to decide wh

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-19 Thread Dayvid Victor
Hi Kyle, (sorry for the long answer). Instance Reduction techniques aims to reduce the amount of data manipulated in order to perform a classification/prediction ... Depending on the approach, they can remove noisy-data and outliers, remove redundant data, generate new generalized data by combin

Re: [Scikit-learn-general] Instance Reduction on scikit-learn

2014-06-18 Thread Kyle Kastner
Do you have any references for this technique? What is it typically used for? On Wed, Jun 18, 2014 at 12:26 PM, Dayvid Victor wrote: > Hi there, > > Is anybody working on an Instance Reduction module for sklearn? > > I started working on those and I already have more than 10 IR (PS and PG) > al