On 11 Mar 2017 22:32, <[email protected]> wrote:
> Send scikit-learn mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/scikit-learn > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of scikit-learn digest..." > > > Today's Topics: > > 1. Label encoding for classifiers and soft targets > (Javier L?pez Pe?a) > 2. issue suggestion - decision trees - GSoC (Konstantinos Katrioplas) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 11 Mar 2017 13:04:57 +0000 > From: Javier L?pez Pe?a <[email protected]> > To: [email protected] > Subject: [scikit-learn] Label encoding for classifiers and soft > targets > Message-ID: <[email protected]> > Content-Type: text/plain; charset=utf-8 > > Hi there! > > I have been recently experimenting with model regularization through the > use of soft targets, > and I?d like to be able to play with that from sklearn. > > The main idea is as follows: imagine I want to fit a (probabilisitic) > classifier with three possible > targets, 0, 1, 2 > > If I pass my training set (X, y) to a sklearn classifier, the target > vector y gets encoded so that > each target becomes an array, [1, 0, 0], [0, 1, 0], or [0, 0, 1] > > What I would like to do is to be able to pass the targets directly in the > encoded form, and avoid > any further encoding. This allows for instance to pass targets as [0.9, > 0.5, 0.5] if I want to prevent > my classifier from becoming too opinionated on its predicted probabilities. > > Ideally I would like to do something like this: > ``` > clf = SomeClassifier(*parameters, encode_targets=False) > ``` > > and then call > ``` > elf.fit(X, encoded_y) > ``` > > Would it be simple to modify sklearn code to do this, or would it require > a lot of tinkering > such as modifying every single classifier under the sun? > > Cheers, > J > > ------------------------------ > > Message: 2 > Date: Sat, 11 Mar 2017 15:29:30 +0200 > From: Konstantinos Katrioplas <[email protected]> > To: [email protected] > Subject: [scikit-learn] issue suggestion - decision trees - GSoC > Message-ID: <[email protected]> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hello all, > > While I am waiting for the PR that I have submitted to be evaluated > (https://github.com/scikit-learn/scikit-learn/pull/8563), would you > suggest another (easy) issue for me to work on? Ideally something for > which I will write some substantial code, so as to present it in my > application for GSoC? > > Is anyone interested to mentor me in the parallelization of decision > trees? I admit I am not yet really familiar with the current tree code > (although I have been using the method for regression on a research > project) but I am very much intrigued by the idea and willing to learn > all about it until the summer. > > Regards, > Konstantinos > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > scikit-learn mailing list > [email protected] > https://mail.python.org/mailman/listinfo/scikit-learn > > > ------------------------------ > > End of scikit-learn Digest, Vol 12, Issue 18 > ******************************************** >
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
