The issue with having anything else than fit(X, y) would break
cross_val_score, GridSearchCV and Pipeline.
I agree that more control is good, but having functions that don't work
well with the rest of the package is not great.
Only being able to "transform" to a distance to the training set is a
bit limiting, but I don't see a different way to do
it within the current API.
Can you explain this statement a bit more " We can go with usual y
vector consisting of feature labels" ?
Thanks,
Andy
On 03/18/2015 12:55 PM, Artem wrote:
Well, we could go with fit(X, y), but since algorithms use S and D,
it'd better to give user a way to specify them directly if (s)he wants
to. Either way, LMNN works with raw labels, so it doesn't require any
changes to the existing API.
On Wed, Mar 18, 2015 at 7:26 PM, Gael Varoquaux
<gael.varoqu...@normalesup.org <mailto:gael.varoqu...@normalesup.org>>
wrote:
On Wed, Mar 18, 2015 at 07:21:18PM +0300, Artem wrote:
> As to what y should look like, it depends on what we'd like the
algorithm to
> do. We can go with usual y vector consisting of feature labels.
Actually, LMNN
> is done this way, the optimization objective depends on the
equality of labels
> only. For ITML (any many others) we need sets of (S)imilar and
(D)issimilar
> pairs, which can also be inferred from labels.
> This is a bit less generic since we would imply that similarity
is transitive,
> and that's not true in a general case. For the general case we'd
need a way to
> feed in actual pairs. This could be done with fit having 2
optional arguments
> (similar and dissimilar) defaulted to None, which are inferred
from y in case
> of absence.
For now, I don't think that we want to add new variants of the
scikit-learn API.
G
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel
Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your
hub for all
things parallel software development, from weekly thought
leadership blogs to
news, videos, case studies, tutorials and more. Take a look and
join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general