The issue with having anything else than fit(X, y) would break cross_val_score, GridSearchCV and Pipeline. I agree that more control is good, but having functions that don't work well with the rest of the package is not great.

Only being able to "transform" to a distance to the training set is a bit limiting, but I don't see a different way to do
it within the current API.
Can you explain this statement a bit more " We can go with usual y vector consisting of feature labels" ?

Thanks,
Andy

On 03/18/2015 12:55 PM, Artem wrote:
Well, we could go with fit(X, y), but since algorithms use S and D, it'd better to give user a way to specify them directly if (s)he wants to. Either way, LMNN works with raw labels, so it doesn't require any changes to the existing API.

On Wed, Mar 18, 2015 at 7:26 PM, Gael Varoquaux <gael.varoqu...@normalesup.org <mailto:gael.varoqu...@normalesup.org>> wrote:

    On Wed, Mar 18, 2015 at 07:21:18PM +0300, Artem wrote:
    > As to what y should look like, it depends on what we'd like the
    algorithm to
    > do. We can go with usual y vector consisting of feature labels.
    Actually, LMNN
    > is done this way, the optimization objective depends on the
    equality of labels
    > only. For ITML (any many others) we need sets of (S)imilar and
    (D)issimilar
    > pairs, which can also be inferred from labels.

    > This is a bit less generic since we would imply that similarity
    is transitive,
    > and that's not true in a general case. For the general case we'd
    need a way to
    > feed in actual pairs. This could be done with fit having 2
    optional arguments
    > (similar and dissimilar) defaulted to None, which are inferred
    from y in case
    > of absence.

    For now, I don't think that we want to add new variants of the
    scikit-learn API.

    G

    
------------------------------------------------------------------------------
    Dive into the World of Parallel Programming The Go Parallel
    Website, sponsored
    by Intel and developed in partnership with Slashdot Media, is your
    hub for all
    things parallel software development, from weekly thought
    leadership blogs to
    news, videos, case studies, tutorials and more. Take a look and
    join the
    conversation now. http://goparallel.sourceforge.net/
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to