Re: [Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-06-02 Thread Mathieu Blondel
On Sun, Jun 2, 2013 at 1:44 PM, Joel Nothman jnoth...@student.usyd.edu.auwrote: From the sounds of things, it would be easier and probably more efficient to just always convert to dense binarized matrices, unless we have a good case for requiring sparse handling of labels. In particular,

Re: [Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-06-02 Thread Joel Nothman
On Sun, Jun 2, 2013 at 4:43 PM, Mathieu Blondel math...@mblondel.orgwrote: Sounds good to me. Only I would like some confirmation on whether deprecating support for sequences of sequences is sensible. Sequences of sequences and arrays of sets are both iterables of iterables, right? So,

Re: [Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-06-02 Thread Mathieu Blondel
On Sun, Jun 2, 2013 at 4:26 PM, Joel Nothman jnoth...@student.usyd.edu.auwrote: That's only true if users know they are required to pass binarized input to cross-validation routines such as GridSearchCV and cross_val_score, or else they might land up with a 2d array of ints instead of a 1d

Re: [Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-06-02 Thread Joel Nothman
On Sun, Jun 2, 2013 at 6:08 PM, Mathieu Blondel math...@mblondel.orgwrote: On Sun, Jun 2, 2013 at 4:26 PM, Joel Nothman jnoth...@student.usyd.edu.au wrote: That's only true if users know they are required to pass binarized input to cross-validation routines such as GridSearchCV and

Re: [Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-06-02 Thread Joel Nothman
On Sun, Jun 2, 2013 at 6:34 PM, Joel Nothman jnoth...@student.usyd.edu.auwrote: On Sun, Jun 2, 2013 at 6:08 PM, Mathieu Blondel math...@mblondel.orgwrote: On Sun, Jun 2, 2013 at 4:26 PM, Joel Nothman jnoth...@student.usyd.edu.au wrote: That's only true if users know they are required to

Re: [Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-06-01 Thread Joel Nothman
On Sun, Jun 2, 2013 at 1:35 PM, Mathieu Blondel math...@mblondel.orgwrote: Sorry for the late answer. It's hard for me to keep track of all the design-related discussions lately. No worries. Thanks for the reply! For me, the advantages of the sequences of sequences format are: - they are

[Scikit-learn-general] Multilabel sequences of sequences considered harmful

2013-05-29 Thread Joel Nothman
TL;DR: do we need to support two forms of multilabel targets? sequences of sequences may have unexpected behaviour. With Arnaud Joly's recent implementation of multilabel support in a number of metrics, there has been some extensive discussion of multilabel targets and their format at