On Sun, Jun 2, 2013 at 1:44 PM, Joel Nothman
<jnoth...@student.usyd.edu.au>wrote:

>
> From the sounds of things, it would be easier and probably more efficient
> to just always convert to dense binarized matrices, unless we have a good
> case for requiring sparse handling of labels. In particular, scipy.sparse
> does not currently support important operations for metrics: ==, !=, &, |,
> ^.
>

Ok, let's keep the label indicator matrix as a numpy array then. We can
always change the multi-label metrics implementations to use CSR in the
future if the need arises (with something like atleast2d_or_csr at the
beginning of each metric).

>
>
>> This way we would have the following advantages:
>> - easy incremental building of the labels, thanks to the support for
>> sequences of sequences or arrays of sets
>> - simplified implementation of metrics (and future estimators), since the
>> handling of sequences of sequences / arrays of sets would be delegated to
>> LabelBinarizer
>>
>
> Sounds good to me. Only I would like some confirmation on whether
> deprecating support for sequences of sequences is sensible.
>

Sequences of sequences and arrays of sets are both iterables of iterables,
right? So, since it only affects LabelBinarizer, I'd think we can support
both.

Mathieu
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to