2013/4/15 Vlad Niculae <[email protected]>
> It really depends on each estimator and there is not one format that's
> better every time. It's the same as with dense arrays, with C versus
> Fortran ordering.
>
I did a quick check on the supervised methods:
the coordinate descent methods (ElasticNet, Lasso) use CSC format for
sparse and Fortran format for dense data.
All others (SGD, LinearSVC, SVC, NaiveBayes, Ridge) assume CSR format for
sparse and C format for dense.
>
> Unfortunately I can't give an example off the top of my head; but I
> think that between SVC, LinearSVC and SGDClassifier, two of them must
> disagree on this.
>
> Best way to know is to thoroughly check the docs of the objects you're
> working in. If nothing is said there, go to the source code and maybe
> the first couple of lines will clue you in. Algorithms that have
> already been optimized for a specific format will usually convert the
> data to that format before starting with ``utils.check_arrays``.
>
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L127
>
> Cheers,
> Vlad
>
> On Mon, Apr 15, 2013 at 4:00 AM, Philipp Singer <[email protected]> wrote:
> > Afaik scikit learn works with csr matrices internally as many
> mathematical
> > operations are just possible for csr matrices.
> >
> > Am 14.04.2013 20:01, schrieb Alex Kopp:
> >
> > Is there a sparse matrix format that is most efficient for sklearn? (COO
> vs
> > CSR vs LIL)
> >
> > Thanks
> >
> >
> >
> ------------------------------------------------------------------------------
> > Precog is a next-generation analytics platform capable of advanced
> > analytics on semi-structured data. The platform includes APIs for
> building
> > apps and a phenomenal toolset for data science. Developers can use
> > our toolset for easy data analysis & visualization. Get a free account!
> > http://www2.precog.com/precogplatform/slashdotnewsletter
> >
> >
> >
> > _______________________________________________
> > Scikit-learn-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > Precog is a next-generation analytics platform capable of advanced
> > analytics on semi-structured data. The platform includes APIs for
> building
> > apps and a phenomenal toolset for data science. Developers can use
> > our toolset for easy data analysis & visualization. Get a free account!
> > http://www2.precog.com/precogplatform/slashdotnewsletter
> > _______________________________________________
> > Scikit-learn-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
--
Peter Prettenhofer
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general