Hi Lars,
I am trying to model a classifier trained on categories present in
Wikipedia. There are approx 1 million categories in it.

Is there a way to accomplish this?

Any help would be appreciated.

Thanks,
Kartik Perisetla
On Jul 3, 2014 6:28 PM, "Lars Buitinck" <[email protected]> wrote:

> 2014-07-03 12:23 GMT+02:00 Kartik Kumar Perisetla <[email protected]>:
> > I am trying to use naive_bayes agorithm for training the model using
> > partial_fit in scikit-learn.
> >
> > I tried with 16011( # of features) , 100 training instances and 1018664(
> > total # of classes), I get an error when I invoke partial_fit method. I
> > think there is a upper limit on ma
> >
> > I see that partial_fit will compute np.zeros((1018664, 16011) for this
> which
> > gives "Array is too big" exception.
>
> That array would take 121 GB of storage. In any case, 1e6 classes is
> *extremely* multiclass. What are you trying to model?
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to