I think it's a bit weird if we're returning sparse output from OneVsRestClassifier.predict if it wasn't fit on sparse Y.
Actually, I would be in favour of deprecating multilabel support in OneVsRestClassifier, since it is performing "binary relevance method" for multilabel, not actually OvR. MultiOutputClassifier duplicates this functionality (more or less), outputs a dense array (indeed it doesn't support sparse Y and perhaps it should) and lives closer to functional alternatives to binary relevance, such as ClassifierChain. On Thu, 11 Apr 2019 at 05:32, Liam Geron <l...@chatdesk.com> wrote: > Unfortunately I don't believe that you get that level of freedom, it's an > API call that automatically calls the model's predict method so I don't > think that I get to specify something like model.predict(X).toarray(). I > could be wrong however, I don't pretend to be an expert on Cloud ML by any > stretch. > > Thanks, > Liam > > On Wed, Apr 10, 2019 at 3:23 PM Sebastian Raschka < > m...@sebastianraschka.com> wrote: > >> Hm, weird that their platform seems to be so picky about it. Have you >> tried to just make the output of the pipeline dense? I.e., >> >> (model.predict(X)).toarray() >> >> Best, >> Sebastian >> >> > On Apr 10, 2019, at 1:10 PM, Liam Geron <l...@chatdesk.com> wrote: >> > >> > Hi Sebastian, >> > >> > Thanks for the advice! The model actually works on it's own in python >> fine luckily, so I don't think that that is the issue exactly. I have tried >> rolling my own estimator to wrap the pipeline to have it call the >> predict_proba method to return a dense array, however I then came across >> the problem that I would have to have that custom estimator defined on the >> Cloud ML end, which I'm unsure how to do. >> > >> > Thanks, >> > Liam >> > >> > On Wed, Apr 10, 2019 at 2:06 PM Sebastian Raschka < >> m...@sebastianraschka.com> wrote: >> > Hi Liam, >> > >> > not sure what your exact error message is, but it may also be that the >> XGBClassifier only accepts dense arrays? I think the TfidfVectorizer >> returns sparse arrays. You could probably fix your issues by inserting a >> "DenseTransformer" into your pipelone (a simple class that just transforms >> an array from a sparse to a dense format). I've implemented sth like that >> that you can import or copy&paste it from here: >> > >> > >> https://github.com/rasbt/mlxtend/blob/master/mlxtend/preprocessing/dense_transformer.py >> > >> > The usage would then basically be >> > >> > model = Pipeline([('tfidf', TfidfVectorizer()), ('to_dense', >> DenseTransformer()), ('clf', OneVsRestClassifier(XGBClassifier()))]) >> > >> > Best, >> > Sebastian >> > >> > >> > >> > >> > > On Apr 10, 2019, at 12:25 PM, Liam Geron <l...@chatdesk.com> wrote: >> > > >> > > Hi all, >> > > >> > > I was hoping to get some guidance re: changing the result of the >> predict method of the OneVsRestClassifier to return a dense array rather >> than a sparse array, given that Google Cloud ML only accepts dense numpy >> arrays as a result of a given models predict method. Right now my model >> architecture looks like: >> > > >> > > model = Pipeline([('tfidf', TfidfVectorizer()), ('clf', >> OneVsRestClassifier(XGBClassifier()))]) >> > > >> > > Which returns a sparse array with the predict method. I saw the Stack >> Overflow post here: >> https://stackoverflow.com/questions/52151548/google-cloud-ml-engine-scikit-learn-prediction-probability-predict-proba >> > > >> > > which recommends overwriting the predict method with the >> predict_proba method, however I found that I can't serialize the model >> after doing so. I also have a stack overflow post here: >> https://stackoverflow.com/questions/55366454/how-to-convert-scikit-learn-onevsrestclassifier-predict-method-output-to-dense-a >> which details the specific pickling error. >> > > >> > > Is this a known issue? Is there an accepted way to convert this into >> a dense array? >> > > >> > > Thanks, >> > > Liam Geron >> > > _______________________________________________ >> > > scikit-learn mailing list >> > > scikit-learn@python.org >> > > https://mail.python.org/mailman/listinfo/scikit-learn >> > >> > _______________________________________________ >> > scikit-learn mailing list >> > scikit-learn@python.org >> > https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ >> > scikit-learn mailing list >> > scikit-learn@python.org >> > https://mail.python.org/mailman/listinfo/scikit-learn >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn