I think it's a bit weird if we're returning sparse output from
OneVsRestClassifier.predict if it wasn't fit on sparse Y.
Actually, I would be in favour of deprecating multilabel support in
OneVsRestClassifier, since it is performing "binary relevance method" for
multilabel, not actually OvR.
Unfortunately I don't believe that you get that level of freedom, it's an
API call that automatically calls the model's predict method so I don't
think that I get to specify something like model.predict(X).toarray(). I
could be wrong however, I don't pretend to be an expert on Cloud ML by any
Hm, weird that their platform seems to be so picky about it. Have you tried to
just make the output of the pipeline dense? I.e.,
(model.predict(X)).toarray()
Best,
Sebastian
> On Apr 10, 2019, at 1:10 PM, Liam Geron wrote:
>
> Hi Sebastian,
>
> Thanks for the advice! The model actually
Hi Nicolas,
You are right, I am just checking this in the source code.
Sorry for the confusion and thanks for the quick response
Cheers
Sole
On Wed, 10 Apr 2019 at 18:43, Nicolas Goix wrote:
> Hi Sole,
>
> I'm not sure the 2 limitations you mentioned are correct.
> 1) in your example, using
Hi Sebastian,
Thanks for the advice! The model actually works on it's own in python fine
luckily, so I don't think that that is the issue exactly. I have tried
rolling my own estimator to wrap the pipeline to have it call the
predict_proba method to return a dense array, however I then came
Hi Liam,
not sure what your exact error message is, but it may also be that the
XGBClassifier only accepts dense arrays? I think the TfidfVectorizer returns
sparse arrays. You could probably fix your issues by inserting a
"DenseTransformer" into your pipelone (a simple class that just
Hi Sole,
I'm not sure the 2 limitations you mentioned are correct.
1) in your example, using the ColumnTransformer you can impute different
values for different columns.
2) the sklearn transformers do learn on the training set and are able to
perpetuate the values learnt from the train set to
Hi all,
I was hoping to get some guidance re: changing the result of the predict
method of the OneVsRestClassifier to return a dense array rather than a
sparse array, given that Google Cloud ML only accepts dense numpy arrays as
a result of a given models predict method. Right now my model
>
> Dear Scikit-Learn team,
>
> Feature engineering is a big task ahead of building machine learning
> models. It involves imputation of missing values, encoding of categorical
> variables, discretisation, variable transformation etc.
>
> Sklearn includes some functionality for feature