+1 from me
On Sat, Sep 14, 2019 at 8:12 AM Joel Nothman wrote:
> I am +1 for this change.
>
> I agree that users will accommodate the syntax sooner or later.
>
> On Fri., 13 Sep. 2019, 7:54 pm Jeremie du Boisberranger, <
> jeremie.du-boisberran...@inria.fr> wrote:
>
>> I don't know what is the
Thanks, Guillaume.
Column transformer looks pretty neat. I've also heard though, this pipeline
can be tedious to set up? Specifying what you want for every feature is a
pain.
Jaiver,
Actually, you guessed right. My real data has only one numerical
variable, looks more like this:
Gender Date
Sayak Paul | sayak.dev
-- Forwarded message -
From:
Date: Fri, Sep 13, 2019 at 10:46 AM
Subject: scikit-learn Digest, Vol 42, Issue 15
To:
Send scikit-learn mailing list submissions to
scikit-learn@python.org
To subscribe or unsubscribe via the World Wide Web, visit
If you have datasets with many categorical features, and perhaps many
categories, the tools in sklearn are quite limited,
but there are alternative implementations of boosted trees that are
designed with categorical features in mind. Take a look
at catboost [1], which has an sklearn-compatible
I am +1 for this change.
I agree that users will accommodate the syntax sooner or later.
On Fri., 13 Sep. 2019, 7:54 pm Jeremie du Boisberranger, <
jeremie.du-boisberran...@inria.fr> wrote:
> I don't know what is the policy about a sklearn 1.0 w.r.t api changes.
>
> If it's meant to be a
I will just add that if you have heterogeneous types, you might want to
look at the ColumnTransformer:
https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html
You might want to apply some scaling (would not be relevant for tree
thought) and encode categories