Hi,
how can I properly handle categorical values in scikit-learn?
https://stackoverflow.com/questions/45727934/pandas-categories-new-levels?noredirect=1#comment78424496_45727934
goals
- scikit-learn syle fit/transform methods to encode labels of
categorical features of X
- should handl
e the transform is costly? Or is it
> more a matter of you wanting to store the transformed data at each step?
>
> There are custom ways to do this sort of thing generically with a mixin if
> you really want.
>
> On 16 August 2017 at 21:28, Georg Heiler
> wrote:
>
>>
There is a new option in the pipeline:
http://scikit-learn.org/stable/modules/pipeline.html#pipeline-cache
How can I use this to also store the transformed data as I only want to
compute the last step i.e. estimator during hyper parameter tuning and not
the transform methods of the clean steps?
Is
To my understanding pandas.factorize only works for the static case where
no unseen variables can occur.
Georg Heiler schrieb am Mo. 7. Aug. 2017 um
08:40:
> I will need to look into factorize. Here is the result from profiling the
> transform method on a single new observation
&
all possible values that could
> occur, do the transformation, and then only pass the 1 transformed sample
> to the classifier. I guess that could be even slow though ...
>
> Best,
> Sebastian
>
> > On Aug 6, 2017, at 6:30 AM, Georg Heiler
> wrote:
> >
there's no
>> way around doing this manually; for example you could create mapping
>> dictionaries for that (most conveniently done in pandas).
>>
>> Best,
>> Sebastian
>>
>> > On Aug 5, 2017, at 5:10 AM, Georg Heiler
>> wrote:
>> >
>
Hi,
the LabelEncooder is only meant for a single column i.e. target variable.
Is the DictVectorizeer or a manual chaining of multiple LabelEncoders (one
per categorical column) the desired way to get values which can be fed into
a subsequent classifier?
Is there some way I have overlooked which w
May 10, 2017 at 5:17 AM, Georg Heiler
> wrote:
> > Hi Matthew,
> >
> > indeed, that works fine. But what was the Problem? Installation from
> source
> > should have worked fine?
>
> Yes, it should, and I don't know what the problem is.
>
> I just compi
> On Tue, May 9, 2017 at 6:27 PM, Georg Heiler
> wrote:
> >> Yes just like that.
> >
> > Hum - you shouldn't get what I got, because I was installing for
> > Python 3.5, and there is a wheel for Python 3.5. I now see there
> > isn't a wheel for OSX Pyth
Yes just like that. Even when completely removing the python library folder
the error persists
Meanwhile I set up a conda environment that works but I would prefer a
plain pip installation.
Matthew Brett schrieb am Di. 9. Mai 2017 um 19:17:
> Hi,
>
> On Tue, May 9, 2017 at 6:00 PM, Geo
cked that it
> is available? E.g. Via xcode-select -p
> BTW does NumPy / SciPy work on your install or is it just sklearn?
>
> Best,
> Sebastian
>
>
>
> Sent from my iPhone
> On May 9, 2017, at 11:36 AM, Georg Heiler
> wrote:
>
> Hi,
>
> unfortunately, the c
Hi,
unfortunately, the c dependencies of my scikit-learn installation broke and
I get the following error on osx:
dlopen(/usr/local/lib/python3.6/site-packages/sklearn/svm/libsvm.cpython-36m-darwin.so,
2): Symbol not found: __ZdlPvm
Referenced from:
/usr/local/lib/python3.6/site-packages/sklear
12 matches
Mail list logo