Yeah the input format is a bit odd, usually it should be n_samples x
n_features, so something like
[['A'], ['C'], ['T'], ['G']]
Though this is currently also hard to do :(
On 09/20/2016 05:50 AM, Lee Zamparo wrote:
Hi Joel,
Yea, seems that the one-hot encoding of the transpose solves the
iss
Hi Joel,
Yea, seems that the one-hot encoding of the transpose solves the issue. As
you say, and as I mentioned to Sebastian, it seems a bit off-usage for
OneHotEncoder.
Thanks for the solution all the same though.
--
Lee Zamparo
On September 19, 2016 at 7:48:15 PM, Joel Nothman (joel.noth...
Hi Sebastian,
Great, thanks!
The docstring doesn’t make it very clear that using the default
’n_values=‘auto’ infers the number of different values column-wise; maybe I
could do a quick PR to update it? Or, maybe I could make your example into
a, well, example for the documentation online?
Alte
OneHotCoder has issues, but I think all you want here is
ohe.fit_transform(np.transpose(le.fit_transform([c for c in myguide])))
Still, this seems like it is far from the intended use of OneHotEncoder
(which should not really be stacked with LabelEncoder), so it's not
surprising it's tricky.
On
Hi, Lee,
maybe set `n_value=4`, this seems to do the job. I think the problem you
encountered is due to the fact that the one-hot encoder infers the number of
values for each feature (column) from the dataset. In your case, each column
had only 1 unique feature in your example
> array([[0, 1,