Re: [scikit-learn] One-hot encoding

Joel Nothman Sun, 04 Feb 2018 21:04:34 -0800

If each input column is encoded as a value from 0 to the (number of
possible values for that column - 1) then n_values for that column should
be the highest value + 1, which is also the number of levels per column.
Does that make sense?


Actually, I've realised there's a somewhat slow and unnecessary bit of code
in the one-hot encoder: where the COO matrix is converted to CSR. I suspect
this was done because most of our ML algorithms perform better on CSR, or
else to maintain backwards compatibility with an earlier implementation.

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] One-hot encoding

Reply via email to