If the possible categories within each set are n_categories many then yes.
I read it as the categories being also disjoint, which is why I said 8.
On 08/14/2015 01:04 PM, federico vaggi wrote:
Originally, I thought I'd have to have a vector of length n_categories
per feature - so in this case, 8*8. I just realized however, that
since the order does not matter, and I just want to indicate the
presence or absence of a categorical feature in a set, I can simply
use two vectors (stacked together) of length n_categories (or 2*8).
On Fri, 14 Aug 2015 at 16:04 Andreas Mueller <t3k...@gmail.com
<mailto:t3k...@gmail.com>> wrote:
Why do you think one-hot will be an "explosion"?
In your example, the vector would be length 8 (if there are values
from a to f, that is, you gave the largest possible sets).
On 08/14/2015 09:01 AM, federico vaggi wrote:
Hi,
Simple example:
Let's say that I have a binary classification task, and my input
vector consists of two disjunct sects of categorical variables -
something like:
X1 = {'a', 'b', 'c', 'd'} and X2 = {'e', 'd', 'b', 'f'}
The order within the sets does not matter (obviously), but it
matters that the elements of X1 are conceptually separate from
those of X2.
All the categorical variables come from the same set.
Is there a clever encoding that:
- Emphasizes that order within each set does not matter
- Avoids explosion with one-hot encoding everything?
Federico
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general