Hi Sarah, I have some reflection questions. You don't need to answer all
of them :) how many categories (approximately) do you have in each of those
20M categorical variables? How many samples do you have? Maybe you should
consider different encoding strategies such as binary encoding. Also, this
Hi all -
I can't do binary encoding because I need to trace back to the exact
categorical variable and that is difficult in binary encoding, I believe.
Each categorical variable has a range, but on average it is about 10
categories. I return a sparse matrix from the encoder. Regardless of the
enc