Hi List,
I'm using Mahout Logistic Regression for a prediction task. As a test, I try
the classification task with one single feature, a categorical one with 26
levels.
When I run the Logistic regression on R or Python, I expect 25 coefficients
(corresponding to 25 out of the 26 levels, due to the "contrast coding") + the
intercept. However, when I run it on Mahout, I have 26 coefficients + the
intercept. Is there any way to force the contrast coding on Mahout (i.e.
consider one of the level as the default level)? Isn't there a risk of matrix
singularity by considering the 26 levels in the logistic regression?
Let me know if it's not clear.Thanks in advance for your answers,
Aymen