Dear all,

Per my understand, what Feature Hashing did in SGD do compress the Feature
Dimensions to a fixed length Vector. Won't that make the training result
incorrect if Feature Hashing Collision happened? Won't the two features
hashed to the same slot would be thought as the same feature? Even if we
have multiple probes to reduce the total collision like a bloom filter.
Won't it also make the slot that has the collision looks like a combination
feature?

Thanks.

Best wishes,
Stanley Xu

Reply via email to