Thanks Sebastian, downloading. Need 20 hours... Best wishes, Stanley Xu
On Wed, Apr 20, 2011 at 9:09 PM, Sebastian Schelter <[email protected]> wrote: > Have a look at Ted's talk about Mahout's SGD classifier: > http://vimeo.com/21273655 > > As far as I remember he also covers the hashing issues you describe. > > --sebastian > > > On 20.04.2011 15:06, Stanley Xu wrote: > >> Dear all, >> >> Per my understand, what Feature Hashing did in SGD do compress the Feature >> Dimensions to a fixed length Vector. Won't that make the training result >> incorrect if Feature Hashing Collision happened? Won't the two features >> hashed to the same slot would be thought as the same feature? Even if we >> have multiple probes to reduce the total collision like a bloom filter. >> Won't it also make the slot that has the collision looks like a >> combination >> feature? >> >> Thanks. >> >> Best wishes, >> Stanley Xu >> >> >
