Thanks Sebastian, downloading. Need 20 hours...
Best wishes,
Stanley Xu


On Wed, Apr 20, 2011 at 9:09 PM, Sebastian Schelter <[email protected]> wrote:

> Have a look at Ted's talk about Mahout's SGD classifier:
> http://vimeo.com/21273655
>
> As far as I remember he also covers the hashing issues you describe.
>
> --sebastian
>
>
> On 20.04.2011 15:06, Stanley Xu wrote:
>
>> Dear all,
>>
>> Per my understand, what Feature Hashing did in SGD do compress the Feature
>> Dimensions to a fixed length Vector. Won't that make the training result
>> incorrect if Feature Hashing Collision happened? Won't the two features
>> hashed to the same slot would be thought as the same feature? Even if we
>> have multiple probes to reduce the total collision like a bloom filter.
>> Won't it also make the slot that has the collision looks like a
>> combination
>> feature?
>>
>> Thanks.
>>
>> Best wishes,
>> Stanley Xu
>>
>>
>

Reply via email to