Ted,
RI seems pretty interesting,

Do you have any refernce paper or system about how people have used it
to improve recommendation systems? How people define context vectors
using extra information.

A quick idea I got was to use LDA to build topic vectors and use them
as context vectors, any thoughts on that.

RI seems to be a good candidate for contribution to mahout.


On Wed, May 23, 2012 at 12:11 PM, Ted Dunning <[email protected]> wrote:
> RI, per se, probably won't help that much with the coincidence problem.
>
> The Mahout math libraries would help a lot with a random indexing
> implementation.
>
> Kitenga has some very nice random indexing support.  See
> http://www.kitenga.com/
>
> They offer commercial software, but you get what you pay for.
>
> On Wed, May 23, 2012 at 12:18 AM, Mugoma Joseph Okomba 
> <[email protected]>wrote:
>
>>
>> Thanks for all the comments. They give us idea on what direction to take.
>>
>> We have been zeroing on idea of Random Indexing, but R.I seems missing in
>> mahout currently. Are there future plans for implementing R.I in mahout?
>> Any libraries out that that would be useful for R.I?
>>
>> On Sun, May 20, 2012 9:47 am, Ted Dunning wrote:
>> > The basic reasoning here is that any cooccurrence measure without
>> > smoothing
>> > is will have zero overlap whenever all the others have zero overlap.
>>  This
>> > seems to be the root of your problem.  The solution is to increase
>> overlap
>> > or increase data.
>> >
>> > The problem with correlation based approaches is that they over state
>> > coincidental overlaps.  Fixing that can't fix the problem of no overlap.
>> >
>>
>>
>>

Reply via email to