Ah! So if it was a sparse vector it could be indexed directly. Or the
mapping could be with a hash-indexed representation as used with
Lucene vectors.

On Sun, Jun 12, 2011 at 3:43 AM, Sean Owen <sro...@gmail.com> wrote:
> The keys have to be hashed to be used as int offsets into a vector. While
> loading the mapping isn't ideal it does only scale as the number of items
> and users.
>  On Jun 12, 2011 3:47 AM, "Lance Norskog" <goks...@gmail.com> wrote:
>> The RecommenderJob makes a "side" file which maps a fabricated integer
>> index to a long ItemID. Why is this needed? Couldn't the
>> RecommenderJob propagate the long ItemID directly? Note that this
>> forces all instances of AggregateAndReduceRecommender to load the
>> entire map. Part of the Map/Reduce rules are 'nothing needs to know
>> everything'.
>>
>> Is this a sparse/dense optimization? If so, have the distributed
>> algorithms advanced enough to make this indirection unnecessary?
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to