On Sun, Aug 4, 2013 at 5:34 PM, Pat Ferrel <[email protected]> wrote:
> Actually this brings up another point that I've harped on before. It sure > would be nice to have a vector representation where you could attache > arbitrary data to items or vectors. Not so memory efficient but it makes > things like ID translation and timestamping actions trivial. If these could > be attached and survive all the Mahout jobs there would be no need for the > in-memory hashmap I'm using to translate IDs and the actions could be > timestamped or other metadata could be attached. At present I guess > everyone knows that only weights are attached to actions/matrix values and > in some cases names to rows/vectors in DRMs. > This is where we started, actually. The memory cost was fairly massive for arbitrary objects being attached to sparse matrices. The problem is that the cost of the annotations isn't amortized very far in long-tail situations. If we restrict our attention to text annotations, then a heavily compressed form might well be feasible.
