If using Strings internally as ID's costs too much from a performance perspective that's totally fine and I wasn't trying to pick that fight. It sounds like there isn't much appetite for String wrappers however.
In any event, your suggestion to switch to numeric IDs is a non-starter. This is because re-key'ing the tables in the database system I'm using would break all the other jobs running against said tables. -chuck On Aug 10, 2011, at 11:34 PM, Sean Owen wrote: > Yes, it's just that it's much slower and takes up much more memory. You are > strongly encouraged to use numeric IDs and not bother with this adapter at > all. It's not a question of interning strings, and they need not be > consecutive IDs, but avoiding them entirely. > > On Thu, Aug 11, 2011 at 1:02 AM, Charles McBrearty <[email protected]> wrote: > >> Hi, >> >> I am taking a look at running some of the recommender examples from Mahout >> in action on a data set that I have that uses strings as the ItemID's and it >> looks to me like the suggested way to do this is to subclass FileDataModel >> and then use FileIdMigrator to manage the String <-> Long mapping. >> >> This seems like a lot of complication to deal with what I would imagine is >> a pretty common use case. Is there something that I'm missing here? >> >> Thanks for any info that anyone can provide. >> >> -chuck
