Hello! I have exported Netflix data to a mongo db and then tried to build a MongoDBDataModel but it is taking too long. As I inspected the MongoDBDataModel class I found out that it's making a conversion from string to long because mongo uses strings for user_id and item_id, and mahout uses long for ids.
MongoDBDataModel stores this conversions in another collection and as it iterates over all the documents in the ratings collection, it checks this conversion collection whether it assigned a long id to every string id(user & item). I think checking/creating a new one(if necessary) in this collection becomes a great overhead when the data is too big. Is there any solution to this which is included in mahout or do I have to write my own optimized code? Regards, Onur
