Hello!

I have exported Netflix data to a mongo db and then tried to build a 
MongoDBDataModel but it is taking too long. As I inspected the MongoDBDataModel 
class I found out that it's making a conversion from string to long because 
mongo uses strings for user_id and item_id, and mahout uses long for ids.

MongoDBDataModel stores this conversions in another collection and as it 
iterates over all the documents in the ratings collection, it checks this 
conversion collection whether it assigned a long id to every string id(user & 
item). I think checking/creating a new one(if necessary) in this collection 
becomes a great overhead when the data is too big.

Is there any solution to this which is included in mahout or do I have to write 
my own optimized code?

Regards,
Onur

Reply via email to