Re: Export to MongoDB

Pat Ferrel Sat, 17 Mar 2012 17:50:35 -0700

I have a couple mongo db structures that contain docs, terms associatedwith each vector dimension, term weights, docids for similar docs,clusters, docs included in the clusters, etc. They come from severalsequence files in HDFS so I'm just looking for a way to conveniently dothe post mahout processing. If each sequence file were in mongo withkeys indexed I can imagine how to connect the dots. Also I'm creating aprototype so trying to find the easiest way to do it. Since the data hasto get into mongo I thought sooner in the pipeline would be simplest. Irealize that I don't need to export into human readable json and couldwrite to mongo directly and that is certainly an option.

I looked for a way to use mongo as a generic backing store forhadoop/mahout but struck out (not even sure that would be a good ideaanyway). I did see the pig integration and saw your code for theMongoDBDataModel in the recommender but they didn't seem to apply to mycase.


Any advise is appreciated.

On 3/17/12 4:01 PM, Sean Owen wrote:

What do you mean by indexed here?

On Sat, Mar 17, 2012 at 10:56 PM, Pat Ferrel<[email protected]>  wrote:

I need to digest some mahout files and merge them into a MongoDB database.
Since digesting would be a lot easier if the mahout keys were indexed I
wonder if a "seqdumper --format json or mongodb" might be useful. It would
make my life easier but maybe there is already a better way to do this?

Re: Export to MongoDB

Reply via email to