Hello,

I am new to using mahout. I have setup hadoop, nutch, pig and I feel I am very 
knowledgeable about solr and fully understand lucene. I am a php developer and 
have only tinkered with java code.

I have 2 million jobs and I need to build a categorization system I figured 
mahout should do the trick. So I setup the 20newsgroup example ran it. I am 
trying to figure out how mahout will fit into the job-posting-into-solr chain.

Currently a job posting will go into a queue to be processed into a solr 
document. we currently have a bunch of processes that will add to the document 
like calling google to get a latitude/longitude based on the job posting 
location, etc. I figure mahout would be in one of these worker queues.

What are my options for accessing mahout from php? webservice.. bash? I would 
like a system where I post it a chunk of text and it would return a list of 
suggested categories since a job posting could belong to multiple categories. 


Any pointers in the right direction would be appreciated

dan


Reply via email to