Hi Grant,
Thanks for your reply, attachment is log from Mahout.
And I meet another problem, when I run this command in pseudo mode, it
will hung when mapper finished before reducer start at 1st job for a very
long time (about 10+ min or more), it's a very small train-set (with 12
samples, 4 classes).
And I found some problem when people using decision forest, and get a EOF
exception, it caused by "_SUCCESS" file created by map-reduce, I'm afraid
is this causes the problem above.
Thanks



On 10/17/11 4:08 PM, "Grant Ingersoll" <[email protected]> wrote:

>Hi Wangda,
>
>Can you include the logs that were spit out by Mahout?
>
>On Oct 16, 2011, at 10:46 PM, <[email protected]> wrote:
>
>> Hi All,
>> I use a very simple input file as the bayes input (and I tried
>>20newspaper example, it will get same result):
>> ------
>> mahout  Mahout's goal is to build scalable machine learning libraries.
>>With scalable we mean: Scalable to reasonably large data sets. Our core
>>algorithms for clustering, classfication and batch based collaborative
>>filtering are implemented on top of Apache Hadoop using the map/reduce
>>paradigm. However we do not restrict contributions to Hadoop based
>>implementations: Contributions that run on
>> lucene  All deprecations targeted to be removed in version 3.0 were
>>removed. If you are upgrading from version 2.9.1 of Lucene, you have to
>>fix all deprecation warnings in your code base to be able to recompile
>>against this version. This is the first Lucene
>> spamassasin SpamAssassin is a mail filter to identify spam. It is an
>>intelligent email filter which uses a diverse range of tests to identify
>>unsolicited bulk email, more commonly known as Spam. These tests are
>>applied to email headers and content to classify email using advanced
>>statistical methods. In addition,
>> ------
>> 
>> And I put the input to a directory named bayes-input, and run the
>>commandline:
>>    bin/mahout trainclassifier -i bayes-input -o bayes-model
>>--classifierType bayes -ng 1 -source hdfs
>> ----
>> After finished training, in bayes-model path, all files' size == 0
>> 
>> bin/hadoop fs -ls bayes-model
>> Found 5 items
>> -rw-r--r--   3 hadoop supergroup          0 2011-10-17 10:16
>>/user/hadoop/bayes-model/_SUCCESS
>> drwxrwxrwx   - hadoop supergroup          0 2011-10-17 10:16
>>/user/hadoop/bayes-model/_logs
>> drwxrwxrwx   - hadoop supergroup          0 2011-10-17 10:19
>>/user/hadoop/bayes-model/trainer-tfIdf
>> drwxrwxrwx   - hadoop supergroup          0 2011-10-17 10:19
>>/user/hadoop/bayes-model/trainer-thetaNormalizer
>> drwxrwxrwx   - hadoop supergroup          0 2011-10-17 10:18
>>/user/hadoop/bayes-model/trainer-weights
>> ----
>> And I use this model to classify new data, all sample will be
>>classified to "unknown"
>> 
>> My Environment:
>> 
>> 1.  Os     : cent-os 5
>> 2.  Mahout : 0.5
>> 3.  Hadoop : 0.20.205
>> 
>> Thanks,
>> Wangda
>> 
>
>--------------------------------------------
>Grant Ingersoll
>http://www.lucidimagination.com
>Lucene Eurocon 2011: http://www.lucene-eurocon.com
>

Reply via email to